University of Southern California


DE-FOA-0002306: Fair Data and Models for Artificial Intelligence and Machine Learning

Slots:                                                     2 available, 1 granted

Internal Deadline:                           First-come, first-serve.

LOI:                                                        April 17, 2020

External Deadline:                          May 15, 2020

Award Information:                        Type: Cooperative Agreement

Estimated Number of Awards: 3 – 7 expected.

Anticipated Amount: A total of $8,500,000 is expected to be available to support awards made under this FOA and its companion Program Announcement to the DOE National Laboratories, subject to the appropriation of funds by Congress.


Submission Process:                     PIs must submit their application as a Limited Submission through the Office of Research Application Portal:


Materials to submit:

Link to Award:                        

Who May Serve as PI:                    Individuals with the skills, knowledge, and resources necessary to carry out the proposed research as a PI are invited to work with their organizations to develop an application for assistance. Individuals from underrepresented groups as well as individuals with disabilities are always encouraged to apply for assistance.

Budgetary Requirements:           Cost sharing is not required.

Purpose: The DOE SC program in Advanced Scientific Computing Research (ASCR) hereby announces its interest in making research data and artificial intelligence (AI) models findable, accessible, interoperable, and reusable (FAIR1) to facilitate the development of new AI applications in SC’s congressionally authorized mission space, which includes the advancement of AI research and development. In particular, ASCR is interested in supporting FAIR benchmark data for AI; and FAIR frameworks for relating data and AI models.

For this FOA, AI is inclusive of, for example, machine learning (ML), deep learning (DL), neural networks (NN), computer vision, and natural language processing (NLP). Data, in this context, are the digital artifacts used to generate AI models and/or employed in combination with AI models during inference. An AI model is an inference method that can be used to perform a “task,” such as prediction, diagnosis, or classification. The model is developed using training data or other knowledge. An AI task is the inference activity performed by an artificially intelligent system.

FAIR Benchmark Data:

Scientific data are different from other classes of data more typically used in AI research. Currently, algorithms developed for non-scientific data can perform poorly when applied to scientific data. The principal focus of this FOA topic is to close this gap by making scientific data publicly available to the AI community so that algorithms, tools, and techniques work for science. Further research and efforts are needed to expand critical capabilities that make unique scientific data open and FAIR (Findable, Accessible, Interoperable, and Reusable), provide sufficient metadata, provenance, and annotations to the AI community, and articulate challenge areas to focus the community on unique aspects of science data and open areas of research.

FAIR Frameworks for Data and AI Models:

Tools for training AI models on data are readily available and widely used. What is lacking, however, is a theoretical framework for understanding relationships between data and models. For example, given a specific data set and problem, we lack rigorous methods for identifying the best model, hyper-parameters, and training method to use. Given a specific data set and problem, which additional data would be helpful to include in the training set? What information about a dataset can be deduced from a model trained on the data? What attributes of the data can be reverse engineered from a model? What can we learn about model robustness and transfer learning by looking at relationships between data and models?

The primary focus of this FOA topic is to advance our understanding of the relationship between data and models by exploring relationships among them through the development of FAIR frameworks for relating data and models. Such frameworks should provide capabilities that advance our understanding of AI, provide new insights to help researchers with applications of AI techniques, and provide an environment where novel approaches to AI can be explored.

Proposed frameworks may focus on specific disciplines or sub-disciplines currently supported by SC’s programs in ASCR, Biological and Environmental Research (BER), Basic Energy Sciences (BES), Fusion Energy Sciences (FES), High Energy Physics (HEP), Nuclear Physics (NP) ; or may focus on particular aspects or sub-areas within AI.


Visit our Institutionally Limited Submission webpage for updates and other announcements.