NSF C-Accel Award OIA-2040727 (Project Page)

2 minute read

Published:

NSF Convergence Accelerator Track D:

Towards Intelligent Sharing and Search for AI Models and Datasets

Jingbo Shang, Rajesh Gupta, Lucila Ohno-Machado, Arun Kumar, Giorgio Quer
University of California San Diego & Scripps Research

Abstract

A major goal of AI-driven applications is to discover the underlying patterns in domain-specific datasets, which typically requires tremendous field experience and interdisciplinary knowledge to design or even select suitable AI models. For instance, AI modeling for COVID-19 patient imaging and social distancing datasets requires an understanding of not only the epidemiological processes but also bioinformatics that informs mutation rate and its effects on models, coupled with socio-economic models that accurately capture living and working conditions. Such model selection process is far beyond the capabilities of search services available at existing platforms (e.g., Google Dataset Search, IEEE DataPort, and GitHub).

We envision an open-source, privacy-preserving intelligent system for searching and navigating through large-scale collections of AI models and datasets for scientific and other applications. The envisioned system would transform AI models and datasets into ‘computational resources’ such that model-dataset pairs can be searched and matched easily based on their semantics. It will serve as a sharing portal for models and datasets matched via contextual information, captured as ‘metadata’ that relies upon innovations in metadata methods and tools in the application context. More importantly, the confidential and private information embedded in the models and datasets will be protected by developing novel, rigorous privacy techniques. This way, our system would be able to allow clinicians to upload the patient imaging dataset and issue a query, such as “Coronavirus hazard assessment from chest CT”, and then without risks of leaking patient information, it would return suitable AI models and related datasets.

Team

Publication & Pre-Prints

Contact

jshang [at] ucsd [dot] edu

Acknowledgment

This project is supported in part by the NSF Convergence Accelerator under award OIA-2040727.