2020-Fall-CSE259-AI Seminar

Graduate Class, CSE, UCSD, 2020

Class Time: Mondays, 12PM to 1PM. Room: https://ucsd.zoom.us/j/99067937524. Piazza: TBD.

Online Lecturing

Due to the COVID-19, this course will be delivered over Zoom. All lectures will be recorded.

Overview

This seminar course mainly focuses on discussing the state-of-the-art methods in AI-related fields. We will invite researchers to talk about their most recent works.

Lecture Schedule

Recording Note: Please download the recording video for the full length. Dropbox website will only show you the first one hour.

WeekDateSpeakerTalk TitleAffiliation
110/05Jian PeiPracticing the Art of Data Science [slides] [recording]CS@Simon Fraser University
210/12Zhiting HuLearning with all experiences: A standardized ML formalism [slides] [recording]HDSI@UCSD
310/19Rose YuPhysics-Guided AI for Learning Spatiotemporal Dynamics [slides] [recording]CSE@UCSD
410/26Tuo ZhaoFine-Tuning of Pretrained Language Models under Limited or Weak Supervision [slides] [recording]ISyE&CSE@GaTech
511/02Jiajun WuLearning to see the physical world [slides] [recording]CS@Stanford
611/09Luca BonomiPrivacy and Machine Learning in Biomedical Applications [slides] [recording]DBMI@UCSD
711/16Giorgio QuerUsing AI to Enable Digital Medicine [slides] [recording]Scripps Research
811/23Stephan MandtCompressing Variational Bayes [slides] [recording]CS@UCI
911/30Cong YuBetter News Understanding via Language LearningGoogle Research
1012/07Babak SalimiCausal Inference for Responsible Data Science [slides] [recording]HDSI@UCSD

Week 1: Practicing the Art of Data Science

Abstract

Data science embraces interdisciplinary methodologies and tools, such as those in statistics, artificial intelligence/machine learning, data management, algorithms, computation and economics. Practicing data science to empower innovative applications, however, remains an art due to many factors beyond technology, such as sophistication of application scenarios, business demands, and the central role of human being in the loop. In this talk, I share with the audience some experience and lessons I learned from my practice of data science research and development. First, I illustrate the core value of building domain-oriented, end-to-end data science solutions that can help people gain new interpretable domain knowledge. Second, using network embedding as an example, I demonstrate that the nature of data science practice is to connect challenges in vertical applications with general scientific principles and tools. I also discuss some future directions, particularly about data strategies for enterprises and organizations on data as assets, privacy, fairness, accountability, and transparency.

Speaker Bio

Dr. Jian Pei is a Professor at the School of Computing Science and an associate member of the Department of Statistics and Actuarial Science, Simon Fraser University, Canada. His expertise is in developing effective and efficient data analysis techniques for novel data intensive applications. He is a research leader in the general areas of data science, big data, data mining, and database systems. He is recognized as a fellow of Royal Society of Canada (RSC) (i.e., the national academy of Canada), the Canadian Academy of Engineering (CAE), ACM and IEEE. He is one of the most cited authors in data mining, database systems, and information retrieval. His research has generated remarkable impact substantially beyond academia. His algorithms have been adopted by industry in production and popular open source software suites. He is responsible for several commercial systems of record-breaking large scale. As a renowned professional leader, he has played important roles in many academic organizations and activities. He is the Chair of ACM SIGKDD and was the Editor-in-Chief of IEEE TKDE. He received many prestigious awards, including the 2017 ACM SIGKDD Innovation Award and the 2015 ACM SIGKDD Service Award. In his last leave-of-absence from the university, he took the executive roles of two Fortune Global 500 companies. He is a mentor of Creative Destruction Lab (CDL).

Week 2: Learning with all experiences: A standardized ML formalism

Abstract

In handling a wide range of experiences ranging from data instances, knowledge, constraints, to rewards, adversaries, and lifelong interplay in an ever-growing spectrum of tasks, contemporary ML/AI research has resulted in a large multitude of learning paradigms (e.g., supervised, unsupervised, active, reinforcement, adversarial learning), models, optimization algorithms, etc. While pushing the field forward rapidly, these results also make a comprehensive grasp of existing ML techniques more and more difficult, and make standardized, reusable, repeatable, and reliable practice and further development of ML/AI products quite costly, if possible at all. In this talk, I’ll present a standardized formalism of machine learning that provides a unified mathematical framework for learning with all experiences. The formalism offers a vehicle for understanding, unifying, and generalizing current major paradigms of learning algorithms, and guidance of operationalizing ML for creating problem solutions in a composable and mechanic manner. I’ll show its applications in controllable text generation, learning with structured knowledge, automated data manipulation, stabilizing GAN training, etc.

Speaker Bio

Zhiting Hu is an Assistant Professor in Halicioglu Data Science Institute at UC San Diego and a visiting research scientist at Amazon Alexa AI. He received his Bachelor’s degree in Computer Science from Peking University in 2014, and his Ph.D. in Machine Learning from Carnegie Mellon University. His research interests lie in the broad area of machine learning, natural language processing, ML systems, healthcare and other application domains. In particular, He is interested in principles, methodologies, and systems of training AI agents with all types of experiences (data, knowledge, rewards, adversaries, lifelong interplay, etc). His research was recognized with best demo nomination at ACL2019 and outstanding paper award at ACL2016.

Week 3: Physics-Guided AI for Learning Spatiotemporal Dynamics

Abstract

Applications such as public health, transportation, climate science, and aerospace engineering require learning complex dynamics from large-scale spatiotemporal data. Such data is often non-linear, non-Euclidean, high-dimensional, and demonstrates complicated dependencies. Existing machine learning frameworks are still insufficient to learn spatiotemporal dynamics as they often fail to exploit the underlying physics principles. I will demonstrate how to inject physical knowledge in AI to deal with these challenges. I will showcase the application of these methods to problems such as forecasting COVID-19, self-driving car trajectory modeling, and predicting turbulence and ocean currents.

Speaker Bio

Dr. Rose Yu is an assistant professor at the University of California San Diego, Department of Computer Science and Engineering. She earned her Ph.D. in Computer Sciences at the University of Southern California in 2017. She was subsequently a Postdoctoral Fellow at the California Institute of Technology. She was an assistant professor at Northeastern University prior to her appointment at UCSD.

Her research focuses on advancing machine learning techniques for large-scale spatiotemporal data analysis, with applications to sustainability, health, and physical sciences. A particular emphasis of her research is on physics-guided AI which aims to integrate first-principles with data-driven models. Among her awards, she has won Google Faculty Research Award, Adobe Data Science Research Award, NSF CRII Award, Best Dissertation Award in USC, and was nominated as one of the “MIT Rising Stars in EECS”.

Week 4: Fine-Tuning of Pretrained Language Models under Limited or Weak Supervision

Abstract

Transfer learning has fundamentally changed the landscape of natural language processing (NLP). Many state-of-the-art models are first pre-trained on a large text corpus and then fine-tuned on downstream tasks. When we only have limited and weak supervision for the downstream tasks, however, due to the extremely high complexity of pre-trained models, aggressive fine-tuning often causes the fine-tuned model to overfit the training data of downstream tasks and fail to generalize to unseen data.

To address such a concern, we propose a new approach for fine-tuning of pretrained models to attain better generalization performance. Our proposed approach adopts three important ingredients: (1) Smoothness-inducing regularization, which effectively manages the complexity of the massive model; (2) Bregman proximal point optimization, which is an instance of trust-region methods and can prevent aggressive updating; (3) Self-training, which can gradually improve the model fitting and effectively suppress error propagation. Our experiments show that the proposed approach significantly outperforms existing methods in multiple NLP tasks under limited or weak supervision.

Speaker Bio

Tuo Zhao is an Assistant Professor in School of Industrial & Systems Engineering at Georgia Tech. He received his Ph.D. degree in Computer Science at Johns Hopkins University. His research mainly focuses on developing methodologies, algorithms and theories for machine learning, especially deep learning. He is also actively working in neural language models and open-source machine learning software for scientific data analysis. He has received several academic awards, including the winner of INDI ADHD-200 global competition, ASA best student paper award on statistical computing, INFORMS best paper award on data mining and Google faculty research award.

Week 5: Learning to see the physical world

Abstract

Human intelligence is beyond pattern recognition. From a single image, we’re able to explain what we see, reconstruct the scene in 3D, predict what’s going to happen, and plan our actions accordingly. In this talk, I will present our recent work on physical scene understanding—building versatile, data-efficient, and generalizable machines that learn to see, reason about, and interact with the physical world. The core idea is to exploit the generic, causal structure behind the world, including knowledge from computer graphics, physics, and language, in the form of approximate simulation engines, and to integrate them with deep learning. Here, deep learning plays two major roles: first, it learns to invert simulation engines for efficient inference; second, it learns to augment simulation engines for constructing powerful forward models. I’ll focus on a few topics to demonstrate this idea: building scene representation for both object geometry and physics; learning expressive dynamics models for planning and control; perception and reasoning beyond vision.

Speaker Bio

Jiajun Wu is an Assistant Professor of Computer Science at Stanford University, working on computer vision, machine learning, and computational cognitive science. Before joining Stanford, he was a Visiting Faculty Researcher at Google Research. He received his PhD in Electrical Engineering and Computer Science at Massachusetts Institute of Technology. Wu’s research has been recognized through the ACM Doctoral Dissertation Award Honorable Mention, the MIT George M. Sprowls PhD Thesis Award in Artificial Intelligence and Decision-Making, the IROS Best Paper Award on Cognitive Robotics, and fellowships from Facebook, Nvidia, Samsung, and Adobe.

Week 6: Privacy and Machine Learning in Biomedical Applications

Abstract

Current health information systems enable the collection of a variety of data (e.g., genetic, environmental, and lifestyle factors) that hold great opportunities for advancing medical research and improving patient care (e.g., GWAS). However, there are significant privacy concerns for sharing these data and enabling data-driven applications (e.g., machine learning), as violations can have consequences (e.g., discrimination). In this talk, I will provide an overview of the differential privacy model, which allows a trusted data aggregator (e.g., hospital) to shared data with rigorous privacy guarantee. Specifically, I will present how the differential privacy model can enable privacy-protecting machine learning tasks and showcase an application for survival analyses.

Speaker Bio

Dr. Luca Bonomi is a postdoctoral researcher at the UCSD Health Department of Biomedical Informatics. His research focuses on developing formal privacy-protecting technology for biomedical data. His research is funded by an NIH K99/R00 Award. Dr. Bonomi holds a Ph.D. in Computer Science and Informatics from Emory University, where he has also received a Graduate Student Research Award for his outstanding research contributions.

Week 7: Using AI to Enable Digital Medicine

Abstract

Digitalize human beings using biosensors to track our complex physiologic system, process the large amount of data generated with artificial intelligence (AI) and change clinical practice towards individualized medicine: these are the goals of digital medicine. In this talk, we discuss how to design AI solutions in the clinical space and what are the key aspects to make a difference. We focus on two critical clinical topics that need AI: 1) atrial fibrillation (AF), and 2) viral illnesses (COVID-19). AF is the most common sustained cardiac arrhythmia, associated with stroke, heart failure and coronary artery disease. AF detection from single-lead electrocardiography (ECG) recordings is still an open problem, as AF events may be episodic and the signal noisy. We conduct a thoughtful analysis of recent convolutional neural network architectures developed in the computer vision field, redesigned to be suitable for a one-dimensional signal, and we evaluate their performance in the detection of AF using 200 thousand seconds of ECG, highlighting the potential and pitfall of this technology. We also discuss how to explain (global and local post hoc explanations) this AI model for AF detection using features that are commonly used by a cardiologist.

To tackle the problem of COVID-19, we start with an overview of continuous, passively monitored vital signs from 200,000 individuals wearing a Fitbit wearable device for 2 years. This large study provides the baseline for DETECT, our app-based, nationwide clinical study enrolling individuals who routinely use a smartwatch or other wireless devices to determine if individualized tracking of changes in heart rate, activity and sleep can provide early diagnosis and self-monitoring for COVID-19. We analyze data from more than 36,000 individuals, showing how we can discriminate (on an individual level) between COVID-19 and other types of infections. We discuss how this can impact both the individual and public health, and how the use of AI can be a game changer in this fight against the virus.

Speaker Bio

Dr. Giorgio Quer received a Ph.D. degree (2011) in Information Engineering from University of Padova, Italy. In 2007, he was a visiting researcher at the Centre for Wireless Communication at the University of Oulu, Finland. During his Ph.D., he proposed a solution for the distributed compression of wireless sensor networks signals, based on the joint exploitation of Compressive Sensing and Principal Component Analysis. From 2010 to 2016, he was at the Qualcomm Institute, University of California San Diego (UCSD), working on cognitive networks protocols and implementation. At Scripps Research, he is leading the Data Science and Analytics Scripps team involved in the All of Us Research Program (NIH), together with several efforts involving big data and AI in digital medicine, including DETECT, towards the use of wearables to detect COVID-19. He is a Senior Member of the IEEE and a Distinguished Lecturer for the IEEE Communications society. His research interests include wireless sensor networks, probabilistic models, deep convolutional networks, wearable sensors, physiological signal processing, and digital medicine.

Week 8: Compressing Variational Bayes

Abstract

Neural image compression algorithms have recently outperformed their classical counterparts in rate-distortion performance and show great potential to also revolutionize video coding. In this talk, I will show how recent innovations from approximate Bayesian inference and generative modeling can lead to dramatic performance improvements in compression. In particular, I will explain how sequential variational autoencoders can be converted into video codecs, how deep latent variable models can be compressed in post-processing with variable bitrates, and how iterative amortized inference can be used to achieve the world record in image compression performance.

Speaker Bio

Stephan Mandt is an Assistant Professor of Computer Science at the University of California, Irvine. From 2016 until 2018, he was a Senior Researcher and Head of the statistical machine learning group at Disney Research, first in Pittsburgh and later in Los Angeles. He held previous postdoctoral positions at Columbia University and Princeton University. Stephan holds a Ph.D. in Theoretical Physics from the University of Cologne. He is a Fellow of the German National Merit Foundation, a Kavli Fellow of the U.S. National Academy of Sciences, and was a visiting researcher at Google Brain. Stephan regularly serves as an Area Chair for NeurIPS, ICML, AAAI, and ICLR, and is a member of the Editorial Board of JMLR. His research is currently supported by NSF, DARPA, Intel, and Qualcomm.

Week 9: Better News Understanding via Language Learning

Abstract

The news ecosystem is going through unprecedented changes and has never been more important in shaping our societies and supporting freedom and democracy. In this talk, I will briefly describe how our research group is working towards better technologies for understanding news and mitigating misinformation to help address some of those challenges.

Speaker Bio

Cong Yu is a research scientist and manager at Google Research in New York. He leads the research group on news and misinformation understanding. The group’s mission is to apply state-of-the-art NLP/ML and structured data technologies to understand newsy and fresh multi-modal information and to mitigate the spread of misinformation. Partnering with journalists and policy advisors, the group is responsible for products such as WebTables, Structured Snippets, Fact Checking at Google, and contributes to a variety of consumer-facing Google products such as GNews and Top Stories in Search.

His personal research interests are structured data exploration and mining, computational journalism, social content analysis and recommendation, human-scalable information management, and applied ML and NLP. He was a conference keynote speaker for VLDB 2019 and twice served as an industrial program co-chair for VLDB (2013 and 2018). Before Google, Cong was a Research Scientist at Yahoo! Research, also in New York. He has a PhD from the University of Michigan, Ann Arbor, advised by Prof. H.V. Jagadish.

Week 10: Causal Inference for Responsible Data Science

Abstract

Scaling and democratizing access to big data promises to provide meaningful, actionable information that supports decision-making. Today, data-driven decisions profoundly affect the course of our lives, such as whether to admit applicants to a particular school, offer them a job, or grant them a mortgage. Unfair, inconsistent, or faulty decision-making raises serious concerns about ethics and responsibility. For example, we may know that our training data is biased, but how do we avoid propagating discrimination when we use this data? How do we avoid incorrect, spurious and non-reproducible findings? How can we curate and expose existing data to make it “safe” for informed decision-making?

In this talk, I describe how we can combine techniques from causal inference and data management to develop systems and algorithms that help answer some of these questions. Many existing popular notions of fairness in ML fail to distinguish between discriminatory, non-discriminatory and spurious correlations between sensitive attributes and outcomes of learning algorithms. I present a new notion of fairness that subsumes and improves upon previous definitions and correctly distinguishes between fairness violations and non-violations. Further, I describe an approach to removing discrimination by repairing training data in order to remove the effects of any inappropriate and/or discriminatory causal relationships between a protected attribute and classifier predictions. Finally, I present my most recent work that use counterfactual reasoning and provenance for explaining black-box decision-making algorithms.

Speaker Bio

Babak Salimi is an assistant professor in HDSI at UC San Diego. Before joining UC San Diego, he was a postdoctoral research associate in the Department of Computer Science and Engineering, University of Washington where he worked with Prof. Dan Suciu and the database group. He received his Ph.D. from the School of Computer Science at Carleton University, advised by Prof. Leopoldo Bertossi. His research seeks to unify techniques from theoretical data management, causal inference and machine learning to develop a new generation of decision-support systems that help people with heterogeneous background to interpret data. His ongoing work in causal relational learning aims to develop the necessary conceptual foundations to make causal inference from complex relational data. Further, his research in the area of responsible data science develops needed foundations for ensuring fairness and accountability in the era of data-driven decisions. His research contributions have been recognized with a Postdoc Research Award at University of Washington, a Best Demonstration Paper Award at VLDB 2018, a Best Paper Award at SIGMOD 2019 and a Research Highlight Award at SIGMOD 2020.