hero image
Zachary Lipton - Carnegie Mellon University. Pittsburgh, PA, US

Zachary Lipton

Associate Professor | Carnegie Mellon University

Pittsburgh, PA, UNITED STATES

Zachary Lipton's research spans machine learning methods and their applications in healthcare and natural language processing.

Biography

Zachary Lipton is the Chief Technology Officer and Chief Scientist at Abridge, where he oversees the builder organization responsible for all of product development and AI research. He is also the Raj Reddy Associate Professor of Machine Learning at Carnegie Mellon University, where he directs the Approximately Correct Machine Intelligence (ACMI) lab, whose research focuses include the theoretical and engineering foundations of robust and adaptive machine learning algorithms, applications to both prediction and decision-making problems in clinical medicine, natural language processing, and the impact of machine learning systems on society. He is the founder of the Approximately Correct blog (approximatelycorrect.com) and a co-author of Dive Into Deep Learning, an interactive open-source book drafted entirely through Jupyter notebooks that has reached millions of readers.

Areas of Expertise (4)

Machine Learning

Machine Intelligence

Natural Language Processing (NLP)

Deep Learning

Media Appearances (4)

OpenAI shakeup has rocked Silicon Valley, leaving some techies concerned about future of AI

CNBC  online

2023-11-20

“I imagine Microsoft might ask for a board seat next time they decide to plow $15 billion into a startup,” said Zachary Lipton, a Carnegie Mellon University professor of machine learning and operations research.

view more

What’s the Future for A.I.?

The New York Times  online

2023-04-04

“This will affect tasks that are more repetitive, more formulaic, more generic,” said Zachary Lipton, a professor at Carnegie Mellon who specializes in artificial intelligence and its impact on society.

view more

What’s wrong with “explainable A.I.”

Fortune  online

2022-03-22

“Everyone who is serious in the field knows that most of today’s explainable A.I. is nonsense,” Zachary Lipton, a computer science professor at Carnegie Mellon University, recently told me. Lipton says he has had many radiologists reach out to him for help after their hospitals deployed a supposedly explainable A.I. system for interpreting medical imagery whose explanations don’t make sense—or, at the very least, are irrelevant to what a radiologist really wants to know about a medical image.

view more

Is AI overhyped? Researchers weigh in on technology's promise and problems

CBC News  online

2020-02-21

But Zachary Lipton, an assistant professor at Carnegie Mellon University's machine learning department and school of business, worries that machine learning's success at making predictions "can blind people to the fact that not every problem is a prediction problem."

view more

Media

Publications:

Zachary Lipton Publication

Documents:

Photos:

loading image

Audio/Podcasts:

Social

Education (3)

UC San Diego: Ph.D., Computer Science 2017

UC San Diego: M.S., Computer Science 2015

Columbia University: B.A., Mathematics - Economics 2007

Articles (5)

Complementary benefits of contrastive learning and self-training under distribution shift

Advances in Neural Information Processing Systems

2024 Self-training and contrastive learning have emerged as leading techniques for incorporating unlabeled data, both under distribution shift (unsupervised domain adaptation) and when it is absent (semi-supervised learning). However, despite the popularity and compatibility of these techniques, their efficacy in combination remains surprisingly unexplored. In this paper, we first undertake a systematic empirical investigation of this combination, finding (i) that in domain adaptation settings, self-training and contrastive learning offer significant complementary gains; and (ii) that in semi-supervised learning settings, surprisingly, the benefits are not synergistic. Across eight distribution shift datasets (eg, BREEDs, WILDS), we demonstrate that the combined method obtains 3--8\% higher accuracy than either approach independently.

view more

Online label shift: Optimal dynamic regret meets practical algorithms

Advances in Neural Information Processing Systems

2024 This paper focuses on supervised and unsupervised online label shift, where the class marginals variesbut the class-conditionals remain invariant. In the unsupervised setting, our goal is to adapt a learner, trained on some offline labeled data, to changing label distributions given unlabeled online data. In the supervised setting, we must both learn a classifier and adapt to the dynamically evolving class marginals given only labeled online data. We develop novel algorithms that reduce the adaptation problem to online regression and guarantee optimal dynamic regret without any prior knowledge of the extent of drift in the label distribution. Our solution is based on bootstrapping the estimates of* online regression oracles* that track the drifting proportions. Experiments across numerous simulated and real-world online label shift scenarios demonstrate the superior performance of our proposed approaches, often achieving 1-3% improvement in accuracy while being sample and computationally efficient

view more

Resolving the Human-subjects Status of Machine Learning's Crowdworkers: What ethical framework should govern the interaction of ML researchers and crowdworkers?

Queue

2023 In recent years, machine learning (ML) has relied heavily on crowdworkers both for building datasets and for addressing research questions requiring human interaction or judgment. The diversity of both the tasks performed and the uses of the resulting data render it difficult to determine when crowdworkers are best thought of as workers versus human subjects. These difficulties are compounded by conflicting policies, with some institutions and researchers regarding all ML crowdworkers as human subjects and others holding that they rarely constitute human subjects. Notably few ML papers involving crowdwork mention IRB oversight, raising the prospect of non-compliance with ethical and regulatory requirements.

view more

Deep equilibrium based neural operators for steady-state PDEs

Advances in Neural Information Processing Systems

2023 Data-driven machine learning approaches are being increasingly used to solve partial differential equations (PDEs). They have shown particularly striking successes when training an operator, which takes as input a PDE in some family, and outputs its solution. However, the architectural design space, especially given structural knowledge of the PDE family of interest, is still poorly understood. We seek to remedy this gap by studying the benefits of weight-tied neural network architectures for steady-state PDEs. To achieve this, we first demonstrate that the solution of most steady-state PDEs can be expressed as a fixed point of a non-linear operator. Motivated by this observation, we propose FNO-DEQ, a deep equilibrium variant of the FNO architecture that directly solves for the solution of a steady-state PDE as the infinite-depth fixed point of an implicit operator layer using a black-box root solver and differentiates analytically through this fixed point resulting in training memory.

view more

Identifying Game-Based Digital Biomarkers of Cognitive Risk for Adolescent Substance Misuse: Protocol for a Proof-of-Concept Study

JMIR Research Protocols

2023 Background: Adolescents at risk for substance misuse are rarely identified early due to existing barriers to screening that include the lack of time and privacy in clinic settings. Games can be used for screening and thus mitigate these barriers. Performance in a game is influenced by cognitive processes such as working memory and inhibitory control. Deficits in these cognitive processes can increase the risk of substance use. Further, substance misuse affects these cognitive processes and may influence game performance, captured by in-game metrics such as reaction time or time for task completion. Digital biomarkers are measures generated from digital tools that explain underlying health processes and can be used to predict, identify, and monitor health outcomes. As such, in-game performance metrics may represent digital biomarkers of cognitive processes that can offer an objective method for assessing underlying risk for substance misuse.

view more