Research in the AI Era

We are entering an era of reasoning abundance enabled by AI, much as the internet ushered in an era of information abundance, opening new opportunities and challenges for research across all areas of computer science. AI is rapidly reshaping nearly every stage of the research lifecycle, from identifying and framing research topics and conducting literature reviews to theoretical analysis, experimental design and execution, interpreting results, writing and reviewing papers, and supporting reproducibility. As these tools become more tightly integrated, sometimes augmenting and sometimes automating core tasks, our notions of contribution, rigor, and expertise are evolving.

This seminar series brings together speakers from across ALL areas of computer science, spanning theory, systems, AI, and interdisciplinary, to talk about these shifts. Each talk will offer a perspective on how AI is changing research methodologies, challenging traditional assumptions about originality and rigor, and enabling new forms of collaboration. Join us to explore what research may look like in the AI era and to discuss how we can harness these tools while preserving the core values of scientific inquiry.

Format & timing

Schedule

What Claude and I Did This Winter!

Emery Berger (UMass Amherst) – Friday, February 6 @ 1 pm [in-person]

Category: Systems

Location: CSL E144

Description

An informal talk about how everything has suddenly changed!*

Abstract

Coding agents have arrived, and they represent a sea change. I spent a few weeks working with one of them (Claude Code) and was blown away. To convey just how powerful these agents are, I will talk about just a sampling of the wide range of projects Claude and I were able to get done in an absurdly small amount of time.


Glia: A Human-Inspired AI for Automated Systems Design and Optimization

Hari Balakrishnan (MIT) – Friday, February 13 @ 1 pm [in-person] (recording)

Category: Systems

Location: CSL E144

Description

This talk will focus on how AI can be used to autonomously design and optimize complex systems. This directly ties into the seminar’s theme by showcasing how AI is automating the core tasks of experimental design and execution, bringing an era of reasoning abundance to the systems research lifecycle.

Abstract

Can an AI autonomously design mechanisms for computer systems on par with the creativity and reasoning of human experts? We present Glia, an AI architecture for networked systems design that uses large language models (LLMs) in a human-inspired, multi-agent workflow. Each agent specializes in reasoning, experimentation, and analysis, collaborating through an evaluation framework that grounds abstract reasoning in empirical feedback. Unlike prior ML-for-systems methods that optimize black-box policies, Glia generates interpretable designs and exposes its reasoning process. When applied to a distributed GPU cluster for LLM inference, it produces new algorithms for request routing, scheduling, and auto-scaling that perform at human-expert levels in significantly less time, while yielding novel insights into workload behavior. Our results suggest that by combining reasoning LLMs with structured experimentation, an AI can produce creative and understandable designs for complex systems problems.

Bio

Hari Balakrishnan is the Fujitsu Professor of Computer Science at MIT. His research is in networked computer systems, with current interests in networking, sensing, and perception for sensor-equipped mobile devices connected to cloud or edge services. He has made many contributions to mobile and sensor computing, overlay and peer-to-peer networks, congestion control, Internet routing, and data management systems.

In 2010, based on the CarTel project, Balakrishnan co-founded Cambridge Mobile Telematics (CMT). CMT’s mission is to make the world’s roads and drivers safer. Using mobile sensing and IoT, signal processing, machine learning, and behavioral science, CMT’s platform measures driving behavior to improve driving behavior and reduce risk, provides crash alerts and roadside assistance, and creates a smooth connected claims process. Today, CMT is the world’s leading telematics and analytics provider, serving many millions of users in 25 countries by partnering with insurers (including powering consumer telematics programs at 21 of the top 25 US insurers), car makers, commercial mobility providers, and the public sector.

Balakrishnan received his PhD in 1998 from the EECS Department at UC Berkeley, which named him a Distinguished Alumnus in 2021, and a BTech in Computer Science in 1993 from IIT Madras, which named him a Distinguished Alumnus in 2013. He was elected to the National Academy of Engineering (2015) and to the American Academy of Arts and Sciences (2017). His honors include the Marconi Prize (2023), the ACM SIGCOMM Award for lifetime contributions to communication networks (2021), the IEEE Kobayashi Computers and Communications Award (2021), the Ernst and Young Entrepreneur of the Year Award for the New England region (2021), the Infosys Prize for Engineering and Computer Science (2020), Fellow of the IEEE (2020), Fellow of the ACM (2008), Sloan Fellow (2002), and the ACM doctoral dissertation award for Computer Science (1998). He has received several best-paper awards including six test-of-time awards for papers with long-term impact and the IEEE Bennett paper prize (2004). At MIT, he has received several honors including the Harold E. Edgerton faculty achievement award for research, teaching, and service (2003), the HKN best instructor award (2018), the Jamieson teaching award (2012), the Junior Bose teaching award (2002), and the Spira teaching award (2001). He has graduated 26 PhD students and 10 postdocs, who have made their mark in research and industry at leading universities and companies.

Balakrishnan was an advisor to Meraki from its inception in 2006 to its acquisition by Cisco in 2012. In 2003, Balakrishnan co-founded StreamBase Systems (acquired by TIBCO), the first high-performance commercial stream processing (aka complex event processing) engine. Between 2000 and 2003, he helped devise the key network QoS algorithms for Sandburst (acquired by Broadcom).


Towards A Learning-Directed Operating System

Aditya Akella (UT Austin) – Friday, February 27 @ 1 pm [in-person] (recording)

Category: Systems

Location: LGRC A112

Description

This work describes the main pillars of a Learning-Directed Operating System that treats policy design as a data-driven, system-wide optimization problem.

Abstract

Modern applications run on increasingly heterogeneous and dynamic platforms, yet today’s operating systems (OSes) still rely on rigid, locally optimized policies that are manually designed, weakly coordinated, and slow to adapt. As a result, even when resources are plentiful, performance and tail latency are often dominated by poor policy choices rather than fundamental hardware limits.

To address this, we are building LDOS, a Learning-Directed Operating System that treats policy design as a data-driven, system-wide optimization problem. In contrast to Linux, where mechanisms and policies are tightly entangled and global system state is difficult to observe or act upon, LDOS is designed from the ground up to expose rich observability, support fast feedback loops, and enable coordinated and trustworthy machine-learned control.

This talk describes the main pillars of the LDOS approach. I will first describe UNUM, which constructs system-wide state embeddings to enable higher-quality and coordinated policy decisions. I will then introduce Darwin, a family of techniques that make ML-driven policies practical by balancing instance-optimal decisions with generalization and runtime overhead. Next, I will present C3, a framework that enforces system-wide and tail-latency guarantees despite learned, adaptive control. The talk will conclude with the core design principles behind LDOS’s clean-slate prototype and an overview of its current status.

Bio

Aditya Akella is a Professor and Regents Chair in Computer Sciences at UT Austin and a Research Scientist at Meta. His research focuses on computer systems and their intersection with machine learning and formal methods. He leads the NSF CISE Expedition on Learning-Directed Operating Systems and serves as Founding Director of the InfraAI @ UT Center. Aditya’s work has influenced the infrastructure of large-scale online services and has been recognized with honors including ACM Fellow, SIGCOMM and IMC Test of Time Awards, the SIGCOMM Rising Star Award, the IRTF Applied Networking Research Prize, the NSF CAREER Award, and multiple best paper awards.


Braintrust: social knowledgebases as scientific fiduciaries

Evan Coleman (MIT) – Friday, March 6 @ 1 pm [in-person] (recording)

Category: Interdisciplinary

Location: LGRC A112

Description

This talk will demonstrate how accelerated research enabled by AI can unlock tackling interdisciplinary and long-horizon problems like climate change.

Abstract

Climate change presents a rare existential challenge. Nearly every economic sector contributes to it, since greenhouse gases are common byproducts of thermodynamically and economically favorable processes. At the same time, it is a planetary-scale problem with a single shared benchmark: atmospheric greenhouse gas concentration (CO2e). These conditions make climate change mitigation a natural test case for how experts coordinate to scale critical technologies under a unified, decades-long objective. Addressing and adapting to a changing climate requires navigating complex technology pathways that span many domains and do not necessarily admit end-to-end automation or simulation-driven discovery. These pathways resemble a “tech tree”: a structured narrative of progress that links partially developed ideas across disciplines to downstream sources of value. In this talk, I will present Braintrust, an early-stage open-source effort using language models to build navigable tech trees for science. Our goal is to convene researchers, research administrators, and financiers around shared representations of scientific progress. Basic research struggles to support such coordination because expertise is fragmented, first-of-a-kind efforts are risky, and incentives are weakly coupled to downstream economic value. Braintrust models tech trees as interactive structures that evolve with new evidence and human input. This approach is orthogonal to using LLMs for scientific execution: we use semantics to surface cross-domain connections, situate speculative ideas relative to established work, and represent uncertainty at the frontier where coordination and investment decisions are made. I will provide real-world examples within climate technology, and conclude by framing social knowledge bases as fiduciary tools that can support the allocation of resources in high-risk, high-reward scientific programs.

Bio

Evan Coleman is a Research Scientist in the MIT Climate Project working on applications of artificial intelligence and machine learning to climate change mitigation. He has a PhD in Theoretical High-Energy Physics from Stanford University (’22). Since joining MIT, his mandate has been to identify and address technical bottlenecks that inhibit the scaling and proliferation of climate technologies.

Evan’s research focuses on building data-driven algorithms, hardware, and scientific tooling that enable large-scale environmental monitoring, portable material characterization, and coordination of intelligent systems across the many disciplines relevant to climate change. His recent work includes applications to scalable prospecting of critical minerals and in situ elemental analysis. He is also the creator of Braintrust, an open-source project which is exploring the use of social knowledgebases as LLM-guided fiduciary tools for high-risk, long-horizon scientific programs.


Scientific production in the era of large language models

Yian Yin (Cornell) – Friday, March 13 @ 1 pm [TBD] (recording)

Category: AI/Interdisciplinary

Location: CSL E144

Description

This topic explores scientific production in the era of LLMs, examining the practical and conceptual shifts caused by powerful AI tools. This directly addresses the seminar’s core question of how AI is reshaping research methodologies, forcing a re-evaluation of traditional assumptions about originality, rigor, and contribution in the face of machine-assisted work. See this X thread.

Abstract

The rapid adoption of AI across disciplines is reshaping the landscape of scientific production. While enthusiasm and concern about generative AI in research are surging, empirical evidence remains scattered, and systematic understanding of large language models’ (LLMs) impact is still limited. In this talk, I draw on several large-scale analyses from my group that examine how LLM use changes the productivity of individual scientists, reshapes attention to prior works, introduces hallucinated content into the scientific record, and creates new challenges for peer review. Taken together, these results provide among the first macro-level evidence on GenAI’s impact on science, highlighting the need for institutions, journals, funding agencies, and the public to rethink how scientific work should be evaluated in this new era.

Bio

Yian Yin is an assistant professor of information science. As a computational social scientist, he applies and develops novel computational tools to understand how individual, social, and environmental processes independently and jointly promote (or inhibit) scientific progress and innovation achievements. His research has been published in top general audience venues such as Science, Nature, and Nature Human Behaviour and featured in media outlets such as Forbes, Scientific American, The Atlantic, Harvard Business Review, and MIT Technology Review. In 2023, he was named to Forbes 30 under 30: Science list. Yian received his Ph.D. in industrial engineering and management science at Northwestern University, with research affiliations at Northwestern Institute on Complex Systems and Kellogg Center for Science of Science and Innovation.


Recent mathematical advances with large language models

Mark Sellke (Harvard/OpenAI) – Friday, April 3 @ 1 pm [in-person] (recording)

Category: Theory

Location: CSL E144

Description

Mark will speak about recent mathematical advances made by large language models. We will look at some specific new theorems where AI made a primary contribution, and discuss the directions of future progress.

Bio

Mark Sellke is an Assistant Professor of Statistics at Harvard working in high-dimensional probability, optimization, and machine learning, and a researcher at OpenAI. His work has been recognized by best paper awards at SODA 2020 and NeurIPS 2021, and the Sloan Research Fellowship, Bernoulli New Researcher Award, and Rollo Davidson Prize.


Accelerating Scientific Research with Gemini: Case Studies and Common Techniques

David Woodruff (CMU) – Wednesday, April 15 @ 12 pm [in-person]

Category: Theory/Automated Review

Location: LGRC 112

Description

We present a collection of case studies demonstrating how researchers have successfully collaborated with advanced AI models, specifically Google’s Gemini-based models (in particular Gemini Deep Think and its advanced variants), to solve open problems, refute conjectures, and generate new proofs across diverse areas in theoretical computer science, as well as other areas such as economics, optimization, and physics. This is based on a corresponding paper here, with around 18 testimonials of open problems we made progress on.

I will also discuss using such models for a STOC pre-submission feedback system, see: Gemini provides automated feedback for theoretical computer scientists at STOC 2026

Bio

David Woodruff is a professor at Carnegie Mellon University in the Computer Science Department. Before that he was a research scientist at IBM Almaden for ten years. He received his PhD from MIT in 2007. His research interests include data stream algorithms, distributed algorithms, machine learning, numerical linear algebra, optimization, sketching, and sparse recovery. He is the recipient of the 2020 Simons Investigator Award, the 2014 Presburger Award, Best Paper Awards at STOC 2013, PODS 2010, and PODS, 2020, and a STOC 2023 Test of Time Award. At IBM he was a member of the Academy of Technology and a Master Inventor.


Calibration in the Age of AI: From Prediction to Decision Making to AI Assisted Research

Aaron Roth (UPenn) – Friday, April 17 @ 1 pm [remote]

Category: Theory

Location: LGRC A112

Description

As a theoretical AI researcher, Aaron Roth will likely explore the fundamental changes and challenges that AI introduces to theoretical analysis and the establishment of scientific rigor. This perspective is vital to the seminar as it examines how AI’s influence is evolving our notions of contribution and expertise across all areas of computer science research. See this X thread.

Abstract

Calibration serves as a trustworthy interface between prediction and decision making, and has (in my opinion) been getting only more important and interesting as a research topic as AI agents become commonplace. But AI tools are also going to revolutionize how mathematical research is conducted. In this talk we’ll walk through two lower bounds we have proven, establishing the optimal sample complexity for multicalibration in both the sequential and batch settings. Both of these papers were written with AI assistance, and at the end of the talk we’ll describe the process and the tools we used. We will end with Q&A and unstructured musings about what AI is already very good at and where its weak spots lie.

Bio

Calibration serves as a trustworthy interface between prediction and decision making, and has (in my opinion) been getting only more important and interesting as a research topic as AI agents become commonplace. But AI tools are also going to revolutionize how mathematical research is conducted. In this talk we’ll walk through two lower bounds we have proven, establishing the optimal sample complexity for multicalibration in both the sequential and batch settings. Both of these papers were written with AI assistance, and at the end of the talk we’ll describe the process and the tools we used. We will end with Q&A and unstructured musings about what AI is already very good at and where its weak spots lie.


The Impact Market to Save Conference Peer Review

Karu Sankaralingam (UW Madison / NVIDIA) – Friday, May 1 @ 1 pm [TBD]

Category: systems/interdisciplinary/review process

Location: CSL E144

Description

This talk introduces the Impact Market, a market-based mechanism for conference peer review that decouples dissemination from credentialing. It proposes a multi-stage process that combines broad access to publication with longer-horizon, incentive-aligned evaluation to produce a more stable and interpretable signal of impact. This directly ties to the seminar’s theme by examining how AI-era scale and automation are pressuring us to redesign research incentives, review processes, and what “quality” and “contribution” should mean in the first place.