Research in the AI Era

We are entering an era of reasoning abundance enabled by AI, much as the internet ushered in an era of information abundance, opening new opportunities and challenges for research across all areas of computer science. AI is rapidly reshaping nearly every stage of the research lifecycle, from identifying and framing research topics and conducting literature reviews to theoretical analysis, experimental design and execution, interpreting results, writing and reviewing papers, and supporting reproducibility. As these tools become more tightly integrated, sometimes augmenting and sometimes automating core tasks, our notions of contribution, rigor, and expertise are evolving.

This seminar series brings together speakers from across ALL areas of computer science, spanning theory, systems, AI, and interdisciplinary, to talk about these shifts. Each talk will offer a perspective on how AI is changing research methodologies, challenging traditional assumptions about originality and rigor, and enabling new forms of collaboration. Join us to explore what research may look like in the AI era and to discuss how we can harness these tools while preserving the core values of scientific inquiry.

Format & timing

Schedule

What Claude and I Did This Winter!

Emery Berger (UMass Amherst) – Friday, February 6 @ 1 pm [in-person]

Category: Systems

Location: CSL E144

Description

An informal talk about how everything has suddenly changed!*

Abstract

Coding agents have arrived, and they represent a sea change. I spent a few weeks working with one of them (Claude Code) and was blown away. To convey just how powerful these agents are, I will talk about just a sampling of the wide range of projects Claude and I were able to get done in an absurdly small amount of time.


Glia: A Human-Inspired AI for Automated Systems Design and Optimization

Hari Balakrishnan (MIT) – Friday, February 13 @ 1 pm [in-person]

Category: Systems

Location: CSL E144

Description

This talk will focus on how AI can be used to autonomously design and optimize complex systems. This directly ties into the seminar’s theme by showcasing how AI is automating the core tasks of experimental design and execution, bringing an era of reasoning abundance to the systems research lifecycle.

Abstract

Can an AI autonomously design mechanisms for computer systems on par with the creativity and reasoning of human experts? We present Glia, an AI architecture for networked systems design that uses large language models (LLMs) in a human-inspired, multi-agent workflow. Each agent specializes in reasoning, experimentation, and analysis, collaborating through an evaluation framework that grounds abstract reasoning in empirical feedback. Unlike prior ML-for-systems methods that optimize black-box policies, Glia generates interpretable designs and exposes its reasoning process. When applied to a distributed GPU cluster for LLM inference, it produces new algorithms for request routing, scheduling, and auto-scaling that perform at human-expert levels in significantly less time, while yielding novel insights into workload behavior. Our results suggest that by combining reasoning LLMs with structured experimentation, an AI can produce creative and understandable designs for complex systems problems.

Bio

Hari Balakrishnan is the Fujitsu Professor of Computer Science at MIT. His research is in networked computer systems, with current interests in networking, sensing, and perception for sensor-equipped mobile devices connected to cloud or edge services. He has made many contributions to mobile and sensor computing, overlay and peer-to-peer networks, congestion control, Internet routing, and data management systems.

In 2010, based on the CarTel project, Balakrishnan co-founded Cambridge Mobile Telematics (CMT). CMT’s mission is to make the world’s roads and drivers safer. Using mobile sensing and IoT, signal processing, machine learning, and behavioral science, CMT’s platform measures driving behavior to improve driving behavior and reduce risk, provides crash alerts and roadside assistance, and creates a smooth connected claims process. Today, CMT is the world’s leading telematics and analytics provider, serving many millions of users in 25 countries by partnering with insurers (including powering consumer telematics programs at 21 of the top 25 US insurers), car makers, commercial mobility providers, and the public sector.

Balakrishnan received his PhD in 1998 from the EECS Department at UC Berkeley, which named him a Distinguished Alumnus in 2021, and a BTech in Computer Science in 1993 from IIT Madras, which named him a Distinguished Alumnus in 2013. He was elected to the National Academy of Engineering (2015) and to the American Academy of Arts and Sciences (2017). His honors include the Marconi Prize (2023), the ACM SIGCOMM Award for lifetime contributions to communication networks (2021), the IEEE Kobayashi Computers and Communications Award (2021), the Ernst and Young Entrepreneur of the Year Award for the New England region (2021), the Infosys Prize for Engineering and Computer Science (2020), Fellow of the IEEE (2020), Fellow of the ACM (2008), Sloan Fellow (2002), and the ACM doctoral dissertation award for Computer Science (1998). He has received several best-paper awards including six test-of-time awards for papers with long-term impact and the IEEE Bennett paper prize (2004). At MIT, he has received several honors including the Harold E. Edgerton faculty achievement award for research, teaching, and service (2003), the HKN best instructor award (2018), the Jamieson teaching award (2012), the Junior Bose teaching award (2002), and the Spira teaching award (2001). He has graduated 26 PhD students and 10 postdocs, who have made their mark in research and industry at leading universities and companies.

Balakrishnan was an advisor to Meraki from its inception in 2006 to its acquisition by Cisco in 2012. In 2003, Balakrishnan co-founded StreamBase Systems (acquired by TIBCO), the first high-performance commercial stream processing (aka complex event processing) engine. Between 2000 and 2003, he helped devise the key network QoS algorithms for Sandburst (acquired by Broadcom).


Towards A Learning-Directed Operating System

Aditya Akella (UT Austin) – Friday, February 27 @ 1 pm [remote]

Category: Systems

Location: LGRC A112

Description

This work describes the main pillars of a Learning-Directed Operating System that treats policy design as a data-driven, system-wide optimization problem.

Abstract

Modern applications run on increasingly heterogeneous and dynamic platforms, yet today’s operating systems (OSes) still rely on rigid, locally optimized policies that are manually designed, weakly coordinated, and slow to adapt. As a result, even when resources are plentiful, performance and tail latency are often dominated by poor policy choices rather than fundamental hardware limits.

To address this, we are building LDOS, a Learning-Directed Operating System that treats policy design as a data-driven, system-wide optimization problem. In contrast to Linux, where mechanisms and policies are tightly entangled and global system state is difficult to observe or act upon, LDOS is designed from the ground up to expose rich observability, support fast feedback loops, and enable coordinated and trustworthy machine-learned control.

This talk describes the main pillars of the LDOS approach. I will first describe UNUM, which constructs system-wide state embeddings to enable higher-quality and coordinated policy decisions. I will then introduce Darwin, a family of techniques that make ML-driven policies practical by balancing instance-optimal decisions with generalization and runtime overhead. Next, I will present C3, a framework that enforces system-wide and tail-latency guarantees despite learned, adaptive control. The talk will conclude with the core design principles behind LDOS’s clean-slate prototype and an overview of its current status.

Bio

Aditya Akella is a Professor and Regents Chair in Computer Sciences at UT Austin and a Research Scientist at Meta. His research focuses on computer systems and their intersection with machine learning and formal methods. He leads the NSF CISE Expedition on Learning-Directed Operating Systems and serves as Founding Director of the InfraAI @ UT Center. Aditya’s work has influenced the infrastructure of large-scale online services and has been recognized with honors including ACM Fellow, SIGCOMM and IMC Test of Time Awards, the SIGCOMM Rising Star Award, the IRTF Applied Networking Research Prize, the NSF CAREER Award, and multiple best paper awards.


Braintrust: social knowledgebases as scientific fiduciaries

Evan Coleman (MIT) – Friday, March 6 @ 1 pm [in-person]

Category: Interdisciplinary

Location: LGRC A112

Description

This talk will demonstrate how accelerated research enabled by AI can unlock tackling interdisciplinary and long-horizon problems like climate change.

Abstract

Climate change presents a rare existential challenge. Nearly every economic sector contributes to it, since greenhouse gases are common byproducts of thermodynamically and economically favorable processes. At the same time, it is a planetary-scale problem with a single shared benchmark: atmospheric greenhouse gas concentration (CO2e). These conditions make climate change mitigation a natural test case for how experts coordinate to scale critical technologies under a unified, decades-long objective. Addressing and adapting to a changing climate requires navigating complex technology pathways that span many domains and do not necessarily admit end-to-end automation or simulation-driven discovery. These pathways resemble a “tech tree”: a structured narrative of progress that links partially developed ideas across disciplines to downstream sources of value. In this talk, I will present Braintrust, an early-stage open-source effort using language models to build navigable tech trees for science. Our goal is to convene researchers, research administrators, and financiers around shared representations of scientific progress. Basic research struggles to support such coordination because expertise is fragmented, first-of-a-kind efforts are risky, and incentives are weakly coupled to downstream economic value. Braintrust models tech trees as interactive structures that evolve with new evidence and human input. This approach is orthogonal to using LLMs for scientific execution: we use semantics to surface cross-domain connections, situate speculative ideas relative to established work, and represent uncertainty at the frontier where coordination and investment decisions are made. I will provide real-world examples within climate technology, and conclude by framing social knowledge bases as fiduciary tools that can support the allocation of resources in high-risk, high-reward scientific programs.

Bio

Evan Coleman is a Research Scientist in the MIT Climate Project working on applications of artificial intelligence and machine learning to climate change mitigation. He has a PhD in Theoretical High-Energy Physics from Stanford University (’22). Since joining MIT, his mandate has been to identify and address technical bottlenecks that inhibit the scaling and proliferation of climate technologies.

Evan’s research focuses on building data-driven algorithms, hardware, and scientific tooling that enable large-scale environmental monitoring, portable material characterization, and coordination of intelligent systems across the many disciplines relevant to climate change. His recent work includes applications to scalable prospecting of critical minerals and in situ elemental analysis. He is also the creator of Braintrust, an open-source project which is exploring the use of social knowledgebases as LLM-guided fiduciary tools for high-risk, long-horizon scientific programs.


Scientific production in the era of large language models

Yian Yin (Cornell) – Friday, March 13 @ 1 pm [TBD]

Category: AI/Interdisciplinary

Location: CSL E144

Description

This topic explores scientific production in the era of LLMs, examining the practical and conceptual shifts caused by powerful AI tools. This directly addresses the seminar’s core question of how AI is reshaping research methodologies, forcing a re-evaluation of traditional assumptions about originality, rigor, and contribution in the face of machine-assisted work. See this X thread.

Abstract

Despite growing excitement (and concern) about the fast adoption of generative artificial intelligence (Gen AI) across all academic disciplines, empirical evidence remains fragmented, and systematic understanding of the impact of large language models (LLMs) across scientific domains is limited. We analyzed large-scale data from three major preprint repositories to show that the use of LLMs accelerates manuscript output, reduces barriers for non-native English speakers, and diversifies the discovery of prior literatures. However, traditional signals of scientific quality such as language complexity are becoming unreliable indicators of merit, just as we are experiencing an upswing in the quantity of scientific work. As AI systems advance, they will challenge our fundamental assumptions about research quality, scholarly communication, and the nature of intellectual labor. Science policy-makers must consider how to evolve our scientific institutions to accommodate the rapidly changing scientific production process.


TBD

Aaron Roth (UPenn) – Friday, March 27 @ 1 pm [remote]

Category: Theory

Location: LGRC A112

Description

As a theoretical AI researcher, Aaron Roth will likely explore the fundamental changes and challenges that AI introduces to theoretical analysis and the establishment of scientific rigor. This perspective is vital to the seminar as it examines how AI’s influence is evolving our notions of contribution and expertise across all areas of computer science research. See this X thread.

Abstract

TBD


On Learning-Curve Monotonicity for Maximum Likelihood Estimators

Mark Sellke (Harvard/OpenAI) – Friday, April 3 @ 1 pm [in-person]

Category: Theory

Location: CSL E144

Description

This talk is based on work, On Learning-Curve Monotonicity for Maximum Likelihood Estimators, where all the results were derived by AI models (variants of GPT-5.2 Pro) with humans only providing prompts and verification. This strikingly illustrates the seminar’s theme by demonstrating AI’s capacity to automate the demanding task of theoretical analysis and proof generation, profoundly challenging traditional views on human-driven discovery and scientific originality.

Abstract

The property of learning-curve monotonicity, highlighted in a recent series of work by Loog, Mey and Viering, describes algorithms which only improve in average performance given more data, for any underlying data distribution within a given family. We establish the first nontrivial monotonicity guarantees for the maximum likelihood estimator in a variety of well-specified parametric settings. For sequential prediction with log loss, we show monotonicity (in fact complete monotonicity) of the forward KL divergence for Gaussian vectors with unknown covariance and either known or unknown mean, as well as for Gamma variables with unknown scale parameter. The Gaussian setting was explicitly highlighted as open in the aforementioned works, even in dimension 1. Finally we observe that for reverse KL divergence, a folklore trick yields monotonicity for very general exponential families. All results in this paper were derived by variants of GPT-5.2 Pro. Humans did not provide any proof strategies or intermediate arguments, but only prompted the model to continue developing additional results, and verified and transcribed its proofs.


Gemini-based automated feedback for theoretical CS papers

David Woodruff (CMU) – Friday, April 10 @ 1 pm [TBD]

Category: Theory/Automated Review

Location: CSL E144

Description

Focusing on Gemini-based automated feedback for theoretical CS papers, this session will cover the application of AI to the peer review process. This directly relates to the seminar series by showing how AI is being integrated into the later stages of the research lifecycle, specifically, augmenting and automating the critical tasks of writing and reviewing papers to support reproducibility and rigor.

Abstract

The pursuit of truth in theoretical computer science and mathematics relies on the highest standards of proof, rigor, and clarity. While peer review is the crucial final check, the process of drafting and refining complex theoretical work often takes months, with simple errors, inconsistent variables, or subtle logical gaps frequently slowing down the entire research pipeline. But could a highly specialized AI tool act as a fast, rigorous collaborator, helping authors pre-vet their work before it ever reaches human reviewers?

To test this potential, we created an experimental program for the Annual ACM Symposium on Theory of Computing (STOC 2026) — one of the most prestigious venues in theoretical computer science. This program offered authors automated, pre-submission feedback generated by a specialized Gemini AI tool. Our objective was to provide constructive suggestions and identify potential technical issues within 24 hours of submission, helping authors polish their final drafts before the submission deadline.

The responses were very positive: the tool successfully identified a variety of issues, including calculation and logic errors. Here we report how we developed the tool and the results of its use.


The Impact Market to Save Conference Peer Review

Karu Sankaralingam (UW Madison / NVIDIA) – Friday, April 17 @ 1 pm [TBD]

Category: systems/interdisciplinary/review process

Location: CSL E144

Description

This talk introduces the Impact Market, a market-based mechanism for conference peer review that decouples dissemination from credentialing. It proposes a multi-stage process that combines broad access to publication with longer-horizon, incentive-aligned evaluation to produce a more stable and interpretable signal of impact. This directly ties to the seminar’s theme by examining how AI-era scale and automation are pressuring us to redesign research incentives, review processes, and what “quality” and “contribution” should mean in the first place.