My research group builds a variety of software systems in the course of our research. These are the systems that we are actively building:
Large Language Models (LLMs) of Code: BigCode is a large collaboration to build open and responsible large language models (LLMs) of code. BigCode has released several Code LLMs to which our group has contributed, including StarCoder and SantaCoder. I lead the working group on evaluation for the BigCode Project. MultiPL-E is the first polyglot benchmark for Code LLMs and is used by BigCode and several other organizations developing Code LLMs.
WebAssembly: WasmFX is a large collaboration to bring effect handlers (related to continuations) to WebAssembly. Our group developed a very early prototype of continuations for WebAssembly (DLS 2020). We have also worked on benchmarking WebAssembly performance (ATC 2019).
Stopify: A source-to-source compiler that adds first-class continuations to JavaScript. We use Stopify to build Ocelot, which is a JavaScript IDE that runs in a web browser. Ocelot was the IDE used to teach an introductory software engineering class (COMPSCI 220) at UMass Amherst from 2018–2022. We also used Ocelot to teach an outreach workshop that allows novices to program high-performance, soccer-playing robots. We have described some of this work in research papers (PLDI 2018 and EAAI 2020).
Here are some past projects that are no longer in active development, in chronological order:
Ovid: A control-flow analysis for JavaScript that I developed early in my PhD. It is described in a research paper at WWW 2009.
Flapjax brought functional reactive programming to JavaScript. We used it to build a variety of systems that ran for 10+ years. We described Flapjax in an OOPSLA 2009 paper that received the OOPSLA Most Influential Paper Award in 2019. We also wrote a 10-year retrospective on the SIGPLAN blog.
LambdaJS: A tested semantics for JavaScript that I developed during my PhD. LambdaJS is described in my dissertation and in an ECOOP 2010 paper, and spurred several follow-up efforts.
Strobe: A type-checker for JavaScript, that I also developed during my PhD, a few years before related tools like Microsoft TypeScript and Facebook Flow appeared. Its approach to type-checking is described in my dissertation and research papers (ESOP 2011 and FOOL 2012). This type-checker was also the foundation for other research projects (USENIX Security 2011 ECOOP 2013, DLS 2013).
PANE: An SDN controller that allows safe, decentralized network configuration. PANE policies are written in a domain-specific language, and it has a compiler and runtime system that realizes the policies in OpenFlow. PANE was the foundation for Andrew Ferguson’s dissertation and is also described in a series of research papers (HotICE 2012], HotSDN 2012, and SIGCOMM 2013).
Frenetic: A language for programming software-defined networks. PANE borrowed many ideas from Frenetic, and I joined the Frenetic project two years after its inception. During my postdoc, I led the development of a new compiler and runtime system for Frenetic, and helped evolve its language (NetKAT). My work on Frenetic is described in a series of research papers (PLDI 2013, POPL 2014, and ICFP 2015).
Rehearsal: A verification tool for Puppet, which is an industrial DSL for writing system configurations. We used Rehearsal to build Tortoise, which is an automated configuration repair tool. Both are described in research papers (PLDI 2016 and ASE 2017).
SMT-Based Robot Transition Repair (SRTR): The SRTR project is a suite of programming tools for the working roboticist. Robot control programs suffer failures that are exceptionally hard to fix: some failures are unobservable (by the robot), others occur due to environmental factors, and recovery sometimes requires human intervention. SRTR provides a DSL for writing robot control programs, and a semi-automated program synthesis and repair tools. Portions of SRTR are described in research papers (IJCAI 2018 and CoRL 2020).
Containerless: A serverless computing platform that transparently uses language-based techniques to isolate and accelerate serverless functions when possible, and falls back to virtualization only when necessary.