SEMINAL: Searching for Error Messages in Advanced Languages
Papers and Downloads
- Searching for Type-Error Messages, PLDI 2007
- SEMINAL: Searching for ML Type-Error Messages, ML Workshop 2006
- Prototype implementation
Overview
Seminal is a new approach to providing useful type-error messages during compilation, while simultaneously making compilers simpler, more reliable, and faster. The key idea is to use a simple and robust type-checker as an oracle for an error-message finder that uses search to find a good error message in the form of a "similar" code skeleton that would type-check.
Motivation
Modern programming languages have complex type systems, complex implementations, and complex rules for type inference and/or implicit type instantiation of "generic" functions. Algorithms for inferring types are often simple, elegant, and fast in practice for well-typed programs, but the simple algorithms do not produce good error messages. So what typically happens is an elegant compiler implementation gets muddied with extra state, flags, special cases, and strings in an attempt to produce marginally better error messages. This process, which we have seen first-hand in the Cyclone compiler and confirmed anecdotally with many other researchers, leads to compilers that are buggier, tougher to write, much tougher to maintain, and slower.
Approach
We maintain the simple elegance of a type-checker that
makes little attempt to produce good error messages by not
using it directly to produce error messages.
Similarly, we do not construct a complicated type-checker
just to produce error messages. Instead, the
error-message producer is a search procedure that
uses the simple type-checker as an oracle to determine if
programs type-check. The goal of the search is to find a
program skeleton very much like the ill-typed program
except the skeleton would type-check. For example, a
message might say "f(e1,e2,e3)
does not
typecheck but f(e1,_,e3)
does if you replace
_
with an expression of type
t
." Of course, there are many such skeletons
that we then rank, suggesting the "best" ones first.
Advantages
The key insight is that we can reuse an elegant type-checker for error messages with essentially no change to the type-checker's implementation. A search procedure may be too slow for compiling a large program, but we use search only for ill-typed code. As such, it is reasonable to act "at human speed" -- searching for one or two seconds is reasonable. Furthermore, the search is over program expressions, which are often simpler than program types, and does not require a detailed understanding of the type system. Finally, the error messages should be better than they were before. In short, we believe Seminal represents an easier implementation approach that produces better results, a true "win-win."
Contact
- Email (essential):
- (first initial + last name) {at} ccs.neu.edu
- Location (likely):
- West Village H, Office 326
- Post (possible):
-
Northeastern University
Khoury College of Computer Sciences
360 Huntington Ave, 2nd floor
Boston, MA 02115