Verified Compilers for a Multi-Language World
Fri 02.19.16
Verified Compilers for a Multi-Language World
Fri 02.19.16
Fri 02.19.16
Fri 02.19.16
Fri 02.19.16
Fri 02.19.16
Compilers play a critical role in the production of software. As such, they should be correct. That is, they should preserve the behavior of all programs they compile. Despite remarkable progress on formally verified compilers in recent years, these compilers suffer from a serious limitation: they are proved correct under the assumption that they will only be used to compile whole programs. This is an entirely unrealistic assumption since most software systems today are comprised of components written in different languages compiled by different compilers to a common low-level target language. The intellectual merit of this project is the development of a proof architecture for building verified compilers for today’s world of multi-language software: such verified compilers guarantee correct compilation of components and support linking with arbitrary target code, no matter its source. The project’s broader significance and importance are that verified compilation of components stands to benefit practically every software system, from safety-critical software to web browsers, because such systems use libraries or components that are written in a variety of languages. The project will achieve broad impact through the development of (i) a proof methodology that scales to realistic multi-pass compilers and multi-language software, (ii) a target language that extends LLVM—increasingly the target of choice for modern compilers—with support for compilation from type-safe source languages, and (iii) educational materials related to the proof techniques employed in the course of this project.
The project has two central themes, both of which stem from a view of compiler correctness as a language interoperability problem. First, specification of correctness of component compilation demands a formal semantics of interoperability between the source and target languages. More precisely: if a source component (say s) compiles to target component (say t), then t linked with some arbitrary target code (say t’) should behave the same as s interoperating with t’. Second, enabling safe interoperability between components compiled from languages as different as Java, Rust, Python, and C, requires the design of a gradually type-safe target language based on LLVM that supports safe interoperability between more precisely typed, less precisely typed, and type-unsafe components.
Compilers play a critical role in the production of software. As such, they should be correct. That is, they should preserve the behavior of all programs they compile. Despite remarkable progress on formally verified compilers in recent years, these compilers suffer from a serious limitation: they are proved correct under the assumption that they will only be used to compile whole programs. This is an entirely unrealistic assumption since most software systems today are comprised of components written in different languages compiled by different compilers to a common low-level target language. The intellectual merit of this project is the development of a proof architecture for building verified compilers for today’s world of multi-language software: such verified compilers guarantee correct compilation of components and support linking with arbitrary target code, no matter its source. The project’s broader significance and importance are that verified compilation of components stands to benefit practically every software system, from safety-critical software to web browsers, because such systems use libraries or components that are written in a variety of languages. The project will achieve broad impact through the development of (i) a proof methodology that scales to realistic multi-pass compilers and multi-language software, (ii) a target language that extends LLVM—increasingly the target of choice for modern compilers—with support for compilation from type-safe source languages, and (iii) educational materials related to the proof techniques employed in the course of this project.
The project has two central themes, both of which stem from a view of compiler correctness as a language interoperability problem. First, specification of correctness of component compilation demands a formal semantics of interoperability between the source and target languages. More precisely: if a source component (say s) compiles to target component (say t), then t linked with some arbitrary target code (say t’) should behave the same as s interoperating with t’. Second, enabling safe interoperability between components compiled from languages as different as Java, Rust, Python, and C, requires the design of a gradually type-safe target language based on LLVM that supports safe interoperability between more precisely typed, less precisely typed, and type-unsafe components.