This tool finds errors in code and suggests how to fix them

Author: Milton Posner
Date: 06.28.24

At first glance, the field of formal methods doesn’t seem to lend itself to an easy-to-use product intended for broad audiences. It deals with the verification of software and hardware systems based on mathematical pillars and proofs, and even in the rigorous field of computer science, it stands out for its difficulty.

But Khoury doctoral student Max von Hippel, along with his best friend and MIT Sloan School of Management graduate Juan Castaño, built a product anyway. It’s called Benchify, and it can scan Python code, identify errors, and suggest fixes — all with just a single command. By optimizing the tedious process, the package enables software engineers to spend less time testing and debugging, and more time writing new code.

Max von Hippel (left) and Juan Castaño.
Max von Hippel (left) and Juan Castaño.

Von Hippel and Castaño have been developing the tool for the last year and a half, and now they’ve secured more than half a million dollars in funding from elite startup incubator Y Combinator and venture capital firm NFX.

“People were writing back-end code that was full of bugs, they didn’t have good techniques for bug finding, and the engineers hated writing unit tests,” von Hippel says. “They were testing their code in ways that didn’t reliably find problems, so they wanted a tool to help.”

The path to that tool began in 2015 when von Hippel and Castaño roomed together as first-years in an entrepreneurship-themed dorm at Dartmouth College. They quickly grew close and began kicking around the idea of founding a business together.

Von Hippel soon transferred to the University of Arizona, finished his math degree there, then jumped to Boston to pursue a formal methods doctorate at Khoury College. Castaño stayed in New Hampshire, focusing on economics and human-center design while completing his undergraduate studies at Dartmouth.

“We’d have phone calls where we’d chat about business ideas,” von Hippel recalls. “I’d come from a technical perspective of what was possible, while Juan would come from a viability perspective of what could make money.”

The stars aligned in 2022 when Castaño began his MBA just across the river at MIT. As part of the university’s Sandbox entrepreneurship program, he began ideating and meeting with business mentors. Von Hippel joined him for sessions, at first simply to help a buddy with his education. But as they interviewed dozens of engineers and project managers about the challenges they faced in the development processes, they realized that there was an unfilled niche for a formal-methods-based tool to assist software engineers.

“Research has shown that formal methods techniques are just way too hard for people to use, even if they have bachelor’s degrees in computer science,” von Hippel explains. “The big thing we’re trying to solve is not the problem of making better formal methods; it’s the problem of making existing formal methods available to normal software engineers.”

The idea, he adds, “is to have formal methods where the engineer literally doesn’t have to know the words ‘formal methods’ … We’ll translate from technical to nontechnical for you, and if you can understand a unit test written in Python, you can look at the result and know whether it applies to your code.”

That explainability has been von Hippel and Castaño’s biggest design challenge. Whether a user’s code is “correct” depends on what that code is trying to accomplish, so the duo designed Benchify to determine the engineer’s goal, suggest quick-and-dirty patches if they would improve the code, and stay silent if the proposed patch wouldn’t help.

Two benchify screenshots. The screenshot on the left shows a snippet of python code with the command for benchify to review the code under the snippet. The screenshot on the right shows the results of the review and says that the code passed one test and failed a second. The application also suggests to change line 5 of the code to pass the second test.

But in industry, as in academia, technical challenges aren’t the only challenges. As Castaño taught von Hippel the ins and outs of venture capital and investor pitches, von Hippel found that as opposed to academic funding agencies — which often look for researchers’ confidence in one specific plan — investors were more interested in how the duo planned to adjust based on the market’s response.

“Investors seemed open to the idea that there is a certain amount of customer discovery that should happen before we decide the best way to commercialize the product,” he recalls. “The important thing to them was that we had that decision tree laid out ahead of time, that we knew what the options were, and that we knew what experiments we could do to get a sense of what works and what doesn’t.”

After years of presenting in academia, von Hippel also practiced with Castaño to reach the “appropriate” level of depth for investors. But he was surprised by investors’ technical savvy.

“I haven’t had to hide the academic aspects of what we’re doing from investors, which I suspected I would have needed to,” he says. “The venture capital firms we’ve talked to have either had deep backgrounds in computer science or brought in people who did … That level of technical chops has allowed us to have really good conversations.”

Max Von Hippel (left) and Juan Castaño pitch to investors over video chat.
Von Hippel and Castaño pitch to investors over video chat.

Those conversations bore fruit when Y Combinator and NFX made their investments in May and April, respectively. Y Combinator is especially prized, as the early-stage incubator program, whose portfolio totals a combined $600 billion valuation, funds less than two percent of applicants — among them Airbnb, Reddit, Doordash, and Dropbox. The funding affords von Hippel and Castaño two years of runway to pay themselves full time, bring in new team members for part-time work, and cover legal and miscellaneous costs.

The beta version of Benchify, which is now available, is based on a freemium model; the free version allows coders to analyze five functions per day, while the paid version allows unlimited use. For now, the tool works only on isolated Python functions and is geared toward back-end engineers. Von Hippel and Castaño are expanding it to handle JavaScript, Java, and C#, as well as more tailored product offerings for different applications.

“A big goal of launching the product, beyond just revenue, is customer exploration,” von Hippel notes. “We have a bunch of cool feature ideas, but in contrast to academia where the goal is to go as deep as possible, here we want to hit low-hanging fruit first to help with commercialization.”

It’s an effort that von Hippel, who plans to defend his doctoral dissertation in July, expects to consume at least the next 5–10 years as he and Castaño guide Benchify to wider success. And he has faith that as they make the tool better, they’ll make each other better too.

“It’s tempting to look at us and see a skill dichotomy,” von Hippel says of his computing chops and Castaño’s entrepreneurial acumen, “but both of us find the other’s regime interesting. We talk about everything … and it’s been really rewarding and interesting to learn about a world that I didn’t have much exposure to.”

The Khoury Network: Be in the know

Subscribe now to our monthly newsletter for the latest stories and achievements of our students and faculty

This field is for validation purposes and should be left unchanged.