A Boston AI lab convincing language models to write code that won't crash a fighter jet, a satellite, or a Bosch dishwasher.
Inside that wing is software. The software runs on a chip nobody outside a small procurement office has heard of, in a language a small number of engineers still write fluently, and it was last meaningfully updated when Obama was in his first term. Replacing it should take a week. Replacing it actually takes three years - because if it goes wrong, somebody dies. This is the world Code Metal is built for. It is not the part of AI that goes viral on a Tuesday afternoon.
Code Metal is a two-year-old Boston startup with sixty-six employees and roughly $178 million in the bank. It builds an AI platform that translates code from one language to another - Python to CUDA, MATLAB to C++, legacy FORTRAN to something that will run on a 2026 FPGA - and then formally proves the translation is correct. The word "proves" is doing a lot of work in that sentence. It is also why aerospace, defense and automotive companies will return their calls.
Most generative-AI code tools work like a charming intern. Code Metal works like a clerk with a pen and an extremely long checklist. That is the entire pitch. It is, somehow, working.
Pick a hardware company. Boeing, Bosch, Toshiba, Raytheon. Inside it lives an unromantic crisis: there is too much code, written in too many languages, for too many chips, and the engineers who originally wrote it are mostly retired. The next chip is always different. The compliance rules (MISRA C, DO-178C, ISO 26262) are immovable. The cost of a port is measured in person-years.
Meanwhile, the rest of the software industry has decided large language models can write the code. They can - if you do not mind a five percent chance of a quietly broken edge case. For a SaaS dashboard, five percent is a bug ticket. For an anti-lock braking system, five percent is a recall.
This is the gap Code Metal walked into. Not the gap between humans and AI. The gap between AI you can demo and AI you can certify.
Peter Morales, the CEO, came to the idea the way people in defense usually come to ideas: by being handed one. At BAE Systems, he was asked to make machine-learning algorithms run faster on an F-35, specifically the algorithms responsible for not getting shot down. The bottleneck, it turned out, was not the math. The bottleneck was getting the math to live happily on a particular chip. Morales spent a decade in AI research, including a stint at MIT Lincoln Laboratory automating things for the Air Force. He noticed, eventually, that everybody had this problem and nobody was solving it generally.
His co-founder Alex Showalter-Bucher had spent a decade at Lincoln Lab too, doing the kind of work the Navy, the Army and the Department of Homeland Security do not put in press releases. In 2023 they left, raised $3.5M in pre-seed money from J2 Ventures and Fulcrum, and started writing the version of the tool they had always wanted.
The bet was not that AI could write code. It was that AI plus an old, deeply unfashionable branch of computer science called formal methods could write code that was provably correct. Most of Silicon Valley had forgotten formal methods existed. Code Metal remembered.
Conceptually the platform is unglamorous, which is the point. A customer hands it a body of source code - say, a control loop written in MATLAB. Code Metal breaks the code into pieces small enough that an LLM cannot meaningfully hallucinate, instruments each piece with auto-generated unit tests, translates the piece into the target language (C++, Rust, CUDA, VHDL, MISRA C, take your pick), and then verifies the new piece behaves identically to the original. Then it optimizes. Then it benchmarks. Then it hands you back something a compliance officer will accept.
The company calls this "Divide & Conquer, Instrument & Test, Translate & Verify, Optimize & Benchmark." It rolls off the tongue exactly as well as you would expect from a team of ex-Lincoln Lab researchers.
The flagship: translate between C++, Python, Rust, MATLAB, CUDA, OpenCL, VHDL and MISRA C, then prove the translation is equivalent.
Pushes generated code onto FPGAs, GPUs, microcontrollers and custom accelerators - without the usual nine-month porting epic.
Turn a 1998 codebase into something a 2026 engineer can actually maintain, with behavior preserved under formal verification.
Investors include Accel, Salesforce Ventures, Shield Capital, J2 Ventures and Fulcrum Venture Group. The customer list is the more telling chart, but harder to draw, so we will list it instead.
The official mission statement is to "make AI trustworthy by safely delivering the last mile for mission-critical industries." The phrase "last mile" is doing the heavy lifting. A model that can almost translate a flight controller is, in this domain, no model at all. The last twenty percent - the part where the code actually has to run, in production, on a chip, without surprises - is where the entire industry has been stalling. Code Metal has decided that gap is the whole business.
The deeper thesis is cultural. Defense and aerospace did not catch the generative-AI wave in 2023 because their procurement officers, very sensibly, refused to deploy software whose behavior could not be guaranteed. Code Metal's argument is that those officers were right - and that the way to get AI into regulated industries is not to lower the safety bar but to raise the verification ceiling.
It is a less viral position. It is also, increasingly, the lucrative one.
The chip world is in a slow-motion reshuffle. Custom silicon is everywhere - in cars, in appliances, in factory robots, in earbuds, in satellites. Each new chip brings its own SDK, its own quirks, its own performance corners. The classical answer is to throw a team of embedded engineers at the problem and wait. There is no team of embedded engineers large enough.
If Code Metal works - and the customer list suggests it does, for narrow cases, today - then the unit economics of writing software for physical things change. A medical-device company can support three new processors a year instead of one. A drone manufacturer can target both NVIDIA and a domestic FPGA without forking the codebase. A defense contractor can port a 30-year-old simulation off a deprecated compiler in weeks rather than quarters.
The reason this matters past the trade press is the obvious one. The next decade of hardware - autonomous vehicles, satellite constellations, factory automation, medical robotics - is gated almost entirely on the speed at which trustworthy software can reach trustworthy chips. Generative AI alone cannot deliver this. Formal verification alone cannot deliver this. The combination, if it scales, can.
Back to the wing. The flexing one, the one over the Pacific. The code inside it does not yet know that a Boston company is coming for the codebase. It will get a port that takes weeks instead of years. Maybe one of the engineers will go home in time for dinner. Maybe a few of them will read about Code Metal in a press release and not quite understand what is being claimed. That is generally how unglamorous revolutions work. Quietly, in someone else's industry, until the part where they are everywhere.