Health care claims
Used to submit a claim for payment or report an encounter. Professional, institutional, and dental variants.
Electronic Data Interchange is one of those things that is everywhere and invisible. Every time a claim moves from a hospital to an insurer, an enrollment changes at a health plan, or a provider checks whether a patient is covered before a procedure. That's EDI.1 Specifically, it's X12: a set of standards that have been governing business-to-business data exchange since the 1970s, and that the United States government baked into law through HIPAA Administrative Simplification.2
We ran into it building machine learning models for insurance claims: fraud detection, waste and abuse patterns, predictive outcome models. All of it depended on the same formats: 837s for claims, 834s for enrollment, 270/271 pairs for eligibility.3 We needed to parse and transform millions of these files fast, inside real pipelines, with enough structure preserved that field values could flow into the models without losing their meaning.
So we looked for a library. There are options. Most were thick desktop translators built for a previous era of middleware. Others were single-stack libraries that would lock us to one runtime. Some were hosted services that handed us an API and kept the parsing somewhere else. Nothing was fast, portable, and honest about failures at the same time.
That last part kept coming up. In health care, malformed files aren't the exception; they're normal. Payer implementations vary. Real-world EDI doesn't always look like the spec. When something breaks in your pipeline at 2am, you need a parser that names the exact segment, field, and violation. Not one that returns a byte offset and calls it done.
The design decision that makes it different: instead of hand-writing a parser for each transaction type, we wrote a program that generates parsers from specs. Define the transaction spec once. Get a complete, conformant parser from it. The approach borrows from programming language tooling: write the grammar, derive the parser. Tree-sitter is the canonical example. We applied the same idea here. One spec definition, one generated parser, consistent behavior across every runtime. Hand-written rules drift; generated ones don’t.
The core is written in Rust, which is fast and memory-safe in ways that matter when you’re processing high-volume medical data. From there it compiles to .NET, Java, Python, PHP, and JavaScript. It runs in a browser, which is how this playground works. And because each parser is derived from its schema, it knows exactly what valid looks like. When something breaks, the diagnostic names the segment, field, and violation.
Transforms are part of the same library. Parse to JSON, XML, FHIR, or any custom format you wire in. One dependency, one thing to reason about, the whole chain from raw X12 to structured data.
Reads the interchange header to determine delimiters and transaction type. No configuration, no schema files to manage up front.
X12 loops are implicit in the flat segment stream. The parser reconstructs that structure so you can navigate by loop, not by line number.
Diagnostics point at the exact segment, field, and violation. In language a person can read. Not a byte offset. The actual problem.
.NET, Java, Python, PHP, JavaScript, and browser via WebAssembly. One core library, no HTTP hop, no subprocess.
HIPAA Administrative Simplification standardizes how electronic health care transactions are structured and exchanged in the United States.4 X12 is the standards body behind the format, and their work goes well beyond health care into finance, supply chain, transportation, logistics, and defense.5 The transactions below are the health care ones we focused on first.
Used to submit a claim for payment or report an encounter. Professional, institutional, and dental variants.
Establishes, changes, reinstates, or terminates coverage between a sponsor and a health plan.
The pair used to ask whether a member has coverage for a given service and receive the answer.
X12 standards cover finance, insurance, supply chain, transportation, logistics, defense, and more.
Market frame
Most EDI tooling was built for throughput: move data from A to B, log a success, move on. That model works for stable, high-volume production routes. It starts to break down when you need to understand a failure, embed parsing into a pipeline you own, or run the same logic in more than one language.
We think about the space along two dimensions: how portable is the tool (does it run where you need it, in the language you're already using) and how transparent it is when something goes wrong.
Competitive landscape
Legacy translators
Edifecs, BizTalk, OpenText
Mature, often readable output. Usually tied to a specific platform, runtime, or licensing model. Hard to embed in a modern pipeline without significant overhead.
Where we're building
Portable and transparent
Multi-stack. Embeds in your pipeline. Diagnostics name the failure. Schema-derived. It knows what valid looks like, so it can tell you what isn't.
Hosted EDI services
SPS Commerce, TrueCommerce, VAN providers
Convenient for standard routes. You get an API, not a parser. When something breaks, debugging means a support ticket. The internals aren't yours to inspect.
DIY scripts
Hand-rolled regex and split()
Portable and cheap on day one. Breaks on any edge case the spec didn't cover, and you own every bug. Works until it doesn't, and then it really doesn't.
Best for
Teams that need parsing inside real products, jobs, or browser workflows. Not as a managed black box.
Trade-off
Hosted services make standard routes easy, but they hide the parser when the hard part is understanding a bad transaction.
Point of view
The gap we care about is failure understanding: exact segment, exact field, exact reason, in language a person can act on.
The browser preview is live. If you want to use the parser in your own stack, email us. We will tell you where it is solid, where it is still moving, and whether it is actually a fit.
hello@josh.one