Skip to main content

Generating insurance documents at scale using AI

· 11 min read
Hafida Maaraf
Hafida Maaraf
Senior Fullstack Developer
Cover

Every Orus insurance contract ships with a stack of legal documents: the contract terms, the insurance certificate, invoices, fiscal attestations for tax-deductible contracts. Their format is regulated, their content reviewed by legal counsel, and they're rendered dynamically from subscription data: customer name, company SIREN, coverages, premium, any clauses negotiated for that specific deal.

We add new products every few weeks. At the time of writing this, we render 60 distinct customer-facing document templates across our French and Spanish products. The whole thing is 218 React components.

Adding a new template used to take a couple of days. We've brought it down to a few hours by building a Claude Code skill that handles the tedious parts, with a human still in the loop on every non-trivial decision. This article walks through what the skill does, what we deliberately keep human, and the gotchas baked into it.

How insurance documents are rendered at Orus

We don't generate documents with an LLM at runtime. We render them deterministically, from React components, using react-pdf.

A simplified Conditions Particulières template looks like this:

export function AgreedTerms(props: AgreedTermsProps): ReactElement {
return (
<Document>
<FrontPage {...props} />
<CoveragesSection coverages={props.coverages} />
{props.customClauses.length > 0 && (
<CustomClausesSection clauses={props.customClauses} />
)}
<SignatureSection signedAt={props.signedAt} />
</Document>
)
}

Each section is itself a React component composed of building blocks from our internal library: H1, Paragraph, Table2Columns, TextCallout, Ul/Li (with a level prop for nested bullets), AmountText, DateText, Signature, and so on.

This setup buys us a few things:

  • TypeScript exhaustiveness. A new product version is a new branch in a switch statement. If we forget to add the branch, the code doesn't compile.
  • Reusable components. A callout looks the same everywhere because it's the same component.
  • Pure rendering. No template language, no string concatenation, no runtime AI. Same input, same output.
  • Testability. We snapshot the rendered PDFs in CI.

The PDFs themselves are rendered in a worker pool, with OpenTelemetry tracking render duration per template. None of that involves AI. It's React, all the way down.

The job of porting a Word document to code

A new insurance product, or a new version of an existing one, almost always starts the same way: someone in the product or legal team hands us a .docx file. That document is the source of truth, validated by legal counsel.

Our job is to turn it into a React component that produces the same document, but dynamically.

This used to be a multi-day effort. The reason isn't the typing; it's everything around it.

You need to:

  • Read the entire document carefully to identify every placeholder. Placeholders look like [CONTRACT_NUMBER], [IF_DRAFT]...[END_IF_DRAFT], [FOR_EACH_ACTIVITY]...[END_FOR_EACH]. They're not always consistent across products.
  • Decide on the structure. Single file? Folder with one section per file? Reuse common chunks from an existing product?
  • Find every reusable component we already have and use them, instead of recreating a slightly different version that ends up drifting.
  • Map every placeholder to a React prop. Some of them already exist in SubscriptionProps. Some are new.
  • Wire the custom clauses at the right spot. It's the same code in every template, but it's easy to forget, and if you do, the operations team finds out months later when a customer asks why their negotiated clause isn't on the PDF.
  • Get the nesting of lists right. Word documents can have 3 levels of bullet points. Our Li component takes a level prop. Mixing them up changes the visual but not the rendering tests.
  • Decide whether each paragraph is a regular Paragraph or a TextCallout or a LayoutCallout based on whether it has a single line, multiple paragraphs, lists, or tables.
  • Cross-check the result against the original DOCX, page by page.

None of this is intellectually hard, but all of it requires sustained attention, deep knowledge of the codebase, and zero tolerance for "I'll fix that later." A missing custom clause section is a contractual issue.

So this kind of work has two characteristics that make it a good candidate for AI assistance: it's tedious and rule-governed.

Why we built a skill, not a prompt

Our first instinct was the obvious one: open the DOCX, paste the relevant pieces into a chat, ask Claude to produce the React component. It worked, kind of. The output looked right at first glance, but it consistently had issues:

  • It would invent component names that don't exist in our library (<Callout> instead of <TextCallout>).
  • It would duplicate logic that already lives in a common chunk.
  • It would silently drop the custom clauses section.
  • It would format placeholders as TypeScript template literals instead of React expressions referencing real props.

All of these come from the same root cause: the model doesn't know our codebase. It knows React, it knows PDF generation in general, but it doesn't know that we have a LayoutCallout for callouts with multiple paragraphs, or that custom clauses live in subscription-props.ts and need to be wired in front-pages.tsx.

You can paper over this by stuffing the prompt with references and examples, but the result is brittle. Every time we add a new component or change a convention, the prompt rots.

What we needed was something more structured. Not a prompt, but a workflow that:

  1. Knows where to look in the codebase.
  2. Has explicit steps.
  3. Forces a validation gate before any code is generated.
  4. Encodes the domain rules we discovered the hard way (mammoth drops floating frames, custom clauses are easy to forget, never add a default case in version switches, etc.).

Claude Code skills are designed for exactly this. A skill is a workflow document that the agent loads on demand when the user invokes it. The agent reads the workflow, follows the steps, and runs the tools at each step.

Our skill is invoked like this:

/convert-editique-docx-to-react cp rcph 3.1 ./specs/cyber-guarantees.docx

And then it goes through eight steps before the developer has to do anything beyond pressing enter.

The workflow, step by step

Step 1: extract the content

npx mammoth <path-to-docx> --output-format=html > /tmp/docx-content.html

mammoth is a small library that converts DOCX to HTML. It does a decent job for body text, tables, lists, and headings.

Step 2: cross-check the raw XML

This is the first lesson the skill encodes. mammoth silently drops floating text frames: side notes and callout boxes anchored to a position in Word documents. Most of our legal documents come from Google Docs, which generates them liberally.

So the skill always runs a second extraction pass directly on the raw XML:

unzip -p <path-to-docx> word/document.xml \
| grep -oE '<w:t[^>]*>[^<]*</w:t>' \
| sed 's|<[^>]*>||g'

Then it diffs the two extractions. If the XML contains text that's missing from the HTML, the skill integrates it automatically into the generated component, typically as a TextCallout or LayoutCallout.

A developer caught this while testing an attestation: two regulatory mentions were missing from the rendered PDF. They were in floating text boxes that mammoth had dropped. Encoding the lesson in the skill means we don't trip over it again.

Step 3: analyze the codebase

Before generating anything, the agent explores the relevant parts of our documents library:

  • Where do similar templates live? (packages/libs/documents/src/documents/agreed-terms/ for CPs, documents/insurance-certificate/ for attestations.)
  • What does the structure look like for the same product if we already have a version of it? (We don't want to invent a new structure for a 3.1 if 3.0 already exists.)
  • What reusable common chunks can we reuse?
  • What's the props type for this product?

This is how we make sure the generated component looks like it belongs in the codebase.

Step 4: identify and map placeholders

The skill scans the DOCX content for every unique placeholder and sorts them by type:

PatternTypeExample
[VARIABLE_NAME]Variable[CONTRACT_NUMBER], [COMPANY_NAME]
[IF_X]...[END_IF_X]Conditional[IF_DRAFT]...[END_IF_DRAFT]
[FOR_EACH_X]...[END_FOR_EACH]Loop[FOR_EACH_ACTIVITY]...[END_FOR_EACH]

Then it tries to match each placeholder to a prop from SubscriptionProps. When it can match, it lists the source. When it can't, it flags the placeholder as needing clarification.

Step 5: validate with the user (mandatory gate)

Before generating any code, the skill stops and presents the developer with the following:

## Document Analysis: cp for rcph 3.1

### Product Status

- [x] Existing product, new version

### Document Content Analysis

**Sections identified:**

1. Front page (header, contract identification, dates)
2. Coverages section (with a new "Cyber" subsection)
3. ...

**Complexity assessment:**

- Total estimated lines: ~420
- Tables found: 3
- Conditional blocks: 5

### Proposed File Structure

```text
agreed-terms/rcph/3.1/
index.tsx
front-page.tsx
cyber-coverages-section.tsx
...
```

### Placeholders Found → Props Mapping

| DOCX Placeholder | React Prop | Type | Source |
| --------------------- | ----------------------- | ----------------- | ----------------------- |
| `[CONTRACT_NUMBER]` | `contractNumber` | `string` | `SubscriptionProps` |
| `[CYBER_LIMIT]` | `cyberCoverageLimit` | `number` | NEEDS NEW PROP |
| `[FOR_EACH_ACTIVITY]` | `activities.map(...)` | `Activity[]` | `SubscriptionProps` |
| `[RETROACTIVE_DATE]` | `retroactiveDate` | `Date` | NEEDS CLARIFICATION |

### Questions

1. Should `[CYBER_LIMIT]` be a new prop or derived from an existing field?
2. Should `[RETROACTIVE_DATE]` be a new prop or reuse `contractStartDate`?

Please confirm or adjust before I generate.

The developer reads this, fixes the questions, confirms, and only then does the skill move on. No code has been written at this point.

This gate is the difference between a generator we can use and one we have to babysit. It surfaces every ambiguity at the moment when it's cheap to resolve, and it forces a shared understanding between the developer and the agent.

Steps 6 to 8: generate, register, validate

These are the mechanical steps:

  • Generate the TSX files using the validated plan.
  • Register the new template in the entry file, with a TypeScript-exhaustive switch (no default case, ever; we want the compiler to break when we add a new version without handling it).
  • Run npx tsc --noEmit in the documents package.
  • Verify no [PLACEHOLDER] text remains in the generated files.

If all of that passes, the developer opens the rendered PDF and compares it to the DOCX. The closer the upstream steps got it right, the shorter this final review.

The gotchas, distilled

A skill is essentially the codified version of everything we wish we'd known six months ago. The short list of lessons baked into ours:

Mammoth drops floating frames

Already covered. The XML cross-check runs on every conversion and is fully automatic.

Don't omit custom clauses

Every CP must support custom clauses, but the way they're wired is product-specific and easy to skip. The skill always explicitly checks for them and refuses to generate without confirming where they go.

Callout type matters

Word doesn't distinguish between "callout with a single sentence" and "callout with three paragraphs and a table." React-PDF does. The skill encodes the rule:

Content TypeComponent
Single paragraph or short textTextCallout
Multiple paragraphsLayoutCallout
Content with listsLayoutCallout
Content with tablesLayoutCallout

Never add a default case in version switches

This one isn't AI-specific, but the skill enforces it. Every product entry file ends in a switch over the version. If we add a default case, TypeScript stops complaining when a new version is unhandled, which means the new version silently uses the wrong template. So: no defaults, ever.

The first time a new template fails to compile because we forgot to register it, we know the safety net works.

What we keep human

The skill does a lot, but the validation gate at step 5 is intentional. The developer remains in the loop on every non-trivial decision: which props are new, how to handle ambiguous content, whether to introduce a new common chunk or duplicate.

There are two reasons for this.

The first is correctness. Insurance documents are legal artifacts, and we want the AI to draft the work but a human to validate the structure before the first line of code is written.

The second is knowledge transfer. Every time a developer goes through this validation step, they learn a bit more about how our document library is organized. If we made the skill fully autonomous, we'd save a few minutes per template and lose the most useful feedback loop we have for building a shared mental model.

Conclusion

A new template that used to take two days now takes a couple of hours, and the result is more consistent with the rest of the codebase than what a human writing it from scratch would produce.

We didn't reach for an LLM at runtime, and we didn't try to make the model smarter. We took something the model was already capable of doing and built a workflow around it: codebase exploration, mandatory validation gates, encoded domain rules, deterministic post-processing. The AI handles the tedious work; the workflow keeps us honest about the parts where a human stays in the loop.

We're applying the same pattern to other areas: reactor-to-job migrations, dimension scaffolding, and a few more in the pipeline. Skills let us encode institutional knowledge as workflows the agent can follow.

If you find this approach interesting and want to come hack on the next generation of internal AI tooling, or on the insurance system it supports, we're hiring software engineers in Paris. Check out our open positions.