Docs

Use Cases

Blog

Resources

Request a demo

DocsUse CasesBlog
Log in
DocsUse CasesBlog
Log inRequest a demo
Use Cases/Complex Tables & Layout Extraction

Complex Tables & Layout Extraction

Extract data from multi-page tables, merged cells, nested layouts, and dense financial documents. Where other tools break, anyformat delivers.

Key highlights

  • Multi-page table extraction preserving structure across page breaks
  • Merged cells, row spans, and nested tables handled natively
  • Figure detection and explanation for charts and diagrams
  • Multi-stage pipeline with layout analysis, structure recognition, and field extraction
  • Calibrated confidence scoring on every extracted cell and field

Complex Table and Layout Extraction


Multi-page tables, merged cells, nested layouts — extracted with structure intact.


The problem: structure is the first casualty

Documents with complex tables and layouts are where most extraction tools fail. Standard OCR reads characters but loses spatial relationships. LLMs process text but cannot reliably reconstruct table structure from visual input. Even specialized tools often convert tables to Markdown, which flattens merged cells, destroys row spans, and strips the relational structure that makes the data meaningful.

The real-world documents that matter most — financial statements with nested sub-tables, medical records with multi-page lab results, insurance policies with coverage matrices, engineering specifications with parameter tables — are exactly the ones that break.

A table that spans three pages with merged header cells and hierarchical row groups is not an edge case. It is Tuesday.


Multi-stage pipeline: layout to structure to extraction

anyformat processes complex documents through a multi-stage pipeline that separates layout analysis, structure recognition, and data extraction into distinct phases:

Stage 1 — Layout analysis: The system identifies regions of the document: text blocks, tables, figures, headers, footers, and page furniture. Each region is classified and spatially anchored.

Stage 2 — Structure recognition: For tables, the system reconstructs the full grid structure: row and column boundaries, merged cells, header hierarchies, and spanning relationships. For multi-page tables, structure is stitched across page breaks with continuity preserved.

Stage 3 — Data extraction: With structure understood, the system extracts values into their correct positions within the recognized structure. The output is structured JSON that preserves every relationship — not flattened text.

This separation matters. When layout analysis, structure recognition, and extraction are collapsed into a single step (as most LLM-based tools do), the system has to solve three problems simultaneously. Errors compound. anyformat solves them in sequence, with each stage validating the previous one.


Table structure preservation across page breaks

When a table starts on page 4 and ends on page 7, most tools treat each page fragment independently. The result is four separate partial tables with lost headers, duplicated rows, and broken relationships.

anyformat detects table continuations across page breaks and reconstructs the complete table as a single structure. Headers are associated with all rows they govern, even when the header row is pages away from the data. Row groups and sub-totals maintain their hierarchical relationships throughout.


Merged cell and nested layout handling

Merged cells — both horizontal and vertical spans — are a persistent source of extraction errors. A cell that spans three rows creates ambiguity: which row does the value belong to? A header that spans five columns groups those columns semantically, but most tools lose that grouping.

anyformat explicitly models cell spans in its output structure. A merged cell is represented with its span coordinates, not duplicated across rows or collapsed into the first cell. Nested tables (tables within table cells) are recursively extracted as structured sub-objects.


Figure detection and classification

Complex documents contain more than text and tables. Charts, diagrams, photographs, signatures, stamps, and embedded images carry information that text extraction misses entirely.

anyformat detects figures within documents, classifies them by type (chart, diagram, photograph, signature), and generates structured descriptions that capture what the visual element represents in context. This is particularly valuable for technical documents, inspection reports, and scientific papers where figures are load-bearing content.


Confidence scoring per cell

Not every cell in a complex table is equally easy to extract. A clearly printed numeric value in a well-structured table might deserve 99% confidence. A handwritten annotation in a merged cell spanning a page break might deserve 60%.

anyformat assigns calibrated confidence scores at the cell level, not just the document or field level. This means downstream systems and human reviewers know exactly which values to trust and which to verify. The cost of a wrong value in a financial table or medical record is not abstract — per-cell confidence makes review efficient and targeted.


No Markdown lossy conversion

Many extraction tools — including LlamaParse and other RAG-focused parsers — convert documents to Markdown as an intermediate representation. Markdown is a text format. It was not designed to represent table structure.

When a table with merged cells, hierarchical headers, and multi-page spans is converted to Markdown, the result is a pipe-delimited grid that has lost most of its structural information. That loss is usually unrecoverable — downstream extraction cannot reconstruct what the conversion destroyed.

anyformat outputs structured JSON that preserves the complete table structure. Row spans, column spans, header hierarchies, cell types, and positional relationships are all retained. No intermediate Markdown step. No lossy conversion.

Reducto's RD-TableBench benchmark demonstrates how challenging complex table extraction is. anyformat addresses that challenge by preserving structure all the way through the pipeline, from layout analysis to final JSON output.


Visual grounding: see what the system sees

Every extracted value in anyformat is visually grounded — linked back to its exact position in the source document. When a reviewer questions a value, they can see the bounding box on the original document, verifying not just the extracted text but where the system found it.

For complex layouts where the same number might appear in multiple table cells, visual grounding eliminates ambiguity about which cell was extracted.


Built for the documents that break everything else

If your documents are simple single-page forms with consistent layouts, most tools will work. If your documents contain multi-page tables with merged cells, nested structures, figures, and handwritten annotations, you need a pipeline built for complexity.

Try anyformat on your most complex documents →


anyformat is the document intelligence platform built for enterprises that process complex, high-stakes documents. ISO 27001 certified, GDPR-compliant, with zero-retention processing and on-premise deployment. Learn more at anyformat.ai

Frequently asked questions

Can anyformat extract data from multi-page tables?

Yes. anyformat's multi-stage pipeline preserves table structure across page breaks, merged cells, and nested layouts. Unlike tools that output Markdown (losing structural information), anyformat outputs structured JSON that preserves all relational data.

How does anyformat handle complex document layouts?

The extraction pipeline includes layout analysis, structure recognition, and contextual field extraction. It handles multi-column layouts, tables within tables, headers spanning multiple rows, and mixed content (text + tables + images) in a single pass.

What makes anyformat different from OCR tools on complex documents?

Standard OCR returns raw text or bounding boxes. anyformat goes further — it understands the semantic structure of the document, extracting fields into structured JSON with confidence scoring. The multi-stage pipeline handles edge cases that break simpler tools.

Other use cases

Invoice Processing Automation

Financial Services & Compliance

Healthcare & Clinical Documents

Real Estate Document Processing

RAG & Document Intelligence Pipelines

API-First Document Processing

Stop processing documents manually

Book a demo and see how teams cut manual document processing by 5x with anyformat.

Contact:

info@anyformat.ai
ISO 27001 CertifiedGDPR Compliant

Stay updated

Get product news and updates

Sitemap

  • Home
  • Platform
  • Customers
  • Security
  • FAQ
  • Log in
  • Demo

Resources

  • Docs
  • Changelog
  • Blog
  • Security & Trust
Financiado por la Unión Europea – NextGenerationEUGobierno de España – Ministerio para la Transformación Digital y de la Función PúblicaPlan de Recuperación, Transformación y ResilienciaComunidad de Madrid

Copyright © 2026 anyformat.ai · Enterprise Document Operations Automation

Privacy PolicyTerms of ServiceCookie Policy