Standard Metrics extracts hundreds of thousands of data points from documents for our VC/PE customers each quarter using an AI + human-in-the-loop parsing process. In the past year, we’ve dramatically increased the volume of documents we parse with AI, expanded to new types of documents parsed, and increased accuracy rates and time savings. Today, we’re excited to announce the launch of our latest AI-powered, human-in-the loop unlock: board decks.
We’ll dive into what our AI document parsing process looks like, why board decks are a bigger challenge than more traditional financial statements for parsing, and what this change means for our customers.
How AI and human-in-the-loop document parsing works
To parse the many documents that portfolio companies send us, we use a multi-step parsing flow that incorporates pre-processing, multiple LLMs, human QA, and constant iteration to continuously improve speed while maintaining high accuracy.
Pre-processing to split large documents into smaller parts and classify those smaller documents by type (PDFs versus Excel files or P&L versus a balance sheet, for example) helps us to provide the right context lengths to LLMs. (More on why that matters can be found here.)
Using different LLMs for different parts of the process helps us reach higher accuracy, which we constantly measure and iterate on based on our evaluations framework. Why can’t we just rely on one? Different tools perform better at different things. One leading LLM, for example, is strong at PDF parsing across vision and text, but struggles more with worksheet-style data like Excel documents. Another, meanwhile, is strong at identifying forecasts versus actuals, but has a small context window.
Human QA by our managed data services team ensures a high accuracy bar, while longer-term error measurement (and process adjustments to address those errors) make AI outputs pre-QA better and better over time.
Long story short: AI-powered document parsing done right is complex and has required constant iteration over the past year and a half. Now, AI handles the vast majority of documents we process. However, for a long time, board deck parsing remained a particular challenge to parse with AI — a challenge that we’re excited to be solving today.
Why we built AI board deck parsing
We knew that portfolio company board deck parsing would be an important unlock for investors. Company decks average at 47 pages of content. These are huge documents to search through and, if that search is done manually, a time-consuming and frustrating process for investors. Moreover, off-the-shelf LLMs are not yet able to handle structured data extraction accurately enough to provide consistent, clean data.
But while board decks and PDFs are easy for people to read, the visual formatting that makes them readable doesn’t translate well into structured data. When we look at a chart, we immediately connect bars, labels, periods, and trends. When we see a table, we understand how the relationships between rows, columns, and headers work.
But data extraction software doesn’t see those relationships. It sees separate text elements, numbers, and shapes positioned across the page. It can detect each piece, but it can’t reliably determine the way those pieces work together.

Moreover, board deck structure is highly variable. And even when a metric is pulled from the correct place and correctly identified, the definitions of metrics commonly found in board decks can differ from company to company.
AI document parsing for board decks addresses this by going beyond basic text extraction. It incorporates layout-aware processing, structural parsing, and metrics-aware context — reconstructing the semantic structure of a page rather than just reading the text on it.
It also handles layers of financial context that generic extraction tools don’t have: detecting fiscal year boundaries, classifying document types, distinguishing actuals from budgets and forecasts, and resolving unit ambiguity across tables. These are the kinds of nuances that determine whether an extracted number is reliable and useful or incorrect and misleading, and they’re not covered by standard document parsing workflows.

What AI board deck parsing will change for our customers
With AI board deck parsing, Standard Metrics can get more of your documents parsed more quickly, without sacrificing accuracy. Rather than manually reading through tens of pages to map metrics by hand, our team can focus the majority of their time on reviewing data (rather than extracting it). Across the board, we’re parsing documents faster and with greater accuracy.
On top of speed, parsing board decks with AI is key to getting more unique datasets into the Standard Metrics platform. Board decks are the primary home for many important metrics for investors: ARR, net revenue retention, industry-specific metrics, and many others that simply don’t appear in a P&L, cash flow statement, or balance sheet. Parsing more of this data, structuring it on our platform, and making it accessible to our AI Analyst (and other features) ensures that all relevant KPIs are in one spot and useable for audits, downstream analysis, LP reviews, and more.
If you’re not yet a Standard Metrics customer and would like to learn more, fill out the demo form below and we’ll be in touch.
Automate your portfolio reporting
Find out how you can:
- Collect a higher volume of accurate data
- Analyze a robust, auditable data set
- Deliver insights that drive fund performance
