Skip to main content

Date Published

April 7, 2026

Share this

I’ve been chasing this analysis for over six years.

Not our just-launched “Standard Metrics Private Market Report: 2025 Year in Review,” but the data to make it real.

Anyone who’s tried to benchmark private companies knows the problem: you never have enough of them, reporting consistently enough, to say something statistically meaningful.

I came into this space to do analytics. That’s what I was hired for at Y Combinator and at Battery Ventures. But private markets data infrastructure wasn’t built for analytics yet. The data lived was siloed across different systems, and nowhere near ready for the kind of analysis I wanted to run. So I became a self-taught data engineer out of necessity, building workflows and data products to build the data foundation. That infrastructure work ended up consuming most of my time. And even when I did get to the analysis, the sample sizes were never large enough to say something statistically defensible.

That changed when I joined Standard Metrics. With roughly 10,000 private companies reporting through our platform, I can now pull insights like: companies in the top quartile of Sales and Marketing (S&M) spend grew ~5x faster than those in the bottom quartile.

So the team and I built the report. The process turned out to be just as interesting as the findings.

 

What 10,000 companies actually report

What we have at Standard Metrics is something unusual in the private markets: an anonymized dataset of actual company-submitted financial data from roughly 10,000 venture-backed companies reporting to their investors through our platform. Not scraped from pitch decks. Not estimated from headcount proxies. Revenue, expenses, margins, burn, headcount, reported quarterly by the companies themselves.

That matters because the quality of the input determines the quality of the analysis. These companies report through Standard Metrics as part of their investor reporting workflow, which means the data has the same rigor as what goes to their boards. We normalize everything to USD using constant currency rates, segment companies into annualized revenue bins, and apply a minimum threshold of 30 companies per data point before we put a number in a chart.

The dataset we focused on spans from Q1 2021 through Q4 2025, which gives us five full years of quarterly data, going through the ZIRP cycle, the correction, and the AI boom. That depth is what makes trend analysis possible.

This is also the foundation of our paid benchmarking product. The same dataset that powers the Year in Review powers individual portfolio benchmarks for our customers.

 

From question to answer in minutes

But, I needed to explore this data across dozens of different cuts: revenue segment, AI vs. non-AI, sector, time period, percentile thresholds. So I opened Claude Code and started building.

What came out was a full interactive dashboard connected to our internal data, with filters for every dimension our team cared about. I didn’t write a spec. I didn’t plan the architecture. I described what we wanted to see, and Claude iterated in real time. When our team wanted a new view, I added it. When a chart wasn’t telling the story clearly, I’d describe what was off and Claude would rework it.

Our team went from “I need to see this data cut” to seeing it in minutes, not days. And then I kept going, adding more filters, more metrics, more ways to slice the dataset, because the friction between question and answer had essentially disappeared.

 

What we cut matters more than what we kept

The dashboard made it trivially easy to explore. I’d drop views into a Claude project, share my takeaways, and our team would dig further into the data together. Need to check whether the AI growth premium holds across revenue segments? Done. Want to understand whether graduation to the next bucket is what’s driving the sub-$1M segment’s struggle? Let’s dig in.

That ease is both the opportunity and the risk.

The questions you ask, the directions you drill into, the way you challenge or accept what the data appears to show, all of that determines where you end up. The LLM is remarkably good at running the analysis you ask for. It’s less good at the judgment around what’s statistically relevant, or interesting but not meaningful. At least for now…

That’s where my six years of waiting came in handy. I’ve been staring at private company financials long enough to know when something in the data doesn’t pass the smell test.

The team softened language where sample sizes were thin. We flagged where conclusions were preliminary and required further digging to speak more confidently. We excluded entire sections that looked compelling in the dashboard but didn’t hold up under scrutiny, or places where Claude made confident statements that lacked true support from the data.

The AI made the analysis faster. My years of experience shaped which questions to ask, what to trust, and what to leave out. The team made sure the findings land clearly for our readers. That’s the real story of working with these tools right now. They’re multipliers, but what you multiply matters.

 

What’s next

This report covered revenue growth, AI dynamics, and spend patterns. We barely scratched the surface.

The reason this matters is that the analysis you can run on 10,000 companies is categorically different from what you can run on 200. You can cut by revenue segment, by sector, by AI classification, and still have enough data in each cell to trust the percentiles. That’s new in private markets. And we have insights on gross margin, EBITDA, burn rates, and more coming soon. We want to hear what questions matter to you, we’re just getting started.

Download the report here. If you’d like your portfolio benchmarked against this dataset, get in touch via the form below.


Automate your portfolio reporting

Find out how you can:

  • Collect a higher volume of accurate data
  • Analyze a robust, auditable data set
  • Deliver insights that drive fund performance