The Companies House Free Data Stack at Eleven: What the APIs, Bulk Snapshots and PSC Files Actually Cover in 2026 — and the Six Gaps the Open Register Still Leaves
Eleven years on from the launch of the Free Company Information API, the Companies House data stack covers the current state of five million companies exhaustively. The bulk files, the streams and the six gaps still defining the open register.

For an organisation that holds the master record of every UK company, Companies House gives an extraordinary amount of its data away. Eleven years after the Free Company Information API launched in beta in 2015, the Registrar's published data products have quietly become the load-bearing substrate of Britain's anti-money-laundering, journalism and credit-decision economies. A single API key — taken out in minutes, free of charge — buys read access to roughly five million live company profiles, fifty million filings and the entire People with Significant Control register, all under Open Government Licence v3.
That, at least, is the headline. Spend a few weeks inside the stack and the texture is more interesting. The free tier covers the current state of the register comprehensively, but anything historical, anything inside an accounts iXBRL file, anything that requires joining across companies, and increasingly anything ECCTA has flagged as suppressible — all of that sits behind a self-service wall the data products do not breach. Britain's open register is open, but it is also flat, present-tense and selectively redacted. This is what each product does and does not include in mid-2026.
The Four Free Channels
Companies House publishes data on four broad rails. They are not interchangeable: the rate limits, refresh cycles and joinability differ enough that any serious downstream user runs at least three of them in parallel.
| Channel | Coverage | Refresh | Rate limit / size | Best for |
|---|---|---|---|---|
| Public Data API (REST) | Company profile, officers, filing history index, PSCs, charges, exemptions | Real-time | 600 calls per 5 min per key | Per-company lookups, profile pages |
| Streaming API | JSON event feed across companies, filing-history, officers, PSCs, charges, insolvency-cases, disqualified-officers | Real-time, sub-second on busy streams | Persistent HTTPS connection per stream | Change-data-capture, monitoring portfolios |
| Document API | The actual filed documents (PDF, iXBRL, JSON for selected types) | Real-time | 600 per 5 min per key, shared with REST | Pulling accounts, mortgages, CS01s |
| Bulk Product files | Snapshot archives of basic company data, PSCs, disqualifications, mortgages | Daily or monthly depending on file | Single download, up to ~1.2 GB | Bulk-load into a warehouse, point-in-time captures |
Two technical notes catch new users. First, the REST and Document APIs share the 600-per-five-minutes limit per key — Companies House does not give the document path its own quota, so anyone scraping iXBRL across thousands of companies must throttle. Second, the streaming feeds carry every event for every company on the register, not a filtered subset. You must filter client-side. On a normal weekday the all-companies stream emits roughly 200,000 events; the filing-history stream perhaps 80,000.
What the Profile API Actually Returns
The company profile endpoint — /company/{number} — is the most-used path on the API by a wide margin. As of 2026 it returns the fields you would expect: registered office, company type, status, accounts and confirmation-statement reference dates, SIC codes, and a layer of new ECCTA additions. Those additions are worth listing because they are the most recent change to the schema and are not yet uniformly populated.
verifications.entity— flag indicating the company has filed the lawful-purpose statement required at the first confirmation statement after the March 2024 commencementregistered_office_is_in_dispute— set when the registrar is investigating an address complaint under the appropriate address rulehas_super_secure_pscs— extended to flag the new s.790ZG suppression casessuper_secure_managing_officer_count— count of officers whose details are protectedhas_charges— boolean, retained from the pre-ECCTA schema
Notable in their absence: the registered email address (held by the registrar but not exposed publicly, by design); the audit firm name (it lives only inside the auditor's report inside the accounts iXBRL, not in any profile field); and any historic version of the profile. The API serves the current state. If you want to know whether a director's correspondence address was the residential address two years ago, the Document API will let you fetch the AD01 filing that changed it — but you will need to know the filing exists, and the profile API will not tell you.
The Bulk Files in 2026
For anyone running a downstream system at scale, the monthly Bulk Product is where most of the work happens. The current file inventory:
| File | Format | Frequency | Approx. size | Notes |
|---|---|---|---|---|
| BasicCompanyDataAsOneFile | CSV, zipped | Monthly, 1st | ~550 MB compressed, ~5.2 m rows | Company name, number, type, status, accounts/CS dates, SIC, address |
| PSC snapshot | JSONL, zipped | Daily | ~1.2 GB | Every PSC record across the register; includes superseded entries |
| Mortgages data | JSON, zipped | Daily | ~250 MB | Charges register dump, including satisfied charges |
| Disqualified directors | CSV / JSON | Daily | <50 MB | Live disqualification orders and undertakings |
| Officers snapshot | — | — | — | Not published as a bulk file. Officers can only be retrieved per company via the API |
That last row is the single most common surprise for new users. Companies House has never published an officers bulk file. The register holds well over thirty million current and resigned officer appointments, but there is no zipped snapshot. Anyone wanting a full directors graph — to spot mass directorships, phoenix patterns or connected-party networks — has to enumerate the register via the API, at 600 calls per five minutes per key. A single, unparallelised crawl of the live company population takes well over a month.
The PSC Snapshot and the TRS Divergence
The daily PSC snapshot is, for AML purposes, the single most valuable bulk file Companies House publishes. It dumps every PSC record — including resigned PSCs, statements of "no individual identified yet", and the rarely seen relevant legal entity entries — as one JSON document per line. As of mid-2026 it runs to roughly 7.6 million records covering approximately 4.1 million entities.
What it does not contain: trust beneficial ownership. UK trusts holding a controlling interest in a UK company must register on HMRC's Trust Registration Service, but those records are not on Companies House. The TRS sits behind a closed access regime under the Money Laundering Regulations 2017 as amended, queryable only by obliged AML entities with a legitimate-interest case. The PSC snapshot will tell you a trustee has been declared as a relevant legal entity, but not who the beneficiaries are. That gap is the single biggest reason serious due-diligence work cannot stop at the Companies House data products.
The Six Gaps That Persist
Eleven years in, six structural gaps still define what the free Companies House data stack will not give you.
- No historical state. The REST and Document APIs return the current profile. To reconstruct a company's state on a past date you must replay filings forward from incorporation, or run your own snapshotting against the bulk files. Companies House keeps no time-machine endpoint.
- No structured accounts data. Accounts are filed as iXBRL — and progressively widened in scope since the 2011 mandate — but Companies House publishes the documents, not a parsed dataset. Pulling balance sheet, P&L or note-level data means iXBRL extraction at your end. The Financial Reporting Council's audit firm taxonomy is similarly absent.
- No officers bulk. As noted, the only way to enumerate appointments at scale is per-company crawling. A network-graph view of UK director relationships — the building block of phoenix and connected-party detection — has to be reconstructed.
- No joins across entities. The API is keyed on company number. If you want every company at a single registered office, every appointment of a single officer, or every filing by a single agent (the ACSP regime), you have to enumerate. Companies House offers a single advanced-search endpoint for officers, capped at one thousand results per query.
- Suppression now reduces what bulk contains. ECCTA's s.1088 personal-information suppression regime has, since May 2024, allowed individuals to apply for residential address, day of birth, signature and former-name redactions across the public record. The bulk files reflect the redacted version. By June 2026 the suppression backlog had cleared an estimated 760,000 successful applications. Any longitudinal dataset compiled before that point still holds the unredacted versions — a quiet data-protection asymmetry between bulk users who started in 2023 and those who started in 2026.
- Cross-register data is out of scope. The Register of Overseas Entities, the Mutuals Public Register at the FCA, the Charity Commission register and the Trust Registration Service are all separate. Companies House links to the first two from individual company profiles where relevant; it does not provide a joined dataset.
What ECCTA Has Already Changed
Two structural changes to the data products since the Economic Crime and Corporate Transparency Act came into force are now bedded in. First, the suppression regime above — quietly the largest single shift in what the bulk files reveal in a decade. Second, the introduction of the verified status flags for directors and PSCs, which began rolling into the REST API schema in April 2025 and will, when identity verification becomes mandatory at incorporation in late 2026, give the profile API a binary cleanliness signal that has never existed before. The flag will not appear in the bulk product until the migration is complete, on the registrar's published timetable in Q1 2027.
What ECCTA has not changed: the absence of an officers bulk, the absence of historical state, and the absence of structured accounts data. Each of those would require investment the Registrar has not, on the public record, committed to.
Where to Look Beyond Companies House
For anyone whose work depends on Britain's company register, the practical answer in mid-2026 is to treat Companies House as the master current-state feed and layer five supplementary sources around it: the Insolvency Service's Individual Insolvency Register and disqualification timelines; the FCA Mutuals Public Register for co-operatives and community benefit societies; the Charity Commission for England and Wales (with OSCR in Scotland and the CCNI in Northern Ireland) for charity finances; HM Land Registry's Overseas Companies That Own Property in England and Wales dataset for ROE corroboration; and your own snapshotting infrastructure for anything historical.
The Registrar gives an extraordinary amount of data away. The cost of taking that data and turning it into something usable for AML, journalism, credit or research is the unpublished, undocumented work of bridging the six gaps above — and that is what the supervised due-diligence market has, in effect, always been selling.