What Is SEC Form 13F? Filings Data Explained

Q: What is SEC Form 13F?

SEC Form 13F is a quarterly disclosure required by the U.S. Securities and Exchange Commission for institutional investment managers with at least $100 million in assets under management. Each filing lists the manager's U.S. equity holdings as of quarter-end.

What Is SEC Form 13F?

SEC Form 13F is a quarterly disclosure that the U.S. Securities and Exchange Commission requires from institutional investment managers with at least $100 million in assets under management. Each filing's information table lists the manager's U.S. equity holdings — issuer name, CUSIP, share count, and market value — as of the end of that quarter. It's how the public tracks who owns what across the market, from hedge funds to pension funds to banks.

Because thousands of managers file every quarter, the combined 13F dataset — available directly from the SEC's EDGAR system — spans 16+ million rows. It's one of the most interesting publicly available financial datasets, and one of the messiest to work with: it arrives as raw structured filings with almost no documentation.

In this video, we walk through loading the 13F holdings data into VerbaGPT and using the Data Notes feature to automatically generate that missing documentation.

What's in the Dataset

The holdings table alone contains 16+ million rows. Key fields include ACCESSION_NUMBER, CIK (the filer's SEC identifier), FILING_DATE, NAMEOFISSUER, CUSIP, and VALUE. The dataset sounds straightforward — but there's a critical catch.

The VALUE column changed units in 2023: pre-2023 values are in thousands of dollars; from 2023 onward, they are in actual dollars. VerbaGPT caught and documented this automatically.

This is exactly the kind of subtle, business-critical data nuance that gets missed when analysts dive into querying before understanding the data. It's also the kind of thing that lives in someone's head, not in any documentation — until now.

How Data Notes Work

Instead of querying the data immediately, VerbaGPT was first tasked with investigating the structure: examining the schema, sampling rows, cross-referencing the SEC's official documentation, and generating a coherent data dictionary. The model identified field patterns, inferred the dataset's structure, and flagged anomalies like the unit change.

The result is a set of Data Notes — a contextual layer that describes the dataset in plain English. From that point on, every query VerbaGPT makes uses this context automatically. It knows what CIK means, what the VALUE column represents, and that the unit shift requires a conversion when comparing pre- and post-2023 data.

Key Takeaways

Public datasets are rarely well-documented — AI can generate that documentation for you
Data Notes are created once and used automatically in every subsequent query
VerbaGPT caught the VALUE unit change that would have silently corrupted any cross-year analysis
The same approach works for your private databases, not just public data

This video pairs directly with the written tutorial below — which goes deeper into the philosophy of iterative documentation and why waiting for "perfect" data docs is the wrong mental model.

Frequently Asked Questions

Who has to file a 13F?

Any institutional investment manager — including hedge funds, mutual funds, pension funds, banks, and insurance companies — that exercises investment discretion over $100 million or more in Section 13(f) securities must file Form 13F with the SEC each quarter.

What's in the 13F information table?

The information table attached to each 13F filing lists every reportable security the manager held: issuer name, CUSIP, share or principal amount, and market value as of the end of the quarter.

Where can I get 13F filings data?

13F filings are public and can be downloaded directly from the SEC's EDGAR system. The combined dataset across all filers spans 16+ million rows — which is why it's used as a sample dataset in the walkthrough above.