Unstructured

ETL for unstructured data preprocessing

AI InfrastructureFounded 2022#19 of 55 in AI Infrastructure

unstructured.ioGitHubPyPI

Company data checked July 13, 2026

Follow this company

Follow this company to revisit its latest research cards from your account.

Company research

Current company data

No company research card is published for Unstructured yet. The current company data below lists package, repository, and discussion metrics AI-Buzz can inspect today; AI-Buzz publishes a card when recent public metrics show a measured change with dated evidence and cited sources.

Package downloads (30d)

4.8M/30d

Dependent projects

301

dependents · latest

GitHub stars

15.1K

Hacker News

45/30d

Note: Public metrics are incomplete, and current data alone does not prove a trend; they do not show private usage, paid use, customer count, or product quality.

See latest data How we measure it All research →

Company profile

What is Unstructured?

Unstructured specializes in extracting and transforming complex unstructured data from diverse sources like PDFs, images, and emails into clean, structured formats. Their tools are critical for preparing high-quality input data for large language models (LLMs) and other AI applications, enabling organizations to build more accurate and effective AI solutions.

Latest company data

Metric dates vary by source

Metric dates

PyPI downloads: July 12, 2026
PyPI dependents: July 13, 2026
GitHub stars: July 13, 2026
Hacker News mentions: July 13, 2026

Key metric

Package downloads (30d)

4.8M/30d

Tracked package: PyPI unstructured

▼ -17%

4.8M/30d

Additional metrics

3 metrics

Metric

Dependent projects

301

Projects depending on tracked package: PyPI unstructured

301

Metric

GitHub stars

15.1K

Main repository stars

15.1K

Metric

Hacker News

45/30d

Position #10 in category discussion

45/30d

Repository health

Maintenance data from the main open-source repository.

Key-person risk

External contributors

% of recent contributors outside the core team

Releases (30d)

Repository usage

Public repositories and source files importing packages tied to Unstructured.

Repos importing

3.4K0%

About Unstructured

FoundersBrian Raymond, Matthew Harrison

unstructured.io github.com/Unstructured-IO/unstructured PyPI: unstructured

Disclosed funding

Disclosed $105M · 3 rounds

Disclosed funding records total $105M. Category position #19 of 55 in AI Infrastructure.

Series B2024

Menlo Ventures

$40M

Series A2024

Menlo Ventures

$40M

Seed2023

Madrona

$25M

Investors

MadronaMenlo Ventures

Tracked packages (1)

1 PyPI

unstructured

PyPIMain PyPI package

unstructured

ETL for unstructured data preprocessing

Unstructured

What is Unstructured?

Package downloads (30d)

Additional metrics

Dependent projects

GitHub stars

Hacker News

Repository health

Repository usage

About Unstructured

Categories

Primary category

Disclosed funding

Investors

Tracked packages (1)

unstructured