Skip to main content
Unstructured logo

Unstructured

ETL for unstructured data preprocessing

AI InfrastructureFounded 2022#16 of 55 in AI Infrastructure

Updated June 10, 2026

Follow this company

Follow this company to revisit its latest research cards from your account.

Company profile

What is Unstructured?

Unstructured specializes in extracting and transforming complex unstructured data from diverse sources like PDFs, images, and emails into clean, structured formats. Their tools are critical for preparing high-quality input data for large language models (LLMs) and other AI applications, enabling organizations to build more accurate and effective AI solutions.

Company research

Data as of June 10, 2026

No company research is published for this company yet. All research →

History

Research history

8 research updates

Ai Infrastructure

Unstructured: GitHub commits decreased 100% over the last 30 days

Latest
Ai Infrastructure

Unstructured: GitHub commits decreased 100% over the last 30 days

2 metrics
Ai Infrastructure

Unstructured: GitHub commits decreased 100% over the last 30 days

2 metrics
Ai Infrastructure

Unstructured: GitHub commits decreased 100% over the last 30 days

2 metrics
Ai Infrastructure

Unstructured: GitHub commits decreased 96.7% over the last 30 days

2 metrics
Ai Infrastructure

Unstructured: GitHub commits decreased 97.4% over the last 30 days

2 metrics
Ai Infrastructure

Unstructured: GitHub commits decreased 97.7% over the last 30 days

2 metrics
Ai Infrastructure

Unstructured: GitHub commits decreased 98% over the last 30 days

2 metrics

Latest company data

Primary data point

Package downloads (30d)

5.6M/30d

Tracked package: PyPI unstructured

▲ +15%Updated 1d ago

Other data points

Dependent projects

295

Projects depending on tracked package: PyPI unstructured

Updated 15h ago

GitHub stars

14.9K

Main repository stars

Updated 15h ago

Hacker News

63/30d

Position #9 in category discussion

Updated 15h ago

Repository health

Maintenance data from the main open-source repository.

Key-person risk
2
External contributors
40
% of recent contributors outside the core team
Releases (30d)
0

Repository usage

Public repositories and source files importing packages tied to Unstructured.

Repos importing
3.4K0%

About Unstructured

Unstructured specializes in extracting and transforming complex unstructured data from diverse sources like PDFs, images, and emails into clean, structured formats. Their tools are critical for preparing high-quality input data for large language models (LLMs) and other AI applications, enabling organizations to build more accurate and effective AI solutions.

FoundersBrian Raymond, Matthew Harrison

Funding

$105M · 3 rounds

Raised $105M total. Category position #16 of 55 in AI Infrastructure.

Series B2024

Menlo Ventures

$40M
Series A2024

Menlo Ventures

$40M
Seed2023

Madrona

$25M

Investors

MadronaMenlo Ventures

Tracked packages (1)

1 PyPI

unstructured

PyPIMain PyPI package

unstructured

ETL for unstructured data preprocessing