Strongest current signal
Package downloads
8.3M/30d
Tracked on PyPI
ETL for unstructured data preprocessing
Best current coverage: 8.3M downloads/30d, 275 dependents, and 3.4K public repos.
Lead signals
Package pull and public code usage both show up clearly for Unstructured.
Strongest current signal
8.3M/30d
Tracked on PyPI
275
Known dependents on PyPI
14.4K
Main repository stars
66/30d
Ranked #7 in category discussion
Research Brief
No recent Research Brief centers Unstructured yet. Start with the latest market reporting, then return here for the stored company signals on this page.
Browse Research Briefs →Reading
package pull and existing public code are the clearest current signals for Unstructured.
Package pull
8.3M/30d
Tracked package pull on PyPI
Existing public code
3.4K
Public repositories importing tracked packages
Downstream usage
275
Known dependents on PyPI
New hiring
4/30d
Mentions in recent HN Who's Hiring threads
Coverage includes npm and PyPI registries, GitHub, public code import detection, tracked hiring posts, developer discussion, and recent company news where available. As of April 13, 2026. Methodology
Sustainability and maintenance signals from the primary public repository.
These signals come from public code import detection and tracked hiring posts rather than registry totals alone.
Public repositories and source files importing packages tied to Unstructured.
Mentions of Unstructuredin recent Hacker News "Who's Hiring" threads.
Background and reference details
background, categories, funding, and tools stay collapsed until you need them.
Unstructured specializes in extracting and transforming complex unstructured data from diverse sources like PDFs, images, and emails into clean, structured formats. Their tools are critical for preparing high-quality input data for large language models (LLMs) and other AI applications, enabling organizations to build more accurate and effective AI solutions.
Raised $105M total - DAI rank #25 (top 9%) suggests strong developer adoption relative to funding.
Menlo Ventures
Menlo Ventures
Madrona
77K PyPI(50% of company total)
ETL for unstructured data preprocessing
77K PyPI(50% of company total)
An open-source Python library for pre-processing unstructured data, extracting text and metadata from various document types for use with large language models.
A managed service that provides scalable and reliable data preprocessing, offering advanced features and integrations beyond the open-source library for enterprise use cases.
Public pricing snapshots collected for Unstructured
Source: Company pricing pageUpdates: WeeklyNote: Extracted via automated page analysis; verify on sourceMethodology →Historical metrics for Unstructured
Unstructured: PyPI Downloads down 94% (1.2M to 77.2K). GitHub Stars up 2% (14.2K to 14.4K). Contributors down 91% (11 to 1).
| Date | PyPI Downloads | GitHub Stars | Contributors |
|---|---|---|---|
| Mar 15, 2026 | 1.2M | 14.2K | 11 |
| Mar 20, 2026 | - | - | - |
| Mar 22, 2026 | 1.2M | 14.2K | 1 |
| Mar 27, 2026 | - | - | - |
| Mar 29, 2026 | 1.2M | 14.4K | 3 |
| Apr 3, 2026 | - | - | - |
| Apr 5, 2026 | 1.1M | 14.4K | 1 |
| Apr 10, 2026 | - | - | - |
| Apr 11, 2026 | - | - | - |
| Apr 12, 2026 | 731.5K | 14.4K | 0 |
| Apr 13, 2026 | 77.2K | 14.4K | 1 |