Methodology and trust
How we collect, verify, and publish the signals behind every Research Brief and supporting page
Methodology last reviewed: April 2026
AI-Buzz tracks signals that traditional market databases miss: package downloads, code adoption, contributor activity, pricing changes, hiring demand, and supporting discussion signals. Most signals sync daily at 6 AM UTC, with some on weekly or monthly schedules. Funding data is manually verified before publication.
Not every raw movement becomes a published claim. Research Briefs are reserved for signal-led changes that survive anomaly checks, have real interpretive value, and route into inspectable company pages.
What This Methodology Covers
Most company databases track funding rounds and firmographic data. AI-Buzz tracks developer adoption and operating signals: package downloads, GitHub contributors, code adoption, pricing changes, and hiring demand. Funding tells you who raised money. Developer signals tell you which companies and products teams are actually choosing.
Research Briefs are structured signal notes, not human-reported feature stories. AI may draft copy from AI-Buzz's company data, but a human editor selects the question, verifies the figures, and approves the claim before publication.
Each brief carries a top-of-piece disclosure stating this process. The goal is honest, repeatable analysis - not the appearance of a larger editorial team.
AI drafts the brief from structured company metrics. It generates the claim statement, key facts, interpretation, and proof table.
Humans select which question to investigate, verify the figures against source data, approve or reject the claim, and control the publish decision.
If interpretation changes due to new data, we update the brief with a revision note and new as-of date. The original claim is not silently changed.
Two lead signals require custom collection infrastructure to build and maintain historical records. The underlying APIs are public, but the parsed, company-mapped, historical dataset that AI-Buzz accumulates daily is not replicable without running equivalent infrastructure for an equivalent period. This is the time-based moat. Pricing snapshots are tracked as secondary operating context, not as a lead adoption signal.
Counts public GitHub repositories that import each company's packages via ecosyste.ms. This measures real integration - developers writing production code that depends on the tool - as opposed to download counts (which include CI/CD automation) or stars (which measure interest, not usage). Updated daily.
Parses monthly Hacker News "Who's Hiring" threads to count company mentions in job postings. This is a bounded public hiring signal from a startup- and tech-skewed venue. It can be a useful leading indicator for where teams are staffing up, but it is not proof of total production usage or market share. Updated monthly with daily aggregation.
Pricing page monitoring stays in the system because packaging shifts can explain go-to-market changes, but it is used as supporting context rather than a headline signal.
| Signal | Source | What's Measured | Frequency |
|---|---|---|---|
| Package Downloads | npm, PyPI | Daily download counts per tool, summed into 30-day totals with month-over-month trends. Multi-package companies aggregate across all tools. | Daily |
| GitHub Activity | GitHub API | Stars, forks, and growth trends | Daily |
| GitHub Contributors | GitHub API | Unique commit authors in last 30 days across all company repos | Daily |
| npm/PyPI Dependents | Libraries.io | Count of packages that depend on company npm/PyPI packages | Weekly |
| Code Adoption | ecosyste.ms | Public repositories importing company packages (dependent repo count) | Daily |
| HN Mentions & Sentiment | HN API | 30-day mention counts, discussion share by category. Sentiment classified by Gemini LLM (positive / neutral / negative, ~80% accuracy). Only shown when sample size >= 25 comments. | Daily |
| Reddit Mentions & Sentiment | Reddit Search API | 30-day mention counts in ML subreddits (r/MachineLearning, r/LocalLLaMA, r/artificial). Sentiment classified by Gemini LLM. | Weekly |
| Job Demand | HN API | Company mentions in monthly HN hiring threads. Skews toward startup/tech roles. | Monthly |
| Docker Hub Pulls | Docker Hub | Container adoption | Weekly |
| Hugging Face Downloads | Hugging Face | ML model adoption | Weekly |
| Hugging Face Models | Hugging Face | ML model portfolio breadth | Weekly |
| Funding | TechCrunch, VentureBeat, Wikipedia | Round size, type, date, lead investors | As announced, manually verified |
| Company News | TechCrunch, VentureBeat, major tech publications | News articles tracked per company from major tech publications | Daily |
Not all data points carry equal weight. We apply minimum thresholds to flag low-confidence signals:
| Signal | High Confidence | Low Confidence | Why It Matters |
|---|---|---|---|
| HN Sentiment | ≥30 comments (1.0 confidence) | <10 comments (0.2 confidence) | Sentiment classification requires enough comments for statistical relevance. Below 10, a single outlier can skew the ratio. |
| GitHub Velocity | ≥2 weeks history | <2 weeks history | Commit velocity compares the latest week to a 4-week average. With less than 2 weeks of data, the trend is unreliable. |
| Downloads | ≥100 / 30 days | <100 / 30 days | Very low download counts are often noise from CI bots or one-time installs. Below 100/month, the signal is too weak to draw conclusions about adoption. |
Each company receives a 0–100 data confidence score reflecting profile completeness, identifier verification, metric freshness, and signal coverage.
| Score Range | Rating | Meaning |
|---|---|---|
| 80–100 | Excellent | Fully verified identifiers, fresh metrics across all signals, complete profile. |
| 60–79 | Good | Most signals present and recently updated. Minor gaps in coverage or verification. |
| 40–59 | Fair | Some signals missing or stale. Profile may lack key identifiers like GitHub repos or package names. |
| <40 | Needs Review | Significant data gaps. Metrics may be outdated or unverified. Treat conclusions with caution. |
Funding rounds are detected from news feeds (TechCrunch, VentureBeat) and cross-referenced against company announcements and Wikipedia. Every round is manually verified before appearing on a company profile. We'd rather miss a round than publish incorrect data.
Every data point goes through a 4-layer validation pipeline:
Found an error? Use the "Report an error" button on any company profile, and we'll review it promptly.
Data from AI-Buzz is free to cite in articles, reports, and research. See our citation guide for attribution formats.
The Developer Adoption Index combines 4 verified adoption signals into a single composite score (0–100) for each company. It measures real developer traction through package downloads, ecosystem dependents, download growth, and contributor activity. All signals are 100% external and independently verifiable.
For score tiers, leaderboard, and usage examples, see the Developer Adoption Index page.
DAI = Σ (weighti × normalize(signali)) × 100Each signal is normalized to 0–1 using percentile ranking (fraction of non-zero values at or below the given value) before weighting. Weights sum to 1.0. The final score is scaled to 0–100.
| Category | Component | Weight | Data Source |
|---|---|---|---|
| Core Adoption - 60% | |||
| Package Downloads | 40% | npm, PyPI Stats (30-day totals) | |
| Dependents | 20% | Libraries.io, npm registry | |
| Growth & Activity - 40% | |||
| Download Growth | 25% | npm, PyPI Stats (month-over-month trend) | |
| GitHub Contributors | 15% | GitHub REST API (unique contributors in 30 days) | |
Each signal is normalized using percentile ranking: for a given value, we compute the fraction of non-zero values across all companies that are at or below it. Companies with a zero value for a signal are excluded from ranking for that signal (they receive a 0.0 normalized score). This prevents zero-value companies from inflating percentiles and ensures the ranking reflects actual adoption spread.
Download growth uses trend normalization rather than percentile ranking: month-over-month growth rates are mapped to a 0–1 scale where -200% maps to 0, 0% to 0.5, and +200% to 1.0, capturing whether adoption is accelerating or declining.
All input signals are synced daily at 6:00 AM UTC via automated GitHub Actions workflows pulling live data from each API. DAI scores are recomputed after all syncs complete. Computation aborts if more than 10% of companies have metrics older than 7 days, preventing stale data from corrupting rankings.
The momentum score (0–100) measures how quickly a company is gaining or losing developer traction. It combines six directional components (v2 weights, active since March 2026):
| Component | Weight | Source |
|---|---|---|
| HN mentions trend | 20% | 30-day mention count change |
| GitHub stars trend | 20% | Star count growth rate |
| Download trend | 20% | npm/PyPI download growth rate |
| Funding recency | 15% | Days since last funding round (linear decay over 365 days) |
| Code adoption | 15% | ecosyste.ms dependent repo count change |
| Job demand | 10% | HN "Who's Hiring" mention trend |
Not every company has data for all six components. When components are missing, the available weights redistribute proportionally so the score still uses the full 0–100 range. A confidence value records the fraction of components that were available (e.g., 1.0 = all 6 present, 0.5 = 3 of 6).
When the momentum score feeds into the signal score, it is multiplied by this confidence value, so companies with fewer data points receive proportionally less credit.
Start with Research, then inspect the company and category pages that support each published claim.
When scoring formulas or weights change, we document them here. Weight changes are also automatically logged to Data Updates.
Simplified from 9 signals to 4 verified adoption signals. Removed sentiment proxies (HN, Reddit), lagging indicators (funding recency, Docker pulls), and renamed from "Developer Momentum Index" to "Developer Adoption Index" (DAI).
Removed self-referential signals (page views, GSC impressions). All 9 signals external. Community (40%), Adoption (50%), External Interest (10%).
Added demand-side signals (page views, GSC impressions, search appearances). Later removed in v3.0 due to circularity concerns.
Original signal score with equal weighting across HN mentions, GitHub stars, funding, and download metrics.