Servo Baseline Readiness Evaluation#
Dietrich Ayala — webtransitions.org
This report measures the Servo web engine's readiness against Baseline "Widely Available" web features using Web Features, BCD, and WPT data, and projects timelines and costs for reaching velocity parity with web platform growth.
Key results in the report:
- % difference from Baseline's "Widely Available", perhaps against each year (eg "Servo supports 72% of Baseline 2022's Widely Available features"
- % of Web Features completed
- Use rate of BCD key implementation and WPT test results as backtest data to calculate velocity at the BCD category level and the Web Feature level
- Use that velocity data to project when in the future Servo will hit Baseline Widely Available for each year we have the data
- Calculate the number of engineers contributing, based on commit data to understand what the contribution inputs that result in the velocity to date
- Provide a calculator to evaluate impact of increasing and decreasing resources on the projected date of hitting Baseline Widely Available
- Add a field to the calculator which lets us set an average annual salary per contributor, to understand the dollar cost of hitting a specific date for Baseline Widely Available for a given year
- Future: Use the Interop project to list priority areas to focus on to make Servo more interoperable sooner, and project cost of that also
Tool Descriptions#
- Baseline uses BCD to measure web platform features' availability status against a set of core browsers chosen as the measure of availability. We want to use the "Widely Available" Baseline state as the metric for Servo's readiness as a web engine.
- Web Features groups BCD keys into logical curated sets to identify a specific feature, letting us reason about the web at ~1100 unique parts vs 15000 unique BCD keys
- Web Platform Tests are cross-browser tests for understanding interop between web engine implementations of web standards
Reference Links#
- Servo: https://servo.org/
- Servo's own measurement site: https://github.com/dklassic/AreWeBrowserYet
- Baseline: https://web-platform-dx.github.io/web-features/
- Web Features: https://web-platform-dx.github.io/web-features/web-features/
- Browser Compat Data: https://github.com/mdn/browser-compat-data/
- Web Platform Tests: https://web-platform-tests.org/
- Web Platform Tests dashboard: https://wpt.fyi/
- Interop: https://wpt.fyi/interop-2025
Research Notes#
Data Sources & Pipeline#
We combine three data sources to measure Servo's support for each web-feature:
- WPT test results via wpt.fyi API (primary signal) — Servo runs WPT daily; results are per-test pass/fail with subtest counts. Available at
https://wpt.fyi/api/runs?product=servo. - WPT Web Features Manifest (the key mapping layer) —
WEB_FEATURES_MANIFEST.jsonfrom WPT releases maps web-feature IDs directly to WPT test paths. 841 features mapped to 53,549 tests. Downloaded fromgh release downloadonweb-platform-tests/wpt. - mdn-bcd-collector results (supplementary) — AreWeBrowserYet runs the collector weekly against Servo nightly; produces boolean pass/fail per BCD key per exposure context. Available as GitHub Actions artifacts from
dklassic/AreWeBrowserYet.
Pipeline: WPT summary → map tests to features via manifest → compute per-feature pass rate → cross-reference with web-features data.json for Baseline status → aggregate.
BCD for Servo: Not worth generating. BCD requires versioned release data; Servo only has nightlies. The collector + WPT approach gives us what we need without the overhead. Servo is not in the Baseline core browser set and adding it to BCD would not change any feature's Baseline status.
Local Data Files#
All paths relative to ~/misc/servo-readiness/:
../web-features/packages/web-features/data.json— Built from web-features repo. 1,129 features with Baseline status (593 high, 130 low, 396 false).data/servo-wpt-summary.json— Latest Servo WPT summary (2026-02-13). 64,834 tests.data/WEB_FEATURES_MANIFEST-*.json— Feature-to-test mapping from latest WPT release. 841 features → 53,549 tests.data/collector-results/popularityBcdMap.json— Latest collector results (2026-01-31). 415 features.data/historical/— Historical WPT summaries at quarterly intervals (2023-Q3 through 2025-Q3).data/analyze.mjs— Combined readiness analysis script.data/velocity.mjs— Velocity and projection analysis script.data/projections.mjs— Projection model and cost calculator.data/stalled.mjs— Stalled features and regression analysis.
Current Readiness (as of 2026-02-13)#
Baseline Widely Available (593 features):
| Support Level | Count | % |
|---|---|---|
| Fully supported (>=95% WPT) | 87 | 14.7% |
| Partially supported (20-95%) | 288 | 48.6% |
| Not supported (<20%) | 64 | 10.8% |
| No data | 154 | 26.0% |
By year (cumulative, fully supported):
| Year Threshold | Full | Total | % | Including Partial |
|---|---|---|---|---|
| Through 2018 | 58 | 266 | 21.8% | 69.2% |
| Through 2020 | 61 | 323 | 18.9% | 66.3% |
| Through 2022 | 71 | 430 | 16.5% | 65.1% |
| Through 2024 | 84 | 530 | 15.8% | 64.7% |
| Through 2026 | 94 | 590 | 15.9% | 64.9% |
Score distribution of 593 Widely Available features:
- 87 at 95-100% (fully supported)
- 80 at 80-95% (near full support — low-hanging fruit)
- 108 at 50-80%
- 109 at 20-50%
- 64 below 20%
- 154 no data (mostly JS built-ins not yet in WPT manifest)
Raw WPT stats:
- 72.5% of tests have OK/Pass status (47,033/64,834)
- 86.3% of subtests pass (1,793,582/2,078,553)
Velocity Analysis#
WPT overall score trajectory:
| Date | WPT Score | Quarterly Gain |
|---|---|---|
| Apr 2023 | 30.4% | — |
| Jan 2024 | 34.3% | ~1.3pp/quarter |
| Jan 2025 | 47.4% | ~3.3pp/quarter |
| Feb 2026 | 62.4% | ~3.7pp/quarter |
Velocity is accelerating. Score doubled from 30% to 62% in under 3 years.
Feature-level velocity (Widely Available, >=95% threshold):
- 2023-Q3: 47 fully supported
- 2024-Q1: 52 (+5 in ~2Q)
- 2024-Q3: 60 (+8 in 2Q)
- 2025-Q1: 71 (+11 in 2Q)
- 2025-Q3: 79 (+8 in 2Q)
- 2026-Q1: 87 (+8 in 2Q)
- Average rate: ~4 full features/quarter
- 53 features crossed the 95% threshold over the measurement window
- 279 features improved, averaging +27pp each
Commit & contributor velocity:
| Period | Commits/Quarter | Peak Contributors/Month |
|---|---|---|
| 2023-Q4 (low) | 399 | 14 |
| 2024-Q4 | 764 | 40 |
| 2025-Q4 | 1,179 | 52 |
The project nearly tripled its commit rate and quadrupled contributors from late 2023 to late 2025.
Commits vs WPT improvement: The relationship is not linear. Some quarters show high efficiency (2026-Q1: +5.4pp from 595 commits = 0.009 score/commit) while others show lower (2025-Q4: +1.5pp from 1,179 commits = 0.001 score/commit). This suggests infrastructure/refactoring periods alternate with feature-completion bursts.
Projection Model#
Using per-feature velocity (extrapolating each feature's individual improvement rate), we project when features cross the 95% threshold:
Score distribution flow (Widely Available features):
| Snapshot | 0-20% | 20-50% | 50-80% | 80-95% | 95-100% | No Data |
|---|---|---|---|---|---|---|
| 2023-Q3 | 113 | 103 | 113 | 50 | 47 | 167 |
| 2024-Q3 | 106 | 93 | 116 | 56 | 60 | 162 |
| 2025-Q3 | 79 | 85 | 126 | 66 | 79 | 158 |
| 2026-Q1 | 64 | 86 | 122 | 80 | 87 | 154 |
Features are flowing rightward through the pipeline. The 80-95% bucket (near-completions) grew from 50 to 80.
Projected completion timeline (439 features with data):
| Timeline | Features |
|---|---|
| Already >= 95% | 87 |
| Within 1 year | 59 |
| 1-2 years | 37 |
| 2-3 years | 23 |
| 3-5 years | 30 |
| 5-10 years | 38 |
| 10+ years | 51 |
| Stalled (zero velocity) | 114 |
Key blocker: 114 features are stalled — no improvement over the measurement period. This makes 75%+ milestones "not feasible" under any contributor scaling, because those features simply aren't being worked on.
Cumulative projection at current pace:
| Date | Fully Supported | % of 593 |
|---|---|---|
| Now (Feb 2026) | 87 | 14.7% |
| +1 year (2027) | 146 | 24.6% |
| +2 years (2028) | 183 | 30.9% |
| +3 years (2029) | 206 | 34.7% |
| +5 years (2031) | 236 | 39.8% |
| +10 years (2036) | 274 | 46.2% |
Cost Calculator#
Key ratios (from recent 6 quarters):
- ~43 active contributors/quarter
- 21.5 commits per contributor per quarter
- 4.5 features reach 95% per quarter
- 0.105 features per contributor per quarter
Scaling assumption: Contributor scaling is sublinear (exponent 0.7) due to coordination overhead — doubling contributors yields ~1.6x the throughput, not 2x.
Projected year to reach 50% of Widely Available (297 features):
| Scenario | Target Year | Annual Cost ($150k/yr) |
|---|---|---|
| Current pace (43 eng) | ~2046 | $6.5M/yr |
| 2x contributors (86 eng) | ~2038 | $12.9M/yr |
| 3x contributors (129 eng) | ~2035 | $19.4M/yr |
| 5x contributors (215 eng) | ~2032 | $32.3M/yr |
Total cost to reach 50% at $150k/yr average salary:
| Scenario | Years | Total Cost |
|---|---|---|
| Current pace | 19.5y | $125.8M |
| 2x contributors | 12.0y | $154.8M |
| 3x contributors | 9.0y | $174.9M |
| 5x contributors | 6.3y | $203.8M |
Faster timelines cost more total (due to scaling overhead) but less per year of waiting.
Reverse calculator — contributors needed for a target date:
| Target % | By 2028 | By 2030 | By 2032 | By 2035 |
|---|---|---|---|---|
| 50% | 1,198 eng ($179.7M/yr) | 429 eng ($64.3M/yr) | 238 eng ($35.7M/yr) | 132 eng ($19.8M/yr) |
| 75% | Not feasible | Not feasible | Not feasible | Not feasible |
| 100% | Not feasible | Not feasible | Not feasible | Not feasible |
75% and 100% are not feasible under any resourcing because 114 features are stalled (zero velocity). Reaching those milestones requires strategic focus on currently-stalled features, not just more contributors on existing work.
Sensitivity: Threshold choice matters a lot:
| Threshold | Features "Fully Supported" | % of 593 |
|---|---|---|
| 80% | 167 | 28.2% |
| 85% | 136 | 22.9% |
| 90% | 112 | 18.9% |
| 95% | 87 | 14.7% |
| 99% | 63 | 10.6% |
Stalled Features Analysis#
Of the 593 Widely Available features, 141 are below 95% and have zero or negative velocity (stalled). Another 154 have no WPT data at all. Together these 295 features (50%) represent the long tail that blocks higher readiness milestones.
Stalled features by current score (below 95% only):
| Score Range | Count | Examples |
|---|---|---|
| 80-95% (near complete) | 20 | min-max-clamp (93%), supports (92.7%), unset-value (91.7%), background-clip (90.3%) |
| 50-80% (substantial) | 41 | canvas-2d (76%), select (72.6%), transforms2d (66.7%), web-audio (61.6%) |
| 20-50% (partial) | 32 | container-queries (23.4%), svg (24.1%), wasm (21.6%), console (23.1%) |
| 1-20% (minimal) | 26 | service-workers (2.2%), webvtt (6.4%), shape-outside (8.7%) |
| 0% (unsupported) | 22 | indexeddb (0.3%), webrtc (0.6%), speech-synthesis, visual-viewport |
79 features have actively regressed. This is the most actionable finding. Worst regressions:
| Feature | Was | Now | Drop |
|---|---|---|---|
| base64encodedecode | 100% | 25% | -75pp |
| object-fit | 96.8% | 25.9% | -71pp |
| webvtt | 62.9% | 6.4% | -57pp |
| empty, input-submit, supports-compat | 100% | 50% | -50pp each |
| wasm-multi-value | 66.7% | 22.2% | -44pp |
| console | 55.1% | 23.1% | -32pp |
| before-after (::before/::after) | 100% | 69.4% | -31pp |
| svg | 54% | 24.1% | -30pp |
Several features that were at 95%+ have dropped below: css-supports (95.6%→88%), background-clip (95%→90.3%), min-max-clamp (95.5%→93%), plus many that fell from 100% (dirname, figure, min-max-width-height, unset-value).
Stalled features by platform area:
| Area | Stalled Count | Avg Score | Key Gap |
|---|---|---|---|
| API (DOM, Web APIs) | 84 | 48.7% | canvas-2d, selection-api, web-audio, history, indexeddb |
| CSS | 70 | 53.4% | transforms2d, transitions, container-queries, containment |
| HTML elements | 14 | 76.8% | form inputs (range, email, checkbox, radio), iframe, embed |
| WebAssembly | 8 | 46.1% | wasm core, threads, exception-handling, mutable-globals |
| HTTP | 1 | 0% | hsts |
Top stalled groups (by web-features taxonomy):
- html-elements: 35 stalled (avg 71.6%) — many HTML elements partially work but stalled
- forms: 17 stalled (avg 71.5%) — form inputs are broadly started but incomplete
- css: 14 stalled (avg 64.9%)
- selectors: 11 stalled (avg 54.3%)
- webassembly: 8 stalled (avg 46.1%)
Features with no WPT data (154 total):
- JavaScript built-ins: 78 (array methods, promise, async, generators, typed arrays, etc.)
- APIs: 39 (WebGL extensions, composition events, DOM core, etc.)
- HTML elements: 23 (abbr, address, article, aside, blockquote, etc.)
- CSS: 6 (case-insensitive-attributes, page-breaks, resolution, etc.)
- Other: 8 (HTTP features, media types, WebAssembly features)
Impact of unblocking:
- Fixing the 20 stalled features at 80-95% → 107/593 (18.0%, up from 14.7%)
- Fixing all 61 stalled features at 50%+ → 148/593 (25.0%)
- Fixing all 141 stalled features → 228/593 (38.4%)
- The 20 near-complete stalled features need only 827 failing subtests fixed out of 31,961 total (2.6% of the stalled work for 23% of the stalled feature count)
Key Insights#
- Regressions are the #1 problem — 79 features went backward, many dramatically. Fixing regressions is typically cheaper than new implementation and would recover features that were previously passing.
- The 95% threshold is strict — 80 features sit at 80-95%, meaning modest improvements could rapidly increase the "fully supported" count.
- 114 stalled features are the bottleneck for 75%+ — these features have zero velocity and block any path to higher readiness regardless of investment.
- Near-complete stalled features are extremely high-ROI — 20 features at 80-95% need only 827 failing subtests fixed (2.6% of all stalled failing subtests) but would add 20 fully-supported features.
- Velocity is accelerating — both WPT scores and feature completion rates are increasing, driven by growing contributor base.
- Scaling is sublinear — doubling contributors doesn't halve the time. Getting to 50% faster costs more in total but less in time.
- Coverage gaps remain — 154 Widely Available features (26%) have no WPT or collector data, mostly JS built-ins. These will need attention as the WPT manifest grows.
- Strategic focus > raw headcount — Unblocking stalled features and fixing regressions would matter more than adding contributors to features already improving.
Scripts & Outputs#
data/analyze.mjs— Current readiness snapshot (Baseline status, by-year breakdown, feature lists)data/velocity.mjs— Historical velocity analysis (score changes, commit correlation, biggest movers)data/projections.mjs— Projection model and cost calculator (per-feature extrapolation, contributor scaling, target-date solver)data/stalled.mjs— Stalled features analysis (regressions, platform areas, impact modeling)data/generate-dashboard.mjs— Generates the interactive HTML dashboarddashboard.html— Interactive visualization with 8 charts (open in browser). Regenerate withnode data/generate-dashboard.mjs