Accuracy
We Publish Our Track Record
Every prediction STEWARD makes is extracted, tracked, and verified against real-world outcomes. When we're wrong, it shows up here. No other intelligence provider publishes their accuracy. This is the trust differentiator.
Predictions are extracted from intelligence briefs by AI, then verified against FRED economic data, government signals, and procurement outcomes. Scores update daily.
Live Stats
Current Accuracy Metrics
Collecting Data
Verification In Progress
We are currently collecting outcome data to validate our predictive models. We will publish accuracy metrics and full methodology once we can prove our predictions match reality. We believe transparency is earned through results, not claims.
--
Predictions Extracted
0
Verified So Far
--
Accuracy TBD
Process
How We Track Accuracy
Extract Predictions
Every intelligence brief with an eval score above 42 is scanned. Forward-looking statements are extracted with predicted values, timeframes, and categories.
Wait for Outcome
Each prediction has a target date. We wait until that date passes before attempting verification. No premature scoring.
Verify Against Reality
Commodity predictions are verified against FRED economic data. Regulatory predictions against government signals. Procurement predictions against actual award data.
Score and Publish
Each prediction is scored: exact match (100%), within 20% (80%), directionally correct (50%), or wrong (0%). Unverifiable predictions are flagged for manual review.
Scoring
Outcome Scoring Rules
Every verified prediction receives one of five outcome grades. These are the same scores used internally to calibrate our models.
| Outcome | Score | Criteria |
|---|---|---|
| Exact Match | 1.0 | Prediction was essentially correct in both direction and magnitude |
| Within 20% | 0.8 | Predicted value was within 20% of actual observed value |
| Directionally Correct | 0.5 | Right direction but magnitude was significantly off |
| Wrong | 0.0 | Prediction was clearly incorrect |
| Unverifiable | N/A | Insufficient data to determine — flagged for manual review |
Trust Through Transparency
We believe intelligence providers should be accountable for the accuracy of their analysis. If we can't show our track record, we shouldn't be selling intelligence.