We measure before we claim. And we publish our failures.
0.512N71 overall — the highest of any memory system in the study (best published: 0.42)
0.55 vs 0.03Cascade — when a fact changes, dependent facts update
0.35 vs 0.01Absence — knowing what it no longer knows
The full 100-episode suite — the benchmark's published episodes and judges, through our production pipeline. All 1,188 questions downloadable for validation.