Accuracy You Can Trust

We believe in transparency. Rather than making vague claims, we publish real accuracy data — tested against official examiner marks — so you can decide for yourself.

Verified against official IGCSE & GCSE examiner results

Quadratic Weighted Kappa — the gold standard

0.97

The metric exam boards use to measure marker agreement. Above 0.80 is near-perfect. The accepted threshold for AI marking systems is 0.70. Graded Pro scores 0.97 across 387 questions spanning maths and English — and a Wilcoxon signed-rank test confirms no statistically significant difference from the examiner (p = 0.12).

How We Compare

Metric	Graded Pro	Human Markers*
Quadratic Weighted Kappa	0.97	Varies by subject
Correlation with examiner	0.97	~0.70
Average error — structured questions	0.26 marks	Not published
Average error — essay questions	~2 marks	5.6 marks
Statistically different from examiner?	No (p = 0.12)	Yes

*Human marker data from a Cambridge Assessment study: 200 English scripts marked by a chief examiner were independently re-marked by experienced markers. Graded Pro results are based on 387 questions across IGCSE Higher Maths (13 students) and GCSE English Language Paper 2, compared against official examiner marks. No mark schemes or student work were adjusted in any way.

Results by Subject

All results are from real examination papers, compared against the actual marks awarded by the official examiner. The only inputs were the students' work and the official mark scheme — nothing was adjusted or modified.

∑

Mathematics

IGCSE Higher · 13 students · 312 questions

QWK 0.97

Exact match 79%

Within ±1 mark 94%

Within ±2 marks 99.7%

Average error 0.27

Correlation 0.97

English Language

GCSE Paper 2 · 75 questions

QWK 0.97

Exact match 65%

Within ±1 mark 83%

Within ±2 marks 89%

Average error 0.96

Correlation 0.98

Structured Questions

Our system excels on questions with defined correct answers — the kind that make up the majority of assessments. Across 356 structured questions in both maths and English:

0.97

Quadratic
Weighted Kappa

Near-perfect agreement

80%

Marks identical
to the examiner

356 questions

95%

Within ±1 mark
of the examiner

Across subjects

0.26

Average error
per question

Across subjects

Whether it's a 1-mark calculation or an 11-mark multi-step problem, the AI consistently matches professional marking standards.

Extended Writing & Essays

Levelled questions — where markers use band descriptors to assess quality — are harder for any marker, human or AI. Our system uses a structured levelling process modelled on how trained markers work: identify the best-fit level, then position within it.

Average error of around 2 marks on levelled questions
QWK of 0.95 on levelled questions alone
97% correlation with professional markers across all question types
Significantly outperforms the average experienced human marker (MAE 2.1 vs 5.6)

What This Means For You

AI marking is not a replacement for your professional judgement — it's a tool that handles the heavy lifting so you can focus on what matters.

Where the AI is strongest

Short-answer questions, calculations, retrieval tasks, and structured responses across all subjects. On these question types, the AI marking is highly reliable and ready to use as-is.

Where you should review

Extended writing and essay-style responses at the very top of the mark range. The AI occasionally under-marks the strongest responses by a few marks. A quick review of your highest-performing students' work is good practice.

Not Just Exams

Our accuracy benchmarks are based on formal examination papers, but Graded Pro is built for everyday marking across all types of student work. The same AI that matches chief examiner standards on exam scripts delivers consistent, rubric-linked feedback on:

Homework — weekly assignments marked and returned the same day, with actionable next steps
Classwork and in-class tasks — quick, consistent feedback while the learning is still fresh
Termly tests and mock exams — full cohort marking with detailed breakdowns by question
Coursework drafts — formative feedback that helps students improve before final submission
Past paper practice — students get instant, exam-standard feedback on every attempt

Wherever there's a rubric or mark scheme, Graded Pro delivers accurate, detailed feedback — whether the stakes are high or the goal is simply helping students learn from their work.

Our Commitment

We continuously test and improve our marking accuracy. We don't claim perfection — no marker, human or AI, achieves that. What we do promise is transparency about where the system performs well and where it has limitations, so you can use it with confidence.

See For Yourself

Start Free Trial

No credit card required

Accuracy