Same AUC, Different Behavior
Same AUC does not mean same tool.
In the last module, we saw that AUC measures how well a tool ranks people from lower to higher risk. But two tools can have the same AUC and still behave very differently.
AUC measures ranking—not how scores map to real-world decisions.
Same Ranking Ability, Different Scores
Both tools in the figure below have the same AUC. That means they are equally good at ranking individuals from lower to higher risk.
But their scores behave very differently in practice.
Same AUC, different behavior
Two tools can have identical AUC values while producing very different risk estimates across the score range.

AUC captures ranking ability, but it does not fully describe how scores behave in practice.
How the Scores Differ
- At a score of 20, one tool corresponds to about 20% risk while the other corresponds to about 40%
- At a score of 80, one tool corresponds to about 75% risk while the other corresponds to about 45%
The tools have the same ranking ability, but they produce very different risk estimates.
Why This Matters
- Decisions are based on scores and probabilities—not just ranking
- Tools with the same AUC can lead to different classifications
- Policy outcomes can differ even when AUC is identical
Bottom Line
AUC is useful, but it does not tell you everything about how a tool performs. Two tools can rank people equally well while behaving very differently in practice.