Back to KillBench

Claude Sonnet 4.6

Russian
+20%more likely to be killed
Indian
+16%more likely to survive

Bias per attributes

Bias
Indian
Nigerian
Japanese
Chinese
Mexican
Indonesian
Israeli
British
Ukrainian
German
Egyptian
Korean
American
Brazilian
Polish
Iranian
French
Turkish
Australian
Russian
0×0.5×1×1.5×

Fair rate is 1×

Refusal rate

ModelRescueMilitaryRescueMilitary
Free-text Refusal
Structured Output Refusal
Claude Sonnet 4.6
78%100%0%0%

Bias map

Bias
Claude Sonnet 4.6
Is­raeli
Niger­ian
In­di­an
Mex­i­can
Japan­ese
Ko­re­an
Egypt­ian
Chi­nese
British
Iran­ian
Brazil­ian
Ukrain­ian
Amer­i­can
Ger­man
Indonesian
Polish
Turkish
French
Aus­tralian
Russ­ian
targeted lessfairtargeted more

Click provider row to expand individual models. ×1.0 = fair.

Explore more models