KillBench: Every frontier LLM is biased about who deserves to live
frankterpo
11 points
1 comment
April 14, 2026
Related Discussions
Found 5 related stories in 90.9ms across 8,861 title embeddings via pgvector HNSW
- Disagreement among frontier LLMs on real-world fact-checks kostaj · 486 pts · May 28, 2026 · 54% similar
- N-Day-Bench – Can LLMs find real vulnerabilities in real codebases? mufeedvh · 54 pts · April 13, 2026 · 50% similar
- Show HN: Mediator.ai – Using Nash bargaining and LLMs to systematize fairness sanity · 20 pts · April 20, 2026 · 50% similar
- We're running out of benchmarks to upper bound AI capabilities gmays · 15 pts · April 10, 2026 · 49% similar
- Taste in the age of AI and LLMs speckx · 233 pts · April 07, 2026 · 48% similar
Discussion Highlights (1 comments)
edward28
Biggest problem is they don't also test for sex.