Show HN: Slop or not – can you tell AI writing from human in everyday contexts?
I’ve been building a crowd-sourced AI detection benchmark. Two responses to the same prompt — one from a real human (pre-2022, provably pre prevalence of AI slop on the internet), one generated by AI. You pick the slop. Three wrong and you’re out. The dataset: 16K human posts from Reddit, Hacker News, and Yelp, each paired with AI generations from 6 models across two providers (Anthropic and OpenAI) at three capability tiers. Same prompt, length-matched, no adversarial coaching — just the model’s natural voice with platform context. Every vote is logged with model, tier, source, response time, and position. Early findings from testing: Reddit posts are easy to spot (humans are too casual for AI to mimic), HN is significantly harder. I'll be releasing the full dataset on HuggingFace and I'll publish a paper if I can get enough data via this crowdsourced study. If you play the HN-only mode, you’re helping calibrate how detectable AI is on here specifically. Would love feedback on the pairs — are any trivially obvious? Are some genuinely hard?
Discussion Highlights (6 comments)
lucastonelli
Hey, congratulations on the final product. It even feels fun. Some are really hard, but some feel blatantly obvious. I don't know why though. I guess it's just because the way we communicate feels off when compared to AI, some times.
SsgMshdPotatoes
Nice idea! Em dashes were giveaways for AIs and typos for human, at least in the ones I did, so those are at least trivial. So might have to do some filtering at least for those. Some were hard though, yeah (at least if not looking longer than 5-10 seconds). Btw, it seemed more logical to me to just see a green/red card when you click, i.e. right choice or wrong choice. Getting red for the correct answer confused me a bit (but this might just be me).
valeena
Was able to get an 8x streak. The question that made me lose it was really hard, I basically took a guess. Some were hard but spottable after re-reading the answers a good 10 times... ahah.
flossposse
By playing this game I'm helping to train AI how to be less detectable?
apothegm
I keep accidentally clicking on the human one because my brain wants to treat it as “find the human content”. FWIW, I found the “medium” one’s hardest. Most of the “hard” ones have dead giveaways in the form of either punctuation or common AI text rhythms.
joegibbs
I got 19x. When they say "curious about" it's always a good sign that it's AI, same with X not Y construction, saying "genuinely", saying things like "absolutely slaps" and other millennial slang, being overly positive: generally sounding like the transcript of an Instagram food review. When they're trying to be casual they seem to default to some kind of 2017 millennial stereotype. Typos and "edit:" are always a good sign that it's human, so I'm sure people will start adding those in to AI-generated text to seem more real