GitHub Is Becoming a Giant AI Code Dump

Athena-maref 23 points 24 comments June 24, 2026
maref.cc · View on Hacker News

Discussion Highlights (18 comments)

manytimesfail

AI generated code is the new plastic... GitHub is the Pacific Garbage patch.

esrh

ironically this is probably ai written too

AmazingTurtle

> AI users were actually 19% slower, but they thought they were 20% faster. I don't know who made those numbers up, but for me... I can almost certainly guarantuee, I have never been so relaxed before. Doing multiple paid projects simultaneously due to AI, still leaning back, customer's are happy. I can confidently say: if you know how to leverage it properly, you can be both more efficient and relaxed at the same time. I'd also argue, if you use a combination of SOTA models to code and review and put in some own thoughts, too, then code is also GG.

zitrusfrucht

I love how so many blog posts that criticize AI generated code are completely AI written.

bel8

I think that's one of the reasons why VSCode adds AI as coauthor by default on commits when GitHub Copilot was used. To try to tag AI commits in an attempt to help filter that out in model training. Controversial default to say the least. https://github.com/microsoft/vscode/pull/310226

piker

I tend to agree with the title but the content seems both AI generated and somewhat dated. The feels 20% faster but actually 19% slower I believe is a few years old at this point. I'm a skeptical as the next but I think it would hard to find a metric on which modern LLMs make devs "19% slower".

mDyJzDPmBdG

I must say I am more worried that I stop in middle of article wondering "is it written by LLM"?

fhe

this article itself is AI dump...

mercurialsolo

This whole thing reeks of AI slop writing - wtf allowed this to be on the main page

59nadir

I saw someone on lobste.rs proudly say that they haven't written a line of Zig code in their life. They have 31 Zig repositories on GitHub. GitHub is useless at this point. (As you might imagine, they also post on HN regularly and is quite "AI positive".)

feverzsj

Don't worry, the bubble will soon burst after midterm election.

bradgessler

Human Code Dump → AI Code Dump Also, water is wet and the sky is blue.

bshepard

I am happy to see how much collective negativity there is to this kind of pointless, time wasting, inaccurate, cringey computer generated boilerplate. Maref, whoever you are, please consider actually writing your own words, not outsourcing them to an algorithm that cannot and will not ever be able to create readable prose.

throwaway2027

Why would it matter? It was of benefit to someone I assume and even if the code is AI generated I think in most cases the results were positive so in theory next AI training runs should be able to learn from those generated results altough that might require work in how training is done or more aggressive filtering. It's literally free RLHF training.

npunt

Can we all band together and agree to flag articles that are so obviously AI written to be engagement bait and devoid of anything meaningful to discuss? To use an AI-ism: HN isn't an AI blog dump, it's a community.

dateusz

the blog is entirely AI driven lmao

jdw64

What I'm sensing is that even HN might be giving recommendations to certain advertised products, isn't it? It feels like a narrative being pushed to sell a governance product called MAREF. Right now, AI is being trained on GitHub in the US and Gitee in China. As GEN AI code increases, one could argue that the open source ecosystem will degrade from a high‑quality dataset reviewed by humans to a codebase of plausibly looking AI‑generated code. And once we start referencing that polluted data, the entire system could deteriorate. But I don't really understand why MAREF is supposed to be the answer. If we adopt MAREF, then to pass MAREF, those metrics become the target, right? But let's think about Goodhart's Law: 'When a measure becomes a target, it ceases to be a good measure.' AI will just produce all sorts of bad code just to pass those checks. If you tighten things too much, people will resort to workarounds just to fit through that narrow gap. And is all GENAIcode garbage? Honestly, I don't think so. I agree that in the long term, if AI training data gets contaminated, it will degrade, but clearly code that has been reviewed by humans is actually better. The case of AlphaDev is a good example. Optimizations like sort 3, 4, and 5 were discovered precisely because they were found by AI. If that's the case, wouldn't it be better to just create an open source project that only accepts human‑written code and funnel all the funding into that? In other words, 'people who create uncontaminated AI datasets'

TimByte

The funniest part is that the article about AI slime polluting the internet was itself generated from start to finish by a neural net to promote some startup. But the problem they're so clumsily trying to monetize is absolutely real. GitHub is rapidly turning from a place with battle-tested solutions into a dumpster fire of hallucinations. And no crutches like MAREF are gonna fix that because platforms profit from showing growth in repo and commit counts even if it's all dead plastic code.

Semantic search powered by Rivestack pgvector
11,536 stories · 108,606 chunks indexed