Too many R packages: CRAN is inundated with submissions

ionychal 93 points 76 comments June 24, 2026
rworks.dev · View on Hacker News

Discussion Highlights (14 comments)

jdw64

People would typically choose based on CRAN TaskViews or follow conventional methodologies, but what I notice from this is that R is truly a language used only by those who use it. And the people who use it are usually master's students or professors; it's rarely used at the undergraduate level. So even those with that level of academic background and training must have had their own implementation roadblocks. Could that be why the use of R has exploded with the help of AI? Looking at this, I think it's fair to understand that even domain experts found programming difficult. Seeing this, can we really say that AI is always bad? For some people, it has become both the hands and a voice for their words.

Mairoce

Frankly the bigger problem is an over reliance among R instructors on the tidyverse, an ever-expanding ecosystem of redundant functions and anti-patterns. They’re teaching new R users that everything can be solved with yet another package import and skipping over teaching them how to use the already powerful and intuitive base packages.

nickcageinacage

vibe coding hell is the reason

greenavocado

The solution to this problem will be a web of trust featuring a vouching system that auto-closes PRs by default. I already see this being implemented in projects.

dofm

R slop. Oof. What an awful thing to imagine. It's already the programming language of choice for egregious abuses of good practice.

ianbooker

I see "AI and R" in three perspectives: First, usage: Using R for our undergrads in time of LLMs is brilliant. ChatGPT slops out working code for their needs. Not pretty but works better that in 2022. Second, development: Mastering R is hard, because its kalkül. Tidyverse mediates some of it, but still. This is the perfect breeding ground for slopification. Lets see. Third, errata: I would love to know the percentage of science built on R to this day. I mean insights and analysis supported by it and it vast packages. What if somewhere, deep down in the stack there is an ancient bug that dented all of this? I think AI might help us here, or review slop will negate this?

parsimo2010

I feel like CRAN should be used for packages that are expressly made for others to use, and with effort put in to the documentation and vignettes. If you’re making a package for a small team or aren’t pushing it to a large audience then just keep it on a GitHub repository. It is almost as easy to install from GitHub with devtools as it is to install.packages().

f311a

It's the same on any package index now.

piokoch

We have too many videos (since creating one is so easy), too many music (since recording it is so easy), too many books (since publishing an e-book is so easy). Now the same story happens again, for software. But this time it causes more troubles...

dizhn

CRAN is not a conventional package repo. Its audience is not really people who care about programming or software. It is a means to an end for them and slop is perfectly fine. The language itself is also very simple and has defaults that people don't even bother changing. For example the default output file name. It doesn't ask for an output file name when you save output. As a result of the above, it is full of packages that come with associated datasets right in the package itself. Packages with a tiny script and gigabytes of data. Or perhaps just the data without any actual code. Very weird universe.

alastairr

The surprising thing to me is that it's taken as long as it has for CRAN to have this problem. As others have said, this is happening everywhere.

jochapjo

I'm a recent first-time CRAN submitter. I believe my package went through 2 rounds of human review. I doubt R has a severe "too much AI slop" problem relative to other languages, but I can see how human reviewers would get inundated.

frogperson

Better search could solve this I think. If packages could be automatically, semantically grouped and made searchable, then there would be a lot less packages. a lot of times is simply faster to remake that to search for something appropriate. I don't think RAG is the right answer, it needs to be more capable than that. i don't know quite what that would look like. I would love to be able to filter out low effort, bad docs, no tests, no recent contributions and and more after typing a semantic query for a library to use.

gdevenyi

As a sysadmin R is the absolute worst for installing and supporting packages.

Semantic search powered by Rivestack pgvector
11,536 stories · 108,606 chunks indexed