Do you even need a database?

upmostly 231 points 262 comments April 15, 2026

Discussion Highlights (20 comments)

the_inspector

In many cases not. E.g. for caching with python, diskcache is a good choice. For small amounts of data, a JSON file does the job (you pointed to JSONL as an option). But for larger collections, that should be searchable/processable, postgres is a good choice. Memory of course, as you wrote, also seems reasonable in many cases.

vovanidze

people wildly underestimate the os page cache and modern nvme drives tbh. disk io today is basically ram speeds from 10 years ago. seeing startups spin up managed postgres + redis clusters + prisma on day 1 just to collect waitlist emails is peak feature vomit. a jsonl file and a single go binary will literally outlive most startup runways. also, the irony of a database gui company writing a post about how you dont actually need a database is pretty based.

ghc

I'm so old I remember working on databases that were designed to use RAW, not files. I'm betting some databases still do, but probably only for mainframe systems nowadays.

chuckadams

I need a filesystem that does some database things. We got teased with that with WinFS and Beos's BFS, but it seems the football always gets yanked away, and the mainstream of filesystems always reverts back to the APIs established in the 1980s.

z3ugma

At some point, don't you just end up making a low-quality, poorly-tested reinvention of SQLite by doing this and adding features?

freedomben

I avoided DBs like the plague early in my career, in favor of serialized formats on disk. I still think there's a lot of merit to that, but at this point in my career I see a lot more use case for sqlite and the relational features it comes with. At the least, I've spent a lot less time chasing down data corruption bugs since changing philosophy. Now that said, if there's value to the "database" being human readable/editable, json is still well worth a consideration. Dealing with even sqlite is a pain in the ass when you just need to tweak or read something, especially if you're not the dev.

fatih-erikli-cg

I agree. Databases are useless. You don't even need to load it into the memory. Reading it from the disk when there is a need to read something must be ok. I don't believe the case that there are billions of records so the database must be something optimized for handling it. That amount of records most likely is something like access logs etc, I think they should not be stored at all, for such case. Even it's postgres, it is still a file on disk. If there is need something like like partitioning the data, it is much more easier to write the code that partitions the data. If there is a need to adding something with textinputs, checkboxes etc, database with their admin tools may be a good thing. If the data is something that imported exported etc, database may be a good thing too. But still I don't believe such cases, in my ten something years of software development career, something like that never happened.

gavinray

Not to nitpick, but it would be interesting to see profiling info of the benchmarks Different languages and stdlib methods can often spend time doing unexpected things that makes what looks like apples-to-apples comparisons not quite equivalent

srslyTrying2hlp

I tried doing this with csv files (and for an online solution, Google Sheets) I ended up just buying a VPS, putting openclaw on it, and letting it Postgres my app. I feel like this article is outdated since the invention of OpenClaw/Claude Opus level AI Agents. The difficulty is no longer programming.

fifilura

Isn't this the same case the NoSQL movement made.

jbiason

Honestly, I have been thinking about the same topic for some time, and I do realize that direct files could be faster. In my (hypothetical, 'cause I never actually sat down and wrote that) case, I wanted the personal transactions in a month, and I realized I could just keep one single file per month, and read the whole thing at once (also 'cause the application would display the whole month at once). Filesystems can be considered a key-value (or key-document) database. The funny thing about the example used in the link is that one could simply create a structure like `user/[id]/info.json` and directly access the user ID instead of running some file to find them -- again, just 'cause the examples used, search by name would be a pain, and one point where databases would handle things better.

forinti

Many eons ago I wrote a small sales web application in Perl. I couldn't install anything on the ISP's machine, so I used file-backed hashes: one for users, one for orders, another for something else. As the years went by, I expected the client to move to something better, but he just stuck with it until he died after about 20 years, the family took over and had everything redone (it now runs Wordpress). The last time I checked, it had hundreds of thousands of orders and still had good performance. The evolution of hardware made this hack keep its performance well past what I had expected it to endure. I'm pretty sure SQLite would be just fine nowadays.

jwitchel

This is a great incredibly well written piece. Nice work showing under the hood build up of how a db works. It makes you think.

randusername

Separate from performance, I feel like databases are a sub-specialty that has its own cognitive load. I can use databases just fine, but will never be able to make wise decisions about table layouts, ORMs, migrations, backups, scaling. I don't understand the culture of "oh we need to use this tool because that's what professionals use" when the team doesn't have the knowledge or discipline to do it right and the scale doesn't justify the complexity.

ForHackernews

Surprised to see this beating SQLite after previously reading https://sqlite.org/fasterthanfs.html

XorNot

I've just built myself a useful tool which now really would benefit from a database and I'm deeply regretting not doing that from the get-go. So my opinion has thoroughly shifted to "start with a database, and if you _really_ don't need one it'll be obvious. But you probably do.

JohnMakin

everyone thinks this is a great idea until they learn about file descriptor limits the hard way

Joeboy

Don't know if it counts, but my London cinema listings website just uses static json files that I upload every weekend. All of the searching and stuff is done client side. Although I do use sqlite to create the files locally. Total hosting costs are £0 ($0) other than the domain name.

shafoshaf

Relational Databases Aren’t Dinosaurs, They’re Sharks. https://www.simplethread.com/relational-databases-arent-dino... The very small bonus you get on small apps is hardly worth the time you spend redeveloping the wheel.

MattRogish

"Do not cite the deep magic to me witch, I was there when it was written" If you want to do this for fun or for learning? Absolutely! I did my CS Masters thesis on SQL JOINS and tried building my own new JOIN indexing system (tl;dr: mine wasn't better). Learning is fun! Just don't recommend people build production systems like this. Is this article trolling? It feels like trolling. I struggle to take an article seriously that conflates databases with database management systems . A JSON file is a database. A CSV is a database. XML (shudder) is a database. PostgreSQL data files, I guess, are a database (and indexes and transaction logs). They never actually posit a scenario in which rolling your own DBMS makes sense (the only pro is "hand rolled binary search is faster than SQLite"), and their "When you might need" a DBMS misses all the scenarios, the addition of which would cause the conclusion to round to "just start with SQLite". It should basically be "if you have an entirely read-only system on a single server/container/whatever" then use JSON files. I won't even argue with that. Nobody - and I mean nobody - is running a production system processing hundreds of thousands of requests per second off of a single JSON file. I mean, if req/sec is the only consideration, at that point just cache everything to flat HTML files! Node and Typescript and code at all is unnecessary complexity. PostgreSQL (MySQL, et al) is a DBMS (DataBase Management System). It might sound pedantic but the "MS" part is the thing you're building in code: concurrency, access controls, backups, transactions: recovery, rollback, committing, etc., ability to do aggregations, joins, indexing, arbitrary queries, etc. etc. These are not just "nice to have" in the vast, vast majority of projects. "The cases where you'll outgrow flat files:" Please add "you just want to get shit done and never have to build your own database management system". Which should be just about everybody. If your app is meaningfully successful - and I mean more than just like a vibe-coded prototype - it will break. It will break in both spectacular ways that wake you up at 2AM and it will break in subtle ways that you won't know about until you realize something terrible has happened and you lost your data. Didn't we just have this discussion like yesterday ( https://ultrathink.art/blog/sqlite-in-production-lessons )? It feels like we're throwing away 50 years of collective knowledge, skills, and experience because it "is faster" (and in the same breath note that nobody is gonna hit these req/sec.) I know, it's really, really hard to type `yarn add sqlite3` and then `SELECT * FROM foo WHERE bar='baz'`. You're right, it's so much easier writing your own binary search and indexing logic and reordering files and query language. Not to mention now you need a AGENTS.md that says "We use our own home-grown database nonsense if you want to query the JSON file in a different way just generate more code." - NOT using standard components that LLMs know backwards-and-forwards? Gonna have a bad time. Enjoy burning your token budget on useless, counter-productive code. This is madness.

Do you even need a database?

Discussion Highlights (20 comments)

Related Discussions