Show HN: Id-agent – Token efficient UUID alternative for AI agents

pranshuchittora 36 points 51 comments May 19, 2026
github.com · View on Hacker News

Discussion Highlights (15 comments)

nither

Smart idea but the concern can be that in the future, tokenization techniques and libraries may change. And also this looks like a very edge optimization to me. But overall, it deserve to exist. Good job.

whazor

i would be afraid of accidental prompt injection

felipeyanez

any plans for a python port?

Tiberium

Is this just a reinvented humanhash?

Falimonda

Benchmark comparing conventional UUID and AID across models, hallucination rate, token usage, would be cool!

simedw

Nice package, not only is using words more token-efficient [saving time and money], but weaker models are also less likely to make mistakes when providing the key, at least in my tests. That said, for `createAliasMap`, don't you think you could create a deterministic mapping from and to UUIDs <-> word chains? That way, no additional state would be needed. [Might require fairly long word chains...]

thrance

An even better solution is to present the AI with local IDs and map those to UUIDs outside of its context. So when giving a list of items for the LLM to choose from, just list them with incremental numbers (1, 2, 3...) and ask for these numbers in tool schemas.

railka

Why do people choose the hyphen ("-") as the separator in an identifier? When double-clicking, the ID does not select completely, unlike when an underscore ("_") is used.

jy14898

I don't like that they're not apples to apples; less bits so of course it'll take less tokens. > Where UUIDs cost ~23 tokens and get hallucinated by LLMs How does this solve the hallucination problem? Just removing the - from the example UUID takes it from 26 tokens to 18

mrweasel

Can someone explain why this would even be needed? Why is there a cost to generating say an UUIDv4? E.g. Claude Code has some regex in the client side code that filters out "bad words", so why can't the agent just generate UUIDs client side, using zero tokens. I sort of get the "problem", but the fact that this is even needed is stupid.

synthos

Isn't this solving a subproblem of the overall issue of uncompressed tool call polluting context? Furthermore, this could be compressed even further with a dynamic legend of every UUID in the context. So UUID@Bravo and UUID@Delta would be the actual symbols in the context but dynamically replaced when calling tools.

nkmnz

Neat idea! I'd argue that the collision risk is basically zero because even though the entropy is lower, because you must validate the LLM-output anyways for two reasons: 1. LLMs might lack intrinsic entropy and reuse some UUIDs much more often. 2. Referential integrity is as important as collision resistance. An LLM must be able to reuse the correct id in the correct place. On the other hand, using a dictionary for the ids helps with readability, but depending on the models strenghts, it might also add a confounder. After all, tokens that represent real words will probably influence the attention in a different way than random numbers.

yunusabd

That's nice, I've had the issue where LLMs would return non-existent uids. But does this package actually help with that? Token savings are nice, but not really my main concern. If this can measurably reduce hallucinations, it would be really useful. > Where UUIDs cost ~23 tokens and get hallucinated by LLMs, id-agent produces memorable word-based IDs at ~14 tokens with equivalent collision resistance.

diimdeep

just nanoid(5) https://github.com/ai/nanoid

ericyd

I really don't understand the problem this is solving even after reading a bunch of comments. What's the use case where this would be beneficial?

Semantic search powered by Rivestack pgvector
8,303 stories · 78,303 chunks indexed