The experience of rendering Arabic typography and its technical debt

bookofjoe 217 points 56 comments June 13, 2026
lr0.org · View on Hacker News

Discussion Highlights (14 comments)

adam_rida

very interesting, arabic is a good reminder that text rendering is mostly solved for the scripts that shaped the defaults. The hard part is that typography, shaping, bidi behavior, font fallback, search, and the editor model all leak into each other. You cannot fix one layer cleanly when the assumptions are wrong in all of them.

yorwba

A more academic treatment of justifying Arabic-script text can be found in https://quod.lib.umich.edu/j/jep/3336451.0023.104?view=text;...

mohamedkoubaa

I'd like to see some more mainstream usage of disconnected fonts for Arabic, for example like these: https://www.arabaddigital.com/en/article/2100-Quarantining-o...

jansan

Very interesting. I just implemented a text shaper and renderer from scratch with support for complex scripts like Arabic, Nastaliq and Indic (will soon post about it here on HN). Now that you write about it, the lack of stretching really is a deficiency in the OpenType spec. If you want a solution for this it has to happen in the rendering step, not the shaping (which is HarfBuzz's main task). The shaper has no information about the available space, but when rendering you could stretch individual glyphs to the desired width, similar to adjusting the width of whitespace in Latin, but more complex, because you actually have to modify the glyphs with a scale transform. I am not an expert on Arabic script by any means, but this should be possible IMO. It would at least be an interesting experiment. Of course the JSTF table would be the right way to do it, but there seems to be a lot of confusion around it. Maybe in the age of LLMs we can give it another shot.

amluto

Is this a small typo? > The relevant rule, W2 of UAX #9, reclassifies a digit as an ARABIC NUMBER if any of the previous strong characters in the paragraph were Arabic letters, and as a EUROPEAN NUMBER otherwise. Both render their internal digits left-to-right, which is correct: numbers everywhere on Earth are read most-significant-first. Does the author mean most-significant-on-the-left? The statement as written is a statement about the order in which one reads or perhaps thinks the number, whereas I think the author is discussing how numbers, including collections of numbers delimited by hyphens and such, should be laid out on the page.

throw-the-towel

This article is wonderful. It's interesting, it's captivating, full with detail, and to think I never gave much thought about Arabic rendering before. This part nearly had me chuckle audibly: He says yes. The result is "Simplified Arabic": initial fused into medial, final into isolated, ligatures dropped. It conquers the Arab newsroom in a generation. Mrowa is assassinated at his desk eight years later, by an unrelated faction, in an unrelated dispute. Also, it's depressing how hundreds of millions of people couldn't even get their language typeset on a computer, and our industry meanwhile was busy building AI-native AI for your groceries (have we mentioned it has AI btw?) and similar performative bullshit.

evmar

One thing I sometimes think about when I think about text layout problems is how the text we use also has a bunch of complexities that we can take for granted. Think of variable width characters and kerning and ligatures and hyphenation and justification. Imagine computers had been won by a CJK language, which have none of these problems. You could imagine a similar article about how exotic and difficult English layout is.

VeninVidiaVicii

Disclaimer: I’m not fluent in Arabic by any means, but the stretched out to both margins style looks very Quranic to me. I don’t think it looks appropriate for say, a message about my DoorDasher.

slim

Internet Explorer 5.5 implements text-justify: kashida. For one brief, weird browser-quarter Microsoft is the only software vendor on earth that can justify Arabic correctly on a screen.

evilturnip

Arabic script is a great test to see if your terminal/renderer/UI can handle anything: contexual shaping, cursive connectivity, bidirectional text layout, diacritics and vertical displacement. I went down this rabbit-hole awhile back and it made me really appreciate the complexity of the script.

kqr

This explains a lot of why I think Arabic looks so beautiful on the page. I also love the way it can sound when spoken. Shame I don't have reason good enough to take the years I would need to learn it.

samat

I feel so sorry for Arabs now, just read that paragraph about everyday experience of trying to write English-Arabic text in the mail or any other editor. > I have watched senior engineers, fluent in both Arabic and English, give up on writing a long email in Outlook on a Wednesday afternoon because the cursor would not behave, and switch to Arabic-only or English-only because the cognitive cost of fighting the editor exceeded the cost of monolingual phrasing. Actually I remember very well suffering this while using Facebook for the first time in my life, and I could not register; I was very slow typer that when I reached the moment the cursor does this weird thing, I would just stare at it and never progress. > This is the ordinary experience of writing mixed Arabic-English text in 2026, in every major editor, email client, and chat application I know of. The pettier cousins are everywhere too, and I collect them: a range like 10–20 silently reading as twenty-to-ten, because digits are weak and the dash is neutral; a trailing exclamation mark teleporting to the far end of the line; a password, toggled visible, displaying in an order that does not match what was typed. None of these are anyone's bug, exactly. My own Cyrillic struggles are nothing in comparison.

Georgelemental

See also: https://notarabic.com/

qingcharles

I could write all day about how badly RtL languages are treated on electronic devices. Most interfaces just slop RtL text into an interface that is otherwise the totally wrong way around. One thing that amuses me is that people share these "safe zone" templates for short form video to make sure your content isn't hidden behind the buttons: https://imgur.com/a/MrIQZen But look over the shoulder of someone using the Arabic version of TikTok and you'll realize how flawed that is: https://imgur.com/a/5iLXd2o

Semantic search powered by Rivestack pgvector
10,416 stories · 97,847 chunks indexed