BetaCode to Unicode in Python

BetaCode is a common ASCII transcription for Polytonic Greek. I've been dealing with it for around twelve years. (As an aside, back in 1994, I designed a METAFONT for Polytonic Greek that enabled one to use BetaCode in TeX—I typeset my self-published Index to the Greek New Testament with it).

For the last six years, my preference has been to use Unicode, so I wrote a program (initially in Java but then in Python) that used a Trie to represent the multiple BetaCode characters that can map to a single pre-composed Unicode character.

I've had a version available on this site since 2002, but I've now updated it to what I've been using for my most recent work. You can download it at http://jtauber.com/2004/11/beta2unicode.py

At some stage I'll better factor out the conversion pairs so the code is useful for other conversions. The Trie code might be useful for other contexts too.

(Also see Ricoblog's Converting Greek Beta Code into Normalized Unicode.)


originally published on jtauber.com


Comments on “BetaCode to Unicode in Python”