If you are lucky enough to have not one, but two alphabets in daily use, your regular task in programming will be transliteration – transformation of text from one script (alphabet) to another.
This is not too complicated request; you can easily create necessary procedures; however, there is a better way:
Microsoft Transliteration Utility (MTU) is not widely known, but very useful tool for just that purpose: transliteration. It can easily transliterate text either typed in a text box or from one file to another.
There is set of predefined translations:
- Serbian Cyrillic to Latin / Serbian Latin to Cyrillic
- Bosnian Cyrillic to Latin / Bosnian Latin to Cyrillic
- Hangul to Romanization
- Inuktitut to Romanization / Romanization to Inuktitut
- Malayalam to Romanization / Romanization to Malayalam
You are not limited to above set; you can easily create your own translations, using Module Development Console:
Creating simple textual file, you can use full power of MTU’s parsing engine: definitions of input and output characters, rules for transliteration including definitions of new states for translation state machine.
This is not the end – you can even use MTU programmatically (although please check EULA for commercial usage):
- Add reference to MSTranslitTools.DLL (it can be found in %programfiles%\Microsoft Transliteration Utility)
- Add using System.NaturalLanguage.Tools;
- Current translation files (.tms) can be found in %CommonProgramFiles%\Transliteration\Modules\Microsoft\
- Here is simple code fragment to demonstrate:
TransliteratorSpecification specification = TransliteratorSpecification.FromSpecificationFile("Serbian Latin to Cyrillic.tms"); Transliterator transliterator = Transliterator.FromSpecification(specification); string rezultat = transliterator.Transliterate("Vesic.Org"); Console.WriteLine(rezultat);