Chris Miksanek
Computer Consultant for the
Czechoslovak Genealogical Society International
Computer Chair for the
Czech and Slovak American Genealogy Society of Illinois
Chris Miksanek
Box 755
Rochester, MN 55903
chris@miksanek.com
http://www.miksanek.com/chris
Where do you find Central European fonts? How does a US keyboard express them? What considerations are in order specifically for Windows, Macintosh, OS/2, or DOS users? These are questions this presentation will address.
which are the same ASCII values expressed with different fonts. When you load a document for display or print, your computer takes the ASCII value of a byte of data and retrieves the character from the corresponding position in the selected font matrix to express the specified character.
Generally, an 'a' is an 'a' on all platforms in any country because most of the characters are mapped in the same place on standard font matrices for data in the ‘lower ASCII’ (also called 7-bit ASCII) range (0 through 127). But the diacritical characters are generally found in the upper ASCII-range (128-255). From font to font and machine to machine, THIS is where you’ll find the inconsistencies which make it difficult to compute seamlessly in an multi-lingual environment.
It would be nice if there was just one codepage (font matrix), but there are several. For Czech/Slovak support there is DOS: 852; for Windows: 1250; for ISO (an international standard): 8859-2; and, for the Macintosh: CE. If there is discussion among operating system developers (Apple, IBM, Microsoft) and standards organizations like ISO and ANSI, it is not obvious. Codepages are significantly inconsistent. Further, many third-party font developers don’t comply with the reigning codepage of the platform for which they are developing a font.
E.G., 's-hacek' is mapped differently on five of the most popular Czech fonts. Windows WorldFont CzechRome, a popular third-party font, maps it to ASCII-243; LeedsBit, a shareware ISO-8859-2 font, maps it to ASCII-185; TimelTEE, a Windows CP1250-compatible font, maps it to ASCII-154; Times CE, the Apple Czech font, maps it to ASCII 228; and Greg’s Czech, a popular Macintosh third-party font, maps it to ASCII-223. DOS, by the way, maps that same character to an ASCII-231. DOS WordPerfect 5.1 has yet another method for maintaining the characters: its own.
Some punctuation marks, e.g., curly-quotes, are problematic, as well. Macintosh curly-quotes are ASCII-210 and ASCII-211 on the Apple codepage, but when those ASCII values are displayed on a Windows machine, ASCII-210 and ASCII-211 are
A similar problem has been observed with Microsoft Excel for Windows: the text in a cell may be displayed in any font but the ‘Formula Bar’ always uses Arial thus misrepresenting any diacritical characters the cell might contain. Some of the newer genealogical databases allow font specifications but inconsistencies are found there too. Some text displays in a font that may not be substituted. Macintosh PAF 2.3.1 simply docs-away this anomaly.
Other application problems with diacritical characters: sorting order ‘breaks’ due to differences in the alphabet; date format differences; and, ‘find’ command problematic. Email problems: only ASCII values from 0-127 may be sent in e-mail and posted to USENET Newsgroups.
In some cases, it is best to do without diacriticals. Email and USENET are two examples where they are often not used. If necessary, UUENCODE to transmit 8-bit ASCII data.
Use a popular font and one that conforms to some published standard;
Develop a cross-platform data exchange strategy;
When sharing data, always identify the font used.
To find fonts, begin your search at
http://www.csagsi.org
To express characters without the proper keyboard support, select the character from a palette of characters. For both Windows and Macintosh users, there are utilities to accomplish this.



Back to the tools page
Back to the CSAGSI Home Page