FEEFHS Eastern European Font Presentation


The Use of Eastern-European Fonts
and their Application in Genealogical Applications

Presented at FEEFHS' Third Annual Convention,
Minneapolis, Minnesota
June 1996

Chris Miksanek
Computer Consultant for the Czechoslovak Genealogical Society International

Computer Chair for the Czech and Slovak American Genealogy Society of Illinois

Chris Miksanek
Box 755
Rochester, MN 55903

chris@miksanek.com
http://www.miksanek.com/chris


Diacritical characters, they appear cosmetic: Are they necessary?
In many European languages a diacritically-marked character is as different from its non-marked counterpart as the letters 'A' and 'Z' are to us.
cz_1.jpg
( 'scale' v. 'rock' )

Where do you find Central European fonts? How does a US keyboard express them? What considerations are in order specifically for Windows, Macintosh, OS/2, or DOS users? These are questions this presentation will address.

I think I know what a font is, what’s the problem?
A font is a character set like Helvetica, but can also contain symbols (e.g., Dingbats). The application only knows a byte’s ASCII value, NOT its 'character representation' like
cz_2.jpg

which are the same ASCII values expressed with different fonts. When you load a document for display or print, your computer takes the ASCII value of a byte of data and retrieves the character from the corresponding position in the selected font matrix to express the specified character.

Generally, an 'a' is an 'a' on all platforms in any country because most of the characters are mapped in the same place on standard font matrices for data in the ‘lower ASCII’ (also called 7-bit ASCII) range (0 through 127). But the diacritical characters are generally found in the upper ASCII-range (128-255). From font to font and machine to machine, THIS is where you’ll find the inconsistencies which make it difficult to compute seamlessly in an multi-lingual environment.

It would be nice if there was just one codepage (font matrix), but there are several. For Czech/Slovak support there is DOS: 852; for Windows: 1250; for ISO (an international standard): 8859-2; and, for the Macintosh: CE. If there is discussion among operating system developers (Apple, IBM, Microsoft) and standards organizations like ISO and ANSI, it is not obvious. Codepages are significantly inconsistent. Further, many third-party font developers don’t comply with the reigning codepage of the platform for which they are developing a font.

E.G., 's-hacek' is mapped differently on five of the most popular Czech fonts. Windows WorldFont CzechRome, a popular third-party font, maps it to ASCII-243; LeedsBit, a shareware ISO-8859-2 font, maps it to ASCII-185; TimelTEE, a Windows CP1250-compatible font, maps it to ASCII-154; Times CE, the Apple Czech font, maps it to ASCII 228; and Greg’s Czech, a popular Macintosh third-party font, maps it to ASCII-223. DOS, by the way, maps that same character to an ASCII-231. DOS WordPerfect 5.1 has yet another method for maintaining the characters: its own.

Some punctuation marks, e.g., curly-quotes, are problematic, as well. Macintosh curly-quotes are ASCII-210 and ASCII-211 on the Apple codepage, but when those ASCII values are displayed on a Windows machine, ASCII-210 and ASCII-211 are

cz_3.jpg

ClarisWorks 'SAVE-AS text' converts curly quotes to straight quotes which appear to be 'platform safe.'

Application, Printer, and Network Troubles
Spell checking with diacritical font:
cz_4.jpg

is a questionable spelling for the Slovak
cz_5.jpg

The font the application uses to display the spell-check panel is a system font and can not be substituted.

A similar problem has been observed with Microsoft Excel for Windows: the text in a cell may be displayed in any font but the ‘Formula Bar’ always uses Arial thus misrepresenting any diacritical characters the cell might contain. Some of the newer genealogical databases allow font specifications but inconsistencies are found there too. Some text displays in a font that may not be substituted. Macintosh PAF 2.3.1 simply docs-away this anomaly.

Other application problems with diacritical characters: sorting order ‘breaks’ due to differences in the alphabet; date format differences; and, ‘find’ command problematic. Email problems: only ASCII values from 0-127 may be sent in e-mail and posted to USENET Newsgroups.

A light at the end of the carpal-tunnel
The approach you take depends on your specific requirements. Do you need to maintain a database that you will be sharing with colleagues in other countries or across operating system platforms (e.g., giving a Macintosh user a copy of your Windows database?) or do you just need to compose and print a simple letter or document?

In some cases, it is best to do without diacriticals. Email and USENET are two examples where they are often not used. If necessary, UUENCODE to transmit 8-bit ASCII data.

Use a popular font and one that conforms to some published standard; Develop a cross-platform data exchange strategy; When sharing data, always identify the font used. To find fonts, begin your search at http://www.csagsi.org

To express characters without the proper keyboard support, select the character from a palette of characters. For both Windows and Macintosh users, there are utilities to accomplish this.


cz_6.gif


cz_7.gif


cz_8.gif


Back to the tools page

Back to the CSAGSI Home Page


The material presented here is Copyright 1996, 2006 Chris Miksanek.
Webmaster: Chris Miksanek
Last updated: October 7, 2006