Of Mice and Keyboards

Fonts and Input:
Where were you in 1999?

four rats looking at keyboard diagram

This page was originally written about a particular language using a particular script. But the underlying information is the same for everyone.

Characters and Fonts

Fix this date in your minds: 1999.

Oh, wait, it’s there already. So much the better.

For computer purposes, 1999 is the year the Unicode Standard was introduced. Technically it has been around since 1993, and is still evolving and growing. But it didn’t really hit its stride until 1999, with Unicode 3. Among other things, this revision introduced the UCAS (Unified Canadian Aboriginal Sylla­bics) block. It covers Inuktitut syllabics along with the characters used for writing other languages across a wide swath of northeastern Canada.

Before 1999

Before 1999, most computers recognized only one language—that is, one kind of writing. European and American computers used Roman script; Greek computers wrote in Greek; Arab and Russian computers used Arabic and Cyrillic; and so on. If you needed to write in a language your computer didn’t know—or a language no computer knew, like Sanskrit or Inuktitut or classical Greek—it had to be done by fakery.

As far as the computer was concerned, you were typing plain Roman text. To make it look like the language you wanted, such as Inuktitut syllabics, you put it in a font made especially for your target language. In your chosen font the letters “x” and “s” were no longer two crossed lines (x) or a squiggle (s); instead they might look like triangles standing on their sides (ᐊ and ᐅ). And the same for the rest of the alphabet. The person reading your file had to have the same font installed, or they would see only gibberish. You could always print out the text and send out paper copies—but then there’s not much point to having computer files in the first place.

Today, fonts of this type are known collectively as “legacy fonts”. They worked like this. Say you wanted to write ᐄ, or ᐋᒃᑲ, or ᐃᓄᒃᓱᓗᒃᑖᑎᒃᑲ ᐊᓯᐅᔪᑦ. What you actually typed would depend on the font you used.

In Nunacom, you’d type  |w or +x4v or wk4hl4|bt4v xysJ5
In ProSyl:  `w or <x4v or wk4hl4]bt4v xysJ5
In AiPaiNunavik:  ™ or €4v or wk4hl4Ìt4v xysJ5

Or the other way around. Type Aa Ii Uu, change the computer’s font, and you’d see:

Nunacom: ᒍᖑ ᖤᓂ ᙵᒥ
ProSyl: ᒍᖑ ᖤᓂ !ᒥ
AiPaiNunavik: ᒍᖑ Iᓂ =ᒥ

Luckily there are tools that will convert text from any one of these “legacy” fonts into another—or into Unicode or various other formats. For small chunks of text like these examples, I used the one over at Inuktitut Computing. Other tools can deal with whole web pages or long documents. Your old files aren’t lost forever just because your computer no longer has the ProSyl font.

Did you spot the Euro sign up there? Before 1998, it didn’t exist. Unicode 2.1 introduced exactly two new characters. The € was one of them. But unlike most Unicode characters, this one was so important it replaced a character in both Mac and Windows computers. If you have twenty-year-old text files in plain English, one uncommon character will have morphed into the Euro. The computer doesn’t care. All it sees is a number.

The character is what the computer sees.

The font is what you see. The computer does not know and does not care what you think a particular letter is supposed to look like.

After 1999

With Unicode, it became possible for computers to use any and all characters, in just about any language, as they were actually written. When I type ᖃᑉᓗᓈᖑᔪᖓ into my computer, it stores the real word, exactly as I typed it, not “c2l~NaJz” (Nunacom and Prosyl) or “c2lˆaJz” (AiPaiNunavik).

This doesn’t do any good, of course, if you can’t see what you wrote. My browser knows that മാലയാലം is seven different letters—but all I see is seven boxes, because I don’t have the necessary font installed. (You probably don’t either. It’s Malayalam.)

The earliest Inuktitut fonts showed up within months after syllabics entered the Unicode Standard. First came the “third-party” fonts that you had to hunt down for yourself; later came fonts that were pre-installed in your computer. For most people that means Euphemia, the UCAS font now included with both Windows and Mac operating systems, including iOS. Next-likeliest is Pigiarniq, the Inuktitut font you get from the Nunavut government. There are plenty of others, though. At last count I’ve got fifteen, along with a fistful of legacy fonts. Pick the one that looks nicest to you.

blank keyboard

Input

When we’re talking about text, “input” generally means something involving the keyboard. It doesn’t have to mean this: you could click on a picture of a letter and the computer will do the rest, just as if you’d typed it. But let’s keep it simple and stick with keyboard input.

Legacy fonts tended to come with a keyboard overlay, or a chart that you could post next to your computer so you knew which key led to which letter. But changing the overlay, or replacing the printed diagram, didn’t really change anything. You were still typing an a, no matter what you wanted it to look like. In a way it’s like italics. The letters “a” and “a” may look entirely different—and both look different from “a”—but as far as your computer is concerned they’re all Character 97. Or Character 61, depending on whether your computer has ten fingers or sixteen. What the letter is, and what it looks like, are entirely different things. And both are different from what you did to make it look that way.

So Now What Do I Do?

When Unicode characters came in, the big problem wasn’t changing existing old-style text files into new Unicode ones. All that took was a few computer geeks to run up a few programs that anyone could use. The problem was how to enter all those new characters. A keyboard with several thousand keys, like a Chinese typewriter, was obviously not going to cut it.

The solution was to make keyboard layouts that pointed to a particular set of characters. For the user, that means selecting a menu item or popup, or clicking a button next to a list. You do not need a closet full of physical keyboards. With the right keyboard layout selected, your fingers can be hitting the “a” key, but onscreen you’ll see ᐊ or ᖑ as before. So what’s the difference? Unlike the old system, the characters you get after typing that “a” are real ᐊ’s or ᖑ’s. They don’t require a specific font to keep from reverting into “a”. Have you got Pigiarniq while the person reading your document has Euphemia? No problem. It’s just like writing your e-mail in Times Roman and sending it to someone who prefers Helvetica.

As with fonts, this is not much use if your computer doesn’t have the appropriate keyboard layout—or if you can’t figure out how to get to it.

User, meet keyboard. Keyboard, meet user.

The first part, having the keyboard, will depend on your computer and its operating system. Newer computers should have one or more syllabic keyboards built in. My Mac offers a choice of four—along with two ways to type Cherokee. Windows Vista and later systems include at least one Inuktitut keyboard layout; if you are on Windows XP, there is an available keyboard but you will have to install it yourself. I don’t know what the Linux options are—but this is OK, because Linux users tend to know exactly what they are doing and don’t need my help.

At worst, you may have to hunt around for a third-party keyboard layout. Or make your own. Designing a keyboard layout—or changing some detail of an existing one—can be done in an afternoon, with no programming required. If, like me, you find yourself constantly typing ᑯᖯᓗ when you meant to say ᑯᑉᓗ, just edit the keyboard to make it do what you want. If you’ve never been able to unlearn the ProSyl key correspon­dences, see if you can find or make a keyboard layout that maps keys to characters the way you’re used to. The document will come out exactly the same as if you’d used one of the standard keyboard layouts.

The input is how the contents of your brain turn into the contents of the computer. The computer does not know and does not care what keys you hit or what buttons you clicked to get there.

The second part, using the keyboard, again depends on your computer. For some people, this may mean it’s time to read the manual. It may also help to know that the Mac and Windows keyboards have one “You don’t know unless you know” feature in common. After you’ve selected the syllabic keyboard, you also have to put down the Caps Lock key to go into Inuktitut (or Cherokee) mode. Without Caps Lock, you’re in ordinary Roman type. With it, you’ve got a whole new set of keyboard layouts: with and without Shift, with and without Alt or Option.

You may think that Shift plus Caps Lock is redundant. But to the computer they are completely unrelated. It’s not like the old Shift Lock on a manual typewriter, where the letters were physically moved up or down as a block. Instead, the capital-locking function has been separately written into most keyboard layouts for your convenience. In fact, when I made my own version of the SuperGreek layout, I added the Caps Lock range. It wasn’t in the original.

To switch back to English, hit Caps Lock again. This means that as long as you don’t do a lot of SHOUTING IN ALL CAPS—which is hard to do with the Shift key alone—you may never need to change keyboard layouts again.

This week’s Official Inuktitut Fonts Link is here. If you are using Microsoft products on a Windows computer, you will probably find all the information you need. In parti­cular, there’s a nice PDF showing you how to enable the keyboard. Another place to look is the Fonts-and-Keyboards pages at the Pirurvik Centre.

Obligatory disclaimer: Addresses of Nunavut Government web sites tend to change at the drop of a hat. If the page you need is no longer there, backtrack to the main page and look for anything that says “Fonts”.

In short:

The character is where you are. Think character-ᒥ.

The input is how you got there. Think input-ᒃᑯᑦ. You don’t have to pronounce it. Just think it.

The legacy font is what it looks like. Think . . . Oh, go ahead. Think ᓲᕐᓗ. It looked like a snow house, but when you got closer it turned out to be a plastic replica.

The unicode font is what you’re wearing. It might be a ᖁᓕᑦᑕᖅ lovingly hand-stitched by your grandmother, or a bright purple down jacket. Either way, you’re the same person and everyone recognizes you.

How Do I Type UCAS Syllabics on my Smartphone?

Short answer: It depends.

Longer answer: . . . on whether you’ve got an iOS device or an Android.

Displaying syllabics on iOS is not a problem. The Euphemia font—the same one you see in your full-size Mac or Windows computer—was added way back in iOS 5. So unless you’ve got the world’s oldest iPhone, any text containing syllabics should display just fine. Even before iOS 5, the Mobile Safari browser supported font embedding. That means a web page could include a font like Euphemia or Pigiarniq and everything would look as intended, without resorting to images.

Typing syllabics is a whole different matter.

iOS 7 and Earlier

The four UCAS and two Cherokee keyboards that Mac users take for granted are reduced to a single Cherokee keyboard. The “Keyman” app promises much but doesn’t seem to have heard of syllabics; so far the only Inuktitut it offers is Kalaallisut (Green­landic), which is written in Roman type.

There’s one small loophole. Even if you can’t type syllabics, you can copy and paste. So if you have a long block of text, you can type or paste the Roman version into an online transcoder, convert to syllabics, and then paste back into your destination. But this is pretty cumbersome if you’ve got multiple separate text blocks.

iOS 8 and Later

Beginning with iOS 8, you have the option third-party keyboards. So even if syllabics are not yet included in the built-in OS, you can go to the App Store and download the free Inuktitut Naqittautit (ᓇᕿᑦᑕᐅᑎᑦ ) from the good people at the Pirurvik Centre.

Android

On the Android, unlike iOS, you can download and install extra fonts. For example, there’s myAlpha, which works with the MultiLing keyboard, apparently developed by the same person. The fine print says Android “2.1 and up”; that should be comprehensive enough. It is probably safe to assume that the font will not include the ᐀ inuksuk character (present in Pigiarniq and Uqammaq, but not in standard system fonts such as Euphemia).

... and the Rest

If you know a lot about fonts and input, you may notice that I haven’t said anything about Rendering Support beyond the font level. If you don’t know a lot, breathe easy: I’m not holding out on you. Right now, the more complicated aspects of Rendering Support apply mainly to some Asian languages—the whole South and Southeast Asian cluster, and Mediter­ranean scripts such as Hebrew and Arabic. The developers of Greek and UCAS Unicode scripts both avoided the problem by making pre-combined letters. A form such as ᐄ or ᾦ may require more than one keystroke, but inside your computer it’s all one character. You may recognize this as the same system used for common European letter-and-accent pairs like é or ñ.

I have also not said one word about the threat to codepoint 1400—the jolly little inuksuk at the beginning of the Pigiarniq and Uqammaq fonts. Maybe if we ignore the problem it will go away.

Too specialized, or too tangential, or too . . . well, whatever. If you want to yak about that kind of thing, contact me directly.

Help!

Not sure what UCAS fonts you’ve got? Try this simple test.

Have you inherited a document that looks like gibberish, but you’re pretty sure it’s supposed to be in syllabics? Try the Legacy Fonts page.

If you have a font-related question and can’t find an answer, drop me an e-mail. I may or may not know, but it can’t hurt to ask. Conversely, let me know if you have infor­mation that you think belongs in this group of pages.