A final aspect of localization technology worth discussing is input method engines (IMEs). Input method software is an absolute necessity for languages such as Chinese, Japanese, and Korean. In East Asia, users typically enter word pronunciation phonetically, and then pick characters or character compounds from a pick list. Intelligent input methods strive to sort the pick lists based on which characters are used most frequently, based on the context of the text already entered, or a combination of both. Input methods are also useful for alphabetic scripts, for example to make it easier for users to enter diacritical marks, or to guarantee canonical character ordering.
While other IMEs for Linux do exist, James Su's Smart Common Input Method (SCIM) platform1 is arguably the best and most comprehensive. SCIM not only provides a user-friendly platform for end users, but also makes it very easy for developers to add additional input methods. The platform installs seamlessly into the desktop tray on both the KDE and Gnome desktops. SCIM currently provides a wide range of input methods covering Chinese, Japanese, Korean, modern Vietnamese, Amharic, and Russian. Composing and dead key support is built in. On the backend, SCIM talks to XIM.
Compared to, say, trying to write or decipher an XKB keyboard map for X11, adding a new input method to SCIM is very easy to do. Input method data files are simple UTF-8 files – a snippet from the自然码 zìránmǎ (自然双拼 zìránshuāngpīn) table is shown above to give you an idea. The simple UTF-8 format makes it easy to understand and debug. The first column indicates what keys on a QWERTY keyboard to type, the second column shows the Chinese character, and the third column contains a number related to that character's usage frequency. Input methods for syllabaries or alphabetic scripts do not require the third column.
1. SCIM: http://www.scim-im.org.