Wiley --> wiley.com

The Web Testing Companion: The Insider's Guide to Efficient and Effective Tests

Lydia Ash

Language Guides: Latin with Diacritics

IME English
Keyboard layout US
Keystrokes Right Alt key and ,   ` c   c
Input characters ç   ç   ç
Unicode positions 007   0027 0063 0327 0063
Code page points - same on all code pages 0xE7   0xB8 0x63  
Names Small Latin Letter C with Combining Cedilla   Small Latin Letter C with Combining Cedilla Combining Cedilla Small Latin Letter C
Display ç  
Unicode ranges U+02B0 - U+02FF Spacing Modifier Letters
U+0300 - U+036F Combining Diacritical Marks
U+20D0 - U+20FF Combining Diacritical Marks for Symbols
Fonts
Angsana New, Arial, Arial Black, Arial Narrow, Batang, BatangChe, Book Antiqua, Bookman Old Style, Browallia New, Century Gothic, Comic Sans MS, Cordia New, Courier, Courier New, Dotum, Fixedsys, Garamond, Georgia, Gulim, GulimChe, Gungsuh, GungsuhChe, Haettenschweiler, Impact, Lucinda Console, Lucinda Sans Unicode, Microsoft Logo, Microsoft Sans Serif, MingLiU, Monotype Corsiva, MS Dialog, MS Dialog Light, MS Gothic, MS Mincho, MS PGothic, MS PMincho, MS Sans Serif, MS Serif, MS SystemEx, MS UI Gothic, Palatino Linotype, Small Fonts, Sylfaen, System, Tahoma, Times New Roman, Trebuchet MS, Verdana

 
In the example in the table, there are two inputs-"c" and the combining cedilla. The cedilla is a combining character, a keyboard modifier similar to the Shift key. It modifies the state of the keyboard driver. Although nothing is typed into the screen when the cedilla is typed, it combines with the letter c to create the Small Latin Letter C with Cedilla. Note The cedilla, and other similar combining marks, are also referred to as dead keys, combining characters, nonspacing marks, or floating diacritics, because they do not output a displayed character in the text.

Unicode UCS-2 provides for a number of these combining characters, but as processing these is more complex than not processing them, there are three levels of support outlined in ISO 10646.

  • Level 1. Disallows combining characters from being processed
  • Level 2. Allows for combining marks in Arabic, Hebrew, Indic, and Thai
  • Level 3. Allows for combining marks in Cyrillic, Greek, and Latin

An interesting note is that the Small Latin letter C with cedilla does exist as a single code position in the 1252 code page (and a few others). The position that it occupies is 0xE7, and it has a Unicode value of [U+00E7].

 



Cover

ISBN 0-4714-30218
578 Pages
May, 2003

Wiley Technology Publishing
Timely. Practical. Reliable.

 
[Book Home] [Links] [App. B] [App. G] [App. L] [Lang Guides] [Code Pgs] [Samples] [HTTP Responses] [Questions] [Templates] [System Guides] [Readings]