Special Romanian Unicode characters
There is a lot of confusion about how to write the
Romanian characters that denote the sounds /S/ and /ts/. Although the correct ones are "s with comma below" and, respectively, "t with comma below", a lot of texts printed after 1860 incorrectly use "s with cedilla" and "t with cedilla".
This error has been perpetuated in all character encoding standards for Central and Eastern Europe (including ISO-8859-2), which include "s" and "t" with cedillas. To make matters more complicated, most computer fonts have "s-cedilla" with a cedilla (like the Turkish equivalent) and "t-cedilla" with a comma below.
ISO-8859-16 remedies this, including "s" and "t" with comma below on the same places "s" and "t" with cedilla were in ISO-8859-2.
Unfortunately, "s" and "t" with comma below are not well supported in modern computer operating systems and fonts. This is why most electronic content in Romanian is still encoded using "s" and "t" with cedilla.
The Unicode standard defines the "comma-below" characters in the
Latin Extened-B section (hex range 0180-024F).
Phoneme |
Wrong (cedilla) | Correct (comma) |
Character | Unicode position (hex) | Character | Unicode position (hex) | HTML entity |
/S/ | Ş | 015E | Ș | 0218 | Ș |
ş | 015F | ș | 0219 | ș |
/ts/ | Ţ | 0162 | Ţ | 021A | Ț |
ţ | 0163 | ț | 021B | ț |
External links