<h2>How to choose the locale file name</h2>
- - how to choose the file name
+ <p>Locale names consist of three parts. The language code, the
+ country/region code, and the optional modifier. The format is
+ language_REGION@modifier. The language code is a code from
+ ISO 639. The two-letter code is prefered, but a three letter
+ code is accepted if no two-letter code is available. The
+ country/region code is a code from ISO 3166. If the language
+ or region in question is missing in the ISO standard, one need
+ to get the ISO standard updated before the locale will be
+ included in glibc. If one can't convince the ISO 639
+ maintainers that your language exists (and thus need a
+ language code), the glibc maintainers will refuse to add the
+ locale. In addition, the glibc maintainers seem to refuse
+ "artificial languages" like Esperanto and Lojban, even if they
+ got a ISO 639 code.</p>
+
+ <p>Little is known about the requirements for the naming of
+ modifiers. The following modifiers are currently used:
+ abegede, cyrillic, euro and saaho. This might indicate that
+ lower case letters are prefered in modifier names.</p>
+
+ <p>It is recommended to follow RFC 3066 when selecting locale
+ names.</p>
+
+ <ul>
+
+ <li><a href="http://www.unicode.org/onlinedat/countries.html">ISO
+ 3166</a></li>
+
+ <li><a href="http://www.loc.gov/standards/iso639-2/">ISO 639</a></li>
+
+ <li><a href="http://rfc.sunsite.dk/rfc/rfc3066.html"> RFC 3066
+ - Tags for the Identification of Languages</a></li>
+
+ </ul>
<h2>Category order</h2>
<h2>Reuse when possible</h2>
- - "copy" from existing locales if the content should be identical
+ <p>One should avoid cut-n-paste when possible, and instead use
+ the <tt>copy</tt> statement to include sections from locales
+ with identical content.</p>
- <h2>LD_INDENTIFICATION</h2>
+ <h2>LD_IDENTIFICATION</h2>
- - standard refs in the LD_INDENTIFICATION
+ <p>The category entries are references to the standard used when
+ writing the given section. The standard refs should have
+ quotes around them, and should not use the <U#>
+ notation. They should normally look something like this:</p>
+
+ <blockquote><pre>
+category "i18n:1997";LC_IDENTIFICATION
+ </pre></blockquote>
<h2>LC_MESSAGES</h2>
- - yes/no expr should have the form ^[yYnN<extra>], without 0 and 1
+ <p>Then yesexpr and noexpr entries should have the form
+ <tt>^[yY<extra>]</tt> and <tt>^[nN<extra>]</tt>,
+ without 0 and 1 and without trailing "<tt>.*</tt>". The
+ reason is to make sure the expressions have the same form as
+ the expressions used in the C/POSIX locale (<tt>^[yY]</tt> and
+ <tt>^[nN]</tt>).</p>
<h2>Standard documents and specifications</h2>
+ <h2>Testing the new locale file</h2>
+
+ <p>To test a new locale on a test machine, do the
+ following:</p>
+
+ <ul>
+
+ <li>Copy the new locale to
+ <tt>/usr/share/i18n/locales/<em>filename</em></tt></li>
+
+ <li>Run <tt>localedef -i <em>inputfile</em> -c -f
+ <em>charset<em> <em>locale</em></tt> to generate a
+ binary locale file in
+ <tt>/usr/lib/locale/<em>locale</em>/</tt></li>
+
+ <li>Test it using LANG=<em>locale</em>, for example by
+ running <tt>date</tt></li>
+
+ </ul>
+
+ <p>Example, generating a new <tt>de_DE@euro</tt> locale using
+ the ISO-8859-15 charset and save it as 'de_DE':</p>
+
+ <pre>
+ cp de_DE@euro /usr/share/i18n/locales/de_DE@euro
+ localedef -i de_DE@euro -c -f ISO-8859-15 de_DE
+ LANG=de_DE date
+ </pre>
+
+ <p>I've made a small tool <a href="check-locale">check-locale</a>
+ capable of detecting a few common mistakes with locales</p>
+
</div>
<hr>
<address><a href="mailto:pere@hungry.com">Petter Reinholdtsen</a></address>
<!-- Created: Sun Mar 21 18:14:42 CET 2004 -->
<!-- hhmts start -->
-Last modified: Sat May 15 11:56:09 CEST 2004
+Last modified: Mon Dec 20 20:17:59 CET 2004
<!-- hhmts end -->
</body>
</html>