SOURCES: pwdutils.login.defs - fix locale issue

Radoslaw Zielinski radek at karnet.pl
Tue Nov 30 16:23:46 CET 2004


Andrzej Krzysztofowicz <ankry at green.mif.pg.gda.pl> [30-11-2004 15:27]:
> Radoslaw Zielinski wrote:
>> Jakub Bogusz <qboosh at pld-linux.org> [30-11-2004 13:35]:
>>> On Tue, Nov 30, 2004 at 01:26:50PM +0100, Radoslaw Zielinski wrote:
[...]
>>>> As this shouldn't be locale dependent, until the bug is properly
>>>> resolved,
>>> regex functionality in libc is locale-dependent (uses LC_COLLATE).
>> For [a-z]?  It doesn't make sense...  What does this class contain in
>> Asian locales then, nothing?
> Eg. for pl_PL it may contain 'ą' and some capital letters. For Asian locales
> it may contain either the same as for C or the same as for pl_PL. Each
> locale character set is a superset of ASCII.

Blah, right you are.  POSIX (9.3.5 RE Bracket Expression) states:

7. In the POSIX locale, a range expression represents the set of collating
   elements that fall between two elements in the collation sequence,
   inclusive.  In other locales, a range expression has unspecified
   behavior: strictly conforming applications shall not rely on whether
   the range expression is valid, or on the set of collating elements
   matched.  A range expression shall be expressed as the starting point
   and the ending point separated by a hyphen ( '-' ).

So, as I understand, [a-z] may or may not contain a pink elephant, which
makes it pretty useless.  What was the point...?

And this is just sick:

  $ echo x | LC_ALL=et_EE grep '[a-z]'
  $ echo x | LC_ALL=et_EE grep '[a-x]'  # shouldn't be invalid?
  x
  $ echo x | LC_ALL=et_EE grep '[a-y]'  # no, "x" is somewhere...
  x
  $ echo z | LC_ALL=et_EE grep '[a-y]'  # this isn't ASCII order for sure.
  z

BTW, perl implements more logical behaviour:

  # LC_ALL is pl_PL.ISO-8859-2
  $ echo ą | perl -Mlocale -nle 'print if /[a-z]/'
  $ echo ą | perl -Mlocale -nle 'print if /[[:alpha:]]/'
  ą

-- 
Radosław Zieliński <radek at karnet.pl>
[ GPG key: http://radek.karnet.pl/ ]
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : /mailman/pipermail/pld-devel-en/attachments/20041130/0d12b9b1/attachment-0002.bin


More information about the pld-devel-en mailing list