Opened 9 years ago

Closed 9 years ago

#4080 closed bug (fixed)

Use libcharset instead of nl_langinfo(CODESET) if possible.

Reported by: PHO Owned by: igloo
Priority: high Milestone: 7.0.1
Component: libraries/base Version: 6.13
Keywords: iconv locale Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime crash Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

nl_langinfo(CODESET) doesn't always return standardized variations of encoding names which GNU libiconv understands.

This problem actually affects (at least) NetBSD and OpenBSD: GHC.IO.Encoding.Iconv.localeEncoding suffers from this and then even ghc --version fails. Here is an example:

/* test1.c */
#include <stdio.h>
#include <locale.h>
#include <langinfo.h>

int main() {
    setlocale(LC_ALL, "");
    printf("nl_langinfo(CODESET) = \"%s\"\n", nl_langinfo(CODESET));
    return 0;
}
% gcc -o test1 test1.c
% LC_ALL=ja_JP.UTF-8 ./test1
nl_langinfo(CODESET) = "UTF-8"   // Good.
% iconv -f UTF-8 -t UTF-8 /dev/null && echo ok
ok
% LC_ALL=C ./test1
nl_langinfo(CODESET) = "646"     // Wtf? You mean ISO 646?
% iconv -f 646 -t UTF-8 /dev/null && echo ok
iconv: conversion from 646 unsupported
iconv: try 'iconv -l' to get the list of supported encodings
% uname -a
NetBSD netbsd 5.99.20 NetBSD 5.99.20 (ADJUSTED) #0: Mon Oct  5 15:05:08 JST 2009
  root@netbsd:/usr/obj/sys/arch/i386/compile/ADJUSTED i386
%

So we should use libcharset if possible, which is shipped together with GNU libiconv. See: http://www.haible.de/bruno/packages-libcharset.html

/* test2.c */
#include <stdio.h>
#include <locale.h>
#include <libcharset.h>

int main() {
    setlocale(LC_ALL, "");
    printf("locale_charset() = \"%s\"\n", locale_charset());
    return 0;
}
% gcc -o test2 test2.c -I/usr/pkg/include -L/usr/pkg/lib -lcharset
% LC_ALL=ja_JP.UTF-8 ./test2
locale_charset() = "UTF-8"    // Good.
% LC_ALL=C ./test2
locale_charset() = "ASCII"    // Good!
% iconv -f ASCII -t UTF-8 /dev/null && echo ok
ok
%

Attachments (1)

libcharset.patch (33.8 KB) - added by PHO 9 years ago.

Download all attachments as: .zip

Change History (5)

Changed 9 years ago by PHO

Attachment: libcharset.patch added

comment:1 Changed 9 years ago by igloo

Status: newpatch

comment:2 Changed 9 years ago by igloo

Milestone: 6.14.1
Priority: normalhigh

comment:3 Changed 9 years ago by simonmar

Owner: set to igloo

Patch looks fine to me - Ian could you go ahead and validate/push please?

comment:4 Changed 9 years ago by igloo

Resolution: fixed
Status: patchclosed

Applied.

Note: See TracTickets for help on using tickets.