ILE C/C++ Runtime Library Functions

CCSIDs of Characters and Character Strings

Every character or character string has a CCSID associated with it. The CCSID of the character or character string depends on the origin of the data. You need to pay attention to the CCSID of a character or character string. It is also important that values are converted to the appropriate CCSID when required.

If LOCALETYPE(*LOCALEUTF) is not specified on the compilation command, the following assumptions are made:

When LOCALETYPE(*LOCALEUTF) is specified, most functions (unless otherwise specified) expect character data input in the CCSID of the LC_CTYPE category of the current locale, regardless of the source of the character data. See Unicode Support for more information.

For more information about variant and invariant characters, see Runtime Character Set. For more information about CCSIDs, code pages, and other globalization concepts, see the i5/OS® globalization topic.

Character Literal CCSID

Character literal CCSID is the CCSID of the character and character string literals in compiled source code. If a programmer does not take special action, the CCSID of these literals is set to the CCSID of the source file. The CCSID of all the literals in a compilation unit can be changed by using the TGTCCSID option on the compilation command. The #pragma convert directive can be used to change the CCSID of character and character string literals within C or C++ source code. See IBM® Rational® Development Studio for i: ILE C/C++ Compiler Reference for more information.

If LOCALETYPE(*CLD) or LOCALETYPE(*LOCALE) is specified on the compilation command, all wide character literals will be wide EBCDIC literals in the CCSID of the source file. If LOCALETYPE(*LOCALEUCS2) is specified on the compilation command, all wide character literals will be UCS-2 literals. If LOCALETYPE(*LOCALEUTF) is specified on the compilation command, all wide characters will be UTF-32 literals.

The programmer must be aware of the CCSID of character literal values. The character literal CCSID cannot be retrieved at run time.

Job CCSID

The CCSID of the job is always an EBCDIC CCSID. ASCII and Unicode job CCSIDs are not supported. Data read from files is sometimes in the job CCSID. Some functions (for example, getenv()) produce job CCSID output; some functions (for example, putenv()) expect job CCSID input. The CCSID used most often by the C runtime is the CCSID of the LC_CTYPE category of the current locale. If the job CCSID does not match the locale CCSID, conversion might be necessary.

Using the JOBI0400 receiver variable format, the job CCSID value can be retrieved at run time using the QUSRJOBI API. The Default Coded Character Set ID field contains the job CCSID value.

File CCSID

When a file is opened, a CCSID is associated with it. Read operations of character and string values return data in the CCSID of the file. Write operations to the file expect the data in the CCSID of the file. The CCSID associated with a file when it is opened is dependent on the function that is used to open the file:

Locale CCSID

A CCSID is associated with each category of the locale (see setlocale() — Set Locale for a list of locale categories). The most commonly used CCSID from the locale is the CCSID associated with the LC_CTYPE category of the locale. Confusion might arise if different locale categories have different CCSID values, so it is recommended that all locale categories have the same CCSID value. You can retrieve the CCSID of the LC_CTYPE category of the current locale by using the nl_langinfo() function and specifying CODESET as the nl_item. Here are some additional locale CCSID details, broken down by LOCALETYPE option specified on the compilation command:



[ Top of Page | Previous Page | Next Page | Contents | Index ]