DynaPDF Manual - Page 16

Previous Page 15   Index   Next Page 17

Data types
Page 16 of 839
ToUTF16 is defined as follows:
// declared in drv_conf.h (Linux/UNIX, Mac OS X)
#if (SIZEOF_WCHAR_T == 4)
#define ToUTF16(IPDF, s)(pdfUTF32ToUTF16((IPDF), (UI32*)(s)))
#else
// UTF-16
#define ToUTF16(IPDF, s)((s))
#endif
This macro calls pdfUTF32ToUTF16() only if the OS uses UTF-32 as Unicode string format.
On operating systems which use already UTF-16, no conversion is applied; the macro will be
removed by the compiler. The function pdfUTF32ToUTF16() holds an array of 4 independent
string buffers so that the macro can be used in functions which support up to four string
parameters. If DynaPDF will ever support a function with more than 4 string parameters, the
number of internal string buffers will be incremented.
However, take care when using the macro to initialize string variables of structures which contain
more than 6 string members:
Example:
SOME_STRUCT myStruct;
myStruct.String1 = ToUTF16(pdf, L”String1”); // OK
myStruct.String2 = ToUTF16(pdf, L”String2”); // OK
myStruct.String3 = ToUTF16(pdf, L”String3”); // OK
myStruct.String4 = ToUTF16(pdf, L”String4”); // OK
myStruct.String5 = ToUTF16(pdf, L”String5”); // OK
myStruct.String6 = ToUTF16(pdf, L”String6”); // OK
myStruct.String7 = ToUTF16(pdf, L”String7”); // Wrong!
The seventh call above overrides the string buffer of String1 because only 6 internal string buffers
are available. If you need to store more than 6 string variables then you must copy the converted
string into another variable!
Unicode File Paths
Unicode file paths are encoded differently depending on the used operating system. While NT
based Windows system use UTF-16 encoded Unicode file paths, non-Windows systems use
usually UTF-8 encoded Unicode file paths. All DynaPDF functions which open a file convert
UTF-16 strings to UTF-8 on non-Windows operating systems. However, to avoid this conversion
step it is usually best to use directly the Ansi version of a function and passing an UTF-8 file path
to it.
CJK Multi-byte Strings
CJK multi-byte strings contain mixed 8 bit / 16 bit character codes. A CJK string can be defined as
an Ansi string (data type char*) and as multi-byte string (data type UI16*). The multi-byte format
 

Previous topic: Var Parameters, Structures, Multi-byte Strings, Unicode

Next topic: Data types used by different programming languages