Monkeybread Software - DynaPDF Manual

DynaPDF Manual - Page 610

Function Reference

Page 610 of 839

Source[i],

Kerning[i],

last,

j - last);*/

// Update the cursor position. More text can follow.

x1 = x2;

y1 = y2;

}

DynaPDF is delivered with several example projects which demonstrate how text coordinates must

be computed and how text can be extracted. The above code is a fragment of the example project

Text Coordinates which is delivered with all DynaPDF versions.

Text Scaling

Like character and word spacing the current text scaling is already considered in the text width that

is provided in all text callback functions. However, the value must be stored in the graphics state if

the width of a sub string must be computed. Text scaling is measured in percent of the original

unscaled text width.

Sub string coordinates

Sub string processing is somewhat more complicated because the width of a sub string cannot be

calculated from the Unicode string and one source character does necessarily correspond to one

Unicode character.

Simple fonts use one byte encodings where one source byte can be decoded to one or more Unicode

characters. CID fonts support also multi-byte encodings with fixed and variable code lengths. A

sequence of n source bytes can be decoded to m Unicode characters. So, there is no logical

relationship between the source and converted Unicode string.

If a text search algorithm should provide the coordinates of a found string, then it must be able to

find the position of the search text in the source string because it is not possible to calculate the

string width from the Unicode string. DynaPDF provides several helper functions to calculate the

width of a sub string or to convert an arbitrary source string manually to Unicode. It is always

possible to calculate the exact position of a string but the recommended strategy depends on the

used text callback function and on the kind of algorithm that should be developed:

• Text extraction algorithms require usually not the exact position of every character or word

in a string. Coordinates of sub strings are only required if word spacing must be considered

but word spacing refers to simple fonts only. Because the code length of a simple font is

always one byte the string width can be easily computed with fntGetTextWidth() and in

cases where the source string is shorter than the Unicode string the source string can be

manually converted to Unicode with TranslateRawString2() (the name is