DynaPDF Manual - Page 56
Previous Page 55 Index Next Page 57
Content parsing & editing
Page 56 of 860
coordinates and the orientation must be considered. The page coordinate system is de-rotated
before text extraction starts since this produces better results. The width and height must be
calculated from the crop box if set, or from the media box otherwise. Note also that the width
and height must be exchanged if the orientation is 90, -90, 270, or -270 degrees.
Remarks:
Note that Text can be NULL, and TextLen zero, also if the function returned with no error. The
page contained no text in this case.
Return value:
If the function succeeds the return value is 1. If the function fails the return value is 0.
FindText
Syntax:
LBOOL psrFindText(
const PPDF* IPDF,
// PDF instance pointer
const IPSR* Ctx,
// Parser instance pointer
struct TFltRect* Area,
// Optional search area
TSearchType SearchType,
// See below
struct TTextSelection* Last,
// The previous selection if any
const UI16* Text,
// Search text
UI32 TextLen,
// Text length in characters
struct TTextSelection* SelText) // Required output structure
typedef enum TSearchType
{
stDefault
= 0, // Case sensitive search
stWholeWord
= 1, // Only whole words
stCaseInSensitive = 2, // Case insensitive search
stMatchAlways
= 4
// Return on every single character. Text and TextLen are ignored when
// this flag is set.
stSearchAsIs
= 8
// Disable sorting on the x-axis
}TSearchType;
The function searches for text and stores the result so that further editing actions can be applied.
The parameter SelText is required. The member StructSize must be set to sizeof(TTextSelection)
before the function can be called. The parameter Text is required unless the flag stMatchAlways
is set. When this flag is set, the function returns for every single character.
The bounding box of the found text can be computed with GetSelBBox().
Text is sorted on the x-axis by default. If this is not wanted set the flag stSearchAsIs.
Optional search area
Area must be defined as if the page would be viewed in a PDF viewer. That means in bottom up
coordinates and the orientation must be considered (see GetPageOrientation()). The width and
height of a page must be calculated from the crop box if set, or from the media box otherwise
Previous topic: DeleteText, ExtractText
Next topic: GetSelBBox