2 architecture anddatastructures
TRANSCRIPT
![Page 1: 2 architecture anddatastructures](https://reader033.vdocuments.site/reader033/viewer/2022042819/55ce7a3fbb61eb30498b4588/html5/thumbnails/1.jpg)
2. Architecture and Data StructuresA quick tour of the Tesseract Code
Ray Smith, Google Inc.
![Page 2: 2 architecture anddatastructures](https://reader033.vdocuments.site/reader033/viewer/2022042819/55ce7a3fbb61eb30498b4588/html5/thumbnails/2.jpg)
Tesseract Tutorial: DAS 2014 Tours France
A Note about the Coordinate System● The pixel edges are aligned with integer coordinates.● (0, 0) is at bottom-left.● Width = right - left => no silly +1/-1.
Note: The API exposes a more common top-down system.0 21
0
2
1
![Page 3: 2 architecture anddatastructures](https://reader033.vdocuments.site/reader033/viewer/2022042819/55ce7a3fbb61eb30498b4588/html5/thumbnails/3.jpg)
Tesseract Tutorial: DAS 2014 Tours France
Nominally a pipeline, but not really, as there is a lot of re-visiting ofold decisions.
Tesseract System Architecture
![Page 4: 2 architecture anddatastructures](https://reader033.vdocuments.site/reader033/viewer/2022042819/55ce7a3fbb61eb30498b4588/html5/thumbnails/4.jpg)
Tesseract Tutorial: DAS 2014 Tours France
Tesseract Word Recognizer
![Page 5: 2 architecture anddatastructures](https://reader033.vdocuments.site/reader033/viewer/2022042819/55ce7a3fbb61eb30498b4588/html5/thumbnails/5.jpg)
Tesseract Tutorial: DAS 2014 Tours France
The ‘C’ Legacy
● Large chunks of the code written originally in C.● Major rewrite in ~1991 with new C++ code.● C->C++ migration gradual over time since.● Majority of global functions now live in a convenience directory
structure class. (For thread compatibility purposes.)
![Page 6: 2 architecture anddatastructures](https://reader033.vdocuments.site/reader033/viewer/2022042819/55ce7a3fbb61eb30498b4588/html5/thumbnails/6.jpg)
Tesseract Tutorial: DAS 2014 Tours France
Directory Structure ~ Functional Architecture
API
ccutilcutil
ccstruct
dict
classify
wordrectextord
ccmain
CCUtil
CUtilCCStruct
Classify
TessBaseAPI
Tesseract
WordrecTextord
Dict
cube
![Page 7: 2 architecture anddatastructures](https://reader033.vdocuments.site/reader033/viewer/2022042819/55ce7a3fbb61eb30498b4588/html5/thumbnails/7.jpg)
Tesseract Tutorial: DAS 2014 Tours France
Key Data Structures = Page Hierarchy
BLOCK
ROW
WERD
PAGE_RES
BLOB_CHOICE
C_OUTLINE
C_BLOB
WERD_CHOICEWERD_RES
ROW_RES
BLOCK_RES
BLOBNBOX
TO_ROW
TO_BLOCKWorkingPartSet
ColPartition
TPOINT
EDGEPT
TWERD
TESSLINE
TBLOB
Layout (old) Layout Normalized outlines
ResultsCore page outlines
![Page 8: 2 architecture anddatastructures](https://reader033.vdocuments.site/reader033/viewer/2022042819/55ce7a3fbb61eb30498b4588/html5/thumbnails/8.jpg)
Tesseract Tutorial: DAS 2014 Tours France
Software Engineering - Building Blocks
UNICHARSET
GenericVector ELIST CLIST
STRING
TBOX
FCOORDICOORD
ContainersCoordinates
Text
![Page 9: 2 architecture anddatastructures](https://reader033.vdocuments.site/reader033/viewer/2022042819/55ce7a3fbb61eb30498b4588/html5/thumbnails/9.jpg)
Tesseract Tutorial: DAS 2014 Tours France
Key Parts of the Call HierarchyTessBaseAPI::Recognize
Tesseract::SegmentPage
Tesseract::classify_word_and_language
Tesseract::recog_all_words
Textord::TextordPageTesseract::AutoPageSeg
Classify::AdaptiveClassifier LanguageModel::UpdateState
Tesseract::chop_word_main
Wordrec::SegSearch
![Page 10: 2 architecture anddatastructures](https://reader033.vdocuments.site/reader033/viewer/2022042819/55ce7a3fbb61eb30498b4588/html5/thumbnails/10.jpg)
Tesseract Tutorial: DAS 2014 Tours France
Tesseract’s List Implementation
● Predates STL● Allows control over ownership of list elements● Uses nasty macros instead of templates
![Page 11: 2 architecture anddatastructures](https://reader033.vdocuments.site/reader033/viewer/2022042819/55ce7a3fbb61eb30498b4588/html5/thumbnails/11.jpg)
Tesseract Tutorial: DAS 2014 Tours France
List Example
tordmain.cpp:float Textord::filter_noise_blobs( BLOBNBOX_LIST *src_list, // original list BLOBNBOX_LIST *noise_list, // noise list BLOBNBOX_LIST *small_list) { // small blobs BLOBNBOX_IT src_it(src_list); // iterators BLOBNBOX_IT noise_it(noise_list); BLOBNBOX_IT small_it(small_list); for (src_it.mark_cycle_pt(); !src_it.cycled_list(); src_it.forward()) { blob = src_it.data(); if (blob->bounding_box().height() < textord_max_noise_size) noise_it.add_after_then_move(src_it.extract()); else if (blob->enclosed_area() >= blob->bounding_box().area() * textord_noise_area_ratio) small_it.add_after_then_move(src_it.extract()); }
blobbox.h:class BLOBNBOX : public ELIST_LINK {…};// Defines classes:// BLOBNBOX_LIST: a list of BLOBNBOX// BLOBNBOX_IT: list iteratorELISTIZEH(BLOBNBOX)
blobbox.cpp:// Implementation of some of the// list functions.ELISTIZE(BLOBNBOX)
![Page 12: 2 architecture anddatastructures](https://reader033.vdocuments.site/reader033/viewer/2022042819/55ce7a3fbb61eb30498b4588/html5/thumbnails/12.jpg)
Tesseract Tutorial: DAS 2014 Tours France
TessBaseAPI : Simple example
Main API class provides initialization, image input, text/hOCR/PDF output:TessBaseAPI api;api.Init(NULL, “eng”);Pix* pix = pixRead(“phototest.tif”);api.SetImage(pix);char* text = api.GetUTF8Text();printf(“%s\n”, text);delete [] text;pixDestroy(&pix);
![Page 13: 2 architecture anddatastructures](https://reader033.vdocuments.site/reader033/viewer/2022042819/55ce7a3fbb61eb30498b4588/html5/thumbnails/13.jpg)
Tesseract Tutorial: DAS 2014 Tours France
TessBaseAPI : Multipage example
TessBaseAPI api;api.Init(NULL, “eng”);tesseract::TessResultRenderer* renderer = new tesseract::TessPDFRenderer(api.GetDatapath());api.ProcessPages(filename, NULL, 0, renderer);const char* data;inT32 data_len;if (renderer->GetOutput(&data, &data_len)) { fwrite(data, 1, data_len, fout); fclose(fout);}
![Page 14: 2 architecture anddatastructures](https://reader033.vdocuments.site/reader033/viewer/2022042819/55ce7a3fbb61eb30498b4588/html5/thumbnails/14.jpg)
Tesseract Tutorial: DAS 2014 Tours France
ResultIterator for getting the real details
ResultIterator* it = api.GetIterator();do { int left, top, right, bottom; if (it->BoundingBox(RIL_WORD, &left, &top, &right, &bottom)) { char* text = it->GetUTF8Text(RIL_WORD); printf("%s %d %d %d %d\n", text, left, top, right, bottom); delete [] text; }} while (it->Next(RIL_WORD));delete it;
![Page 15: 2 architecture anddatastructures](https://reader033.vdocuments.site/reader033/viewer/2022042819/55ce7a3fbb61eb30498b4588/html5/thumbnails/15.jpg)
Tesseract Tutorial: DAS 2014 Tours France
Thanks for Listening!
Questions?