Data Capture | ABBYY FlexiCapture
ABBYY FlexiCapture is highly accurate and scalable document imaging and data extraction software that automatically transforms documents of any structure, language or content into usable and accessible business-ready data.
Intelligent self-learning classification and state-of-the-art recognition technologies enable FlexiCapture to replace error-prone manual processes with automatic document classification and processing.
Flexible and customizable, FlexiCapture can handle virtually all document processing scenarios and can be tailored to any company’s workflows and regulations.
Why ABBYY FlexiCapture?
Software for Document-driven Business Processes
One system for processing all kinds of paper documents in any industry
Intelligent Auto-learning Technology Makes Set Up Easy
Interactive training technology simplifies system implementation and set up.
Mobile Document Capture
FlexiCapture’s mobile capture client provides an alternative entry point for documents – usable at anytime, from anywhere.
Take the data. Leave the paper.
Product Highlights
Auto Document Classification
- Automatically separate and classify documents regardless of how they were imported into the system
- No limitation on classification rules
- Advanced scalability for high-volume data and document capture across enterprises
Accurate Data Extraction with Table Extraction Capability
- Automatic columns and rows identification and data extraction
- Simple setup through point-and-click and auto-regex
- Ability to span pages and extract multi-page table or invoice data
- Ability to extract line data that is not in a table format
- Tolerance for movement of the table on the page due to differing DPI and/or page registration issues on scanning
- Ability to fine tune extraction methods using custom regex for value pattern matching
Workflow Auto-Processing
- Automatic processing on documents including import, document classification, recognition and data extraction, and export.
- Flexible workflow that can be easily adjusted to specific business processes
- Support double verification by two independent operators
Easy Integration with Existing Business Applications
Import Options
FlexiCapture provides import from:
- Scanning device (TWAIN, ISIS, WIA)
- Watched folder (local or LAN)
- FTP server
- E-mail attachments from MS Exchange or POP3 mail servers
Supported file formats on import:
- PDF, BMP, JPEG, JPEG 2000, TIFF, DjVu, PNG, PCX, DCX
Export Options
FlexiCapture provides export to:
- Files
- SharePoint 2003/2007/2010/2013
- ODBC-compatible databases
- any ERP system and invoice approval workflow
- any external application by using custom script-modules
Supported file formats on export:
Data Output Formats: .XML, .TXT, .XLS, .DBF, .CSV.
Image Output Formats: PDF (Image only, text under image), PDF/A (Image only, text under image), TIFF, JPEG, JPEG2000, PCX, BMP, PNG, DCX.
A Single Solution for All Document Types
Speed up business processes by using automated data entry software and eliminating time- and resource-consuming manual data entry. The intelligent capture algorithms enable the system to process any kind of document: invoices, contracts, registration forms and more.
Capture data from any documents, from structured forms to unstructured text-heavy papers.
How It Works?
1. Flexible Import Options
- Scanning device (TWAIN, ISIS, WIA)
- Watched folder (local or LAN)
- FTP server
- E-mail attachments from MS Exchange or POP3 mail servers
2. Supported File Formats on Import
- PDF, BMP, JPEG, JPEG 2000, TIFF, DjVu, PNG, PCX, DCX
3. Scanning Station
FlexiCapture Scanning Station enables easy scanning with any TWAIN-, ISIS- and WIA-enabled device. Available in thick and thin client versions.
4. Scanning Profiles
Scanning Station features scanning profiles, which enable pre-defined settings for applications to be applied to specific batches. When scanning a new set of documents, the user needs only to choose the right profile from a drop-down menu.
5. Image Improvement
Pre-loaded or scanned images can be improved before processing using features that include rotation, deskewing, hide sensitive data and more.
1. Automatic Assembly of Multi-page Documents from a Mix of Pages
This may rely on separators (blank pages inserted between two documents), page counters or advanced ABBYY classification algorithms – that enable automatic detection of pages belonging to different documents.
2. Automated Image-base Classification
- Content-based classification
- Rule-based classification
- Any combination of above
3. Highly Accurate OCR/ICR/OMR and Barcode Recognition
- Optical character recognition of printed text in up to 190 languages
- Intelligent character recognition for hand-printed text in over 110 languages
- Barcode recognition for a variety of 1D and 2D barcodes
- Optical mark recognition for a wide range of checkmarks
4. Automatic Validation
- Comparison against databases
- Conformity with built-in validation rules
- Compliance with format
- Data normalization
- Application of other user-defined checks
5. Support Many Recognition Languages
- 43 main languages with dictionary support: Arabic (Saudi Arabia), Armenian (Eastern), Armenian (Grabar), Armenian (Western), Azeri (Latin), Bashkir, Bulgarian, Catalan, Croatian, Czech, Danish, Dutch, Dutch (Belgian), English, Estonian, Finnish, French, German, German (new spelling), Greek, Hebrew, Hungarian, Indonesian, Italian, Latvian, Lithuanian, Norwegian, Norwegian (Bokmal), Norwegian (Nynorsk), Polish, Portuguese, Portuguese (Brazilian), Romanian, Russian, Slovak, Slovenian, Spanish, Swedish, Tatar, Thai, Turkish, Ukrainian, Vietnamese;
- 133 additional languages without dictionary support: Abkhaz, Adyghe, Afrikaans, Agul, Albanian, Altai, Avar, Aymara, Azerbaijani (Cyrillic), Basque, Belarusian, Bemba, Blackfoot, Breton, Bugotu, Buryat, Cebuano, Chamorro, Chechen, Chukchee, Chuvash, Corsican, Crimean Tatar, Crow, Dargwa, Dungan, Eskimo (Cyrillic), Eskimo (Latin), Even, Evenki, Faroese, Fijian, Frisian, Friulian, Gagauz, Galician, Ganda, German (Luxembourg), Guarani, Hani, Hausa, Hawaiian, Icelandic, Indonesian, Ingush, Irish, Jingpo, Kabardian, Kalmyk, Karachay-balkar, Karakalpak, Kasub, Kawa, Kazakh, Khakass, Khanty, Kikuyu, Kirghiz, Kongo, Koryak, Kpelle, Kumyk, Kurdish, Lak, Latin, Lezgi, Luba, Macedonian, Malagasy, Malay (Malaysian), Malinke, Maltese, Mansi, Maori, Mari, Maya, Miao, Minangkabau, Mohawk, Moldavian, Mongol, Mordvin, Nahuatl, Nenets, Nivkh, Nogay, Nyanja, Ojibway, Ossetian, Papiamento, Provencal, Quechua, Rhaeto-Romanic, Romany, Rundi, Russian (Old Spelling), Rwanda, Sami (Lappish) , Samoan, Scottish Gaelic, Selkup, Serbian (Cyrillic), Serbian (Latin), Shona, Sioux (Dakota), Somali, Sorbian, Sotho, Sunda, Swahili, Swazi, Tabasaran, Tagalog, Tahitian, Tajik, Tok Pisin, Tongan, Tswana, Tun, Turkmen, Tuvinian, Udmurt, Uigur (Cyrillic), Uigur (Latin), Uzbek (Cyrillic), Uzbek (Latin), Welsh, Wolof, Xhosa, Yakut, Yiddish, Zapotec, and Zulu;
- 5 East Asian languages: Chinese (Traditional, Simplified), Japanese, Korean and Hangul (Korean);
- 6 languages for recognition of old European documents and Gothic fonts in books printed in 18-20th centuries
- English,
- French,
- German,
- Italian,
- Spanish,
- Latvian;
- 4 artificial languages: Esperanto, Ido, Interlingua, and Occidental;
- Digits
- 1D Barcodes
- Code 39, Check Code 39, Interleaved 25, Check Interleaved 25, EAN 13, EAN 8, Code 128, Codabar, Code 93, IATA 25, UCC-128, UPC-A, UPC-E, Matrix 2 of 5, Industrial 2 of 5, PostNet, Patch code (1, 2, 3, 4, T/Transfer, 6)
- 2D Barcodes
- PDF 417, Aztec, Datamatrix, QR code
- Multiple Text Types
- Typographic, Handprinted, Typewriter, Matrix printer, Index, OCR-A, OCR-B, MICR (E13B), MICR (CMC7)
1. Group Verification
Group verification for checkmarks and digits is applied across documents in form recognition projects. Identical figures (signs) from an entire document batch are displayed together.
2. Field Verification
Field verification checks data fields one by one.
3. Verification in Document Window
Recognition results of all required data fields are viewed simultaneously and compared with the original image. Information that is not successfully recognized, such as handwritten text or notes, can be typed manually into the fields.
Verification is available in thick and thin client versions. Verification is an optional stage and can be skipped.
1. Multi-Export Destinations
ABBYY FlexiCapture enables multiple destinations for data and images as well as generation of searchable PDFs.
2. Flexible Export Options
FlexiCapture provides export to:
- Files
- SharePoint 2003/2007/2010/2013
- ODBC-compatible databases
- Any ERP system and invoice approval workflow
- Any external application by using custom script modules
FlexiCapture supports export to following file formats:
Data Output Formats: .XML, .TXT, .XLS, .DBF, .CSV.
Image Output Formats: PDF (Image only, text under image), PDF/A (Image only, text under image), TIFF, JPEG, JPEG2000, PCX, BMP, PNG, DCX.
1. Web-based Administration and Monitoring Console
FlexiCapture includes a web-based Administration and Monitoring Console that enables 24/7 supervision from any location. An administrator can easily manage user rights, check event logs, view standard reports or generate custom performance reports.
2. E-mail Alerts
Administrators can opt to receive e-mail alerts for important events like errors, license expiration and page count limits. Administrators can also be notified about imminent database overflow, running out of disk space, requests for access rights, or failed attempts to log in.
- Custom processing stages
- Connection to additional OCR/ICR engines
- Use of third-party image enhancement tools
- Use of custom verification clients
- Connection to signature comparison and other external modules
Plus, the Web Service API ensures easy integration of FlexiCapture into many business applications and workflows as an automatic document classification and data capture service.