The Document Recognition activity enables you to extract document data using the Smart Engines method. You can recognize .pdf or .jpg format files. You can set context variables that will store the recognized text and use them in the process. Read more about context variables in thу "Process context" article.
To open the settings window, click on the activity on the process diagram.
The Parameters tab displays the basic activity parameters.
- Name. The activity name on the process diagram. It is set by the template when adding the activity. If you want to change the name, you can do it in this field;
- Recognition method. Select recognition method.
Document recognition using Smart Engines
To extract data from official documents (passport, birth certificate, visa, etc.), you can use the Smart Engines module. It is a tool for recognizing passports and other identification documents in Russia and other countries.
By default, it is available as a trial version for information only. Some characters of the recognized data are hidden. To activate a full version, you need to purchase a license. For more information, contact us: firstname.lastname@example.org.
After selecting the Smart Engines method, fill in the following fields.
- Document. A document you want to extract data from. This field supports only File variables.
- Country. Select a country for a document;
- Recognized document type. Select a document type (visa, passport, birth certificate, TIN, etc.).
To set the data extracted from a document and the variables that will store it, click the Assign Variables button. The attributes of the selected document type will be displayed. Each document has its own set of attributes.
Select a context process variable to save data for each attribute. You can also create a new variable by selecting Create parameter from the drop-down list. To delete a variable, click the icon.
The window displays the recognition accuracy of the selected attribute. It determines the minimum acceptable accuracy (confidence) and depends on a lot of aspects. One of the most important aspects is the quality of the document image. For example, you set 90 percent (0.9). This means that if the recognition accuracy is 90 percent or more, values can be recognized and accepted. If the accuracy is less than 90 percent, the value is not accepted. The context variable will not be filled.
If the required accuracy is low, it is more likely to get incorrect data. For that reason, you need to be careful in selecting this parameter. Pay attention primarily to the document quality. If you are sure that it is high, you can set the accuracy rates of 97 percent or more. If the quality is slightly lower, it is better to set 94 percent. If it is low, set about 90 percent or less.
If data are not recognized, the process will not move further, and an error will occur. To avoid it, create an escalation for this activity. Read more about escalation in the "Creating escalation" section of the "Execution flow" article.
You can search for document attributes by their names. To do this, start entering a name in the search bar. The search results will immediately be displayed in the table.
The Extracted data block displays all the selected attributes and context variables.
Read more about the Conditions tab in the "Basic activity settings principles" article.