Form Recognizer is one of the Microsoft’s cognitive services which applies advanced machine learning to accurately extract text, key/value pairs, and tables data from typed or handwritten documents.
Functions of form recognizer:
With composed models, we can create a single model ID by combining multiple models each corresponding to a specific type of form. When we submit a document to the composed model, the form recognizer service performs a classification step to find the type of the forms and routes it to the corresponding custom model.
Mesh 3.0- an employee experience platform has utilized the capabilities of Form Recognizer to create a Document Management System where the users can submit any type of form. The application will categorize the document type and extracts the associated key information from the form, allowing users to search for the information across the documents.
Layout API:
Layout API is used as a part of the custom models to detect and identify text, tables, selection marks, and structure information from documents (PDF, TIFF) and images (JPG, PNG, BMP).
The JSON returned by the Layout API contains the following nodes-
Prebuilt models:
The prebuilt models support receipts like sales receipts from Australia, Canada, Great Britain, India, and the United States. It also supports business cards, identity documents, invoices in various formats, and can extract key information from world-wide passports and US driver licenses.
Let’s see an example of prebuilt model that extracts information of a US driver’s license.
1. Use the open-source labelling tool, part of the Form OCR Test Toolset (FOTT)
2. To work with prebuilt models, click on “Use prebuilt model to get data”
3. Go to the Form Recognizer resource created in the azure portal, get the Form recognizer service endpoint and API key present in the Keys and Endpoint tab.
4. Provide the Form recognizer service endpoint, API key and the form type that we are going to analyze. In our case it is ID and chose the file for analysis.
5. Once we click on Run analysis, the data gets extracted in the form of key value pairs.
Custom models
With the help of custom models, we can analyze various forms of our interest. We just need five sample forms of same type to train the custom model using labelled or unlabeled data.
Train a Custom model
1. Use the open-source labelling tool, part of the Form OCR Test Toolset (FOTT) where we can label our custom forms, train a model, and can analyze the form using the model
2. Enable Resource sharing (CORS) for the storage account by clicking on the CORS tab and fill the values as the following
3. Create an Azure blob storage container and upload the forms and generate the SAS URI for the container by selecting all the permissions as shown
4. To create connection to the azure storage container go to connections page and click on add icon. Provide a display name and the SAS URL generated in the previous step.
5. Create new Project
6. On the creation of project the forms will get displayed in the Tags editor tab and click on Run OCR on all files on the left side which detects the text and tables in all the documents.
7. Label the forms by creating the tags and then add the associated text. Specify the tag type and format to get better results.
8. Train the custom model by navigating to train page and click on the “train” button. Provide a name to the model for better identification in the compose models tab and examine the average accuracy. If it is low, then train additional forms.
9. Compose a model by clicking on the “Compose” icon and select the model IDs you want to compose into a single model and click on “Compose” in the upper left corner and give a name to the composed model.
When the operation completes, your newly composed model will appear in the list.
10. Any form that is submitted to the composed model goes through the classification step which matches to the corresponding model ID.
Tip: To avoid the failure in classification, label the forms with a greater number of tags.
1. Click on share project icon present on the right-side top
2. Create connection to the same blob storage container to restore the project.
3. Go to main page and click on “Open Cloud Project” and paste the shared project token.
As we have learnt the steps to create custom and composed models, we can use them in Form Recognizer applications to build automated data processing software.
Kovida Vegi is a Software Engineer at Acuvate. She is a part of the Mesh and SharePoint development team. With her determined focus on our mission and ready to accept challenge attitude, she has made outstanding contributions in developing Document Management System.
Kovida Vegi