Scanning and Text Recognition (OCR)

You can add paper documents to M-Files with a scanner. To use the scanning features, in the classic M-Files Desktop, press Alt and select Operations > Scanning and Text Recoginition (OCR). When a document scan is complete, M-Files suggests the scanned file to be converted to a searchable PDF with optical character recognition (OCR).

M-Files automatically suggests the character recognition if you drag and drop an image to M-Files. You can also convert non-searchable PDFs and images to searchable PDFs manually with the context menu of the file.

You can use optical character recognition with these file formats:

TIF
TIFF
JPG
JPEG
BMP
PNG
PDF

TIFF files that use an alpha channel or JPEG compression are not supported.

Important information for admins

Note: When you use the OCR feature in M-Files on a signed PDF, the entire document is rewritten. Because digital signatures validate the content, any edits made by OCR will invalidate the existing signature. This can result in the signature's removal.

The OCR features in M-Files do not support mass operations. They are meant for conversions of a small number of files at a time.
For information about network scanning, see Scanner Sources.
The scanner integration uses TWAIN and WIA technologies. Only scanners with a TWAIN or WIA driver are supported.
System administrators can change settings for scanning and optical character recognition in Advanced Vault Settings. The settings are in the section Configuration > Scanning & OCR.
If text recognition is done to an image that is not saved to M-Files, the file is saved as a PDF. Otherwise, you can find the original image file in the version history of the object.

Importing Image Files as Searchable PDFs

To import a picture file to the vault as a searchable PDF:

Drag and drop an image file to the classic M-Files Desktop.
Optional: In the Conversion to Searchable PDF dialog, check the Use automatic language detection checkbox to set M-Files to automatically detect the document language.
Optional: In the Conversion to Searchable PDF dialog, click Advanced to improve the quality of the text recognition by selecting primary and secondary language options to match the language used in the image.
Opening the advanced options disables the option to use automatic language detection.
Click Convert to start the conversion.
Once the conversion is complete, the New Document dialog appears. Finish importing the image by filling in the metadata and clicking Create.

The image file is imported to to the vault as a searchable PDF, allowing you to locate it by using the M-Files search functions.

Converting an Image File Stored in M-Files to a Searchable PDF

In M-Files, locate the image file that you want to convert to a searchable PDF.
Right-click the file and select Scanning and Text Recognition (OCR) > Convert to Searchable PDF from the context menu.
Optional: In the Conversion to Searchable PDF dialog, check the Use automatic language detection checkbox to set M-Files to automatically detect the document language.
Optional: In the Conversion to Searchable PDF dialog, click Advanced to improve the quality of the text recognition by selecting primary and secondary language options to match the language used in the image.
Opening the advanced options disables the option to use automatic language detection.
Click Convert to start the conversion.

The image file is converted into a searchable PDF and any textual content in the image can be found using the search functions of M-Files.