Scanning and Text Recognition (OCR)

Paper documents can be added to M-Files with a scanner. For more information on network scanning, see Scanner Sources. To access the scanning functions, press the Alt key and then open the Operations menu.

Note: Scanner integration in M-Files Desktop uses the TWAIN and WIA technologies. Only scanners that can be equipped with a TWAIN or WIA driver are supported.

When scanning is completed, M-Files suggests the scanned file to be converted to a searchable PDF with optical character recognition (OCR). You can also specify advanced settings for the character recognition.

You can also convert an image file to a searchable PDF. Optical character recognition is performed on the image file to enable full-text searching across the file. After the conversion, you can find, for example, a contract document converted from an image by performing a search using the names of the contracting parties or any other text included in the original image file.

M-Files also automatically suggests the character recognition if you drag an image file to M-Files. M-Files does not suggest the character recognition for PDF files, because performing the optical character recognition on an already searchable PDF reduces the quality and increases the size of the PDF file. You can convert non-searchable PDF files into searchable PDF files manually with the context menu of the PDF file.

Optical character recognition can be performed on the following file formats:
  • TIF
  • TIFF
  • JPG
  • JPEG
  • BMP
  • PNG
  • PDF
TIFF files using an alpha channel or JPEG compression are not supported.
Note: If text recognition is performed on an image file which was not saved and returned to M-Files, the file will only be saved as a PDF. Otherwise, the original image file can be found in the document version history.

Importing Image Files as Searchable PDFs

To import a picture file to the vault as a searchable PDF:

  1. Drag and drop an image file to M-Files.
  2. Optional: In the Conversion to Searchable PDF dialog, check the Use automatic language detection checkbox to set M-Files to automatically detect the document language.
  3. Optional: In the Conversion to Searchable PDF dialog, click Advanced to improve the quality of the text recognition by selecting primary and secondary language options to match the language used in the image.
    Opening the advanced options disables the option to use automatic language detection.
  4. Click Convert to start the conversion.
  5. Once the conversion is complete, the New Document dialog appears. Finish importing the image by filling in the metadata and clicking Create.
The image file is imported to to the vault as a searchable PDF, allowing you to locate it by using the M-Files search functions.

Converting an Image File Stored in M-Files to a Searchable PDF

  1. In M-Files, locate the image file that you want to convert to a searchable PDF.
  2. Right-click the file and select Scanning and Text Recognition (OCR) > Convert to Searchable PDF from the context menu.
  3. Optional: In the Conversion to Searchable PDF dialog, check the Use automatic language detection checkbox to set M-Files to automatically detect the document language.
  4. Optional: In the Conversion to Searchable PDF dialog, click Advanced to improve the quality of the text recognition by selecting primary and secondary language options to match the language used in the image.
    Opening the advanced options disables the option to use automatic language detection.
  5. Click Convert to start the conversion.
The image file is converted into a searchable PDF and any textual content in the image can be found using the search functions of M-Files.