Server

Use Server for running continuous recognition on dedicated computer. Server comes with built-in web interface to setup configurations for individual box projects.

Run rserver program to start server and gain access to web interface.

Server program is command line utility called rserver. It accepts few parameters displayed in the screen below. You can get full list of parameters by simply running rserver with --help. Default port number for the web interface is 80. It is common port number and may conflict with other web applications on the computer. You can supply -port parameter to change to other ports. Such as port 8000.

Server binds to local address by default. This is safe option since web interface is not accessible over the network. If you want to access server configuration over the network set parameter -local=false when starting server.

Server run parameters.

Processing log messages get produced into the standard output. If you want log output saved into the file simply use standard redirect available on all operating systems.

When server starts it reads all the project files at the path set via -config parameter. On Windows this defaults to Documents\forms folder. Server creates this folder if it does not exist.
Project box files also get loaded at that time. If you have any projects created with Boxes Editor simply copy them into forms folder. For each project file server creates input subfolder named same as project file.

Server processing and configuration path.

In the example screenshot above our project box files are cms1500.box, dentist.box and german.box. Our input subfolders for images are cms1500, dentist and german. All recognition output will get produced into _output subfolder.

Directory structure is created and maintained by the server. Use web interface to setup recognitions and copy your images into specific subfolder for processing. Once image has been copied into input folder it may take up to 15 seconds for the server to pickup the image. Input images are deleted once processed.

Server setup for specific recognitions.

Here we access server configuration using "localhost" inside the web browser. This implies port 80. If you have -port parameter set to 8000 then inside web browser enter "localhost:8000" to access configuration.

File Copy

When image file is copied into folder for processing server waits few seconds for copy to complete and then starts OCR on the input file. If you are using web server to upload image files or copying files over the network right into input folder you might get into data race. Situation when one process still copying the file while server process has already started OCR.
This is common when incoming file upload is going over slow Internet connection.

There is simple way to avoid data races: upload network files into temporary folders on disk on the server computer and then move them into rserver's input folder for OCR. Moving local files takes split second and will not cause data races.

Archive

Server utility does not archive incoming image files. Once file is picked up and processed it is deleted. If input files would not get deleted server would keep on reprocessing the same file again and again.
If you want to archive image files have them copied before you send files into server's input folders.

Multiple Servers

If you are processing very large volumes of documents it is possible to run multiple instance of rserver on one computer.
Both -config and -port parameters should be different for each rserver process. If you pass same -config and -port parameters second rserver instance will fail opening port for web interface.

Document Classifier

Can Alpha Forms be "trained" to recognize different consistent input file layouts, (eg "PO from customer A", "PO from customer B", etc)?
Then process all images from single input folder.

There is no document classifier at this time. It might be added in the future.

While it sounds like great feature to have there is fundamental problem with auto-classification: sometimes forms are very close in they features and can not be auto-classified properly. OCR engines may give incorrect result due to bad quality of incoming images. We feel that adding document classifier in to the mix simply reduces accuracy with little benefit.

Usually when images come in they origin and type is known. Images simply have to be copied into proper folder for processing by rserver.

If you are building front end for the recognition server simplest solution is to have drop down list with all document types next to upload button.
Based on the selected value of the drop down list simply copy image to proper folder for recognition server to pick it up. Then there is no need for image auto-classification and less chance for critical errors to occur.

Next