OCR text and handwritten forms using Captricity. Captricity's big advantage over Abbyy Cloud OCR is that it allows the user to easily specify the position of text-blocks that want to OCR; they have a simple web-based UI. The quality of the OCR can be checked using compare_txt from recognize.
To get the latest version on CRAN:
install.packages("captr")To get the current development version from GitHub:
install.packages("devtools")
devtools::install_github("soodoku/captr", build_vignettes = TRUE)Read the vignette:
vignette("using_captr", package = "captr")or follow the overview below.
Start by getting an application token and setting it using:
set_token("token")Then, create a batch using:
create_batch("batch_name")Once you have created a batch, you need to get the template ID (it tells Captricity what data to pull from where). Captricity requires a template. These templates can be created using the Web UI.
set_template_id("id")Next, assign the template ID to a batch:
set_batch_template("batch_id", "template_id")Next, upload image(s) to a batch
upload_image(batch_id="batch_id", path_to_image="image_path")Next, check whether the batch is ready to be processed:
test_readiness(batch_id="batch_id")You may also want to find out how much would processing the batch set you back by:
batch_price(batch_id="batch_id")Once you are ready, submit the batch:
submit_batch(batch_id="batch_id")Captricity excels in nomenclature confusion. So once a batch is submitted, it is then called a job. The id for the job can be obtained from the list that is returned from submit_batch. The field name is related_job_id.
To track progress of a job, use:
track_progress(job_id ="job_id")List all forms (instance sets) associated with a job:
list_instance_sets(job_id="job_id")If you want to download data from a particular form, use the list_instance_sets to get the form (instance_set) id and run:
get_instance_set(instance_set_id="instance_set_id")Get csv of all your results from a job:
get_all(job_id="job_id")Scripts are released under the MIT License.
The project welcomes contributions from everyone! In fact, it depends on it. To maintain this welcoming atmosphere, and to collaborate in a fun and productive way, we expect contributors to the project to abide by the Contributor Code of Conduct.