Release 10.x.x

Description

 

Current release: 10.4.0 / Apr 17, 2023

Initial release date: Mar 27, 2023

 

Release 10 of the Acodis IDP Platform adds:

  • Per-Category OCR engine setting

  • Improved transformations for data fields

  • Dynamic metadata schema items

  • Global data retention policy setting

 

New features

  • Support for Single Sign-On using OpenID Connect

  • Support for configurable data retention strategies (via API)

  • Exporting confidence scores for shared model predictions

  • Retrainable figure step

Usability Improvements

  • Improved support for long-running publish operations

  • Improved user experience for exhaustive dropdowns

  • Status filtering in production listing

  • Improved support for nested listing representations

New Features

Configure Import Module OCR per Category

It is now possible to override the OCR configuration at category level:

Options

OCR-on-Demand - If no glyphs are found in the document, OCR is performed by the platform using the configured Azure Cognitive OCR service.

Always OCR - Existing glyphs are discarded and a forced OCR is performed by the platform using the configured Azure Cognitive OCR service.


Updated data field transformations

To facilitate data field transformations without previous Regular Expressions knowledge, two new transformation types were added:

  • Numeric Correction

  • Text Replace

Furthermore, the existing Regex transformation functionality was extended to support discarding parts from the input that don’t match the specified Regex.

Multiple transformations of the these three types can be created and chained to be run on the input in the order they’re set up. Thus, the effect of this chain is cumulative.

Numeric Correction transformation

Numeric correction removes non-numeric characters and/or allows to specify a custom decimal separator. The latter improves parsing accuracy, especially in cases where OCR has a hard time telling dots from commas.

Ex: Remove currency from value

Ex: Define comma (,) as decimal separator (instead of the dot '.')

Text Replace transformation

Performs a simple find and replace operation on the input text.

Regex transformation

Performs a Regex selection and replacement on the input text with the possibility to discard non-matching parts.

Transformations chaining

All operations mentioned above can be chained and repeated to achieve a complex transformation goal in simple steps. They will be executed in the same order in which they are placed in the schema editor and will pass the transformed value from one to the other.

Ex:


Dynamic Metadata Schema Items

Dynamic metadata is data not originating in the document itself, but rather information that is being produced dynamically as the document is being processed. This information includes data such as the Transaction ID, Original Filename, Workflow ID, Workflow Name and Category ID. The newly introduced functionality allows to use this metadata when configuring the schema, thus enabling you to include transaction metadata as part of the export.

To specify what metadata shall occur where, you can simply add a regular field to the schema and set its default value to include the metadata you need. This field will behave as a regular field during processing, with the exception that the selected metadata value will be dynamically filled-in if left blank.

Available Metadata Fields and Schema Configuration example

The following example adds a field “Reviewed By” with the export key Reviewed_By to the schema. Its value for each transaction in the export will be the first- and last name of the user that reviewed the corresponding transaction.

Training and Production Schema Preview

JSON export preview

{ "Total": 3254.89, "Metadata": { "Workflow_Id": "64", "Worflow_Name": "Insights Meeting 20.03.2023", "Category_Id": "c0", "Category_Name": "Category A", "Transaction_Id": "tx:20230320092009885", "File_Name": "101-Invoice.pdf", "Created": "2023-03-20T09:20:09.8759498+00:00", "Reviewed": "2023-03-20T09:21:18.0000000+00:00", "Reviewed_By": "Goncalo Trindade" } }

Setting Global Retention Data Policy via UI

A new settings tab named “Data retention” is now available to configure the Global Data Retention Policy. It allows administrators to define for how long and at which state transactions shall be kept on the platform. For instance, the default configuration of this setting specifies that all processed transactions will be removed after 30 days:

The user can specify the following parameters:

  • Selectable retention days limit is between 1 and 180 days.

  • Selectable transaction states are Exported, Failed, In Review and Processed.

The deletion of transactions is being triggered by a periodic background task, which by default runs twice a day.


Export confidence scores in Expert Mode

In Expert Mode, the Export Step now contains an option to export confidence scores for predicted annotations when available.

When set to Yes, the field’s confidence is added in the export XML as an attribute named confidence.

Example:

<section level="7" page="1" left="0.69931" top="0.24183" width="0.09032" height="0.01595" confidence="0.52"> <header>Header row </header> <p page="1" left="0.79381" top="0.24183" width="0.00937" height="0.01595">3 </p> </section> <section level="7" page="1" left="0.12851" top="0.25838" width="0.00992" height="0.01595" confidence="0.99"> <header>b </header> </section> <section level="7" page="1" left="0.50899" top="0.25838" width="0.00970" height="0.01595" confidence="0.86"> <header>d </header> <p page="1" left="0.69931" top="0.25838" width="0.00564" height="0.01595">f </p> </section>

Watermark Correction in Expert Mode

In Expert Mode, the Azure Cognitive OCR Resource now contains an option to select a Watermark Correction type together with a correction value for the parameter associated with the selected type.

There is currently no preview availably showing the result of the correction.

Available Correction Types:

Correction Type

Correction Parameter allowed values

 

Correction Type

Correction Parameter allowed values

 

Gamma

>0

Values below 1 whiten the image.

Values above 1 make colors purer.

Values below zero make the OCR fail.

Grayscale Gamma

>0

Values below 1 whiten the image.

Values above 1 darken the image.

Values below zero make the OCR fail.

Grayscale Global Binary Threshold

>0 and < 255

Values below 0 will do nothing.

Values above 255 will turn the image full white.

Grayscale Mean Global Binary Threshold

>0

Values below 0 will do nothing.

Values far greater than 1 will perform an extreme whitening of the page.

For as long as somewhat gray text is not to be extracted, this transformation, using the average brightness of the page, has proven the most appropriate for blanket usage, with a parameter somewhere between 0.5 and 0.8


User Experience and User Interface Improvements

Multiple page selection for Page Exclusion

In the page thumbnail view to the left of the page view, it is now possible to enter the multi-page selection mode by pressing the Shift key and clicking on a page.

By keeping Shift pressed, it is possible to select a range of sequential pages - by releasing Shift, it is possible to select multiple non-sequential pages.

This feature speeds up page exclusion considerably.


Improvements and Shortcuts to Interacting with Annotations

Creating annotations

In the page view, there are a few ways to create annotations combining clicking, dragging as well as using modifier keys. Here is a comprehensive list of options:

 

Depending on the selection mode, the selectable elements may be words or paragraphs

 

Document element annotation

 

 

 

  • CLICKing on a document element:
    An area annotation (selecting elements based on their overlap with the selection rectangle) will be drawn around the clicked word or paragraph

 

  • Holding the ALT button and CLICKing on an element:
    A character annotation (selecting elements based on their occurrence between a start- and end character) will be drawn

 

 

Freely drawn annotation

 

 

 

  • Drawing a DRAG SELECTION by pressing the mouse button down, moving the pointer and releasing the button:
    Creates an area selection

 

 

  • Holding the ALT button while drawing a SELECTION :
    Creates a character selection from the closest characters to the start and end point of the selection.

 

Creating multiple annotations with the same label

 

 

 

  • By using one of the above methods AND pressing the CTRL key, additional annotations can be added (using a common label).

  • The process ends when a label is selected and modifications are Saved.

 

Creating table annotations

 

 

 

  • Can be drawn like an area annotation then a table label needs to be assigned, this will convert the annotation to a table annotation.

  • Hot-corners may be used to convert recognized tables in the document to table annotations.

 

Interacting with annotations

  • An annotation can be selected by CLICKing on it.

  • Using SHIFT + CLICK, additional annotations can be added to the selection.

  • Holding SHIFT and drawing a DRAG SELECTION will add all annotations under the selection rectangle to the annotation selection.

  • Annotations have a quasi-layering (z-order). When an annotation is hovered, it will be moved to the top,
    preventing annotations below the currently hovered one to hinder interaction with the current one.

  • Annotations in the background that are fully covered by another annotation may not be accessible directly, but SHIFT+SELECTION could still help selecting them.

Deleting annotations

 

 

 

 

 

 

  • When a single annotation is selected, clicking on the delete button will remove that annotation.

 

 

  • When multiple annotations are selected a number badge next to the delete button warns that deletion will affect more than one annotation.

  • Clicking on the button will change its shape and text and require an additional click to confirm the multi-deletion.

  • If the button is not clicked (or hovered) for 3 seconds, it returns to its normal state (with the bin icon).

  • If either a single or multiple annotations are selected, pressing SHIFT + DELETE will remove the annotations without the extra confirmation process to facilitate advanced users who wish to delete multiple annotations quickly.

 

 

Modifying annotations

  • Area annotations cane be moved and resized.

  • Character annotations can be modified by moving their start and end point.

  • Only one annotation can be modified in such a way at a time.

  • The label of multiple annotations can be changed at once using multi-selection

  • Multiple annotations can be confirmed at once using multi-selection


Direct navigation to Resources in Expert Mode

It is now possible to navigate directly to resources through the new Expert Mode navigator entry named “All Resources” located above the “Collection Entry”.


Persistent Step and Page selection when navigating through Collection documents in Expert Mode

The selected Step and Page will remain the same when navigating through the documents of a collection.

If the next selected document has less pages than the currently selected document, the first page is selected.


Views as panels in Expert Mode

The following changes have been introduced to the expert mode view:

The view on right side has been decoupled into tabs. Their visibility depends on the currently active step.

Structure- and export view are now in the same panel while elements, tables, evaluation and any other new view will have its own tab:

Multiple panels can be open at once: