Skip to main content
Version: Next

Reviewing linkage quality

LinXmart allows the operator to manually override links created by the system, either through the web UI or by uploading a quality review envelope. This allows a group of records with a known error to be manually changed. It also allows groups which fail grouping rules to be manually updated. All quality review grouping changes are stored in the system, together with the full history of quality review changes.

For each Linkage Project, individual groups can be viewed through the Quality Review link from Linkage Project page.

Quality Reviews

The Quality Review page is split into three sections:

  • Project Quality Summary - a count of the groups in the project and some high level quality indicators
  • Data Quality Query - these parameterised queries can be run across the groups (linkage map) in the project
  • Active Quality Reviews - a list of any active quality reviews will be displayed here, including those groups that failed one or more Grouping Rules, and any other manually altered group

Data Quality Queries

A number of queries are available to run. Clicking on any of the queries from the Quality Review page will take you to the quality query page with that particular query pre-selected. Changing the selected query in the query dropdown will automatically update the parameters below it.

Quality Query

Update the values of the parameters as required for your query and click the Run button to show the results.

There are two different types of results and depend on the query that is being run. The first type of result returns a list of groups with aggregate information for each group. The second type of result returns groups with record level information as well.

Group Summary Results

The results that show aggregate information for each group do not show any identifiable information. Only summary information for each group is displayed, such as the number of events. An example is shown below.

Aggregate Group Results

You can visualise an individual group by clicking on the Matching Group ID. This will take you to another page that lists all record data combinations and displays the group as a node-edge graph. More on this below.

Group Detail Results

This type of result will show a list of the identifiers within each group. There is no summary information shown.

List Group Results

As with the aggregate group results, you can visualise this group by clicking on the Matching Group ID.

Manual quality review

Visualising a group

There are two ways to view particular groups:

  1. The groups which have failed Grouping Rules (if any) will be listed under Active Quality Reviews section, along with the reason for the failure. These can be viewed by clicking the View button.
  2. From the results of a data quality query, the user can click on the Matching Group ID.

Once a group is selected, detailed information about the group is displayed.

View matching Group

A group displayed on the View Group screen

The top panel is a diagram representing the group. Each circle (node) represents a unique set of identifier fields. The colour 'chips' within the circles represent the portion of records with different Event Types that have this set of identifier values. The number in each circle corresponds to the ID listed in the lower panel. Each line between two records represents a pair found during the matching process. The thickness of the line corresponds to the weight (total score) for that pair. Although not visible in the diagram here, the actual weight value of any pair can be seen by hovering the mouse over the pair line.

The second panel shows a table of all the records in the group, along with their personal identifiers. Similarly to the group visualisation, only the unique set of identifiers are shown. The number of individual events (record) with the these identifier values is shown in the Events column. Clicking on the value in the Events or Pairs columns will show more details of each record and the records to which this record is directly linked. The table can be sorted on any column by clicking on the column header. Multiple fields can be sorted on by holding down shift and clicking.

Changing a group

To edit a group, click Edit in the Event Data panel. An editable Target Group column will appear in the table below. By changing the number in this column, an operator can reconfigure the records into different groups.

The Target Group column accepts numbers between 1 and 999. The actual number used is irrelevant. The system will re-group these records, putting all the records with the same target group number into a new group and assign new Group IDs to all the groups formed through the re-grouping.

Edit matching Group

You will also see a field for entering notes, which allows you to record decisions for the regrouping. These notes will be displayed for the group you are editing, and for any future group these events may be put in.

In some cases, operators may need to join together records that are currently in two separate groups. This can be accomplished by clicking Add in the Event Data panel while in 'edit' mode. From here another group can be include in this manual quality review, and both groups can be re-grouped simultaneously. The Group ID of the group to be added must be known.

Re-grouping

Once an edited group is saved, the group is locked from any further quality reviews until a re-grouping process is started. Re-grouping effectively saves these changes into the system. The re-grouping process does not happen automatically after every quality review change, but is triggered through the Regroup button/link below the Active Quality Reviews list on the Quality Review page. A number of group quality reviews can be carried out before a re-grouping is started.

Clicking Regroup causes the system to initiate a new Group Pairs and Events job.

Viewing quality reviewed groups

Pairs that have been created through the quality review process will be shown if viewed again through the web UI. A quality pair created to link two records together will be red and labelled as Q (for quality). If there is also a probabilistic pair between the two records (created through the matching process) it will be darker and marked as B (for both). A link between two records will be a dashed red line if previously these two records had been split into different groups, but have since been brought back together.

Batch quality review

Overview

The batch quality review process allows an operator to perform many quality reviews at once.

These quality reviews are performed externally to LinXmart and then fed back in to LinXmart as an Envelope.

The batch quality review process provides a mechanism for operators to apply their own quality checks outside of LinXmart and then feed the results back into the system. Information on records, pairs and groups in a Linkage Project can be accessed through queries directly on the LinXmart project database. This can be developed to operate as an automatic quality review process or a large-scale clerical review process. The results of a batch quality review are fed back into LinXmart as a Quality Review envelope, processed and are saved in the system as quality review changes.

View matching Group

The Batch Quality Review process allows external quality review decisions to be fed back into LinXmart

Access to the LinXmart database for Batch Quality Review purposes can be established by your LinXmart administrator. The batch quality review process requires whole groups of records to be reviewed in a similar process to the 'one group at a time' method used on the web UI. Every record in a modified group must be placed into a new group for processing to occur.

Batch quality review file

A batch quality review can be uploaded as an Envelope, or just a data file uploaded directly to the Linkage Project.

The data file must conform to the required data format.

An example of the Envelope format for a quality review is also available.

Batch quality review jobs

After performing the usual validation checks, the Load Quality Review Request job begins to kick of the batch changes.

Load batch quality review job

The Load Batch Quality Review job parses the data file, and loads valid records into the database.

All the records in the file are first parsed. Records will fail parsing if:

  • They do not contain exactly four fields separated by commas
  • The record is not identifiable in the project
  • Any field is blank

If the number of records which fail parsing is greater than 5% of the total, the entire datafile is rejected and marked as failed.

Quality review pair creation

This job checks each new group's validity (i.e. whether all the records contained in each group are valid).

  • Records which have been end-dated by the system are not valid
  • Records which are not part of the Linkage Project are not valid
  • Records that are part of a group containing invalid or missing record/s are not valid.
  • Target groups containing an invalid record are themselves invalid, and all of the records in the group become invalid. These newly invalidated records in turn cause all other records in their original groups to be marked as invalid.

After the Quality Review Pair Creation job, the Group Pairs and Events job is run, similarly to the manual Regroup trigger described previously.