Skip to main content
Version: Current (1.x)

Data file formats

Specific file formats are required for the different types of data that can be loaded into a Linkage Project.

Linkage and Probability Estimation

The file formats for Linkages and Probability Estimation will differ for each Data Source. The Import Format specified for the Data Source will determine the exact format of the file.

Deletion

The data file is a simple text file with a single value per line, representing the Source Unique ID of the records to be deleted.

An example of the data file is shown here.

data.csv
suid_0001
suid_0002
suid_0003
suid_0004

The same Event Type applies to the entire data set and is defined either by the Envelope manifest, or manually by the user on upload.

Batch Quality Review

The data file should be a comma delimited file with four fields and without a header row. It contains a list of records that are to be re-grouped, identifying which records should be grouped together.

An example of the data file is shown here.

data.csv
CORE,suid_0001,WAMORB,1
CORE,suid_0002,WAMORB,1
CORE,suid_0003,WAMORB,2
CORE,suid_0004,WAMORB,2

The above example will put four existing records in the CORE linkage project into two groups. All four records are in the WAMORB event type.

The columns are described in more detail below.

ColumnDescription
1Linkage Project CodeThe Linkage Project Code refers to the project in which the event record was originally defined. In most cases, all rows in the file will have the same value for this field. For Project-to-Project linkage, the Code of the originating Linkage Project for that event type must be chosen.
2Source Unique IDThe source record identifier provided in each record that is unique to the Event Type.
3Event Type CodeThe Code of the Event Type for the record, as determined during ingestion.
4Target GroupThis refers to the new group to which this record will belong. The exact value chosen here is not important, but the same number must be assigned to all the records that are to belong to the same group, with a different number assigned to records that are not to belong to this group. The numbers used here do not correspond to the internal Group IDs that the system will assign. i.e. if an internal Group ID of 3 already exists in the database and an operator assigns a Target Group ID of 3 to one or more records, the new records will enter the system with a new Group ID (not 3!) and the existing record/s with Group ID = 3 will remain unaffected.
info

If one record from a group is being modified through a Batch Quality Review request, then all records in that group must be included in this request. For an example, an Operator wants to move record1 from group A to group B - they would need to submit a data file of the required format that:

  • Lists all records in group A other than record1, and assigns them one target group value
  • Lists all records in group B, along with record1, and assigns them a different target group value