Data file formats
Specific file formats are required for the different types of data that can be loaded into a Linkage Project.
Linkage and Probability Estimation
The file formats for Linkages and Probability Estimation will differ for each Data Source. The Import Format specified for the Data Source will determine the exact format of the file.
Deletion
The data file is a simple text file with a single value per line, representing the Source Unique ID of the records to be deleted.
An example of the data file is shown here.
suid_0001
suid_0002
suid_0003
suid_0004
The same Event Type applies to the entire data set and is defined either by the Envelope manifest, or manually by the user on upload.
Batch Quality Review
The data file should be a comma delimited file with four fields and without a header row. It contains a list of records that are to be re-grouped, identifying which records should be grouped together.
An example of the data file is shown here.
CORE,suid_0001,WAMORB,1
CORE,suid_0002,WAMORB,1
CORE,suid_0003,WAMORB,2
CORE,suid_0004,WAMORB,2
The above example will put four existing records in the CORE
linkage project into two groups. All four records are in the WAMORB
event type.
The columns are described in more detail below.
Column | Description | |
---|---|---|
1 | Linkage Project Code | The Linkage Project Code refers to the project in which the event record was originally defined. In most cases, all rows in the file will have the same value for this field. For Project-to-Project linkage, the Code of the originating Linkage Project for that event type must be chosen. |
2 | Source Unique ID | The source record identifier provided in each record that is unique to the Event Type. |
3 | Event Type Code | The Code of the Event Type for the record, as determined during ingestion. |
4 | Target Group | This refers to the new group to which this record will belong. The exact value chosen here is not important, but the same number must be assigned to all the records that are to belong to the same group, with a different number assigned to records that are not to belong to this group. The numbers used here do not correspond to the internal Group IDs that the system will assign. i.e. if an internal Group ID of 3 already exists in the database and an operator assigns a Target Group ID of 3 to one or more records, the new records will enter the system with a new Group ID (not 3!) and the existing record/s with Group ID = 3 will remain unaffected. |
If one record from a group is being modified through a Batch Quality Review request, then all records in that group must be included in this request. For an example, an Operator wants to move record1 from group A
to group B
- they would need to submit a data file of the required format that:
- Lists all records in group
A
other than record1, and assigns them one target group value - Lists all records in group
B
, along with record1, and assigns them a different target group value