Detecting Duplicate Records

Custom and Industry Data connectors that support a primary key in their load pattern detect duplicate records (Dupes) automatically. When Nitro detects a duplicate record, the record is not merged into the ODS but is instead written to a file. This file, as well as the total count of all duplicate records, can be accessed in the admin console.

Unintended duplicate records in the ODS can cause major discrepancies when effective-dating the records and further processing the data into star schema in the DDS. Keeping duplicates out of the ODS preserves the quality of the ODS and downstream processing.

Patterns incorporated in the duplicates check include:

  • pt_ftp_ods_replace_ed_history__v
  • pt_ftp_ods_replace_ed_latest__v
  • pt_ftp_ods_upsert_ed_history__v
  • pt_ftp_ods_upsert_ed_latest__v
  • pt_ftp_ods_upsertload__v