Quality Check / Referential Integrity Check intro

Referential integrity errors occur when a foreign key value in a dependent table does not have a corresponding entry in the parent table. This error occurs when Analysis Services processes a dimension in which the fact table references a foreign key value that does not exist in the dimension table for that dimension.

When processing fact tables, earlier versions of Analysis Services skipped rows that contained an undefined member for a dimension. When rows were skipped, the total members in the cube did not match the expected total for the data warehouse, causing Analysis Services to display incorrectly aggregated information.

By default, Analysis Services 2008 will ‘play it safe’. I.e. when referential errors are encountered, the processing of the data will simply stop. Already at this point, the user or the administrator of the system is given a clear indication that something is wrong, or at least needs to be examined, before continuing.

Data inconsistency may occur for various reasons: The ERP system may be inconsistent by design; data may originate from several, separate and inconsistent source systems; data may be lost due to technical issues during the data flow from source to destination; or data may be inconsistent due to tables being updated with different frequencies.

One way to fix the problem is to have a look at the tables, and their relations, in the intermediate Stage database. Using a proper SQL script the foreign key / primary key relationship may be checked and corrected if necessary.

TARGIT has developed a custom SSIS task, the Quality Check Task, which performs the referential integrity check.

Foreign Keys

These are the keys, typically in a fact table, that for each transaction identifies which product has been sold to which customer etc. So Foreign Keys are in reality dimension values. But often they are just IDs with limited or no meaning to the end-user. E.g. product with ID ASW-1001-32456, customer number 103 etc. In the Data Model we have modelled so far No_, Sell-to Customer No_ and Country_Region Code etc. are all Foreign Keys. Foreign Keys are related to Primary Keys in dimension tables.

Primary Keys

These are the keys in a dimension table that are related to the corresponding Foreign Keys in fact tables or other, related dimension tables. Primary Keys are unique keys. Only one occurrence of each Primary Key should be present in the dimension table. Primary Keys are related to Foreign Keys in fact tables in a one-to-many relationship. E.g. even though we may have thousands of transactions in our fact table for product with Foreign Key ID ASW-1001-23456, it only has one occurrence of Primary Key ID ASW-1001-23456 in the dimension table.

Data Inconsistency

Data Inconsistency occurs when Foreign Keys in one table cannot be matched with Primary Keys in a related table. If we ignore the missing Primary Keys when processing our cubes, we will lose the data in the transactions related to these keys.

Referential Integrity

This is the opposite of Data Inconsistency. Full Referential Integrity exists when all foreign keys have a matching primary key.

Was this article helpful?
0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.