Wrong interpretation of column data types in Data Discovery because first rows in data source are misleading

When you load data into TARGIT Data Discovery, you can experience that a column you thought was a date, an integer, a decimal number, etc. was interpreted differently by TARGIT Data Discovery.


There is a number of reasons this can happen - let's look at some:

In this Excel file - the column Amount is clearly a decimal number:


However, when we import it to Data Discovery and do a preview, the top row looks like this (Amount=Integer):



Now - you can force Data Discovery by just changing integer to float (decimal number) - but why was it interpreted this way?

In this case the reason is that in the first rows the Amounts are actually integers without decimals.

This can just be a coincidence - but it will influence the interpretation of the column since Data Discovery will use these first rows to decide the data type of the column.

To avoid this in the future,  you can click Advanced settings when you do the import, you can change the number of rows that are read to detect the datatype to a much higher number than the default (100).

First click Advanced Settings:

Then change the default number (100) for Detection row count:


By setting this to a much higher number - you can make sure that Data Discoverys "hitrate" of getting the data types right will be much higher.




Was this article helpful?
0 out of 0 found this helpful



Please sign in to leave a comment.