Wrong interpretation of column data type in Data Discovery because date is invalid

When you load data into Data Discovery, they are automatically interpreted  - and sometimes they will give you an error.

In this case we are loading in an excel sheet with an invalid date:


Notice that someone accidentally reversed month and day in one row which will give a problem when loading the excel sheet into Data Discovery. (based on the first rows data discovery is expecting the same format through all the following rows).


Here we've added this data source to Data Discovery and it shows a red exclamation mark


which we can hover with the mouse to get some extra info:


So we have a problem with a date - but which one there are multiple in the dataset.


Here are 2 ways of clearing it up fast:

Investigate from within Data Discovery:

Do a preview of the dateset and change the column type of dates to string (1 at the time). 
Each time you have change a data type you right click data source and do a reload.

Once you've found the culprit, your data source will stop giving you error.

(note you still have to open the excel file and correct the mistake - but now you know which column to look in).


Investigate with Excel:

Open the excel sheet and click the data tab and click filter.

Now you have a drop-down for each column - and when you use it for a date column it is very fast (like shown below) to identify an invalid date:


The 06-22-2015 is clearly not recognized as a date (excel is not able to put inside a hierarchy of year month).

Now you just need to correct the date - close excel and do an update in Data Discovery.

Was this article helpful?
0 out of 0 found this helpful



Please sign in to leave a comment.