The base of the doubt is form the above DQ mission, but the concept is out of the scope of the mission hence discussing it here.
In the above link we are processing the “Date” column of the “moma” dataset.
Example of the format of the date which we are trying to process is 1990 - 1991
Assume a case if the column includes date which is say for eg: is 1990 - 1995 - 2000 or 1990 - 1995 - 2000 - 2005 or something more than the two year format of the “Date” column.
We don’t know for sure and we can’t delete the data. I know any year format other than two year is unacceptable, because that would be against logic, my doubt is regarding the uncertainty about the data in the column of the dataset, we can’t check the data one by one for a column say with a million data points. I hope my doubt is clear
How we will find that how many such cases are there ??
How we will process the data column??