I have imported the .csv and added the parameter parse_dates to column ‘DRAW DATE’ using the following code:
lottery = pd.read_csv('649.csv', parse_dates = ['DRAW DATE'])
then I have created a function called extract numbers using the following code:
def extract_numbers(row): row = row[4:10] row = set(row.values) return row
The previous function is the same than the function provided on the solution by dataquest.
Then I have applied the previous function to the dataframe using .apply() as follows:
winning_numbers = lottery.apply(extract_numbers, axis=1) winning_numbers
This caused to copy the set of 6 numbers on every value on the dataframe instead of extracting the set from the row.
After many hours trying to find the problem, I have solved it removing ‘parse_dates’ parameter from .read_csv.
Do you know why this happens? It is strange that a column not used by .apply, affects its result.
Here is a picture of the dataframe with parse_dates