Cannot find year column in csv world happiness

Hi there,

https://app.dataquest.io/m/347/working-with-missing-and-duplicate-data/4/visualizing-missing-data

I am getting this error because I do not have the YEAR column in the dataset ,
I have downloaded the files from Kaggle
https://www.kaggle.com/unsdsn/world-happiness?select=2016.csv
which maybe why there is a difference,
please advise since otherwise I could not follow along this part.

We can learn more about where these missing values are located by visualizing them with a heatmap , a graphical representation of our data in which values are represented as colors. We’ll use the seaborn library to create the heatmap.

Note below that we first reset the index to be the YEAR column so that we’ll be able to see the corresponding year on the left side of the heatmap:

Error -
None of ['YEAR'] are in the columns

I see that a column could be added to the dataframe but then I would have to write out the years around 300+ times to match the number of rows which would be a time consuming solution , if there are any other suggestions?
https://www.geeksforgeeks.org/adding-new-column-to-existing-dataframe-in-pandas/

There may be modifying data. Use dataquest files you can download it from mission

  • click on csv file name
  • then click on download icon to download file.

1 Like

Hi @jamesberentsen:

As @DishinGoyani has said, the dataquest version of the csv has one additional column of YEAR as compared with kaggle (13 columns instead of 12 in kaggle) so be sure to download the dataquest version.

Cheers!

Thanks, but I wondered if there was a way to add a [YEAR] column to the existing Kaggle 2015,'16,'17 csv files which was efficient?

Hi @jamesberentsen:

Maybe you should try based on this method on stackoverflow.

1 Like

You can add like this

happiness2015["YEAR"] = 2015
happiness2016["YEAR"] = 2016
happiness2017["YEAR"] = 2017
1 Like

Thanks DishinGoyani

That worked . Thanks for your help.
Great job.

Regards,
JB

1 Like