Help with course 5: Data Cleaning Walkthrough

The aim is to read them and append it to the data dict. This is the mission number 4 of course #5 Data cleaning walkthrough. I read them files individually but couldn’t think of a way to append them to the dictionary and when I saw the answer, I was still lost. Could someone please explain the code to me as I don’t completely understand it. Thank you

data_files = [
data = {}

#answer code starts here
for f in data_files:
d = pd.read_csv(“schools/{0}”.format(f))
key_name = f.replace(".csv", “”)
data[key_name] = d

1 Like

Hi @chosum.tashi ,

Here we are loading all the CSV files listed in the data_files list into pandas DataFrames. And each of those DataFrames is stored in a dictionary named data. For that, we are doing the following steps:

for each file name in the data_files list, create a DataFrame using the file.
d = pd.read_csv("schools/{0}".format(f))

Our files are stored in the schools folder. In the above code, {0} will be replaced by the value of argument f provided to the .format() method (To know more about the method visit The value of f will be the file name in data_files list. So in each iteration the above code will become:

d = pd.read_csv(“schools/ap_2010.csv”)
d = pd.read_csv(“schools/class_size.csv”)
d = pd.read_csv(“schools/demographics.csv”)
d = pd.read_csv(“schools/graduation.csv”)
d = pd.read_csv(“schools/hs_directory.csv”)
d = pd.read_csv(“schools/sat_results.csv”)

On every iteration, d will store a new DataFrame, which we need to save to the data dictionary. But to do that, we need to specify a key. For an easy reference, we can use the file name as the key. Therefore, we are creating a key using the file name after removing .csv extension from it.

key_name = f.replace(".csv", "")

After that, we add the DataFrame d as value to the key key_name in the data dictionary.

data[key_name] = d

Hope it helps :slightly_smiling_face:


1 Like