Hi a little question in this codes:“Learn data science with Python and R projects”
def open_dataset(file_name=‘AppleStore.csv’, header=True):
opened_file = open(file_name)
from csv import reader
read_file = reader(opened_file)
data = list(read_file)
I didn’t get it why this answer is true i mean the header isnt defined to be as row in the opened file/list of lists, again the creator didn’t describe regarding something important like this header thing
Answers will help ,thanks!
I can understand the confusion because they expect you to be familiar with the dataset
AppleStorer.csv by this point since you have worked with it in previous Missions.
In the last instruction on this Step, they do point this out -
recall that the
AppleStore.csv dataset has a header row
Since we know that it’s a list of lists and that there is a header row, you can infer that
data (the first list/row) is the header based on the information provided.
At this point, I would also highly encourage you to start experimenting with exploring the dataset based on the code given. You can add
print statements, or try to print the entire thing just to get a better understanding of the data yourself. This will come in handy later on as well, so try to do that from time to time for your own benefit.
For example, in Step 3 of this Mission, you wrote the code to extract the data from the CSV to
apps_data. In one of the instructions, you were also asked to explore the first few rows. So, printing out
would have shown you that the first row (the first list) inside the list of lists is the header row.
Let me know if this helps!
I think this is actually an error on their part, " If the dataset has a header, the function returns separately both the header and the rest of the data set."
However, in their code they have:
which will only return the data without the header row so you can’t actually access data to get the header file since it’s been omitted. If you go to the next screen you’ll see they’ve updated the function to actually return both:
return data[1:], data
So, you could always call the open_dataset(header=False) to return the full data and then index it correctly from there.
That’s an instruction for a different Mission Step and not for the one in the question.
Hi thanks for your responses
so just to clarify regarding the header in the def no matter what will replace the header “name” which include the True/False the meaning is that this “header/other name=True/False” is regarding if the header is existing in the file or not right?(specifically when we are using the def)
“def open_dataset(file_name=‘AppleStore.csv’, header=True):”