Function intermediate

def open_dataset(file_name=‘AppleStore.csv’,header=True):
opened_file = open(file_name)
from csv import reader
read_file = reader(opened_file)
data = list(read_file)
if header:
return data[1:]
return data

how does it know what a header is, is “header” predefined in the system as the first row?

header is just a name for used in the function definition, so all the header in def open_dataset refer to the same object. You can check object sameness in the strictest sense using id(any_variable) which returns the memory address of the object for the user to confirm sameness.

You could have called header anything, as long as when you refer to it again in the function, you use the same name. Proper naming is done for human readability, so code becomes “self-documenting” so you don’t even need to write comments. It’s both a matter of style and good practice to know what verbs, nouns to use as variable/class/function names AND levels of abstracting/organizing code to convey a easily readable story of what the code does. For eg. properly abstracted and named processing workflows works well with code collapsing capabilities of IDEs to allow a new user to read the high level story first before expanding the collapsed part to get the details.

To address your point about “predefined”, everything i said above was using header within the function definition. Now i turn to calling the open_dataset function, more generally, calling any function from code developed to use the function.
If we think about any library, eg. pandas, the arguments and their order are indeed strictly predefined.
To use the library functions, you either have to give the arguments in the correct order, or to provide arguments using keywords if you want to only use some of the features in the library function. These keywords are predefined based on what the library designer wants to call it, and because python’s mechanism of supplying keyword arguments to a function requires that the provided keywords match the keywords in def func.

To demonstrate what “use some of the features” mean, take pd.read_excel
if you want to use io,sheet_name,header, you can just call it as pd.read_excel('my_excel.xlsx',2,3) without giving any keywords, these are positional arguments when called like that,
but if you want to avoid using the 2nd input sheet_name (note the skipping of inputs), you would be advised to do pd.read_excel('my_excel.xlsx',header=3), (note keywords are specified now)
because if you don’t give header=, python treats the 3 as input to the 2nd parameter sheet_name which is not intended.

The whole positional vs keyword arguments, and required vs optional arguments thing is rather complex, takes some time to get used to, through you actively playing around and hitting more error messages in python.

LEGB scoping is an important thing to learn (taught in dataquest too) for any language:

After LEGB when you move to more complicated objects, turn to ICPO: