header is just a name for used in the function definition, so all the
def open_dataset refer to the same object. You can check object sameness in the strictest sense using
id(any_variable) which returns the memory address of the object for the user to confirm sameness.
You could have called
header anything, as long as when you refer to it again in the function, you use the same name. Proper naming is done for human readability, so code becomes “self-documenting” so you don’t even need to write comments. It’s both a matter of style and good practice to know what verbs, nouns to use as variable/class/function names AND levels of abstracting/organizing code to convey a easily readable story of what the code does. For eg. properly abstracted and named processing workflows works well with code collapsing capabilities of IDEs to allow a new user to read the high level story first before expanding the collapsed part to get the details.
To address your point about “predefined”, everything i said above was using
header within the function definition. Now i turn to calling the
open_dataset function, more generally, calling any function from code developed to use the function.
If we think about any library, eg. pandas, the arguments and their order are indeed strictly predefined.
To use the library functions, you either have to give the arguments in the correct order, or to provide arguments using keywords if you want to only use some of the features in the library function. These keywords are predefined based on what the library designer wants to call it, and because python’s mechanism of supplying keyword arguments to a function requires that the provided keywords match the keywords in
To demonstrate what “use some of the features” mean, take
if you want to use
io,sheet_name,header, you can just call it as
pd.read_excel('my_excel.xlsx',2,3) without giving any keywords, these are
positional arguments when called like that,
but if you want to avoid using the 2nd input
sheet_name (note the skipping of inputs), you would be advised to do
pd.read_excel('my_excel.xlsx',header=3), (note keywords are specified now)
because if you don’t give
header=, python treats the 3 as input to the 2nd parameter
sheet_name which is not intended.
The whole positional vs keyword arguments, and required vs optional arguments thing is rather complex, takes some time to get used to, through you actively playing around and hitting more error messages in python.
LEGB scoping is an important thing to learn (taught in dataquest too) for any language: https://realpython.com/python-scope-legb-rule/.
After LEGB when you move to more complicated objects, turn to ICPO: https://lerner.co.il/2019/09/10/legb-meet-icpo-pythons-search-strategy-for-attributes/