Hi Leonel,
If you read the error message, and can recognize tuple
, and list
are python types, you will get a sense that somewhere, the type of a variable is wrong.
I can see some points of confusion that made you write all_data[1:]
to get the data.
To answer this, you can think about what’s the difference between data
inside the function and all_data
outside the function. What is the function returning? (verifying not only values, but also types which are even more important). Be extremely thoughtful (both when reading other’s code and writing your own) at the interface from function calling code to the function definition code, and from the return statement inside function definition to the variable(s) assigned to in the function calling code because it’s a major source of mistakes. Even if it works now, still important to keep in mind what’s going on because future changes to the code can break things again.
Different types allow different ways of indexing, getting/setting patterns, and predefined methods for data manipulation. Each type works differently with plotting libraries like matplotlib or manipulation libraries like pandas. You may think you are getting 1 type but you are actually getting another. The code may look the same, the variable name may make perfect sense, but what’s happening under the hood could be totally not what you expect. For a start, print(type(any_variable))
is a good debugging tool.
(Spoiler alert! Following paragraphs contains direct answer to your question, so try to think through first paragraph first and debug yourself.)
To improve your code, you can do multiple assignment (i don’t mean a=b=c), or more clearly called tuple unpacking with header,apps_data = all_data
. This horizontal presentation style provides a consistency to the reader with the multiple variables returned inside if header
block in the function.
Did you think about which argument to return in the 1st position and why? Also, this question assumes you are using default header=True parameter. What happens when someone passes header=False when calling your function? How does that affect how you assign the return values from calling this function?
If you want to ignore position, you can explore https://docs.python.org/3/library/collections.html#collections.namedtuple to use names to index values in a collection (such as a tuple).
For your from csv import reader
it is usually better to collect these importing code at the top of the .py file
so the reader immediately knows what tools are used from where.
For unused variables like read_file
, directly assigning a = func2(func1(c))
could be more convenient than
b = func1(c)
a = func2(b)
Nevertheless you may want to name it out for self-documenting code (variable name tells you what it contains), or when you start using pandas where it’s convenient to store out intermediate processed dataframes into new variables so they provide a shorthand reference to be easily used in downstream analysis by multiple functions.
As you get better (coding in your own .ipynb or .py files) you can add type hinting to your code, this can be seen as a faster form of documentation: https://realpython.com/python-type-checking/, because it forces you to specify what type are you taking in and giving out for every function. It also allows IDEs to help you autocomplete when writing classes and allows you to run static type checking tools like mypy.