Screen Link: https://app.dataquest.io/m/136/data-cleaning-walkthrough/11/inserting-dbn-fields
padded_csd = col.astype(str)
if len(padded_csd) == 2:
What I expected to happen: i should be able to apply the function?
What actually happened:
AttributeError Traceback (most recent call last)
<ipython-input-24-461cbed20d38> in <module>
----> 1 padded_csd = CSD.apply(pad_csd)
3 data["class_size"]["DBN"] = padded_csd + data["class_size"]["SCHOOL CODE"]
~/anaconda/lib/python3.6/site-packages/pandas/core/series.py in apply(self, func, convert_dtype, args, **kwds)
3590 values = self.astype(object).values
-> 3591 mapped = lib.map_infer(values, f, convert=convert_dtype)
3593 if len(mapped) and isinstance(mapped, Series):
pandas/_libs/lib.pyx in pandas._libs.lib.map_infer()
<ipython-input-23-7d3255e9f536> in pad_csd(col)
1 # The mistake that i did
2 def pad_csd(col):
----> 3 padded_csd = col.astype(str)
4 if len(padded_csd) == 2:
5 return padded_csd
AttributeError: 'int' object has no attribute 'astype'``` to format properly
Other details: from the answer, i found out that if i changed it to the below it would work. but can i ask why the padded_csd = col.astype(str) doesn’t work?
col = str(col)
if len(col) == 2:
The difference here is that .astype(str) is a method for a Pandas series whereas str() is a function.
In the mission you are using the .apply() method to apply your pad_csd() function to each row in the column (Series). Because of this you are not applying your pad_csd function to the entire column (Series), but rather to each value in the series which are integers. Integers do not have the method .astype(str), but you can apply the function str() which is designed to take an integer as an argument.
sorry, can you elaborate a bit on that further? or an example may help please if possible
or if you can point me to any article on this area, would much appreciate it
A more detailed answer does get a little complicated when you are first learning Python. This is because it involves something called object oriented programming (OOP) which is a more advanced programming paradigm. While you don’t need to understand all the minutia of OOP to understand the difference between a method and a function, some familiarity is necessary. DataQuest does include some missions on OOP in the Data Scientist path.
Here is a post that breaks down the difference by showing some examples that I think will help.
This might still be a little confusing, but if you are really motivated to understand the difference I would suggest learning more about OOP. This blog post is a little advanced, but is a good overview of OOP in Python.
If you are still pretty new to learning Python, the best course of action would be to remember there is a difference between a function (such as str()) and a method (such as .astype(str)) and keep learning the basics of python. After you have learned the basics really well, come back to this concept and learn it in more detail. I know this recommendation is probably unsatisfying to the curious mind, but (in my experience) getting bogged down in the advanced concepts of Python when you are just starting out will slow your progress and learning down. Don’t stop asking these great questions and don’t stop trying to find the answers.
Great thanks for the direction @bvalgard. Let me check it out