I downloaded hacker news dataset to practice in Jupyter Notebook and in the 7th page
when u try to get frequency count for tags it doesn’t work in jupyter(says that DataFrame has no attribute value_counts()) but is fine in the Website
You have to use value_counts() on a series object!
In the screen you’re talking about, we see that the step_1 object is a Series. It was derived by isolating the title column in the original hn file you read in at the start of that mission.
My current understanding is str.extract always returns dataframe (even if only 1 capturing group is used, which should produce only 1 column which makes sense to store as series) unless expand=False.
whenever i print type and variable it gives me Series, but should i include str.extract command it turn everything to dataframe
I have never seen this before, maybe you were printing the wrong variable? In a method chain, A.B.C try to print A.B if you want to test effect of C rather than create your own A.B in another way.
At this juncture I’d be forced to conclude that there was a problem much earlier with your previous steps, perhaps as early as when you first read in the data.
It would help more if you uploaded the .ipynb file, perhaps on a github repo just for sharing purposes. At this point I’m really curious as to what is going on too!! And are you also positive that the file is being read in correctly? When you assign the csv file via pd.read_csv(), and you then preview the file in the jupyter notebook, does it look the same as the dataframe you preview on the DQ site?
Our platform is using Pandas version 0.22.0. And in this version, the expand parameter is set to False by default in str.extract(). And when expand is False, it will return a Series if there is only 1 capture group.
I just want to thank everyone who found time and tried to help me) Finally, I was able to proceed further,
The issue was that I didn’t set the paramenter expand=False, when I did, it resolved the issue)