Act fast, special offers end soon! Up to $294 is savings when you get Premium today.
Get offer codes

Series.str.extract() and Series.value_counts() in 354-7 doesn't work in google colab jupyter notebook

Screen Link:https://app.dataquest.io/m/354/regular-expression-basics/7/accessing-the-matching-text-with-capture-groups

Your Code:

pattern = r"\[(\w+)\]"

tag_freq = titles.str.extract(pattern).value_counts()

print(tag_freq)

What I expected to happen:
pdf 276
video 111
2015 3
audio 3
slides 2

crash 1
coffee 1
map 1
JavaScript 1
comic 1
Name: title, Length: 52, dtype: int64

What actually happened:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-13-fdc284e8b56e> in <module>()
      6 tag_freq = pd.Series()
      7 tag_freq = titles.str.extract(pattern)
----> 8 tag_freq.value_counts()
      9 print(tag_freq)

/usr/local/lib/python3.6/dist-packages/pandas/core/generic.py in __getattr__(self, name)
   5177             if self._info_axis._can_hold_identifiers_and_holds_name(name):
   5178                 return self[name]
-> 5179             return object.__getattribute__(self, name)
   5180 
   5181     def __setattr__(self, name, value):

AttributeError: 'DataFrame' object has no attribute 'value_counts'

Other details: It works fine in DataQuest script.py. But somehow it doesn’t work in notebook.
Thanks.

1 Like

Hey.

This is due to different versions of pandas being used in app.dataquest.io and in Google Colab.

Basically, it used to be the case that Series.str.extract used to return a series by default (hence Series.value_counts being applicable in app.dataquest.io). Eventually, this behavior was changed and Series.str.extract started returning dataframes by default, hence the error in Google Colab.

Sahil explores this here and here.

3 Likes

Thanks Bruno!
I added expand=False when using extract and it worked.
Appreciate your time and help.
Devang.

9 Likes

Thank you so much, I was having the same issue on my Jupyter Notebook. Problem solved.