Creating multiple filters for a criteria

Hi I am trying to make multiple filters and criteria. I am looking for those who suffered from any cardiovascular related issue and put key words that relates to cardiovascular conditions. It doesn’t have to fit the actual description or value in any cell verbatim, it just needs that word or letters on there. This is the code I used:

filter_list = ['myo','peri', 'aortic', 'heart', 'artery', 'veins', 'strokes', 'heartbeat', 'electrocardiogram', 'coronary', 'arrhythmias', 'fibrillation', 'tachycardia', 'bradycardia', 'thrombosis', 'endocardiogram', 'blood pressure', 'palpitations', 'tricuspid', 'atherosclerosis', 'stroke', 'white blood cell', 'fibrin']
covid_vaers_cardio = covid_vaers[[covid_vaers.SYMPTOM1.isin(filter_list)], [covid_vaers.SYMPTOM2.isin(filter_list)], [covid_vaers.SYMPTOM3.isin(filter_list)], [covid_vaers.SYMPTOM4.isin(filter_list)], [covid_vaers.SYMPTOM5.isin(filter_list)], [covid_vaers.SYMPTOM_TEXT.isin(filter_list)]]

This is my error message:

TypeError                                 Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_8156/1443622594.py in <module>
      1 filter_list = ['myo','peri', 'aortic', 'heart', 'artery', 'veins', 'strokes', 'heartbeat', 'electrocardiogram', 'coronary', 'arrhythmias', 'fibrillation', 'tachycardia', 'bradycardia', 'thrombosis', 'endocardiogram', 'blood pressure', 'palpitations', 'tricuspid', 'atherosclerosis', 'stroke', 'white blood cell', 'fibrin']
----> 2 covid_vaers_cardio = covid_vaers[[covid_vaers.SYMPTOM1.isin(filter_list)], [covid_vaers.SYMPTOM2.isin(filter_list)], [covid_vaers.SYMPTOM3.isin(filter_list)], [covid_vaers.SYMPTOM4.isin(filter_list)], [covid_vaers.SYMPTOM5.isin(filter_list)], [covid_vaers.SYMPTOM_TEXT.isin(filter_list)]]

~\anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
   3456             if self.columns.nlevels > 1:
   3457                 return self._getitem_multilevel(key)
-> 3458             indexer = self.columns.get_loc(key)
   3459             if is_integer(indexer):
   3460                 indexer = [indexer]

~\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   3359             casted_key = self._maybe_cast_indexer(key)
   3360             try:
-> 3361                 return self._engine.get_loc(casted_key)
   3362             except KeyError as err:
   3363                 raise KeyError(key) from err

~\anaconda3\lib\site-packages\pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

~\anaconda3\lib\site-packages\pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

TypeError: '([0          False
1          False
2          False
3          False
4          False
           ...  
1220892    False
1220893    False
1220894    False
1220895    False
1220896    False
Name: SYMPTOM1, Length: 1220897, dtype: bool], [0          False
1          False
2          False
3          False
4          False
           ...  
1220892    False
1220893    False
1220894    False
1220895    False
1220896    False
Name: SYMPTOM2, Length: 1220897, dtype: bool], [0          False
1          False
2          False
3          False
4          False
           ...  
1220892    False
1220893    False
1220894    False
1220895    False
1220896    False
Name: SYMPTOM3, Length: 1220897, dtype: bool], [0          False
1          False
2          False
3          False
4          False
           ...  
1220892    False
1220893    False
1220894    False
1220895    False
1220896    False
Name: SYMPTOM4, Length: 1220897, dtype: bool], [0          False
1          False
2          False
3          False
4          False
           ...  
1220892    False
1220893    False
1220894    False
1220895    False
1220896    False
Name: SYMPTOM5, Length: 1220897, dtype: bool], [0          False
1          False
2          False
3          False
4          False
           ...  
1220892    False
1220893    False
1220894    False
1220895    False
1220896    False
Name: SYMPTOM_TEXT, Length: 1220897, dtype: bool])' is an invalid key

What do you suggest I do?

Hello @MfonobongAmana
Welcome to the DataQuest Community! :wave:

I have edited your code so that it would be visible for us to detect your error.

You have experienced a type error in pandas.

Since you are trying to make multiple filters, you need to have logical operators to combine the expressions
You can use:

  • &: For and
  • | For or
  • ~ For not

For example if I need columns SYMPTOM1, SYMPTOM2, SYMPTOM3 and SYMPTOM4 to only have items in filter_lister I will use & to combine the multiple filters (Use | if you want where it appears in at least one of the columns).

covid_vaers_cardio = covid_vaers[covid_vaers.SYMPTOM1.isin(filter_list) & covid_vaers.SYMPTOM2.isin(filter_list) & covid_vaers.SYMPTOM3.isin(filter_list) & covid_vaers.SYMPTOM4.isin(filter_list) & covid_vaers.SYMPTOM5.isin(filter_list) & covid_vaers.SYMPTOM_TEXT.isin(filter_list)]

Additionally, you can learn more about filtering data frames here: :point_down:

2 Likes

I noticed the changes you made and I ran a code. This is what it gave me as an error:

File “C:\Users\18608\AppData\Local\Temp/ipykernel_8156/1108663572.py”, line 2
covid_vaers_cardio = covid_vaers[covid_vaers.SYMPTOM1.isin(filter_list) & [covid_vaers.SYMPTOM2.isin(filter_list) & [covid_vaers.SYMPTOM3.isin(filter_list) & covid_vaers.SYMPTOM4.isin(filter_list) & covid_vaers.SYMPTOM5.isin(filter_list) & covid_vaers.SYMPTOM_TEXT.isin(filter_list)]
^
SyntaxError: unexpected EOF while parsing

Update: I corrected the code

filter_list = [‘myo’,‘peri’, ‘aortic’, ‘heart rate’, ‘artery’, ‘vein’, ‘stroke’, ‘heartbeat’, ‘electrocardiogram’, ‘coronary’, ‘arrhythmias’, ‘fibrillation’, ‘tachycardia’, ‘bradycardia’, ‘thrombosis’, ‘endocardiogram’, ‘blood pressure’, ‘palpitation’, ‘tricuspid’, ‘atherosclerosis’, ‘stroke’, ‘white blood cell’, ‘fibrin’]

covid_vaers_cardio = covid_vaers[covid_vaers.SYMPTOM1.isin(filter_list) & covid_vaers.SYMPTOM2.isin(filter_list) & covid_vaers.SYMPTOM3.isin(filter_list) & covid_vaers.SYMPTOM4.isin(filter_list) & covid_vaers.SYMPTOM5.isin(filter_list) & covid_vaers.SYMPTOM_TEXT.isin(filter_list)]

Gave me no errors but when I want to see the first 10 values, shows me nothing