Creating a Pie Chart

Hi. I am trying to create a pie chart where I group those who have been Covid19 vaccine injured according to their region in the US. The first step is to define the function. This is what I did.

## Create a pie chart representing the state where the COVID19 vaccinated lived following their Adverse Reaction

## Define parameters

### define function to create custom intervals
def region(state):
    if state == 'CT' & 'DE' & 'GA' & 'ME' & 'MD'& 'MA' & 'NH' & 'NJ' & 'NY' & 'NC' & 'OH' & 'PA' & 'RI' & 'SC' & 'VA' & 'WV' & 'VT':
        return "EST"
    elif state == 'FL' & 'IN' & 'KY' & 'MI' & 'TN':
        return "Mix of EST & CST"
    elif state == 'AL'& 'AK' & 'IL' & 'IA' & 'LA' & 'MN' & 'MS' & 'MO' & 'OK' & 'WI':
        return "CST"
    elif state == 'TX' & 'KS' & 'NE' & 'ND' & 'SD':
        return "Mix of CST & MST"
    elif state == 'CO' & 'MT' & 'NM' & 'UT' & 'WY' & 'AZ':
        return "MST"
    elif state == 'ID' & 'OR':
        return "Mix of MST & PST"
    elif state == 'CA' & 'WA':
        return "PST"
    else:
        return "All other states & US territories"

### create new column with customized intervals    
covid_vaers_cardio["time_zones"] = covid_vaers_cardio["STATE"].apply(region)

covid_vaers_cardio["time_zones"].value_counts()

Error message: unsupported operand type(s) for &: ‘str’ and ‘str’

Seems that I cannot put multiple strings to define a function

Okay, I see a couple of issues here with your function:

 if state == 'CT' & 'DE' & 'GA' & 'ME' & 'MD'& 'MA' & 'NH' & 'NJ' & 'NY' & 'NC' & 'OH' & 'PA' & 'RI' & 'SC' & 'VA' & 'WV' & 'VT'
return "EST"

So first, you cannot AND together strings, you can only AND boolean data types. In order to use boolean operators like AND or OR, you need to be using Boolean data types. One fix for this would look like this (note, I’m using a different line here):

elif (state == 'ID') & (state=='OR'):

This way state == ‘ID’ evaluates to either True or False and state == ‘OR’ also evaluates to either True or False.

Now, I think an easier way to do this might be to put each of the states into lists labelled “PST”, “MST”, “CST” etc. and then use:

if state in EST:

to assign the labels.

Second, I don’t think you want to use the AND operator for this. Looking at one of the branches of your if statement (with the issues from above fixed):

elif (state == 'CA') & (state == 'WA'):
        return "PST"

This will only return “PST” if the state is both CA and WA, which will almost certainly never be true. Instead it should read:

elif (state == 'CA') | (state == 'WA'):
        return "PST"

Hope all of this helps!!

1 Like

The key here is that on Python, the ‘&’ function relates to booleans rather than strings

With all trying to create custom filters, I’m just trying to create a pie chart with the earlier code that you helped me correct. This is what I did afterwards. (Note: by the way, I still prefer that I have a parameter such as ‘Mix of EST & CST’ because not all states fall under a time zone)

## Create a pie chart representing the state where the COVID19 vaccinated lived following their Adverse Reaction

## Define parameters

### define function to create custom intervals
def region(state):
    if (state == 'CT') | (state == 'DE') | (state == 'GA') | (state == 'ME') | (state == 'MD') | (state == 'MA') | (state == 'NH') | (state == 'NJ') | (state == 'NY') | (state == 'NC') | (state == 'OH') | (state == 'PA') | (state == 'RI') | (state == 'SC') | (state == 'VA') | (state == 'WV') | (state == 'VT'):
        return "EST"
    elif (state == 'FL') | (state == 'IN') | (state == 'KY') | (state == 'MI') | (state == 'TN'):
        return "Mix of EST & CST"
    elif (state == 'AL') | (state == 'AK') | (state == 'IL') | (state == 'IA') | (state == 'LA') | (state == 'MN') | (state == 'MS') | (state == 'MO') | (state == 'OK') | (state == 'WI'):
        return "CST"
    elif (state == 'TX') | (state == 'KS') | (state == 'NE') | (state == 'ND') | (state == 'SD'):
        return "Mix of CST & MST"
    elif (state == 'CO') | (state == 'MT') | (state == 'NM') | (state == 'UT') | (state == 'WY') | (state == 'AZ'):
        return "MST"
    elif (state == 'ID') | (state == 'OR'):
        return "Mix of MST & PST"
    elif (state == 'CA') | (state == 'WA'):
        return "PST"
    else:
        return "Others"

### create new column with customized intervals    
covid_vaers_cardio["time_zones"] = covid_vaers_cardio["STATE"].apply(region)

covid_vaers_cardio["time_zones"].value_counts()

## Group the custom filter
covid_vaers_cardio_grouped = covid_vaers_cardio.groupby(["time_zones"], dropna = False, as_index = False).agg({"STATE" : np.size})

## time_zones to be used as labels
labels = covid_vaers_cardio["time_zones"].unique()

## create an x to save typing time
x = covid_vaers_cardio_grouped

## Create pie chart
fig, ax = plt.subplots()
ax.pie(x, labels, radius=3, center=(4, 4))

## Title
plt.title('Distribution of Cardiovascular Adverse Events from the COVID19 Vaccinated by Location', loc='right',color='b', fontsize=16)

## Add legend
plt.legend(labels=diet, loc='left')

plt.show() 

ValueError: could not convert string to float: ‘CST’ (error message)

How is this rectified?

Error:

ValueError: could not convert string to float: ‘CST’ (error message)

Bug:

ax.pie(x, labels, radius=3, center=(4, 4))

To resolve:

Reading from help(ax.pie)

Help on method pie in module matplotlib.axes._axes:

pie(x, explode=None, labels=None, colors=None, autopct=None, pctdistance=0.6, shadow=False, labeldistance=1.1, startangle=0, radius=1, counterclock=True, wedgeprops=None, textprops=None, center=(0, 0), frame=False, rotatelabels=False, *, normalize=True, data=None) method of matplotlib.axes._subplots.AxesSubplot instance
    Plot a pie chart.
    
    Make a pie chart of array *x*.  The fractional area of each wedge is
    given by ``x/sum(x)``.
    
    The wedges are plotted counterclockwise, by default starting from the
    x-axis.
    
    Parameters
    ----------
    x : 1D array-like
        The wedge sizes.

x needs to be a 1D array-like. This x refers to the x mentioned in help(ax.pie). Not to be confused with the x = covid_vaers_cardio_grouped in your code.

Now, x mentioned below onward is x = covid_vaers_cardio_grouped.

However, x is a <class 'pandas.core.frame.DataFrame'>. You can verify the type of x by using type(x).

To fix the bug,

ax.pie(x['STATE'], labels=labels, radius=3, center=(4, 4))

Or follow the similar code below:

from matplotlib import pyplot as plt
import numpy as np
import pandas as pd
timezones = ['CST', 'CST', 'CST', 'EST', 'PST']
states = ['AL', 'AK', 'AK', 'CT', 'CA']
df = pd.DataFrame.from_dict({'states': states, 'timezones': timezones})
x = df.groupby('timezones', dropna=False, as_index=False).agg({'states': np.size})
labels = x['timezones']
>>> type(x)
<class 'pandas.core.frame.DataFrame'>

>>> x 
  timezones  states
0       CST       3
1       EST       1
2       PST       1

>>> x['states']
0    3
1    1
2    1
fig, ax = plt.subplots()
ax.pie(x['states'].values, labels=labels, radius=3, center=(4,4))
plt.show()
1 Like

Ok sweet. Thanks. I can look up what to do to modify the pie chart