Axis in Matplotlib

Hi DQ-Community,

i work on a little Project to boost my Data Visualization skills with Python and now i’m stuck.
I hope somebody can give me an answer to my question:

I want to visualize the amount of E-Mails during a specific period. I decided to use Matplotlib for this Project. In the x-axis i add the calender week and i want to add the amount of not yet read E-Mails in the y-axis. I have a column with “read Mails” and “Not-read Mails”.
Now my question is:

Is it possible to count the Amount of the “Not-Read Mails” with Pandas and assign the value to the y-axis? How can i do this in Pandas?

Thanks for your help.

Simo

@Simo: Do you mind uploading your dataset and a screenshot of how your data looks like in excel? Thanks

Assumption : your data is a time series (Pandas series), your index is the date put in correct datetime format, and the values are the number of not read emails . Then:

ar = np.zeros(shape=(7,),dtype=np.float32)
for i in range(len(data)):        
     d=data.index[i].weekday()
     ar[d]+=data.iloc[i]
1 Like

Hi @masterryan.prof @WilfriedF i uploaded the anonymized Data, because i see that i’m even not able to create a basic Time Series :confused: The Structure is the same.

I wanted to add on the y-axis the total count of the Mails with the Lable “New” in the Column State_Now and on the x-axis the Calendar week - but it seems that also this is not working with my solution.

Here is my Code:

import pandas as pd
from pandas import DataFrame, Series
from pathlib import Path
import matplotlib.pyplot as plt
pd.set_option("display.max_rows", None, "display.max_columns", None)

input_file = Path.cwd() / 'data' / 'raw' / 'Pandas_Example.xlsx'
df = pd.read_excel(input_file)
df['Week_of_Year'] = pd.to_datetime(df['Week_of_Year'])

#plt.plot()
#plt.show()
#first_twelve = df[0:12]
plt.plot(df['Week_of_Year'], df['State_Now'] =="New")
plt.show()

As you can see on the Comments i tried to adapt the learning from DQ to my own Dataset, but it doesn’t work.

What i have to do to create a nice Time Series over the Weeks?

Thanks as always guys
Simo

Example_Pandas.xlsx (193.0 KB) ,

df = df.set_index("Send_Time")
df.index = pd.to_datetime(pd.to_datetime(df.index).date)

does it work ?

Hi @WilfriedF,

i have to use the last colum Week_of_Year. Your solution refers to the individual Mail time, right?
I’m not sure if it was necessary to do pd.to_datetime in my case i did it only because i have seen it in the DQ-Course.

Sorry I first thought you wanted weekday on x-axis, but if you want week num of the year, it’s easiest indeed:

data = df.set_index('Week_of_Year')
ar = np.zeros(shape=(52,),dtype=np.float32)
for i in range(len(data)):        
     d=data.index[i]
     if data["State_Now"].iloc[i] == "New":
         ar[d]+=1

Where the np. at zeros and float32 refers to?
Do i have to Import someting?
It runs on an error here: NameError: name ‘np’ is not defined

Sorry this is numpy indeed

import numpy as np

then you plot ar array

Sure you can do it without creating a numpy array but I usually proceed like this

dtype float32 is not necessary here since we deal with int numbers, but it doesn’t matter

Thanks i imported numpy :slight_smile:But the result doesn’t make sense.

image

:confused:

Wait looking at your excel, I see you will have a problem since there is a 53 week number
So you need to adapt the code

d=data.index[i]-1
if data["State_Now"].iloc[i] == "New":
   ar[d]+=1

When you plot:
x = range(0,52)
y = ar

plt.plot(x,y)

still not making any sense ?

your array should have values > 1

I did that but the chart it still confusing.
Do i have to do manually something to the axis?
I expected a Line which will go from left to the right over the Calendar weeks showing the progress/trend of the amont of the new Mails. I thought that my raw Data is good enough for this requirement. In the Tutorials it looks so easy :slight_smile:

EDIT:

Let me check with the plots.

Captura

check it, there are very few “New” values

only 7

Can you please post the whole code? It seems that I did something wrong with the order of the single statements we discussed. My result looks still like the screenshot, very confusing.

If you want accumulation of “New” values among time, this is another story you will need to use cumsum for example

data = df.set_index('Week_of_Year')
ar = np.zeros(shape=(52,),dtype=np.float32)
for i in range(len(data)):        
    d=data.index[i]-1
    if data["State_Now"].iloc[i] == "New":
        ar[d]+=1
plt.plot(ar)

Look with cumsum:

data = df.set_index('Week_of_Year')
ar = np.zeros(shape=(52,),dtype=np.float32)
for i in range(len(data)):        
    d=data.index[i]-1
    if data["State_Now"].iloc[i] == "New":
        ar[d]+=1
plt.plot(np.cumsum(ar))
plt.xlabel("week")

Captura

Yeah, thats great. Let me try it with cumsum on my site.
So the Whole Code Should be:

import pandas as pd
import numpy as np
from pandas import DataFrame, Series
from pathlib import Path
import matplotlib.pyplot as plt
pd.set_option("display.max_rows", None, "display.max_columns", None)

input_file = Path.cwd() / 'data' / 'raw' / 'Pandas_Example.xlsx'
df = pd.read_excel(input_file)
data = df.set_index('Week_of_Year')
ar = np.zeros(shape=(52,),dtype=np.float32)
for i in range(len(data)):        
    d=data.index[i]-1
    if data["State_Now"].iloc[i] == "New":
        ar[d]+=1
plt.plot(np.cumsum(ar))
plt.xlabel("week")

Where i have to set the plots exactly in my Code?:

x = range(0,52)
y = ar

plt.plot(x,y)

With just the red code you posted above it’s fine. I cheked it so no need to set x = range(0,52) nor y, matplot will automatically understand plt.plot(np.cumsum(ar)) or plt.plot(ar)
`

@WilfriedF thank you so much for your help, it works now.
Now i can get deeper and deeper into the Options of Analyses :).

Thanks again, Mate!

Simo

EDIT:
One more question came into my mind. Can i somehow change the range of description of the x-axis? I would like to see every single Calendar Week mentioned there (not like now every ten).

1 Like