Coloring area between lines with matplotlib

Hello, everyone,

I’m currently at the Z-Scores mission of the statistics course and at some point we are asked to draw a kde plot to check whether the price of a house in a given sample is reasonable or not.

Dataquests asks us to plot a chart that looks a bit like this:

However, I wanted to highlight the area between the red and the yellow line, which describe the lower and the upper limit of the standard deviation.

I know i can fill the whole line under the KDE line with plt.fill_between(). I also know I can draw a polygon with plt.axvspan(). But I can’t find a way to fill the area under the KDE line, but restrict the shade to the part of the graphic between the red and yellow lines.

The furthest I got is the following plot:

Does anyone have any idea how to highlight the area under the curve ?

Thanks in advance.

1 Like

Hi @celioxf,

I tried to create a sample plot of what you described and here is how I made it:

import matplotlib.pyplot as plt
import numpy as np
import scipy.stats as stats
import math
mu = 0
variance = 1
sigma = math.sqrt(variance)
x = np.linspace(mu - 3*sigma, mu + 3*sigma, 100)
plt.plot(x, stats.norm.pdf(x, mu, sigma))
plt.axvline(-1, color='red')
plt.axvline(1, color='green')
plt.fill_between(x[(1 >= x) & (x >= -1)], 0, stats.norm.pdf(x[(1 >= x) & (x >= -1)], mu, sigma), color='grey', hatch='//', alpha=0.2)
plt.ylim([0, max(stats.norm.pdf(x, mu, sigma)) * 1.2])
plt.show()

image

I hope this will help you to figure that out. :slightly_smiling_face:

Best,
Sahil

1 Like

Hey, Sahil. Thank for the reply. It’s been quite some time, since I’ve posted it here, I didn’t expect to get an answer anymore. However, your code did give me some insight on how to filter the arrays to pass as the x and y arguments of the fill_between() function from matplotlib.

Finding the data for the arrays was tricky though and I had to search for help on stackoverflow.

Here is how I did it:

  1. Got the coordinates of the lines from the Axes object
  2. Transformed this into a DataFrame
  3. Filtered the DataFrame to find the coordinates I wanted
  4. Passed the arrays to the fill_between function.

In the end, it looked like this:

Finding the coordinates was the most difficult part and I’d like to use this opportunity to ask, if there is a neater way to do it. Also, while I’m happy enough with the plot as it is, I couldn’t get the shaded area to go all the way to the x-axis - maybe it has to do with the fact, that I’m using a kernel density plot and not a line plot like you did.

Thanks for the help again and send me reply if you can come up with a simpler way to get the x and y arrays for the function [edit to make it more explict, I mean the example.lines[0]get_data().

1 Like

Hi @celioxf, You can fix this issue by setting the y-axis limit as I did in my plot:

Best,
Sahil