I’m currently at the Z-Scores mission of the statistics course and at some point we are asked to draw a kde plot to check whether the price of a house in a given sample is reasonable or not.
Dataquests asks us to plot a chart that looks a bit like this:
However, I wanted to highlight the area between the red and the yellow line, which describe the lower and the upper limit of the standard deviation.
I know i can fill the whole line under the KDE line with plt.fill_between(). I also know I can draw a polygon with plt.axvspan(). But I can’t find a way to fill the area under the KDE line, but restrict the shade to the part of the graphic between the red and yellow lines.
Hey, Sahil. Thank for the reply. It’s been quite some time, since I’ve posted it here, I didn’t expect to get an answer anymore. However, your code did give me some insight on how to filter the arrays to pass as the x and y arguments of the fill_between() function from matplotlib.
Finding the data for the arrays was tricky though and I had to search for help on stackoverflow.
Here is how I did it:
Got the coordinates of the lines from the Axes object
Transformed this into a DataFrame
Filtered the DataFrame to find the coordinates I wanted
Finding the coordinates was the most difficult part and I’d like to use this opportunity to ask, if there is a neater way to do it. Also, while I’m happy enough with the plot as it is, I couldn’t get the shaded area to go all the way to the x-axis - maybe it has to do with the fact, that I’m using a kernel density plot and not a line plot like you did.
Thanks for the help again and send me reply if you can come up with a simpler way to get the x and y arrays for the function [edit to make it more explict, I mean the example.lines[0]get_data().