i’d like to plot a graph where all values below the average are in red and all values above average are green.
I’ve tried slicing the dataset in above and below average.
But naturally it will plot horizontal lines from where there are no values (because they are sliced out of the dataset) until values appear again.
Someone an idea how to solve this that a beginner like me can understand?
Never tried anything like this, but maybe Line Collections could be useful here. Make sure the documentation version matches with the Matplotlib version used in the Classroom just in case. Additional reference.
Or, maybe something like this. There could be easier ways, perhaps, but this one seems convenient.
the bottom solution is the easiest one, but it has a huge drawback: connections between 2 points on either side of the hline are in the wrong color or not visible:
fig, ax = plt.subplots(figsize=(16,8))
# this line is optional - it connects plots that are below and above the hline:
plt.plot(dft['SalePrice'])
plt.plot(np.where(dft['SalePrice']> dft['SalePrice'].mean(),dft['SalePrice'], None), color="red", label="1")
plt.plot(np.where(dft['SalePrice'] <= dft['SalePrice'].mean(),dft['SalePrice'], None), color="blue", label="1")
plt.axhline(df['SalePrice'].mean(), color='red')
plt.show()
maybe just draw a thick black hline (representing avg value) on your plot covering both green and red hlines? it will cover the green/red flat line, give info about avg value and look like it was ment to be there
Thanks, will plot the horizontal line for now (even though i think its a bit ugly as it will be a very thick line in order to cover the red and green).
The line collections solution that @the_doctor proposes seems to difficult for me at this stage.
I tried to use the
ma.masked_less
solution, but I get an error:
‘<’ not supported between instances of ‘SingleBlockManager’ and 'float
I actually got curious and tried my best for an hour or two today, and couldn’t make it work on a line chart like yours (Im missing proper vocabulary here, zigzaggy?) . Every solution I found gives some sinusoidal line as an example, so there’s always some curve and continuity but as soon as you want to apply that to say… currency line chart (when sometimes 2 points are on either sides of the avg value) then it doesn’t work (well it works, but the way my 2 charts work - missing bits or wrong color)
also tried plotly, expecting some easier solution and… nada. Every solution on the web said that matplotlib doesn’t have an easy answer for that one, so it’s going to be a hard one
I wanted to try this for practice and came up with using a for loop to solve the problem.
I’m very close but I’m running into a bug. For some reason I can highlight everything below the mean, but if I try to highlight everything above the mean then it putters out at a certain point. Very strange! But maybe it’s enough to get you going in the right direction.
Everything Below the mean highlighted:
Everything Above the mean highlighted (bug):
You can see it kind of works, but decides not to at a certain point. Let me know if this is helpful and if and how you’re able to work it!
this generates the same plot as the solution above(some of the line parts are not colored correctly when the 2 points are on either side of the horizontal line)
you’ve used a dataset, that’s more gentle (rolling mean) and has many more entries but if you zoom in on a more varied dataset, you can see it still doesnt work (this is all your code, just data from housing ML project):
(not sure if it’s doable, because it’s a sinusoidal line, every solution to this problem on the internet has some sinusoidal line, no one uses a plot line like ours)
this one may be even more painful:
like @leonhekkert said: slice the dataset, which opens up a pandoras box:
you’re going to have to plot multiple lines(the plot goes above and below the horizontal line a few times)
most of those lines will end way below/ above the horizontal line (the delimiter value) so you will have to write a function to continue drawing the line in 1 color till the hline(horizontal line), then stop, continue drawing new line from the other side of the line in a different color
you know the yaxis values of those missing lines, BUT calculating the xaxis positions of the points at the hline sounds… ‘fun’
Maybe the ugliest way to do it so far, but thought I would share since there was good discussion.
Creating a shape and using overlay rules may have merit. I tried to see if there was a way to crop a chart to do the same, but cleaner, and didn’t have luck.
What do you think of the solutions in the notebook - both are similar ways to do the same thing. Only thing I see as off here is some minor grid spacing issues around the mean.