How to change colour of a line plot based on a condition (below/above average)

https://app.dataquest.io/c/96/m/529/guided-project%3A-storytelling-data-visualization-on-exchange-rates/7/next-steps

Hi all,

i’d like to plot a graph where all values below the average are in red and all values above average are green.

I’ve tried slicing the dataset in above and below average.
But naturally it will plot horizontal lines from where there are no values (because they are sliced out of the dataset) until values appear again.

Someone an idea how to solve this that a beginner like me can understand?

GP5 Euro daily exchange rates.ipynb (481.5 KB)

Click here to view the jupyter notebook file in a new tab

Never tried anything like this, but maybe Line Collections could be useful here. Make sure the documentation version matches with the Matplotlib version used in the Classroom just in case. Additional reference.

Or, maybe something like this. There could be easier ways, perhaps, but this one seems convenient.

the bottom solution is the easiest one, but it has a huge drawback: connections between 2 points on either side of the hline are in the wrong color or not visible:

fig, ax = plt.subplots(figsize=(16,8))
# this line is optional - it connects plots that are below and above the hline:
plt.plot(dft['SalePrice']) 
plt.plot(np.where(dft['SalePrice']> dft['SalePrice'].mean(),dft['SalePrice'], None), color="red", label="1")
plt.plot(np.where(dft['SalePrice'] <= dft['SalePrice'].mean(),dft['SalePrice'], None), color="blue", label="1")
plt.axhline(df['SalePrice'].mean(), color='red')
plt.show()

maybe just draw a thick black hline (representing avg value) on your plot covering both green and red hlines? it will cover the green/red flat line, give info about avg value and look like it was ment to be there

Thanks, will plot the horizontal line for now (even though i think its a bit ugly as it will be a very thick line in order to cover the red and green).
The line collections solution that @the_doctor proposes seems to difficult for me at this stage.

I tried to use the

ma.masked_less

solution, but I get an error:

‘<’ not supported between instances of ‘SingleBlockManager’ and 'float

I actually got curious and tried my best for an hour or two today, and couldn’t make it work on a line chart like yours (Im missing proper vocabulary here, zigzaggy?) . Every solution I found gives some sinusoidal line as an example, so there’s always some curve and continuity but as soon as you want to apply that to say… currency line chart (when sometimes 2 points are on either sides of the avg value) then it doesn’t work (well it works, but the way my 2 charts work - missing bits or wrong color)

also tried plotly, expecting some easier solution and… nada. Every solution on the web said that matplotlib doesn’t have an easy answer for that one, so it’s going to be a hard one

Hi leon,

I wanted to try this for practice and came up with using a for loop to solve the problem.

I’m very close but I’m running into a bug. For some reason I can highlight everything below the mean, but if I try to highlight everything above the mean then it putters out at a certain point. Very strange! But maybe it’s enough to get you going in the right direction.

Everything Below the mean highlighted:
image

Everything Above the mean highlighted (bug):
image

You can see it kind of works, but decides not to at a certain point. Let me know if this is helpful and if and how you’re able to work it!

Here’s the notebook. This is a clone of the storytelling visualization project.
Highlighting Above Mean.ipynb (135.9 KB)

Click here to view the jupyter notebook file in a new tab

EDIT : I figured it out - there needed to be a counter so that when the for-loop ended it created one last plot. This should work for your purposes.

image
Highlighting Above Mean.ipynb (135.7 KB)


ignore this in the solution - this was from another angle that didn’t pan out.

this generates the same plot as the solution above(some of the line parts are not colored correctly when the 2 points are on either side of the horizontal line)
you’ve used a dataset, that’s more gentle (rolling mean) and has many more entries but if you zoom in on a more varied dataset, you can see it still doesnt work (this is all your code, just data from housing ML project):

Can you share the notebook you’re using? I’m not at the ML part of these course yet and I’d like to see what’s going on.

lets start with dataset

#how to read this dataset:
df = pd.read_csv('AmesHousing.tsv', delimiter="\t")

notebook:
plot_battle.ipynb (112.9 KB)

Click here to view the jupyter notebook file in a new tab

I see the limitations in my solution now - if there’s violent swings in the data from one day to the next then it does not plot properly.

You win this plot battle!

Not sure, if It’s worth the fight. I can imagine 2 solutions to this problem, both of them painful:

  1. someone cracks the code on this one: Multicolored lines — Matplotlib 3.5.0 documentation

(not sure if it’s doable, because it’s a sinusoidal line, every solution to this problem on the internet has some sinusoidal line, no one uses a plot line like ours)

  1. this one may be even more painful:

like @leonhekkert said: slice the dataset, which opens up a pandoras box:

  • you’re going to have to plot multiple lines(the plot goes above and below the horizontal line a few times)

  • most of those lines will end way below/ above the horizontal line (the delimiter value) so you will have to write a function to continue drawing the line in 1 color till the hline(horizontal line), then stop, continue drawing new line from the other side of the line in a different color

  • you know the yaxis values of those missing lines, BUT calculating the xaxis positions of the points at the hline sounds… ‘fun’

Maybe the ugliest way to do it so far, but thought I would share since there was good discussion.

Creating a shape and using overlay rules may have merit. I tried to see if there was a way to crop a chart to do the same, but cleaner, and didn’t have luck.

image

I got my plotting practice in for the week.

What do you think of the solutions in the notebook - both are similar ways to do the same thing. Only thing I see as off here is some minor grid spacing issues around the mean.

plot_battle.ipynb (151.9 KB)

Click here to view the jupyter notebook file in a new tab

1 Like

haha, awesome I actually wanted to do it that way as soon as I saw your ‘zorder’ trick, you beat me to it, nice one!

1 Like