I’m working on plotting a bar chart to compare the first ten rows and the last ten rows of a column (unemployment rate)
the dataframe name is recent_grads
My code is:
first = recent_grads[:10]["Unemployment_rate"]
last = recent_grads[-10:]["Unemployment_rate"]
df = pd.DataFrame(first, last)
df.plot.bar(x = first, y = last, legend = False)
What I expected to happen: I expected it to returned a bar chart
What actually happened: However, Python keeps telling that the last 10 values in the last variable is not in the index
KeyError: '[0.047584 0.04010498 0.10711573 0.07754113 0.08174221 0.04632028\n 0.06511219 0.1490482 0.05362065 0.10494572] not in index'
Please help I tried Google and StackOverflow but nothing made sense.
Look at what the examples are putting into their x and y inputs.
As a start, you have to care about the type, value, and sometimes id() (when learning copying/iterators/OOP) of every python variable you use.
KeyError happens when something is used to index another object, usually a collection of items stored in key-value structure. You don’t have to know what the collection is, as long as you can see what was being used as a key to index and why it is wrong. Here you see a list of numbers separated by spaces and \n in the error message. Does it look familiar? Try to guess what it is, where it comes from.
Besides thinking about how the developers of Matplotlib may have implemented something, you can abstract away from the low level details and think about API design. Meaning what do the Matplotlib/Pandas developers intend for you to provide to make a function work?
(The function in this case being bar plotting).
Maybe the way you are thinking works, but are there other easier ways of API design that lighten the burden of users, so they have to provide less information/ type less? (this is the hint to answer to your question)
first and last are variables. "first" and "last" are strings. They represent different things. Pandas takes what you give and passes it to matplotlib, doing some checking along the way, this check generated the keyerror because it can’t find the column name you wanted in the dataframe you provided.
It’s hard in this case to google because every application can generate a different keyerror message depending on what wrong inputs were given, but when you have the chance to google a standardized error message,(like those designed by pandas developers using validate parameter during pd.merge) that can lead you to the place in the source code, and when you read the surrounding code and do some tracing, you can quickly learn what causes the error message.