Strange Error when trying to convert series to int


autos["price"] = autos["price"].str.replace("$","").str.replace(",","").astype(int)    
autos["odometer"] = autos["odometer"].str.replace("km","").str.replace(",","").astype(int)


TypeErrorTraceback (most recent call last)
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

TypeError: an integer is required

During handling of the above exception, another exception occurred:

KeyErrorTraceback (most recent call last)
<ipython-input-7-85c720d2cc2e> in <module>()
----> 1 autos["price"] = autos["price"].str.replace("$","").str.replace(",","").astype(int)
      2 autos["odometer"] = autos["odometer"].str.replace("km","").str.replace(",","").astype(int)

/dataquest/system/env/python3/lib/python3.4/site-packages/pandas/core/ in __getitem__(self, key)
    621         key = com._apply_if_callable(key, self)
    622         try:
--> 623             result = self.index.get_value(self, key)
    625             if not is_scalar(result):

/dataquest/system/env/python3/lib/python3.4/site-packages/pandas/core/indexes/ in get_value(self, series, key)
   2558         try:
   2559             return self._engine.get_value(s, k,
-> 2560                                           tz=getattr(series.dtype, 'tz', None))
   2561         except KeyError as e1:
   2562             if len(self) > 0 and self.inferred_type in ['integer', 'boolean']:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

KeyError: 'price'

What Mission is this from?

First, check if the column price exists in autos or not.

I would also recommend printing out the result of -

autos["price"] = autos["price"].str.replace("$","").str.replace(",","")

to see whether the output from the above can be converted into an integer or not. Based on the error it seems the output from the above might contain some strings which can’t be converted to integer type.

It seems you don’t have a column by the name price. Check first the column names using autos.columns

The price column definitely exists in autos although I didn’t include when renaming the df. The mission is Guided Project: Exploring Ebay Car Sales Data.

I ran this before hand:

                      "monthOfRegistration" : "registration_month",
                     "notRepairedDamage" : "unrepaired_damage",
                     "dateCreated" : "ad_created",
                     "offerType" : "offer_type",
                     "vehicleType" : "vehicle_type",
                     "odometer" : "odo_meter",
                     "fuelType" : "fuel_type",
                     "nrOfPictures" : "num_of_pics",
                     "postalCode" : "postal_code",
                     "lastseen" : "last_seen",
                     "gearbox" : "gear_box",
                      "powerPS" : "power_ps",
                      "dateCrawled" : "date_crawled",
                      "lastSeen": "last_seen"}, inplace=True)

What I don’t get is that after renaming the columns (as shown above) I left our price because I do not want to rename price. price is still in autos as I intended after the renaming so I’m confused as to why it’s throwing an error.

If price is in autos did you check out the 2nd part of my comment above as well?

Yes, I tried to drop astype(int) but with no luck. It’s a bit of an odd issue!

Do you get the error even when you don’t include astype(int)?

If you still get the same error, then it might be better if you shared a link to your Github repo with your Notebook. It seems you have performed operations on the dataframe or columns before this that are causing issues with this particular operation.

Before you share the link, I would highly recommend that you do the following -

  1. Click on Kernel in the toolbar of the Notebook, and select the option to Restart the Kernel and Clean all Outputs.
  2. Run all of your code (just once, not multiple times) including the one you shared above that throws that error.

So I’ve dropped the astype(int) and it works fine now, ha. That seems to have done the job thankfully.

However, if I want to convert the column(s) into int types, will I be able to autos[“prices”].astype(int) without any issues or must I do something else?

Thanks for this

As long as there are strings in your price column that can be converted to integers, you can use astype(int). That’s why I suggested printing out the values for without astype(int) to see if there were any such strings or not which could potentially throw an error.

I haven’t tested it myself to be sure, but it is possible that NaN values in price might have resulted in that error too, since NaNs couldn’t be converted to integer. But I will have to test this to be sure.

Good to know it’s working for you now!

What if your “price” name has non-printable characters?

I can suggest that you completely rename the column (not using the autos.rename(columns={mapper_dict})which you’ve already done, but using the df.columns property modifier
autos.columns = ['price' if 'price' in item else item for item in list(autos.columns)]

This first renames the price column if it exists but has non-printable characters. Then you can run the normal autos.rename(columns=mapper)

Looks kinda complicated, but it takes away any non-printable characters