Guided Project: Clean And Analyze Employee Exit Surveys - Part 9

In this Part, the first step is to convert the datatype of the column combined_updated[‘institute_service’] from ‘object’ to ‘str’. I tried the below code but in vain. I am getting the below attached error msg. Any help?

Code I tried
combined_updated[‘institute_service’] = combined_updated[‘institute_service’].astype(‘str’)

Error Msg:*
/dataquest/system/env/python3/lib/python3.4/site-packages/ipykernel/main.py:7: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

1 Like

In this case you need to do as the error is suggesting

combined_updated.loc[:,‘institute_service’] = combined_updated[‘institute_service’].astype(‘str’)

This is something that changed with pandas 1.0 before was a warning now is an error. Because this column already exist you need to explicitly say both column and row index you want the change to take effect.

Also there is this blog post with more details about this error/warning

1 Like

Thanks for the rensponse.
I modified the code as suggested by you, but this time I got a new error as show below.

Code I tried
combined_updated.iloc[:,‘institute_service’] = combined_updated[‘institute_service’].astype(‘str’)

Error Msg
ValueErrorTraceback (most recent call last)
in ()
5 # combined.notnull().sum()
6 combined_updated = combined.dropna(axis=1,thresh=500)
----> 7 combined_updated.iloc[:,‘institute_service’] = combined_updated[‘institute_service’].astype(‘str’)
8 combined_updated[‘institute_service’]
9 # print(combined_updated)

/dataquest/system/env/python3/lib/python3.4/site-packages/pandas/core/indexing.py in setitem(self, key, value)
191 else:
192 key = com._apply_if_callable(key, self.obj)
–> 193 indexer = self._get_setitem_indexer(key)
194 self._setitem_with_indexer(indexer, value)
195

/dataquest/system/env/python3/lib/python3.4/site-packages/pandas/core/indexing.py in _get_setitem_indexer(self, key)
169 if isinstance(key, tuple):
170 try:
–> 171 return self._convert_tuple(key, is_setter=True)
172 except IndexingError:
173 pass

/dataquest/system/env/python3/lib/python3.4/site-packages/pandas/core/indexing.py in _convert_tuple(self, key, is_setter)
240 if i >= self.obj.ndim:
241 raise IndexingError(‘Too many indexers’)
–> 242 idx = self._convert_to_indexer(k, axis=i, is_setter=is_setter)
243 keyidx.append(idx)
244 return tuple(keyidx)

/dataquest/system/env/python3/lib/python3.4/site-packages/pandas/core/indexing.py in _convert_to_indexer(self, obj, axis, is_setter)
1848
1849 raise ValueError("Can only index by location with a [s]"
-> 1850 self._valid_types)
1851
1852

ValueError: Can only index by location with a [integer, integer slice (START point is INCLUDED, END point is EXCLUDED), listlike of integers, boolean array]

My suggestion was with loc not iloc

It makes a difference.

Make sure you get those 2 right as it is very important to work with pandas.

Cheers

Pedro

1 Like

I corrected my code now and that that solved the error. :grinning:
But the goal is not yet fulfulled. I mean the datatype still shows ‘object’ instead of ‘str’ as can be seen in the below attached output.

output
648 nan
649 5-6
650 3-4
Name: institute_service, Length: 651, 648 nan
649 5-6
650 3-4
Name: institute_service, Length: 651, dtype: object

in pandas objects are strings

Clean and analyse employees exit surveys, I also get an error and having trouble in solving it.
In part 9 to change the data column “institute_service” to string and to extract the number and change it back to float. Below is the code I used,
combined_updated[“institute_service”] = combined[“institute_service”].astype(str).str.extract(r"(\d+)")

The error I received,

/dataquest/system/env/python3/lib/python3.4/site-packages/ipykernel/main.py:4: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

/dataquest/system/env/python3/lib/python3.4/site-packages/ipykernel/main.py:4: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

Then I tried,

combined_updated.loc[:,“institute_service”] = combined_updated[“institute_service”].astype(str).str.extract(r"(\d+)", expand = False)

then I receive an error as below,

/dataquest/system/env/python3/lib/python3.4/site-packages/pandas/core/indexing.py:537: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation:

self.obj[item] = s

Thanks

Hi @hasini0213,

This article will help you to understand this warning message:

Best,
Sahil

Thanks will check it out.:slight_smile: :slight_smile:

1 Like