Applymap(),create function

When we define a function update_vals(col)should we put for value in col then put if conditions? For my understanding, the function needs to take a column in as a parameter, then python checks each value in that column.

My code is different from the answer. Why shouldn’t use for loop in this function? Thank you in advance!!

Screen Link: https://app.dataquest.io/m/348/guided-project%3A-clean-and-analyze-employee-exit-surveys/8/combine-the-data

My Code:

def update_vals(col):
    for value in col:
        if pd.isnull(val):
            return np.NaN
        elif value == '-':
            return False
        else:
            return True

What actually happened:


TypeErrorTraceback (most recent call last)
<ipython-input-112-a144b1d58549> in <module>()
     17 'work_life_balance',
     18 'workload',]   
---> 19 dete_resignations['dissatisfied']=dete_resignations[dete_col].applymap(update_vals).any(axis=1,skipna=False)
     20 
     21 tafe_col=['contributing_factors._dissatisfaction','contributing_factors._job_dissatisfaction']

/dataquest/system/env/python3/lib/python3.4/site-packages/pandas/core/frame.py in applymap(self, func)
   5066             return lib.map_infer(x.asobject, func)
   5067 
-> 5068         return self.apply(infer)
   5069 
   5070     # ----------------------------------------------------------------------

/dataquest/system/env/python3/lib/python3.4/site-packages/pandas/core/frame.py in apply(self, func, axis, broadcast, raw, reduce, args, **kwds)
   4875                         f, axis,
   4876                         reduce=reduce,
-> 4877                         ignore_failures=ignore_failures)
   4878             else:
   4879                 return self._apply_broadcast(f, axis)

/dataquest/system/env/python3/lib/python3.4/site-packages/pandas/core/frame.py in _apply_standard(self, func, axis, ignore_failures, reduce)
   4971             try:
   4972                 for i, v in enumerate(series_gen):
-> 4973                     results[i] = func(v)
   4974                     keys.append(v.name)
   4975             except Exception as e:

/dataquest/system/env/python3/lib/python3.4/site-packages/pandas/core/frame.py in infer(x)
   5064             if x.empty:
   5065                 return lib.map_infer(x, func)
-> 5066             return lib.map_infer(x.asobject, func)
   5067 
   5068         return self.apply(infer)

pandas/_libs/src/inference.pyx in pandas._libs.lib.map_infer()

<ipython-input-112-a144b1d58549> in update_vals(val)
      1 #Write a function make values change
      2 def update_vals(val):
----> 3     for row in val:
      4         if pd.isnull(val):
      5             return np.NaN

TypeError: ("'bool' object is not iterable", 'occurred at index job_dissatisfaction')

Not always. Pandas is smart, so when you use .apply() you are already telling pandas to apply and iterate that function to all the column or columns and it’s values. In that case you don’t need to use a for loop. But if you are going to use that function in a list then you need to pass the list as the argument of the function and inside the function you need a for loop.

so if i want to use for loop here, then the code should be like this?

    for val in col:
        if pd.isnull(val):
            return np.NaN
        elif val== '-':
            return False
        else:
            return True```

Yeah, it might not work on a column, but if you use it in a list it will work perfectly. To make sure that it changes the values in the column it’s better to use a function, like this