We need to do the following cleaning: data=[2-3, 4.0, 10-13, 40,…]. Those are years, or range of years. So, the person stayed in the enterprise between 2 and 3 years in the first case. I need to convert all of them to a single value. I could do it with a for loop, but I wonder if there is some vectorized form.
For example, if instead of the mean of each range I would just take the lowest value (in data would be 2), it could be:
teste= combined_updated[‘institute_service’].astype(‘str’).str.replace(‘Less than 1 year’, ‘1’).str.replace(‘More than 20 years’, ‘20’).str.split(’-’).str.astype(‘float’)
So, I use str to get the first value, cause I will get a list [2,3] for data. is there some way to convert them to floats (inside the list) and then do the mean, in a vectorized way? …
Not even sure I was clear, sorry.