TypeError: string indices must be integers error when trying to set the 468-8 function in the opposite direction

Screen Link: https://app.dataquest.io/m/468/business-metrics/8/churn-rate

My Code:

def count_customers(yearmonth):
    row_counter = 0
# here I extract the month from the function's input
    m = int(str(yearmonth)[4:])
    for r in subs:
# now, I extract the start and end months from each row and compare those numbers to the inputted month
        r_end_m = r["end_date"].dt.month
        r_start_m = r["start_date"].dt.month
        if (r_start_m < m) & (m < r_end_m):
            row_counter += 1
    return row_counter
# the row_counter would output the number of rows that meet the criteria requested on the mission screen

churn["total_customers"] = churn["yearmonth"].apply(count_customers)

What I expected to happen:

So, I know the function given in the solution creates a date from the yearmonth data and applies it as a date to compare on the subs DataFrame through vectors :nerd_face:, but initially, I thought of the function in a different manner and would kindly ask if you could help me understand the error Traceback that I’m getting when looping through the subs DataFrame:

What actually happened:

TypeErrorTraceback (most recent call last)
<ipython-input-1-ee6816adda68> in <module>()
     13     return row_counter
---> 15 churn["total_customers"] = churn["yearmonth"].apply(count_customers)
     17 # def get_customers(yearmonth):

/dataquest/system/env/python3/lib/python3.4/site-packages/pandas/core/series.py in apply(self, func, convert_dtype, args, **kwds)
   2549             else:
   2550                 values = self.asobject
-> 2551                 mapped = lib.map_infer(values, f, convert=convert_dtype)
   2553         if len(mapped) and isinstance(mapped[0], Series):

pandas/_libs/src/inference.pyx in pandas._libs.lib.map_infer()

<ipython-input-1-ee6816adda68> in count_customers(yearmonth)
      5     m = int(str(yearmonth)[4:])
      6     for r in subs:
----> 7         r_end_m = r["end_date"].dt.month
      8         r_start_m = r["start_date"].dt.month
      9         if (r_start_m < m) & (m < r_end_m):

TypeError: string indices must be integers

subs is a DataFrame, You can’t iterate over a DataFrame to get its each row like this.
So. when you are trying iterate like this, python’s taking this r as a string each time and inside [] the string column name "start_date"as invalid because it’s not an integer value like str[2].
DataFrame.iterrows is a generator which yields both the index and row (as a Series):

import pandas as pd
import numpy as np

df = pd.DataFrame({'c1': [10, 11, 12], 'c2': [100, 110, 120]})

for index, row in df.iterrows():
    print(row['c1'], row['c2'])

10 100
11 110
12 120
The df.iteritems() iterates over columns and not rows. Thus, to make it iterate over rows, you have to transpose (the "T"), which means you change rows and columns into each other (reflect over diagonal). As a result, you effectively iterate the original dataframe over its rows when you use df.T.iteritems().


for date, row in df.T.iteritems():

Thank you for your feedback. The approach does not work though, it turns out that in this way, I’m leaving out information from the subs DataFrame. Still, I learned that I couldn’t iterate on a df like that, thanks :+1:

1 Like

I think @jithins123 , you can help @estebanalfaroorozco out, I’m a novice in this world. So, could you please have a look here?

TypeError: means that you are trying to perform an operation on a value whose type is not compatible with that operation. An Iterable is a collection of elements that can be accessed sequentially . In Python, iterable objects are indexed using numbers . When you try to access an iterable object using a string or a float as the index, an error will be returned as TypeError: string indices must be integers. This means that when you’re accessing an iterable object like a string or float value, you must do it using an integer value.

For example, str[hello"] and str[2.1] as indexes. As these are not integers, a TypeError exception is raised. This means that when you’re accessing an iterable object like a string or float value, you must do it using an integer value . If you are accessing items from a dictionary , make sure that you are accessing the dictionary itself and not a key in the dictionary.

Python supports slice notation for any sequential data type like lists, strings , tuples, bytes, bytearrays, and ranges. When working with strings and slice notation, it can happen that a TypeError: string indices must be integers is raised, pointing out that the indices must be integers, even if they obviously are.

1 Like