Different approch to find Churn rate in Business metrics doesn't accept as answer

Screen Link:
I tried a different approach to find the churn rate in Business metric mission. I have used the yearmonth in yyyymm format as a string and added a ‘01’ at the end to create the first day of the month and compared it with the start and end date.

My Code:

import datetime as dt

def fun(ym):
    
    day01 = str(ym)+'01'
    day01 = pd.to_datetime(day01, format='%Y%m%d')
    customers =  ((day01 > subs['start_date']) & 
            (day01<subs['end_date'])).sum()
    return customers
            
     
churn['total_customers'] = churn['yearmonth'].apply(fun)

churn['churn_rate'] = churn['total_churned']/churn['total_customers']

churn['yearmonth'] = churn['yearmonth'].astype(str)

What I expected to happen:

I couldn’t find any errors in the code. It does give the answer same as the solution provided.

What actually happened:

Though in the answer checking section it suggests that actual output and the required output are not the same. I believe values are the same, decimal points are also the same and datatypes are also same. Though the approach used in solution code is a bit different from mine. I will add the solution code below.

import datetime as dt

def get_customers(yearmonth):
    year = yearmonth//100
    month = yearmonth-year*100
    date = dt.datetime(year, month, 1)
    
    return ((subs["start_date"] < date) & (date <= subs["end_date"])).sum()

churn["total_customers"] = churn["yearmonth"].apply(get_customers)
churn["churn_rate"] = churn["total_churned"] / churn["total_customers"]
churn["yearmonth"] = churn["yearmonth"].astype(str)

Here is the output screenshot

Obviously the solution code is giving me a green light to go to the next mission. Though my code isn’t accepted as the answer.

It would be great if anyone could tell me what could have gone wrong here. Thanks in advance.

In such cases, when it’s difficult to figure out where the problem might be (which the platform isn’t always able to help with as you can see), use your and the provided solutions to your advantage.

  1. Create two copies of the original dataframe
  2. Apply the given solution to one copy and your solution to the other copy
  3. Compare the columns with a simple boolean check

For example, after doing the above, I checked the two with a simple print statement -

print(original_churn['total_customers']==student_churn['total_customers'])

If both were to be the same, then all values would be True. But instead, 3 of them don’t match.

Start by using those three as a reference to debug your function.

Additional Tip -

In Pandas, you can specify how many rows or columns you want to display when you print certain things out. Can help in such cases where it doesn’t display all the results. Adding the following at the top of your code should help as well -

pd.set_option('display.max_rows', None)

That None allows it to display all the rows instead of a few. Unfortunately, this doesn’t work with DQ’s grader when you submit, but you can use it for running the code and printing things out.

1 Like

Hi @the_doctor,
Thanks a lot for the guidance! I checked the way you have explained and found the differences. On further inspection, it became clear that end_date could be the same as the first day. By changing the logic of if statement to

customers =  ((day01 > subs['start_date']) & 
            (day01<=subs['end_date'])).sum()

things got sorted. The same top and bottom rows shown in the solution made me think that error could be elsewhere! I should have tried harder to spot it. Thanks again for your quick help.

2 Likes