Another approach ,Using strftime for date conversion

Screen Link:

My Code:

import pandas as pd

subs=pd.read_csv("muscle_labs.csv",parse_dates=["end_date","start_date"])


subs["churn_month"]=subs["end_date"].apply(lambda x:x.strftime("%Y%m"))

subs["churn_month"]=subs["churn_month"].astype("int")

monthly_churn=pd.DataFrame(subs.groupby(["churn_month"]).size())

monthly_churn.rename(columns={0:"total_churned"},inplace=True)

monthly_churn["total_churned"]=monthly_churn["total_churned"].astype("int")

What I expected to happen:
used strftime for conversion , the result was same but to pass dataquest test case need to convert date column to int.

What actually happened:

Replace this line with the output/error

Need to find out which is a faster way
dataquest solution should be fast as it used parallel operation
but i kept column as datetime only

Hi @eashwary
If you want to find wich one is faster you can use %time on a cell in your jupyter notebook to find the time that takes to run the code inside that cell

ps: if it doesn’t work then try with %%time

If anyone is on this mission screen, what do you think about this approach to find the date for comparison

day01 = str(yyyymm)+'01'
    day01 = pd.to_datetime(day01, format='%Y%m%d')

This will return the first day of the each month from churn['yearmonth'] which can be compared with the start_date and end_date as it is.