Screen Link: https://app.dataquest.io/m/466/fuzzy-language-in-data-science/4/churned-customers
My Code:
import pandas as pd
import datetime as dt
data = pd.read_csv("rfm_xmas19.txt", parse_dates=["trans_date"])
group_by_customer = data.groupby("customer_id")
last_transaction = group_by_customer["trans_date"].max()
best_churn = pd.DataFrame(last_transaction)
def has_churned(row):
cutoff_day = dt.datetime(2019,10,16)
if row['trans_date'] > cutoff_day:
row['churned'] = 0
else:
row['churned'] = 1
return row
best_churn = best_churn.apply(has_churned,1)
The Solution:
cutoff_day = dt.datetime(2019, 10, 16)
best_churn["churned"] = best_churn["trans_date"].apply(
lambda date: 1 if date < cutoff_day else 0
)
After playing around with this a bit I realized that the error in my code was using >
relative to cutoff_day
rather than <
. Personally I feel it would be helpful to make it more explicit in the instructions that since is telling us no purchases on or after October 16.