Hi! I have 2 questions about the
Screen Link: https://app.dataquest.io/m/468/business-metrics/7/date-wrangling
right_index=True seems to reset index for a table, but it is used for which table here, for churn or monthly_churn, it looks like to the monthly_churn. but we did not clarify to python, how did python know how to work?
2, we learned
pd.merge() in below format before. but it shows error.
churn = pd.merge(left=churn, right=monthly_churn, how="left", on="yearmonth", right_index=True)
we learned pd.merge() in below format before. but it shows error. 'yearmonth' why?
You are getting an error for that line of code because
yearmonth is not a common column in
monthly_churn. As the column is only present in
churn, you will have to use
left_on option instead of
right_index = True does not reset the index for the table. It is merely telling the
merge function to use the index of the right dataframe (
monthly_churn in this case) as the column to join with the left dataframe (
churn in this case). Since we cannot use the option of
on in this scenario (there are no common columns in the two dataframes), we have to specify which columns to merge on for both the left (using
churn) and the right (using
Hope this helps! Let me know if you have any more questions regarding this.
pd.merge()function, if the values of columns of 2 datasets are same, just the column name different, then we have to specify which key of the dataset we want to use, in this case, we need to use the column of the left dataset,churn. then we have to use the
right_index for the column of the right dataset which has the same value as the columns of left data set. Does it mean that we have to use
right_index as a pair to merge 2 data sets that have no common column( i mean not the column name are not the same but the value are same, like
Yes, you are correct. You have to specify keys for both the dataframes when they don’t have the same column names. If the
churn_month was a column instead of the index (which is the case here) in
monthly_churn, then you would have used the option
right_on instead of
Interesting that both options worked for me when I used local. But only right_index worked for Dataquest…
That is interesting, @maksym001. Perhaps one of the @moderators can throw some light on it.