Guided Project: Analyzing NYC High School Data: Missing Data


I’m doing Analyzing NYC High School Data Guided Project and have a question about left and inner join. In the instructions, it is assumed that the ap_2010 and graduation datasets have many missing values in the DBN (school code) column, that’s why we use a left join to preserve as much information as possible when combining these datasets with sat_results. I checked both datasets for missing values and there are no missing values in the DBN column.

My question is: do we need to use a left join in order to preserve information in the sat_results dataset in case some DBNs are ot the same or there some other explanation?


1 Like