Kindly explain the 2nd strategy on this link it's very confusing

Screen Link:

Hi @joshi.ananya.joshi1

Sorry to say that the question is also very confusing. Could you please explain what exactly is creating confusion?

On the screen link provided there are two strategies mentioned as :

  • For rows with location values but missing values in either borough or the street name columns, we used geocoding APIs to look up the location coordinates to find the missing data.
  • For rows with values in the street name columns missing borough and/or location data, we used geocoding APIs to look up the address to find the missing data.

I am unable to understand the second one kindly explain it.

1 Like

This talks about how they have found the missing values called as the supplemental data using the GeoPy package. I’m sure you figured that out.

If you go back, you will see a table like this. There are missing values in borough/ location etc.

borough location on_street off_street cross_street
0 MANHATTAN (40.742832, -74.00771) WEST 15 STREET NaN 10 AVENUE
1 BROOKLYN (40.623714, -73.99314) 16 AVENUE NaN 62 STREET
2 NaN (40.591755, -73.9083) BELT PARKWAY NaN NaN
3 QUEENS (40.73602, -73.87954) GRAND AVENUE NaN VANLOON STREET
4 BRONX (40.884727, -73.89945) NaN 208 WEST 238 STREET NaN

In order to fill these data they created a file called supplemental_data.csv
These two strategies you mentioned are used to create this file. This .csv file in turn help us to fill the missing values in the data frame.

In case you were asking more about how did they find these values to create supplemental_data.csv, I’ll have to answer like a bad teacher who doesn’t want to admit that they don’t know the answer with this reply; “You don’t need to know that now to complete this mission” :wink:

Like the mission screen states “You can learn more about working with APIs in our APIs and Web Scraping course.”

I hope this answers your question.

1 Like