Personal Python Project; Help Needed

Hi,

I have a large csv file that I have read into a Pandas DF and now need to use data in one column (name of a river) to look up the latatude and longitude via geopy.

Can anyone point me in the right direction as to code the following pseudo-code in python?
For each null value in column 12 and 13:
take the name of the river in column 8 and seach in geopy for the lat and long.
then, overright the null values in columns 12 and 13 with the lat and long returned from geopy.

Thank you for pointing me in the right direct!!!

Tim

Hi Tim,

I think the below might work for you. I am making a couple of assumptions about your Dataframe: (1) column 12 is latitude and column 13 is longitude, (2) you have already done pip install geopy, (3) your Dataframe name is df.

Note: You will get a DeprecationWarning - someone smarter than I could probably figure out how to solve this, but for now the code will still work.

I made a fake df to test this out on. It looks like the table below:

col_8 col_12 col_13
river1 50 100
Oldman, Alberta NaN NaN
Sturgeon River, Alberta NaN NaN
river2 178 234

# Import
import pandas as pd
from geopy.geocoders import Nominatim

# import the river csv (I used the name rivers.csv)
df = pd.read_csv(‘rivers.csv’, na_values=None)

# create instance of Nomination method
nom=Nominatim()

# retrive the coordinates (lat & long) for the river
# and store it in the new column ‘location’
df[‘coordinates’] = df[‘col_8’].apply(nom.geocode)

# create a column with just latitude and longitude from the
# coordinate column we created. we want to apply the lambda function
# that assigns the row the lat (or long) value if the coordinate is not None
df[‘lat’] = df[‘coordinates’].apply(lambda x: x.latitude if x != None else None)
df[‘long’] = df[‘coordinates’].apply(lambda x: x.longitude if x != None else None)

#Fill missing values from Columns 12 & 13 with the lat and long values
df[‘col_12’].fillna(df[‘lat’], inplace=True)
df[‘col_13’].fillna(df[‘long’], inplace=True)

2 Likes

Hi, I thought I would add a screenshot of the jupyter notebook:

Hi Tim,

I’ve noticed when you copy syntax from the responses that it gives you weird quotation marks. Try changing the quotation marks around rivers.csv and make sure that you use the name of your csv file.