Extract Lat from dataset

After we used re.findall , we return everything inside ( ), like [(40.8276026690005, -73.90447525699966)],but when we do lat=coords[0].split(',')[0].replace('(','') , I don’t understand why coords[0] is corresponded to 40.8276026690005 only. For my understanding ( ) is index 0, for example [(a,b),(c,d)], index 0 should be (a,b)

My Code:

import re

def find_lat(loc):
    pattern=r'\(.+\)'
    coords=re.findall(pattern,loc)
    lat=coords[0].split(',')[0].replace('(','')
    return lat

data['hs_directory']['lat']=data['hs_directory']['Location 1'].apply(find_lat)

print(data['hs_directory']['lat'].head())

Your understanding of how zero-indexing works for [(a,b),(c,d)] is correct. So index 0 has this item (a,b). The split(',') changes (a,b) into ['(a', 'b)'] . The [0] picks the (a from this list. And replace removes the (

Kindly see below

import re

coords = [(40.8276026690005, -73.90447525699966)]

print(coords[0])

Ouput: (40.8276026690005, -73.90447525699966)
lat=str(coords[0]).split(',')
lat

Output: ['(40.8276026690005', ' -73.90447525699966)']
lat=str(coords[0]).split(',')[0]
lat

Ouput: (40.8276026690005
lat=str(coords[0]).split(',')[0].replace('(','')
lat
Ouput: 40.8276026690005
2 Likes

Thank you for explaining it. I got it!!!

1 Like