I need to calculate some ‘gravity’ variables, for example remoteness:
1 / SUM ((GPD_trade_partner)/(Distance))
I have exporter-importer pair, the flow of trade from one country to the other. But to calculate the remoteness for a given country in a given year I need to know all its trading partners (all countries that country ‘A’ traded with: as an importer and as an exporter, without doubling them if there was both imports and exports with this country B). For example, to have a DataFrame like this:
Or some other way to calculate remoteness without creating new columns/ new data frame. For example, I can calculate the weighted distance for each row in the DataFrame with importer GDP and exporter GDP (plus two columns). But how can I sum up this weighted distance only for unique exporter-importer pairs?
Thank you so much! I will try that.
I am okay with the new column, I didn’t know how to approach the task at all.
I meant that I am looking for any solution creating new columns/ dataframe or not.
I’ve done the ‘check_string’ solution, it worked, it dropped rows with duplicate importer-exporter pairs, but I still could not figure out how to calculate remoteness for each country (weighted distance to all trade partners). So I did that and it worked:
# the original dataframe was 'year', 'exporter', 'importer'
df2 = df[['year', 'importer', 'exporter']]
df.columns = ['year','country','partner']
df2.columns = ['year','country','partner']
df.append(df2, ignore_index=True)
df.drop_duplicates(inplace=True)
Then I’ve added partner’s GDP and distance between countries and this is how I calculated remoteness: