Challenge: Calculating Return on Assets by Country

Screen Link: Learn data science with Python and R projects

So I was doing the challenge and it’s pretty straight forward.
However for the second part of it (getting the company name with the most roa for every sector) I decided to explore the pandas documentation and see if I can solve it in another way other than the loop method using some build in functions and utilities.
So I stumbled upon the pandas.DataFrame.groupby function and using it I managed to get a series containing the sectors and their corresponding highest roa
My Code:

grouped_by_sector = f500.groupby(["sector"])["roa"].max()

However what we want is obviously the name of the company with the highest roa pro sector and not just the roa itself so anyone with an idea how to adapt this to get the needed result ?
Thanks.

f500["roa"] = f500["profits"] / f500["assets"]

idx = f500.groupby(["sector"])["roa"].transform(max) == f500["roa"]

top_roa_by_sector_df = f500[idx][["company", "sector"]]


top_roa_by_sector_tuples = zip(top_roa_by_sector_df.sector, top_roa_by_sector_df.company)

top_roa_by_sector = dict(top_roa_by_sector_tuples)

This was the final solution I came up with after googling and checking the documentation. Does anyone know how I can optimise it further ?

2 Likes