Just need an explanation

Screen Link: https://app.dataquest.io/m/292/exploring-data-with-pandas%3A-intermediate/12/challenge-calculating-return-on-assets-by-country

My Code:

f500["roa"] = f500["profits"] / f500["assets"]

top_roa_by_sector = {}
for sector in f500["sector"].unique():
    is_sector = f500["sector"] == sector # **<–––––––Can somebody tell what this line of code is doing?**
    sector_companies = f500.loc[is_sector]
    
<!--Enter other details below: -->
Are we assigning the sector column a new name for the loop?
1 Like

Hi @troc25,

As far I understand, this line assigns True or False to is_sector

1 Like

It will generate boolean series by comparing each row of f500["sector"] with sector string.

For example

# Note that just taking first five rows from `f500` for example only
f500_ = f500.head()
print(f500_["sector"])
0                 Retailing
1                    Energy
2                    Energy
3                    Energy
4    Motor Vehicles & Parts
Name: sector, dtype: object

# and if sector is "Energy"
sector = "Energy"

is_sector = f500_["sector"] == sector    # It will return `True` for every row where `sector` match otherwise `False`
print(is_sector)
0    False
1     True
2     True
3     True
4    False
Name: sector, dtype: bool

# later we can use `is_sector` as boolean indexing for `f500` like
f500.loc[is_sector]
2 Likes

Dishin,
I appreciate how you broke this down. Thank you!
-T

1 Like

Hi, my approach was this, for me worked but I don’t know if it is just right:

    top_roa_by_sector = {}

    sector_companies = f500["sector"].unique()
    # i = 0
    for sector_name in sector_companies:
        """I use the next line as a reducer (but it is a Boolean Array, you had received the perfect 
            explanation in the previous post)"""
        reducer = f500["sector"] == sector_name
        data = f500.loc[reducer,["company", "sector", "roa"]]
        # viewing each block in the console
        # print(data)

        # ordering data and viewing each block in the console
        data_ordered = data.sort_values("roa",ascending=False)
        print(data_ordered.iloc[0][0], data_ordered.iloc[0][2])

        # taking the values 
        company_name = data_ordered.iloc[0][0]
        top_value = data_ordered.iloc[0][2]

        # creating the dictionary
        # top_roa_by_sector[sector_name] = {company_name : top_value}
        top_roa_by_sector[sector_name] = company_name
        # i += 1
        
    # In the end, the code performed only 21 cycles