Screen Link: https://app.dataquest.io/m/308/measures-of-variability/1/the-range
Hello everyone, it’s not the first time people here would use a groupby instead of the more loop-inclined solutions DQ proposes.
But doing so you still need to convert the groupby into a dictionnary with “to_dict()”, but no matter what you pass as a parameter - below I chose “index” -, I always end up with a nested dictonnary.
To bypass that I still need to use a for loop. But is there a way to get rid of this last step and turn this into a great oneliner ???
Here’s my code :
import pandas as pd
houses = pd.read_table('AmesHousing_1.txt')
def range(arr):
return(max(arr)-min(arr))
range_by_year = houses.groupby(["Yr Sold"]).agg({"SalePrice":range}).to_dict("index")
As a result it gives me this nested dictionnary :
{2006: {'SalePrice': 590000},
2007: {'SalePrice': 715700},
2008: {'SalePrice': 601900},
2009: {'SalePrice': 575100},
2010: {'SalePrice': 598868}}
And here’s the loop to fix it :
for k, v in range_by_year.items():
range_by_year[k] = range_by_year[k]["SalePrice"]
Wich gives me the dictionnary as requested :
{2006: 590000, 2007: 715700, 2008: 601900, 2009: 575100, 2010: 598868}
Thanks in advance !