BLACK FRIDAY EXTRA SAVINGS EVENT - EXTENDED
START FREE

Alternate solution for finding z-scores by neighborhood

Screen link: Learn data science with Python and R projects

I thought I would try my own way to solve this screen by using pandas groupby and dictionary comprehensions. Thanks to StackOverflow for the code to “rename” dictionary keys to their more verbose representations. I’m finding more and more uses for pop() all the time!

def z_score(value, array, bessel = 0):
    mean = sum(array) / len(array)
    
    from numpy import std
    st_dev = std(array, ddof = bessel)
    
    distance = value - mean
    z = distance / st_dev
    
    return z

import numpy as np

all_prices = dict(list(houses.groupby('Neighborhood')['SalePrice']))
hoods = ['NAmes', 'CollgCr', 'OldTown', 'Edwards', 'Somerst']
neighborhood_prices = {hood: all_prices[hood] for hood in hoods}
neighborhood_z_scores = {hood: z_score(200000, neighborhood_prices[hood]) 
                         for hood in neighborhood_prices}

# "Rename" dictionary keys to match the lesson
new_hoods = ['North Ames', 
             'College Creek', 
             'Old Town', 
             'Edwards', 
             'Somerset'
            ]
for old, new in zip(hoods, new_hoods):
    neighborhood_z_scores[new] = neighborhood_z_scores.pop(old)

# Assign key with lowest absolute value of z-score
best_investment = min(neighborhood_z_scores, 
                      key=lambda x: np.abs(neighborhood_z_scores[x]))