# Explanation required

Hi,

I’ve been really enjoying doing the practice problems, but I’ve come across one that I haven’t fully been able to understand.

This is the question:

1. Create a function, `avg_group` , with the following features:
2. The first argument is a dictionary that follows the schema `{column_name: {index: value}}` .
3. The second argument is one of the string keys in the first argument.
4. Returns a new dictionary where:
• They keys are the values in the entry of the first argument at the second argument.
• The values are lists containing the averages for `total_bill` , `tip` and `size` in this order.

``````def get_avgs(d, value, indices):# what is the purpose of the indices parameter
sums = [0 for _ in range(3)]
for idx in indices:
sums += d["total_bill"][idx]
sums += d["tip"][idx]
sums += d["size"][idx]
return list(map(lambda x: x/len(indices), sums))

def avg_group(d, col):
data = d[col]                        # | Whats is the purpose of these 3 lines of code
groups = list(set(d[col].values()))  # |
groups.sort()                        # |

group_by = {}

for key in groups:
indices = [k for k,v in d[col].items() if v == key] #what is the purpose of this
group_by[key] = get_avgs(d, key, indices)

return group_by
``````

For a dictionary, ` {column_name: {index: value}}`, how would you access `value`?

Given the following -

Think about what’s happening above given the question I asked you above.

They decided to store `d[col]` into a variable `data` but then never used `data` anywhere and continued to directly use `d[col]`. You can ignore this line.

``````groups = list(set(d[col].values()))
``````

This is a convenient, concise way to extract the nested keys in the nested dictionary. For example,

``````d = {
'sex': {69: 'Male', 103: 'Female', 84: 'Male', 207: 'Male', 0: 'Female'}
}
``````

`d[col]` where `col` is `sex` would return -

``````{69: 'Male', 103: 'Female', 84: 'Male', 207: 'Male', 0: 'Female'}
``````

Then, since we want -

returns a new dictionary where they keys are the values in the given column name

we need to extract the `values` from `d[col]` such they can be used as `keys`. That’s what the following does -

``````d[col].values()
``````

The above returns -

dict_values([‘Female’, ‘Male’, ‘Male’, ‘Male’, ‘Female’])

Now, how do you get the unique values from above (since you need the as keys)? That’s where `set` comes in. `set`s don’t store duplicate values. So, if you add anything to a `set` or you convert any container to a `set`, it will remove the duplicate values.

So, `set(d[col].values())` returns -

{‘Female’, ‘Male’}

And then you convert the above `set` into a `list`, `list(set(d[col].values()))` -

[‘Female’, ‘Male’]

`.sort()` sorts the items in the list in ascending or lexicographic (alphabetical) order. Based on the instructions, I don’t see a particular need for sorting them actually.

1 Like