Help me write a function to get top 'model' per 'brand'

I am trying to get the top 3 most common models for each of the most common brands as asked in Step 9 of the Mission.

I’ve written code that creates a sorted frequency table dictionary for any given brand:

import operator
vw = autos['brand'] == 'volkswagen'
vw_models = autos[vw]['model']

model_freq_vw = {}

for m in vw_models:
    if m not in model_freq_vw:
        model_freq_vw[m] = 1
    else:
        model_freq_vw[m] += 1
sorted_vw = sorted(model_freq_vw.items(), key=operator.itemgetter(1), reverse=True)
sorted_vw[:3]

The output I got was:
[(‘golf’, 3684), (‘polo’, 1592), (‘passat’, 1345)]

This seems to be working and I could easily repeat this and change the brand over and over. I’d like to take it to the next level and write a function and just pass each brand, and have the function give me the top 3 models for that brand (as above). I tried to write a function but am getting very long error messages that are beyond my experience at this point.

Here is my function attempt:

def brand_model_freq(b):
    brand_filter = autos[b] == b
    brand_models = autos[brand_filter]['model']
    
    model_freq = {}
    
    for m in brand_models:
        if m not in model_freq:
            model_freq[m] = 1
        else:
            model_freq[m] += 1
    
    sorted_brand = sorted(model_freq.items(), key=operator.itemgetter(1), reverse=True)
    
    return sorted_brand[:3]

How can I fix my function to achieve this?

I know this might be beyond what the mission is asking at this point but I thought I’d ask since this seems like it would be super useful in practice.

Thank you very much for your time!

Mission direct link:

1 Like

The issue is likely because of the following in your brand_model_freq() function -

brand_filter = autos[b] == b

b is supposed to be the actual brand value. For example, 'volkswagen' in your previous code.

So, autos[b] will be incorrect, because autos['volkswagen'] would throw an error since you are trying to index a column and 'volkswagen' is not a column.

You likely wanted it to be autos['brand'] instead.

In the Data Cleaning and Analysis Course you will learn some more advanced approaches to help you with what you are trying to do. I will recommend to try to also come back to this problem once you go through that course and re-implement the function. It will help you learn a lot.

Note
For future references, when working with your own code, make sure you include all relevant information. In this case, it would have been better if you included -

  • The actual function call (like brand_model_freq('volkswagen')) to see how you are trying to run your code.
  • The entire error. Without that most people will find it difficult to help you out.

Thank you that worked! I’ll definitely write future questions to the community better. I’ll upload my project soon. Thank you very much!

1 Like

Here is a link to my finished project that includes the above function if anyone would like to see it:

Guided Project: Ebay Car Sales Data Analysis

Click here to view the jupyter notebook file in a new tab