Hi @tusharsingh00. In the first code, it won’t automatically print out the values from the loop as it goes. In order to see the results, you need to tell Python to print them, as in print(autos[cols].value_counts(normalize=True, dropna=False).describe()) .
I’m not sure why you got an error you received from the line autos[['date_crawled','ad_created','last_seen']][0:5], it worked just fine for me. What kind of error did you receive?
If you’re not able to figure it out, please share your .ipynb notebook file so members of the community can have a look and help you troubleshoot.
I just had a look at the .ipynb file you attached. The reason autos[['data_crawled','ad_created','last_seen']][0:5] isn’t running is because those columns don’t exist. On the 2nd screen there are instructions to rename some columns and transform the rest of them from camelcase to snakecase. For example, you still have dateCrawled as a column name (camelcase), so it hasn’t been changed to snakecase (date_crawled).
Hello! Is it possible for us to download the autos.csv file somewhere? The only file I found was the one you guys provided in Kaggle, but that one is the complete, I would like the one used in the project.
In the DQ interface for Jupyter notebook, there’s a download link at the top. Clicking this will allow you to download a .tar file, which contains the dataset(s) used for the project along with your project notebook file.
thanku It worked, i dont have any coding background that’s why it’s always a panic situation for me. I always forget few things while coding sometime its really hard to understand the terms asked in question area. Thank You April.g for guiding me
The error (KeyError: ('date_crawled', 'ad_created', 'last_seen')) results from using multiple column names. You’d need to use double brackets (autos[['date_crawled', 'ad_created', 'last_seen']]) to take care of that error.
If we fix that, we get another error: AttributeError: 'DataFrame' object has no attribute 'value_counts'. The reason for this is because .value_counts() is only used for a Series object. When you use multiple columns, you have a DataFrame object and can’t use value_counts.
You can only use value_counts on a series (one column), and autos[cols] is a dataframe with 3 columns. That’s why you’re getting that error. You can do one column at a time, or make a loop that puts in one column at a time, like
for col in cols:
print(autos[col].value_counts(normalize=True, dropna=False).describe(include = 'all'))
Not sure what you need it for (results didn’t seem that interesting), but I hope that helps.
ummmm I think i need this result for it ( Use the workflow we just described to calculate the distribution of values in the date_crawled , ad_created , and last_seen columns (all string columns) as percentages.)
Is there a method we can call to add spaces between words? Are the only options to change the column names manually or define a method to do so? I got the error ‘‘list’ object has no attribute’ replace’’ when I tried the following:
col = col.split()
col = col.replace(" “,”_")
col = col.lower()
Hey @vroomvroom. When you run the loop over autos_c.columns, c for each iteration is a string. When split() is applied, it creates a list from the string and splits (by default) at a space. So if your string was 'python is awesome', you would end up with ['python', 'is', 'awesome']. Camelcase doesn’t have the spaces, so all you’re doing with this bit of code is turning each string into a list with one element; for example, 'dateCrawled' becomes ['dateCrawled']. Then when the function tries to apply replace(), it can’t because c is no longer a string but a list. That’s why you’re getting the error there.
I found this article on Geeks for Geeks that you might be interested in seeing. There’s a few different options to create functions for this purpose. I forget what exactly you would know so far in the course, but the first one at least should be pretty easy to understand what’s going on. Hopefully that helps you achieve the results you’re looking for!