Attribute error Guided Project: Popular Data Science Questions

I copied and pasted these lines

associations.fillna(0, inplace=True)

for tags in questions["Tags"]:
    associations.loc[tags, tags] += 1

from here

but I get an error, everything worked fine up to this point

hey @jamesberentsen

Please follow Introducing guidelines for all technical questions in our Community to re-post your question.

Without giving any context to your own work such as:

  • the structure of associations - is it a pandas.series or pandas.dataframe or something else? if it matched between solution and your work or not?
  • why in the first place you had to resort to copy-paste
  • what exactly do you think the for loop is doing in the solution
  • why do you think it should be applied in a similar manner to your own code work and that it should work correctly
    etc. etc.

I have so many questions on your one question.
Community can’t really help you with your queries/issues without first providing context and essential info.

associations it is a dataframe. Please refer to the link it has everything in there
but i can paste image here

the code is exactly the same as the link i provided
why would you need more context?

You can see everythign that came befoer it there

the for loop is doing as expained in notes in link provided

We will now fill this dataframe with zeroes and then, for each lists of tags in questions[“Tags”], we will increment the intervening tags by one. The end result will be a dataframe that for each pair of tags, it tells us how many times they were used together.

sorry i. am not sure what you mean here

why do you think it should be applied in a similar manner to your own code work and that it should work correctly

i have read community guidelines for asking question and i am not sure what is wrong with the info i provided please explain more?

it was tagged and titled correctly as far as i can tell and all context provided in link

why in the first place you had to resort to copy-paste

because i was unsure how to code this

Running the solution code itself doesn’t produce the same error. So, that can mostly only mean couple of things -

  1. You have something differently defined in your code related to either associations or related to questions.
  2. Some issue with the library version you are working with.

As per your error, you seem to be working with Python 3.7. I tried the solution with Python 3.7 and Pandas version 1.0.5 and it didn’t throw an error.

Point 1 is likely the cause. You can cross-check your code in relation to the solution to see where you might have made a mistake. This could be a simple oversight somewhere in your implementation.

If you copied all of the code from the solution and didn’t make any changes, then I would recommend making sure you libraries are installed properly or try with a different Pandas version.

Also, while this has been brought up before, if I am not mistaken, some of your questions don’t contain the Mission/Mission Step Link or they don’t have the appropriate tags corresponding to the Mission and Mission Step. I would recommend making sure you add that information to your posts. It helps the community as well.

1 Like

Hi,

yes sorry about that i did forget the link it is here
https://app.dataquest.io/m/469/guided-project%3A-popular-data-science-questions/8/relations-between-tags

i have gone through the code but do not see any oversight i have made

i have uploaded the file to github

You seem to be working with dataset QueryResults.csv which is different than the one Dataquest uses 2019_questions.csv.

I can’t run your code since your dataset wasn’t included in your Github repo. But if I change the dataset from the one you use to the one Dataquest uses then your code runs without error.

So, this is likely related to some values in the dataset that you have. It is possible that your dataset needs a bit more cleaning. You will have to explore your dataset for this and see what might be the cause for this problem.

I don’t wish to discourage you from asking questions, nor do I want you to stop working on projects the way you currently are. It’s absolutely great that you are going for datasets different than the ones Dataquest provides (which are mostly cleaner). I want you to continue with this approach. But, as @Rucha mentioned above, additional context specified in the initial post itself can be helpful and makes it easier to provide responses. In the future, please try to point out where and how you deviate from the Dataquest instructions/source material so that others can help out appropriately.

Good luck with this!

2 Likes

Hi @jamesberentsen

It’s not just about a missing link. You started your post like this:

To most of the readers, this would seem like you only copied this part. How are we supposed to know you have downloaded the entire solution and are trying to dismantle it.

While going through the code, did you skip the top part and only focus on this block of code? Then how can this double-check be called thorough? And as @the_doctor has highlighted, the likely error causing factor here seems to be the dataset.
How can you avoid putting that information here?

Not just for the reference, but please go through this entire post once along with the chained one - Double `True` counts and Pivot Table for Guided Project: Clean and Analyze Employee Exit Surveys.

Even if you exclude Bruno and me, all the participants must have experimented so much to reach a certain level of understanding of the issue, before coming here to raise a topic.

As @the_doctor also mentioned, this is not to discourage you from raising a question, but, wouldn’t it be better if you also tell us about how and what you started with, what is your expected output, why do you think it deviates, how much of coding and re-coding you have done, what all experiments/ tests you subjected your code to.

You copy pasted Bruno’s solution which has no error. Why should I bother myself with that solution? If you are getting an error, you could have uploaded your notebook, with mentioning that a different dataset is being used. Perhaps, then any one of us would have tested the solution out, and tried to work along with you on this.
This working together is based on the assumption, that you already tried running your code with the same dataset and the code works fine. If this is not true, then please first work on some code debugging at your end and scale.

In real-world things aren’t so easy! You will make something good and people will try to hack some of your credits… one missing colon from syntax and the whole department will raise a question on your skills and credibility. Before you claim “I have done all possible testing” be prepared for even those experiments you couldn’t think of.

It’s not just about a missing link. You started your post like this:

To most of the readers, this would seem like you only copied this part. How are we supposed to know you have downloaded the entire solution and are trying to dismantle it.

How did you reach that conclusion?
I am not sure if my writing is so bad that I misguided you to think that I downloaded the workbook (if so please point out where), but I did not download it – I copied and pasted each cell.

While going through the code, did you skip the top part and only focus on this block of code?
Then how can this double-check be called thorough?

You posed a hypothetical question of what you believe to be true and answered it with an unwarranted assumption. Could you be more specific of what the 'top part ’ is?
I do not see what part I skipped and whether you actually saw a part I skipped or are only making another assumption. The latter seems to be the case.

And as @the_doctor has highlighted, the likely error causing factor here seems to be the dataset.
How can you avoid putting that information here?

The tone of that question is accusatory.

Hi
Yes that was the problem thanks.

I want you to continue with this approach. But, as @Rucha mentioned above, additional context specified in the initial post itself can be helpful and makes it easier to provide responses. In the future, please try to point out where and how you deviate from the Dataquest instructions/source material so that others can help out appropriately.

Yes I shall do that in future.I was under the false impression that since I had run exact same sql queries the read in files would be the same.