Length error with DQ exercise

Hey folks! I’m having a problem getting through this part of the Calculating Artists ages mission in the Python Data Analysis course:

https://app.dataquest.io/m/331/python-data-analysis-basics/2/calculating-artist-ages

This is my code:

ages = []

for row in moma:
    date = row[6]
    birth = row[3]
    if type(birth) == int:
        age = date - birth
    else:
        age = 0
    ages.append(age)
    
final_ages = []

for age in ages:
    if age > 20:
        final_age = age
    else:
        final_age = "Unknown"
    final_ages.append(final_age)
   

print(len(ages))
print(len(final_ages))
print(len(moma))

The code runs fine and I get the same lengths for all of them, but when I try to submit the answer I get this error from DataQuest:

  • ages is shorter than we expected.
  • final_ages is shorter than we expected.

I can’t seem to pin down what I’m doing wrong and if I’m doing something wrong. My code matches the answer as well. Any guesses?

Hi @kaitlyn.celine, welcome to the community!

There’s something weird with this screen. Your code looks good. I reran my own code that previously passed, and it comes up with the same error now too. :sweat:

I’ll tag @Sahil so the issue can be logged. For now, you can go ahead copy and paste the provided solution and that should allow you to bypass the screen.

1 Like

Thanks! It worked for that page, but it’s looking like the same issue reappears in the following modules in the mission. For example, the following “Decades” list also shows up as too short. The first time I get a different error is when I reach the frequency table module:

s
 decade_frequencydict (<class 'dict'>)
- actual + expected

  {'100s': 3,
   '110s': 3,
   '20s': 1886,
   '30s': 4722,
   '40s': 4081,
   '50s': 2434,
   '60s': 1357,
   '70s': 559,
   '80s': 364,
   '90s': 253,
-  'Unknown': 1063}
+  'Unknown': 1067}

I get this error even while using the solution directly copy and pasted

Hey, Kaitlyn.

There’s something going on the backend. I’ve informed the engineering team of this bug. The dataset was modified recently, my suspicion is that answer checking is still looking at the old version of the dataset.

Can you please go back to the second screen and run the following code? It’s downloading the old version of the dataset so that it matches the answer checking version, and then running the solution code.

I’m unable to test it thoroughly for reasons that would bore you, so please let me know if this fixes it.

url = "https://dq-content.s3.amazonaws.com/331/artworks_clean.csv"

r = __import__("requests").get(url)

with open("artworks_clean.csv", "w") as f:
    f.write(r.content.decode("UTF-8"))

ages = []
for row in moma:
    birth = row[3]
    date = row[6]
    if type(birth) == int:
        age = date - birth
    else:
        age = 0
    ages.append(age)
final_ages = []
for age in ages:
    if age > 20:
        final_age = age
    else:
        final_age = "Unknown"
    final_ages.append(final_age)
2 Likes

Just ran it and it all works! Thanks for your help.

I should probably also mention that the issue appears with the preceding lesson as well. It’s the same data set so that makes sense.

in Cleaning And Preparing Data In Python: 8. Parsing Numbers from Complex Strings, Part Two
the same length error appears

Same issue for https://app.dataquest.io/m/351/cleaning-and-preparing-data-in-python/4/cleaning-the-nationality-and-gender-columns.

Hey, Phil.

Thanks for letting us know about this. To fix it, please prepend the following code snippet to your solution.

url = "https://dq-content.s3.amazonaws.com/351/artworks.csv"

r = __import__("requests").get(url)

with open("artworks.csv", "w") as f:
    f.write(r.content.decode("UTF-8"))

If it doesn’t work the first time (and the solution code is actually correct), please try modifying the code somewhere without changing what it does. For example, you can delete one of the blank lines or add a comment somewhere.

The reason why I ask this is that I think answer checking will keep the previous version of the file in memory even after downloading the new file, but it only does so for one run, so it should work the second time you run it.

The problem is that the second time you run it, if you don’t change the code at all, it will store the results from the previous run without rerunning the code, because the code is the same.

I realize this is asking a lot, apologies for the experience and I hope this helps.

3 Likes

Thank you very much, Kaitlyn.

Yeah, that’s not surprising, we actually changed the dataset in the previous mission first as there was something wrong with it.

Then, to keep things consistent, we changed it in the following mission (the one you asked about) as well.

In another reply I added some instructions for people to deal with errors in the previous mission.

Apologies for this hiccup!

1 Like

thank you @Bruno, it’s working.

1 Like

Hi guys I was experiencing a similar issue where the expected length of my final_ages list was shorter than expected. I found the solution provided by Bruno that pulls in a revised data set, but I now get the error that the data set is too long. I’ve also reset the code and attempted to use the original solution but still receive an error on the length.

Any thoughts on how to get this mission completed?

Is this in a screen after the one you first ran into the issue? I assume resetting the code runner will take you back to the new dataset. But I don’t know what is the best way to reset the code runner.

Just logging off, waiting for 20 minutes and coming back should work, so by the time you read this, the issue may have fixed itself.

Sahil, can you help out here? @Sahil

1 Like

Hey Bruno,

The code is now working. I believe it was simply logging off and waiting a bit. Thanks for the support here!

2 Likes

Hey Bruno,

I am still getting data set too long error. I have logged off for several hours but the issue persists. Any thoughts? See below for my code.

url = "https://dq-content.s3.amazonaws.com/351/artworks.csv"

r = __import__("requests").get(url)

with open("artworks.csv", "w") as f:
    f.write(r.content.decode("UTF-8"))
    
def remove_parantheses(dataset,index):
    for row in dataset:
        value = row[index]
        value = value.replace("(","")
        value = value.replace(")","")
        row[index] = value
    return dataset
moma = remove_parantheses(moma,2)
moma = remove_parantheses(moma,5)

You get this error both with, and without the fix I provided?

Hello Bruno,

First I ran my code without the fix. I was getting error for data set being too long.

Secondly, I ran my code with adding code you provided. Followed your steps for the fix and still was getting error for data set being too long.

Thirdly, I tried all exercises from beginning. I was still getting data set too long.

Lastly, did “ls” on the console. There were artswork.csv and artswork_clean.csv files.
I removed artswork_clean.csv file using os.remove(“artswork_clean.csv”).

Ran all the exercises from the beginning and ran the code without your fix. Now it works!

I am not sure if removing “artworks_clean.csv” resolved the issue or was it something else which fixed it.
I don’t get the error anymore.

Bizarre!