Guided Project: Exploring Hacker News Posts, Step 4: I'm having problem trying to convert number of comments from string to integer using int()

Screen Link: https://app.dataquest.io/m/356/guided-project%3A-exploring-hacker-news-posts/4/calculating-the-average-number-of-comments-for-ask-hn-and-show-hn-posts

Your Code: ```total_ask_comments = 0

for post in ask_posts:
total_ask_comments += int(post[4])

avg_ask_comments = total_ask_comments / len(ask_posts)
print(avg_ask_comments)```

What I expected to happen: Calculate number of comments in average.

What actually happened: ```ValueErrorTraceback (most recent call last)
in ()
2
3 for post in ask_posts:
----> 4 total_ask_comments += int(post[4])
5
6 avg_ask_comments = total_ask_comments / len(ask_posts)

ValueError: invalid literal for int() with base 10: ‘h’```

Other details: My understanding is the post[4] string that can’t be converted to integer because some data contain letter ‘h’, am I correct? If so, how do I go about rectifying this? My code is the same to the one in the solution…

Hey, Thaihoan. Your diagonostic is on point!

To better help see what’s going on here, please share your notebook.

Hi Bruno,

I’m new to dataquest and community forum, I’m barely navigating this space… I included the screen link, is that what you meant by “sharing notebook” or it is something else and how can I do it?

Thanks.

Notebook is the file created by Jupyter (the one with the cells and the output).

If you’re working in the app, you can download it. If you’re working locally, you’ll need to locate it. Notebook’s names typically end with .ipynb.

1 Like

I managed to spot my mistake and fixed it. Thanks @Bruno for the notebook advice! I might ask for your help soon again :slightly_smiling_face:

Hello Nguyen, I am having similar problem. How did you sort it out.

Find link to my notebook here

http://localhost:8888/notebooks/documents/dataquest/hacker%20news%20guided%20project%20issue.ipynb#

Peter

A post was split to a new topic: Mismatch with Hacker News GP Solution

A post was split to a new topic: ValueError: invalid literal for int() with base 10: ‘h’

Hi @jamoko77.poo, here is the solution: this link

1 Like

You can solve this invalid literal for int() with base 10 by using Python isdigit() method to check whether the value is number or not. The returns True if all the characters are digits, otherwise False.

num = "55.55"
if num.isdigit():
  print(int(num))
else:
  print("String value is not a digit : " , num)

the error might have occurred from the second line, Observe this

total_ask_comments = 0
for row in ask_posts:
    num_comments = row[4]
    num_comments = int(num_comments)
    total_ask_comments += num_comments
avg_show_comments = total_ask_comments/len(ask_posts[4])
print(avg_show_comments)

or the error might have occurred in creating the empty list. Most users from the observation likely appended the title instead of the whole row: see the below


# empty list creation
ask_posts = []
show_posts = []
other_posts = []
# looping over hn
for row in hn:
    title = row[1]
    
    if 'ask hn' in title.lower():
        ask_posts.append(row)
    elif 'show hn' in title.lower():
        show_posts.append(row)
    else:
        other_posts.append(row)

# number of posts in each category
print(len(hn))
print(len(ask_posts))
print(len(show_posts))
print(len(other_posts))

Hi, I think it is the logic problem. In the last step, we out title start with lowercase ‘show hn’ or ‘ask hn’ to the ask_posts list and show posts list only. We did not put the whole row to the list… if you try to print(ask_posts) you can only see titles only… That is why we cant change to number at this step…is it correct?