Going fast! #DataquestChallenge Premium Annual Offer:
500 get 50% & the next 1000 get 40% off.
GET OFFER CODE

Working with a bunch of lists and loops in Python

I’m working on a silly side project and have encountered an issue with lists and loops that I haven’t been able to wrap my head around.

I think it’s easier to explain with some pseudocode, so here’s what I’m starting with:

source_list = ['a', 'e', 'f', 'g']
list1 = ['a', 'b', 'c']
list2 = []

counter = 0

What I’m trying to do:

  1. While counter is less than 3, pick a random item from source_list
  2. Check if that item is in either list1 or list2
  3. If it IS in either list, go back to number 1 and start again
  4. If it is NOT in either list already, add it to list2, counter +=1, and then go back to number 1

or to express that in pseudocode:

while counter < 3:
	source_list_choice = random.choice(source_list)
	check and be sure source_list_choice is not already in list1 or list2
		if it IS in either one of those lists:
			start again by picking a new source_list_choice
		if it is NOT in either list:
			add it to list2
			counter +=1
			start again by picking a new source_list_choice

Why I want to do this: I’m building a Twitter bot that’ll tweet messages from a source list. I want to choose three items from that list at random, check that they haven’t already been tweeted before (list1), check the same tweet hasn’t been randomly picked twice by checking what’s already on the “today’s tweets” list (list2), and then come out of this bit of code with list2, a list of three tweets that (1) are all unique and (2) haven’t been posted before.

What actual code do I have right now:

while counter < 3:
    print(counter)
    potential_tweet = random.choice(generated['Text'])
    for row in posted_tweets:
        if potential_tweet == row[0]:
            break
        else:
            continue
        todays_tweets.append(potential_tweet)
        print("Added " + potential_tweet)
        counter += 1

What does this code do right now?

It seems to be counting by twos, and adding each randomly-selected tweet to todays_tweets (i.e. list2) twice. So I’m ending up with an output that’s like ['b, 'b', 'c', 'c'], but what I’m looking for is an output like ['b', 'c', 'd'], where b,c,d, are unique, randomly-selected items from generated['Text'] (i…e source_list) that aren’t in the posted_tweets (i.e. list2) list.

Any ideas? It feels like there’s some simple thing that I’m missing here. I also suspect there’s probably a more efficient way to do this than with for loops, which might become important as the list of already-posted tweets gets longer…

I seem that instead
if potential_tweet == row[0]: - in this case always compare only first element of row, not compare second and third elements
better use
if potential_tweet is in row - in this case compare all existing elements of row