In Operator and Double Loops

Screen Link: Algorithm Complexity > Time Complexity of Algorithms > Space Complexity > Screen 5 “Precomputing to Reduce Time Complexity”

I have a question of about how the in operator works as a True/False check when placed in a loop or nested loop. There’s a lot of text but it’s all simple code and print output to be thorough.

The solution to this screen utilizes a loop and the in operator set a dictionary key:value pair to True or False.

test_values = [1, 2, 5]
target_sums = [2, 3, 7, 8]
# possible_sums = {2, 3, 4, 6, 7, 10}

def find_sums_precompute(values, target_sums):
    possible_sums = set()
    for i in range(len(values)):
        for j in range (i, len(values)):
            possible_sums.add(values[i] + values[j])
    sums = {}
    for target in target_sums:
        sums[target] = target in possible_sums
    return sums, possible_sums

print(find_sums_precompute(test_values, test_targets))

# correct output:
# {2: True, 3: True, 7: True, 8: False}

So I was curious about the previous solution it improved upon.

test_values = [1, 2, 5]
target_sums = [2, 3, 7, 8]
# test_values possible sums = 2, 3, 4, 6, 7, 10

def find_sums(values, target_sums):
    sums = {}
    for target in target_sums:
        sums[target] = False
        for i in range(len(values)):
            for j in range(i, len(values)):
                if values[i] + values[j] == target:
                    sums[target] = True
    return sums

print(find_sums(test_values, test_targets))

# same correct output:
# {2: True, 3: True, 7: True, 8: False}

In particular, why do we set the sums[target] = False then later do this check if values[i] + values[j] == target and switch to True if true. So, I tried to implement it with the in operator like the better solution find_sums_precompute(), which failed.

def find_sums_with_in(values, target_sums):
    sums = {}
    for target in target_sums:
        for i in range(len(values)):
            for j in range(i, len(values)):
                sums[target] = values[i] + values[j] in target_sums
    return sums

print(find_sums_with_in(test_values, test_targets))

# Incorrect output returned
# {2: False, 3: False, 7: False, 8: False}

I printed out target, i's value, and j’s value during the loops. Here is the result of the nested loop iteration through the first target value, 2, which should result in True.

test_values = [1, 2, 5] used for loop 1 and 2
target_sums = [2, 3, 7, 8]
test_values possible sums = 2, 3, 4, 6, 7, 10

target is 2
loop 1 value = 1
loop 2 value = 1
{2: True}
loop 1 value = 1
loop 2 value = 2
{2: True}
loop 1 value = 1
loop 2 value = 5
{2: False}
loop 1 value = 2
loop 2 value = 2
{2: False}
loop 1 value = 2
loop 2 value = 5
{2: True}
loop 1 value = 5
loop 2 value = 5
{2: False}
target is 3

The in operator is working but it’s continuing through the nested loop in its entirety, therefore returning the last check, which happens to be False. It does this for all following target values as well. Why does using in fail to return the correct results in this case but not find_sums_precompute() ?

Out of curiosity I did a print test of the in operator check from find_sums_precompute(). This is the full output, which ends in the correct answer.

# The "in" operator loop from find_sums_precompute()
for target in target_sums:
    sums[target] = target in possible_sums

# target = [2, 3, 7, 8]
# possible_sums = {2, 3, 4, 6, 7, 10}

target value= 2
{2: True}
target value= 3
{2: True, 3: True}
target value= 7
{2: True, 3: True, 7: True}
target value= 8
{2: True, 3: True, 7: True, 8: False}

Here, it appears the in operator ends the loop when it finds a True. If it didn’t, wouldn’t the check on target = 2 return False as it continued through the list and eventually hit possible_sums = 10?

I also did some simple tests with addition, which didn’t seem to break in the same way. Is it something about double loops??

This whole issue has nothing to do with in operator behaving wrongly or what quirks it may have when used with other programming constructs (no it doesn’t have any quirks at all, no gotchas to watch for just because of the existence of other programming constructs).

A membership test operator (in) has no ability to end a loop. If a loop ends prematurely, it’s because you written a break, continue or return statement in the body of the loop, or your loop used some exception handling try-except construct, or you used some flag variable to indicate on what conditions to end loop early. When i say end, i refer to both meanings of 1 iteration of the loop ending early to continue with next iteration of loop, and the entire looping construct ending.

That’s a general problem solving pattern. It’s like in court where the accused is “assumed innocent until proven guilty”. In programming, we set something to be 1 state (doesn’t even have to be False/True like in this example), then flip the state to another when some conditions are met. If conditions are not met, it stays in the 1st state. At the end of data processing (loop in this case) we interpret physical meanings from the 2 states we chose to represent meaning.

In this problem, False means “there are NO values that can be paired to sum to this target value”. False/True in sums dictionary directly answer the task. That’s why we must program the correct logic to make sure after all the work, the correct boolean value ends up for each key in sums.
It’s natural to set it to False until you find a single example that proves it right, which is when you set it to True. It is unnatural to set every target to True and then if can’t find any pairs to sum to target value, flip it to False. The former way allows you to short-circuit (Short Circuiting Techniques in Python - GeeksforGeeks), latter does not. (Only for this problem, because for some other problems, False is the one that causes short-circuit)

The above idea is like proving the statement “There are green swans in the world”. This statement is easier to be proven True than False. Relating to programming, assuming default False, then if anyone sees one green swan, it’s proven True, end of work (short-circuit). Assuming default True, people have to comb the entire earth before being reasonably certain there are no green swans, and be able to flip True to False. Fortunately in the above example “the entire earth” is just a for loop, which doesn’t take forever, but still longer than if you tried to prove True given default False.
My point is to introduce you to this general problem solving pattern, some logic concepts, and short-circuiting as a thing in programming.

find_sums_with_in is wrong because sums[target] = values[i] + values[j] in target_sums is not nested inside a if conditional. This line does not care if the values sum to target or not. If it sums to, then a True is assigned into Dictionary, if it doesn’t then a False is assigned. The final state of sums depends completely on the final pair of values, because every loop overwrites the boolean from evaluating previous pairs of values.

You seem to think your loop continues to the end that’s why False was assigned everywhere, andfind_sums_precompute ends loop early.

Every loop does go to the end.
The difference is find_sums_with_in blindly overwrites the previous loop’s values (I can’t remember any situation where this is a good problem solving method), while find_sums checks if == before flipping the boolean.
We should only overwrite, if some condition is met.(Even the max_value = max(max_value, curr_value) you may find online has an implicit if contained.). If that condition is met, we can end the loop iteration for that target early with continue, since we already know there is a pair, and don’t care what other pairs or how many pairs can sum to that target. (None of the examples provided in this thread did any short-circuiting).

You may also be confused why find_sums_precompute did not have any if conditional wrapping the assignment with membership check. That’s because it’s doing completely different logic from find_sums.
One is finding Apples in Oranges, another is finding Orange in Apples (as a rough analogy).

Besides the issue i mentioned above on final pair of values determine sums state, another wrong thing with find_sums_with_in is checking in target_sums. This is checking more than you should check, leading to higher chance of getting True. Instead of checking if any pair can sum to the current target in the current outermost loop, that is checking if it can some to any target (not just current target).
If target_sums, contained a 10 inside (eg. target_sums = [2, 3, 7, 8, 10], every value in sums will become True (eg {2: True, 3: True, 7: True, 8: True, 10: True}), because every target (key) in sums will be influenced by the pair of (5,5) which sums to 10.
So to summarize the 2 issues of find_sums_with_in

  1. Unconditional assignment leads to sums final state of each key being completely determined by last pair of values (5,5)
  2. sums[target] = values[i] + values[j] in target_sums checks for more than just the current target from
    outermost loop for target in target_sums:, looks ahead/backwards into the full target_sums which it shouldn’t. (example demonstrating how this breaks is given above)

Here’s how you can do it as you get better with python:

from itertools import combinations_with_replacement as cwr

test_values = [1, 2, 5]
target_sums = [2, 3, 7, 8]

true_keys = set(map(sum,cwr(test_values,2))).intersection(target_sums)

sums = dict.fromkeys(target_sums,False)
sums.update(dict.fromkeys(true_keys,True))
print(sums)
1 Like

Thanks so much for your very thorough response. I believe I understood most of what you explained. Short circuiting in particular is very interesting. However, when I look back at the correct membership test used in find_sums_precompute() I get confused again. I’ll write it out but if you don’t have the time to address this further I understand.

My confusion is the same for both issue summaries 1 and 2. Why does my membership test using in rewrite every True/False possibility until to the end of the container but the correct implementation does not? Especially since, as you point out, nothing is being short circuited and every loop does go to the end (assuming you mean the loops I included in code formatting).

Regarding issue 1, unconditional assignment, you say find_sums_with_in() needs its membership check inside an if conditional. But then you say:

I know find_sums_precompute() is checking for a value inside a set while find_sums_with_in() is adding two values and checking inside a list. But beyond that, I can’t see the difference. To my beginner’s eye, both are in for loops, take a value, and check to see if it’s a member of a container.

Regarding issue 2, checking the whole list vs. a single target, I can’t see how my incorrect version differs from the correct. If I pseudocode the correct execution’s implementation I don’t understand why it isn’t unconditional.

# The "in" operator loop from find_sums_precompute()

# test_targets  = [2, 3, 7, 8]
# possible_sums = {2, 3, 4, 6, 7, 10}

for target in target_sums:
    sums[target] = target in possible_sums

target = 2
possible_sums = 2
{2: True} <— why does the first loop iteration hold the correct answer here
target = 2
possible_sums = 3
{3: False} <— why doesn’t the first loop continue, and flip to False here?
loop continues
target = 2
possible_sums = 10
{8: False} <— And why doesn’t it continue to the end, flipping False like mine did?

Because your code is not nested in if, but find_sums assigns based on result of if. If true, overwrite, else do nothing so most recently assigned value remains. find_sums_precompute is not even checking in the same thing so it’s not comparable in this discussion.

Try to step through code with Python Tutor - Visualize Python, Java, JavaScript, C, C++, Ruby code execution. This saves you from writing prints and helps visualize at any moment what variable holds what value in which function stack.

Just checking in a set vs a list is big enough difference for other problems (although it’s not relevant here because the set in find_sums_precompute could have been implemented as a list too for slower O(n) instead of O(1) checking and more space wastage but still give same final result).

Step back to why use a for-loop. It’s because you want to do something with EACH ELEMENT of a container.
In your code, you did for target in target_sums:, so a reader expects you to do something with target, but the body of the loop has nothing to do with target so that’s already a problem, regardless whether or how membership checking occurs.

It is implicitly conditional. You just don’t see if word explicitly.
Imagine there are 4 target holes to throw marbles at (to reference squid game), all at different distances. There are also two 3-sided dice with faces (1,2,5). The sum of dice roll is how how far you throw. Throw either too far or too close and you miss the hole.
The logic of find_sums is given 4 targets, try to check if they can be hit 1 by 1. There are 4 iterations happening in outer loop. Step in 1 iteration and imagine you are aiming for just 1 of the targets.
Now it’s time to try to throw dice. Everytime you throw a pair, sum them and check (if values[i] + values[j] == target:!) whether it will land in the hole. If yes, turn the (assumed during initialization) can’t hit this target into can. No matter this throw can or cannot hit the hole (calculations and checks occur unconditionally, saving values and assigning variables occur conditionally, they do not step on each other’s toes at all), just keep throwing dice until all combinations of pairs are found, then you have completed the process for 1st target. Repeat all of above for next 3 target.

The logic of find_sums_precompute is not to try to throw holes into a given target, but to take a given target and try to match it to a list(in english sense, so set to be precise here) of all possible combinations of dice. The implicit if is embedded in a membership check. The way to interpret 2 in [1,2,3] is “if 2 is in the list of 3 numbers, return True, else False” (See the if there?). Let me write membership check in a way assuming this feature doesn’t exist in python.

to_check = 2
possible_values = [1,2,3]
found = False
for item in possible_values:
    if to_check == item:
        found = True

Do you now see how the above code block contains the missing if you are searching for, and how it’s exactly the same as found = 2 in [1,2,3] if you consider what found will end up with (True/False) for various inputs of to_check and possible_values? Make sure you understand how this code block is the same as in, then read my first answer again, or rethink through the whole problem yourself.(Understand the 2 provided functions, then understand the problems with your function that i mentioned above and fix it)

I don’t understand where these outputs came from so can’t comment.

Why do membership test instead of loop, if, equality check?
Because membership check on sets are way faster than on list. On list it goes element by element for O(n) time complexity. On sets, it does a hash lookup (not element by element) for O(1) time complexity.
Here’s article on set vs list performance: Python Lists vs Sets. When to use lists and when to use sets? | by Jacob Toftgaard Rasmussen | Towards Data Science.

When you get advanced, you can read about things like __contains__ (what in acts as syntactic sugar for), and __eq__ and __hash__.

Anyway interesting question, i felt i made a breakthrough explaining that implicit if. Had to imagine how someone who had no idea what a membership test is would think of implementing/understanding it. Definitely a good lesson in teaching for me.

1 Like

I think I do finally get it thanks to your explanations and the step-by-step code visualizer. Again, thank you very much for the huge amount of effort you’ve put into the answers. Because I’m so inexperienced, I don’t think there was an easier or straightforward way to answer it. It’s all been extremely interesting to read and it’s a huge relief to have someone answer a question that was really messing with my head even though it’s not super important to the course.