LIMITED TIME OFFER: 50% OFF OF PREMIUM WITH OUR ANNUAL PLAN (THAT'S $294 IN SAVINGS).
GET OFFER

Question about variable assignment during the for loop (351-8)

I have a question about update values in a for loop. Here is my example code:

Screen Link: https://app.dataquest.io/m/351/cleaning-and-preparing-data-in-python/8/parsing-numbers-from-complex-strings-part-two

def process_date(data):
    
        if '-' in data:
            before , after = data.split('-')
            data = round((int(before) + int(after)) / 2)
        else:
            data = int(data)
            
        return data
    
            
processed_test_data = []
for row in stripped_test_data:
    Try_test = row
    Try_test = process_date(Try_test)
    processed_test_data.append(Try_test)
    row = Try_test
    
for row in moma:
    Date = row[6]
    Date = strip_characters(Date)
    Date = process_date(Date)
    row[6] = Date

What I expected to happen:
In this example, I have 2 for loop, one looping the moma dataset and one using the stripped_test_data. I expected values of the stripped_test_data and moma will be updated after the for loop execution.

What actually happened:
Moma dataset has updated after the for loop but the stripped_test_data did not. After the execution, the stripped_test_data is still in string format but not in integer. Can anybody answer my question?

stripped_test_datalist (<class 'list'>)
['1912',
 '1929',
 '1913-1923',
 '1951',
 '1994',
 '1934',
 '1915',
 '1995',
 '1912',
 '1988',
 '2002',
 '1957-1959',
 '1955',
 '1970',
 '1990-1999']
 
 processed_test_datalist (<class 'list'>)
[1912,
 1929,
 1918,
 1951,
 1994,
 1934,
 1915,
 1995,
 1912,
 1988,
 2002,
 1958,
 1955,
 1970,
 1994]

This happens because you never assigned the processed date to the stripped_test_data list and you should not do it. Instead, and this is the right thing to do, you assign it to the processed_test_data list and you can see that in this list the data is correct.

Thanks for your answer.

I also notice the problem. In my first for loop code, row is a pointer to a list. When I assign ‘row = Try_test’, the old connection to the list is changed and row is now pointing to a new value Try_test. That’s why when we print stripped_test_data, the result does not change.

The same goes with my second for loop, but row[6] directly changes the content of the list.
I found a pretty good visual explanation for this on stack overflow. In case some of you might have a similar question, go and have a check.
.

1 Like