When to use "return" in a loop? Can I loop over the column?

I was writing codes that are expected to create a function to clean date data, and I was confused by when to use “return” and when we don’t need to.
This function means to convert a year range like “1970-1990” to the average number of the two years. Here is the step:

  • Splits the string into two strings, before and after the dash character.
  • Converts the two numbers to the integer type and then average them by adding them together and dividing by two.
  • Uses the round() function to round the average, so values like 1964.5 become 1964.
    Here is the giving code(prepared for the next step):
test_data = ["1912", "1929", "1913-1923",
             "(1951)", "1994", "1934",
             "c. 1915", "1995", "c. 1912",
             "(1988)", "2002", "1957-1959",
             "c. 1955.", "c. 1970's", 
             "C. 1990-1999"]

bad_chars = ["(",")","c","C",".","s","'", " "]

def strip_characters(string):
    for char in bad_chars:
        string = string.replace(char,"")
    return string

stripped_test_data = ['1912', '1929', '1913-1923',
                      '1951', '1994', '1934',
                      '1915', '1995', '1912',
                      '1988', '2002', '1957-1959',
                      '1955', '1970', '1990-1999']

Here is my code:

def process_date(date):
    if "-" in date:
        date.split("-")
        date = (int(date[1])+int(date[2])) / 2 
        date = round(date)
    else:
        date = int(date) 
    return date 

processed_test_data = []

for d in stripped_test_data:
    d = process_date(d)
    processed_test_data.append(d)
    return processed_test_data


for column in moma:
    date = column[6]
    for t in date: 
        t = strip_characters(t)
        t = process_date(t)
        date.append(t)
        column[6] = date

But here is the answer:

def process_date(date):
    if "-" in date:
        split_date = date.split("-")
        date_one = split_date[0]
        date_two = split_date[1]       
        date = (int(date_one) + int(date_two)) / 2
        date = round(date)
    else:
        date = int(date)
    return date

processed_test_data = []

for d in stripped_test_data:
    date = process_date(d)
    processed_test_data.append(date)

for row in moma:
    date = row[6]
    date = strip_characters(date)
    date = process_date(date)
    row[6] = date

I have three questions:

  1. In the first def, can I use
    date = (int(date[1])+int(date[2])) / 2
    to replace:
    date_one = split_date[0]
    date_two = split_date[1]
    date = (int(date_one) + int(date_two)) / 2
    in the answer?

  2. In the second loop, why can’ I use “return” ?

  3. In the final loop, why I shouldn’t loop over each element in a column?

I know there are lots of confusion here. I’d appreciate if anyone can help me! Thank you :slight_smile:

Hi @stellayou1126,
I will try to answer your question clearly
1)You can’t because split() method returns a list of strings after breaking the given string this method will not convert your string to list, in this code

after applying date.split("_") the varible data is still string, date[1] and data[2] are only the second and third characters of your string
2) Because it is loop not function
3) in short, because you have data of a different type in each column

1 Like