# Cleaning and Preparing Data in Python Practice Problems - Cleaning House Listings 3

My Code:

``````import csv

def write_csv(rows):
f = open('listings_clean.csv', mode='w')
writer = csv.writer(f)
for row in rows:
writer.writerow(row)
f.close()

import random
def clean_id_col():
f = open('listings.csv')
id_set = set()               # Create a set because set does not allow duplicates
random_num = random.sample(range(1000,9999), 1)

# If 'id' is blank, we set it as a random number, then we will add it to the set. If it is not blank, we will just add the 'id' number to the set.
if row[0] == '':
row[0] = str(random_num[0])
if row[0] in id_set:                                  # This step to make sure there is no duplicates
row[0] = str(random_num[0])       # if there is, we will generate another random number
elif row[0] != '':

write_csv(listings)

clean_id_col()
f = open('listings_clean.csv')
for i in range(30):
print(rows[i])
``````

I use random number generator, although there is no prior lessons about it. I think it works better for this problem since I did not fully understand the answer given by the assignment, I come up with my own.

The answer from the code above turns out correct. But I would love to know peopleâ€™s thoughts on this. Do you understand this better? What other code would you replace/add to it?

You can use `random.randint(1000, 9999)` instead of `random.sample(range(1000,9999), 1)`, although youâ€™ll need to change `random_num[0]` to `random_num`.

In. . .

``````        if row[0] == '':
row[0] = str(random_num[0])
if row[0] in id_set:                                  # This step to make sure there is no duplicates
row[0] = str(random_num[0])       # if there is, we will generate another random number
elif row[0] != '':
``````

. . . the. . .

``````if row[0] in id_set:                                  # This step to make sure there is no duplicates
row[0] = str(random_num[0])       # if there is, we will generate another random number
``````

. . . part doesnâ€™t do what you think it does. The second line doesnâ€™t fetch a new random number, but rather uses the one that was generated just at the start of the for loop. Consequently, this will always execute if we get into the first `if`. You can remove these two lines and your solution will work just the same.

This was marked as correct because the randomness never interfered with the existing `id`s. If you run it enough times, it will fail.