# Cleaning and Preparing Data in Python Practice Problems - Slide 7

The function I wrote below should assign a 4 digit unique id to the rows missing ids. What it does is, it first calculates all 4 digit permutations of numbers between 0 and 9. Then creates a list consisting of current 4 digit ids in the dataset. Afterwards detects the rows without ids. After detecting the rows without ids, we loop through the permutations and when we find one that is not already in the dataset we modify that row and assign that permutation as the id. However, when I submit my answer, I see that some rows have duplicate ids. Such as " The column `id` has duplicated values: Example rows `1` and `10` with value `9876`."

``````def clean_id_col():

import itertools
permutation = list(itertools.permutations([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], 4))

string_permutation = []

for perm in permutation:
number = ''
for integer in perm:
number = number + str(integer)
string_permutation.append(number)

unique_ids = []

for row in listings:
ids = row
if ids:
unique_ids.append(str(ids))

for row in listings[1:]:
ids = row
if not ids:
for number in string_permutation:
if number not in unique_ids:
row = number

write_csv(listings)``````

You loop through all numbers in `string_permutation` above. This is what happens -

• `ids` is empty
• First iteration of `for` loop
• `number` not in `unique_ids`
• `row` set as `number`
• Second iteration of `for` loop
• `number` not in `unique_ids`
• `row` set as `number`
• Third iteration of `for` loop
• `number` not in `unique_ids`
• `row` set as `number`

And so on.

Your `row` keeps on getting updated to `number` as long as `number` is not in `unique_ids` for all numbers in `string_permutation` for the same `ids`.

Do you notice the problems (yes, plural) here?

1 Like

Hey Doctor!

Thank you for your fast reply, it was very informative for me. Yes I noticed the problems Modified the code as below so when a unique id is found I assign that id to that row and append that id to unique ids list as well. When this happens I stop the loop:

``````if number not in unique_ids:
row = number
unique_ids.append(number)
break``````