Hi there,
Just a quick question about the provided solution vs my own. The mission suggests adding every word from each row in the SMS column to a list called vocabulary. When its done, it says to convert the vocabulary list to a set using the set() function, then convert it back to a list using the list() function. Apparently this allows you to remove duplicates.
Before reading the whole question (a mistake I often make!) I iterated over each word but instead of using set() and list() at the end, I used an if statement within the for loop to skip words that were already in the list.
I believe it resulted in the same list (once sorted), but is my way inefficient/slow in comparison? Is using an if statement in this manner frowned upon in some way?
Apologies if this is a dumb question or answered elsewhere in the course!
Code below.
Screen Link:
My Code:
trainingset['SMS'] = trainingset['SMS'].str.split()
vocabulary = []
for row in trainingset['SMS']:
for word in row:
if word not in vocabulary:
vocabulary.append(word)
Solution Code (using my variable names):
trainingset['SMS'] = trainingset['SMS'].str.split()
vocabulary = []
for row in trainingset['SMS']:
for word in row:
vocabulary1.append(word)
vocabulary = list(set(vocabulary))