I answer correctly and then I look for the correct answer for dataquest to compare. I then realize than you don’t have to iterate over columns in pandas, you could do something much better like this:
The main point is that:
true_avengers[cols] == “YES”
returns a data.frame so we can use any method of data.frames directly. Pretty useful with boolean operations!
I too have come up with a different solution than DQ, however mine doesn´t pass the DQ error checks. Here´s what I did:
deaths = {"YES":1, "NO":0}
death_cols = ["Death1", "Death2","Death3","Death4","Death5"]
for col in death_cols:
true_avengers[col] = true_avengers[col].map(deaths)
true_avengers["Deaths"] = true_avengers[death_cols].sum(axis=1, skipna=True).astype(int)
true_avengers.Death1+true_avengers.Death2+true_avengers.Death3+true_avengers.Death4+true_avengers.Death5
true_avengers.Deaths.sum() #total deaths: 88
true_avengers.Deaths.describe() #confirmed the numbers to be the same as in the DQ solution
count 159.000000
mean 0.553459
std 0.768426
min 0.000000
25% 0.000000
50% 0.000000
75% 1.000000
max 5.000000
Name: Deaths, dtype: float64
Am I missing something, or is DQ very picky about the path you choose to solve the problem? If that´s the case it´s very frustrating… Just for reference, here´s the original DQ solution:
def clean_deaths(row):
num_deaths = 0
columns = ['Death1', 'Death2', 'Death3', 'Death4', 'Death5']
for c in columns:
death = row[c]
if pd.isnull(death) or death == 'NO':
continue
elif death == 'YES':
num_deaths += 1
return num_deaths
true_avengers['Deaths'] = true_avengers.apply(clean_deaths, axis=1)
.sum() and .describe() produce the same results as my code above. However to pass the mission I had to use DQ´s code :-/