Guided Project: Prison Break. Review Please

Hello,

I have just completed the Prison Break project.

In addition to the project, I answered the question “In which countries do helicopter prison breaks have a higher chance of success?” Positive criticism are so welcomed.

Basics.ipynb (124.9 KB)
helper.py (1.9 KB)

Click here to view the jupyter notebook file in a new tab

1 Like

Hello,

Good job on completing the project.

I’m trying to understand how you calculated the helicopter escape chance of success for each country. It looks to me as if you’re just adding the frequency of escapes for a country to the integer 33.

For example, France has 15 escapes, so the value would be 15 + 33 which is 48.

I was expecting a division between number of successful attempts and total number of attempts per country.

Maybe I’m reading your notebook wrong.

1 Like

Thank you for your feedback @wanzulfikri. This is well appreciated.

What I did was to get the frequency of succeeded escapes per prison. The prison with the highest succeeded prison break becomes my result.

However, I think the default dataset changed. See my screenshots below:

Thanks.

Let me walk through my current understanding one by one:

The initial attempt_prison_countries has duplicate countries. This is because you’ve appended the same country multiple times in the following loop.

prison_countries = []
for row in data:
    prison_countries.append(row[2])

One way to prevent that from happening is to check if prison_countries doesn’t have the country yet before appending.

prison_countries = []
for row in data:
    if row[2] not in prison_countries: # check if country is already in prison_countries
         prison_countries.append(row[2])
prison_countries #leave this at the end of the cell and Jupyter will print it automatically

And I realised that your double for loop is in the wrong order thus the weird output; you’re actually adding 1 to all countries in attempt_prison_countries every time there’s a Yes.

You’ll need to loop over each country in attempt_prison_countries and then, loop over data. Any row with the same country and also has Yes will increment the number in prison_country.

prison_countries = []
for row in data:
    if row[2] not in prison_countries: # check if country is already in prison_countries
         prison_countries.append(row[2])
            
attempt_prison_countries = []
for country in prison_countries:
    attempt_prison_countries.append([country, 0])

for prison_country in attempt_prison_countries:
    for row in data:
        if prison_country[0] == row[2] : # if row has the same country as prison_country
            if row[3] == 'Yes':         # you can do this check above with 'and' or &&
                prison_country[1] += 1    

attempt_prison_countries #leave this at the end of the cell and Jupyter will print it automatically

You can also consider modifying the above loop to find the successful attempts immediately without multiple for loops. This is where a dictionary can be very helpful:

successful_attempts = {}
for row in data:
    if row[2] not in successful_attempts: # check if country is already in prison_countries
        successful_attempts[row[2]] = 0   # if not, add it as a new country with value of 0
        
    if row[3] == "Yes":                   # if success equals to "Yes"    
        successful_attempts[row[2]] += 1  # increment success count for the country the row is related to

successful_attempts #leave this at the end of the cell and Jupyter will print it automatically

No matter which method you choose, what you get is just the successful attempts per country but not the success chance which is the percentage of success out of all attempts. To get the latter, you’ll have to use the previous number of attempts per country dataframe/series together with the new successful attempts per country data.

Another thing to be aware of is if the output looks weird, it’s worthwhile restarting the kernel and run all the cells again.

Feel free to ask any questions if you don’t understand my explanations, or more likely, if I misunderstood your code.

2 Likes