I’m starting to think about code optimization for speed in general, and am curious if anyone has any resources for conceptual guidelines, and/or welcoming feedback on the two approaches to the same code below. I ran times for both, and mine was negligibly faster, but I am wondering what tradeoffs are being made with each approach. Which would be faster if the dataset was 1000X bigger, and why? Thanks!

Screen Link:

https://app.dataquest.io/m/89/introduction-to-decision-trees/9/overview-of-dataset-entropy

My Code:

```
possibilities = income['high_income'].unique()
outcomes = len(income['high_income'])
entropies = []
for e in possibilities:
occurances = len(income[income['high_income']==e])
ratio = occurances/outcomes
P = ratio * math.log(ratio, len(possibilities))
entropies.append(P)
income_entropy = -(sum(entropies))
```

DQ code:

```
prob_0 = income[income["high_income"] == 0].shape[0] / income.shape[0]
prob_1 = income[income["high_income"] == 1].shape[0] / income.shape[0]
income_entropy = -(prob_0 * math.log(prob_0, 2) + prob_1 * math.log(prob_1, 2))
```