source page:

https://app.dataquest.io/m/89/introduction-to-decision-trees/11/information-gain

```
import numpy
def calc_entropy(column):
"""
Calculate entropy given a pandas series, list, or numpy array.
"""
# Compute the counts of each unique value in the column
counts = numpy.bincount(column)
# Divide by the total column length to get a probability
probabilities = counts / len(column)
# Initialize the entropy to 0
entropy = 0
# Loop through the probabilities, and add each one to the total entropy
for prob in probabilities:
if prob > 0:
entropy += prob * math.log(prob, 2)
return -entropy
# Verify that our function matches our answer from earlier
entropy = calc_entropy([1,1,0,0,1])
print(entropy)
information_gain = entropy - ((.8 * calc_entropy([1,1,0,0])) + (.2 * calc_entropy([1])))
print(information_gain)
#end original code
```

answer

```
income_entropy = calc_entropy(income["high_income"])
median_age = income["age"].median()
left_split = income[income["age"] <= median_age]
right_split = income[income["age"] > median_age]
age_information_gain = income_entropy - ((left_split.shape[0] / income.shape[0]) * calc_entropy(left_split["high_income"]) + ((right_split.shape[0] / income.shape[0]) * calc_entropy(right_split["high_income"])))
```

i recommend changing the recommended answer for readability:

```
med_age = income['age'].median()
left_split = income[income['age'] <= med_age]
right_split = income[income['age'] > med_age]
left_prob = left_split.shape[0] / income.shape[0]
right_prob = right_split.shape[0] / income.shape[0]
income_entropy = calc_entropy(income['high_income'])
left_entropy = calc_entropy(left_split['high_income'])
right_entropy = calc_entropy(right_split['high_income'])
age_information_gain = income_entropy - (left_prob*left_entropy+right_prob*right_entropy)
age_information_gain
```

The above is my current answer to that problem, and I recommend changing to this, because it makes it very clear what you are actually doing to calculate the information gain, where the current answer code requires you to parse something you are unfamiliar with first before you can actually understand what exactly is happening.