Screen Link:
What I expected to happen:
In step 4, it was determined that Marital Status is the best column to split on.
What actually happened:
When running the ID3 algorithm, it kept splitting on Age.
Screen Link:
What I expected to happen:
In step 4, it was determined that Marital Status is the best column to split on.
What actually happened:
When running the ID3 algorithm, it kept splitting on Age.
Hi,
My code is basically the same as the solution. The question is more about the concept.
def print_with_depth(string, depth):
# Add space before a string
prefix = " " * depth
# Print a string, and indent it appropriately
print("{0}{1}".format(prefix, string))
def print_node(tree, depth):
# Check for the presence of "label" in the tree
if "label" in tree:
# If found, then this is a leaf, so print it and return
print_with_depth("Leaf: Label {0}".format(tree["label"]), depth)
# This is critical -- without it, you'll get infinite recursion
return
# Print information about what the node is splitting on
print_with_depth("{0} > {1}".format(tree["column"], tree["median"]), depth)
# Create a list of tree branches
branches = [tree["left"], tree["right"]]
for branch in branches:
depth += 1
print_node(branch, depth)
# Insert code here to recursively call print_node on each branch
# Don't forget to increment depth when you pass it in
print_node(tree, 0)
In step 4, we run the find_best_column()
function on the entire income dataframe.
In step 7, we run the id3()
function on the smaller, pretend dataframe, called data that was created in step 5.