Building a Decision tree

Screen Link:

What I expected to happen:
In step 4, it was determined that Marital Status is the best column to split on.

What actually happened:
When running the ID3 algorithm, it kept splitting on Age.

Hello @eykei, welcome to the community!

Please share your code so people can help you.

Hi,
My code is basically the same as the solution. The question is more about the concept.

def print_with_depth(string, depth):
    # Add space before a string
    prefix = "    " * depth
    # Print a string, and indent it appropriately
    print("{0}{1}".format(prefix, string))
    
    
def print_node(tree, depth):
    # Check for the presence of "label" in the tree
    if "label" in tree:
        # If found, then this is a leaf, so print it and return
        print_with_depth("Leaf: Label {0}".format(tree["label"]), depth)
        # This is critical -- without it, you'll get infinite recursion
        return
    # Print information about what the node is splitting on
    print_with_depth("{0} > {1}".format(tree["column"], tree["median"]), depth)
    
    # Create a list of tree branches
    branches = [tree["left"], tree["right"]]
    
    for branch in branches:
        depth += 1
        print_node(branch, depth)
    # Insert code here to recursively call print_node on each branch
    # Don't forget to increment depth when you pass it in

print_node(tree, 0)

In step 4, we run the find_best_column() function on the entire income dataframe.

In step 7, we run the id3() function on the smaller, pretend dataframe, called data that was created in step 5.