While learning how we can visualize the null values or the creation between columns that have a null value. My entire attention went into understanding the new line of codes that I saw for the very first time. Like, Creating a triangular mask
# create a triangular mask to avoid repeated values and make
# the plot easier to read
missing_corr = missing_corr.iloc[1:, :-1]
mask = np.triu(np.ones_like(missing_corr), k=1)
Can someone please help me in understanding how I should try to get the core concept and if these concepts like a triangular mask would be covered in future or not.
This is mistake on our end, we haven’t introduced this term before. I will get this issue logged. Triangular mask is basically removing the repeated value in a correlation table:
We are excluding them to avoid plotting correlations of the same columns (1.0). Here is what the plot will look like if we don’t exclude the first row and last column.
For making it easier to understand what is going on. Let’s print a subset of the correlation dataframe: print(missing_corr.iloc[0:5, 0:5])
vehicle_1
vehicle_2
vehicle_3
vehicle_4
vehicle_5
vehicle_1
1.000000
0.151516
0.019972
0.008732
0.004425
vehicle_2
0.151516
1.000000
0.131813
0.057631
0.029208
vehicle_3
0.019972
0.131813
1.000000
0.437214
0.221585
vehicle_4
0.008732
0.057631
0.437214
1.000000
0.506810
vehicle_5
0.004425
0.029208
0.221585
0.506810
1.000000
In the above table, you will notice that the first row is the same as the first column and the last row is the same as the last column. If we remove, the first row and last column, that problem is solved.