Each one of those represents a row of the dataset and is colored according to null/non-null values. When creating the sorted dataframe, we made the REGION column the index, so you’ll see each region multiple times. If you don’t use the yticklabels=20 parameter with the heatmap, all the labels are used which ends up blending it all together to make it unreadable. The yticklabels=20 makes it so it only prints one label for every 20 rows, but we’ll still see duplicates because there were several rows to start with. I hope that makes sense!