As far as I know, correlation is used between two numerical columns. However, columns like vehicle_1', vehicle_2… and cause_vehicle_1, all contain nominal data. Then, how is it possible to calculate the correlation of nominal data?

Also, what is a null correlation, I’m familiar with “correlation” but never heard of “null correlation”? what is it?

Update: While searching the web for “null correlation”, I encountered the “nullity correlation matrix”. Can some please explain what does it mean how to interpret it?

hi @prateek
I am not @Sahil by the way. Similarly we can say this is not a nullity correlation.

If you can understand from this, great! if not no worries me neither Please take it up once you understand matrices and vectors - should come up in further course in DS track.

It’s just to help understand the correlation table better using visualization.

A dark blue box would indicate a high +ve correlation whereas, a dark red box would indicate a high -ve correlation. The light-colored boxes would indicate a weaker correlation (based on color, either -ve or +ve)

It’s just a visual aid to understand if the columns are somehow related to each other.

Consider the off_street and on_street columns showing -0.99 correlation for missing values.
The dark res box, enables/ helps us to understand, that if the collision happened off the street, we won’t find data for the on_street column. The value will be present for the off_street column but not for the latter.