Categorical and Quantitative data

Please i need help on how to differentiate a categorical data from a quantitative data

@mbahstanley77, categorical data can be thought of as “non-numeric” data. Examples of categorical variables include the model of your car, the color of your hair, country where you were born, etc. Quantitative data includes variables that are numeric (i.e. height, weight, shoe size, etc.). These variables are represented as numbers and you can determine some type of relationship between numbers. However, categorical variables can also be represented as numbers (i.e. on surveys when you are given a rating. 5 = Extremely Likely, 4 = Likely, and so on). Although these are numeric responses, the rating is actually a categorical variable.

1 Like

So can df.describe() be helpful in differenciating them in a csv file? @samson.john

You can, but there are exceptions. Columns without numeric variables are probably categorical. However, if you had the “rating” variable as described above, df.describe() would give out a numeric summary for this column if the values are stored as int of float. However, the rating would still be considered a categorical variable. If you have a data dictionary, it would be easier to determine if columns are categorical or quantitative.

1 Like

Thanks @samson.john for the reply

Just to add to the already great answer. Dates are made of numbers but are categorical data. Ordinal categorical data because we can order dates. The interesting thing with dates is at time it can be either ordinal, interval, ratio or nominal. Still categorical datas.