Yeah, it can be confusing.
When working with strings - alphabets specifically, lexicographic order is just alphabetical order.
However, when working with numbers, itβs the numerical order when looking at the digits of the numbers from left to right.
So, if you have the following numbers -
1, 2, 11, 3, 23, 22, 15, 31
Lexicographic order will be -
1, 11, 15, 2
Notice that we look at the first smallest digit, which is 1
. Then the second digit of numbers that have a second digit, and the smallest of the second digits is 1
, thatβs why the second number above is 11
.
Once we have exhausted all numbers starting with 1
, we move on to the next smallest digit, which is 2
.
Based on that this is the final outcome -
1, 11, 15, 2, 22, 23, 3, 31
Now, if there was a 3-digit number like 112
, then that would be taken into account as well. 112
would be βhigherβ than 11
but would be smaller than 15
. So the order of numbers in that case would be -
1, 11, 112, 15, 2, 22, 23, 3, 31
There is more to it but the above should suffice. In most cases, you will have functions that will sort it as per the order above for you and itβs not something to worry about. I have, honestly, not encountered this much myself to comment on its usefulness in data science/analysis.
But it should be noted, that when numbers are represented as strings, like -
[β1β, β11β, β15β, β2β, β22β, β23β, β3β, β31β]
Sorting a list of strings as above using those commonly available sorting functions, will sort them in lexicographic order. Sorting them as normal numerical values -
[1, 2, 11, 3, 23, 22, 15, 31]
will sort them in the numerical order you expect.