What is the exact meaning of lexicographical order? How it is different from alphabetical order?
What is lexicographical order?
252k views Asked by NDesai AtThere are 7 answers
I want to add an answer that is more related to the programming side of the term rather than the mathematical side of it.
Lexicographical order is not always an equivalent of "dictionary order", at least this definition is not complete in the realm of programming, rather, it refers to "an ordering based on multiple criteria".
For example, almost in all famous programming languages, there are standard tools for sorting collections of objects, now what if you want to sort a collection based on more than one thing? For instance, let's say you want to sort some items based on their prices first AND then based on their popularity. This is an example of Lexicographical Order.
For example in Java (8+), you could do something like this:
// sorts items from the cheapest AND the most popular ones
// towards the most expensive AND the least popular ones.
Collections.sort(items,
Comparator.comparing(Item::price)
.thenComparing(Item::popularity)
.reversed()
);
And the Java documentation uses this term too, to refer to such type of ordering when explaining the "thenComapring()" method:
Returns a lexicographic-order comparator with another comparator.
Lexicographical order is nothing but the dictionary order or preferably the order in which words appear in the dictonary. For example, let's take three strings, "short", "shorthand" and "small". In the dictionary, "short" comes before "shorthand" and "shorthand" comes before "small". This is lexicographical order.
Alphabetical order is a specific kind of lexicographical ordering. The term lexicographical often refers to the mathematical rules or sorting. These include, for example, proving logically that sorting is possible. Read more about lexicographical order on wikipedia
Alphabetical ordering includes variants that differ in how to handle spaces, uppercase characters, numerals, and punctuation. Purists believe that allowing characters other than a-z makes the sort not "alphabetic" and therefore it must fall in to the larger class of "lexicographic". Again, wikipedia has additional details.
In computer programming, a related question is dictionary order or ascii code order. In dictionary order, the uppercase "A" sorts adjacent to lowercase "a". However, in many computer languages, the default string compare will use ascii codes. With ascii, all uppercase letters come before any lowercase letters, which means that that "Z" will sort before "a". This is sometimes called ASCIIbetical order.
This simply means "dictionary order", i.e., the way in which words are ordered in a dictionary. If you were to determine which one of the two words would come before the other in a dictionary, you would compare the words letter by the letter starting from the first position. For example, the word "children" will appear before (and can be considered smaller) than the word "chill" because the first four letters of the two words are the same but the letter at the fifth position in "children" (i.e. d ) comes before (or is smaller than) the letter at the fifth position in "chill" (i.e. l ). Observe that lengthwise, the word "children" is bigger than "chill" but length is not the criteria here. For the same reason, an array containing 12345 will appear before an array containing 1235. (Deshmukh, OCP Java SE 11 Programmer I 1Z0815 Study guide 2019)
Something that can help understand better the lexicographical ordering with string is the following example.
Given the following Python script:
words = ['apple', 'Banana', 'Cherry', 'Date', 'applepie']
max_word = max(words)
print(max_word)
The result will be surprisingly: 'applepie'
The motivation is that in the UNICAR sequence, the UPPERCASE LETTERs come before the LOWER CASE LETTERs.
To be more precise the UPPERCASE letters have a UNICHAR from 65 to 90 (A-Z) and the LOWERCASE letters have a UNICHAR from 97 to 122 (a-z).
So by sorting the above list because 'applepie' have the longest string with greater UNICHAR characters, it has been returned by the max built-in function.
lexicographical order is alphabetical order. The other type is numerical ordering. Consider the following values,
Those values are in lexicographical order.
in numerical order: 10 comes after 2,
but 10 comes before 2 in "alphabetical" - aka: lexicographical - order.