Common name for index for lexicographical sorting

78 views Asked by At

Is there some common name for index for lexicographical sorting?

I have text and index to it sorted by lexicographical order of all substrings which start from specific position and last till the end of the text.

Let's consider the code:

#include <algorithm>
#include <iostream>
#include <ranges>
#include <vector>

auto make_index_projection_for_subrange = []<std::ranges::random_access_range R>(R & range) {
    return [&range](std::ranges::range_difference_t<R> i) -> decltype(auto) {
        return std::ranges::subrange(std::ranges::begin(range)+i, std::ranges::end(range));
        };
};

int main()
{
    const std::string s1 = "abdbc";
    std::vector i1{0,1,2,3,4};
    
    //Sort by substring from the char to the end
    auto proj = make_index_projection_for_subrange(s1);

    std::ranges::sort(i1, std::ranges::lexicographical_compare,proj);
    std::ranges::copy(i1, std::ostream_iterator<int>(std::cout));
    std::cout << std::endl;

    for(auto i: i1){
        std::cout << std::string(s1.begin()+i, s1.end()) << std::endl;
    }
}

See the demo; it provides the following results:

03142
abdbc
bc
bdbc
c
dbc

All the substrings are now sorted alphabetically and search as fast as at least binary search. If one need to find similar strings, this is just passing through adjacent lines.

I call this "adjacent index" because all similar substrings are adjacent to each other, but I guess this this is a classic structure with a common name.

1

There are 1 answers

0
Damir Tenishev On BEST ANSWER

The structure is Suffix_array

While the Trie is closely related and could be helpful, as well.

(Just to have it formally answered and easier to find the answer, based on the comments above)