cabin

Source code for personal website
git clone git://git.dimitrijedobrota.com/cabin.git
Log | Files | Refs | README | LICENSE

commit 88df6fa82d8f803f78e918fa59f0102a2ced5225
parent 7626b75684d7b1d6e1d1585c19ab25cf9245f6a7
Author: Dimitrije Dobrota <mail@dimitrijedobrota.com>
Date:   Sun, 12 Jan 2025 14:26:54 +0100

Index sort article

Diffstat:
Ablog/IndexSort.md | 131+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 131 insertions(+), 0 deletions(-)

diff --git a/blog/IndexSort.md b/blog/IndexSort.md @@ -0,0 +1,131 @@ +@title: Index sort +@date: 2025-01-12 +@language: en +@categories: c++, algorithms, leetcode + + +# Index sort + +Today I am going to share my all-time favorite algorithm with a lot of +flexibility and use cases. Say for example you need to sort an array but also +need to keep track of the original index for each element, or there are +multiple arrays that need to be sorted on the same criteria. This is where +index sorts steps in. + + +## Algorithm + +```c++ +#include <vector> // vector +#include <algorithm> // sort +#include <numeric> // iota + +void do_something(const std::vector<int>& vec) { + const int n = vec.size(); + + static int idxs[100001]; + std::iota(idxs, idxs + n, 0); + std::sort(idxs, idxs + n, [&](int i, int j) { + return vec[i] < vec[j]; + }); + + ... +} + +void do_something(const std::vector<int>& vec1, std::vector<int>& vec2) { + const int n = vec1.size(); // vec1.size() == vec2.size() + + static int idxs[100001]; + std::iota(idxs, idxs + n, 0); + std::sort(idxs, idxs + n, [&](int i, int j) { + return vec1[i] != vec1[j] ? vec1[i] < vec1[j] : vec2[i] < vec2[j]; + }); // sort based on vec1, or on vec2 if equal + + ... +} + +``` + +The main idea of the algorithm is that instead of sorting the array itself, we +make a separate array that will represent all of the indices into the original +array, and then we sort them instead. Here are the steps: + +1) In the examples above I've created a static array idxs, since I know the +maximum value of N. If that isn't the case a dynamic vector can be used +instead. +2) With the std::iota I fill the idxs array with numbers from 0 to n - 1 +3) Use std::sort on idxs array with a custom comparator function, that indexes +into the original array(s). The result is an array of indices that is +rearranged in a way that represents the order that the original array +should be in order for it to be sorted. + +The following snippets can be used to index both original and sorted array in +the way you see fit. + +```C++ +for (int i = 0; i < n; i++) { + const int j = idxs[i]; + // i - position in the sorted array + // j - position in the original array +} +``` + +```C++ +#include <span> // C++20 + +for (const int i: std::span(idxs, n)) { + // iterate over indices in original array, in sorted order +} +``` + + +## Interesting Usage + +### Minimum number of swaps to make array sorted + +```C++ + +... + +int res = 0; +for (int i = 0; i < n; i++) { + while (idxs[i] != i) { + swap(idxs[i], idx[idx[i]]); + res++; + } +} + +return res; +``` + +This cute algorithm can be used to determine the number of swaps needed to sort +an array. It's complexity is O(n), but since we had to sort the index array +first, overall complexity is O(sort). + +It leverages the fact that in one swap we can put the element in it's place, +since we know where it belongs. Element at position i, belongs at position +idxs[i], so we swap it with the element at position idxs[idxs[i]], to get it in +place. We repeat this procedure until the current position is satisfied, then +we go to the next position. + + +### Range compression + +```C++ + +... + +int cnt = 0; +vector<int> compressed(n); +for (int i = 1; i < n; i++) { + if (idxs[i] != idxs[i - 1]) cnt++; + compressed[idxs[i]] = cnt; +} + +``` + +Sometimes we don't care about the values themselves, but about their relation +to one another (greater, less and equal). We can use this trick to compress the +range of values of any size, to the one sized at most N, for later use with some +special data structure like Segment Tree. +