' MSD RadixSort. No sort based on comparisons can be faster than O(N log N). ' RadixSort makes no comparisons and can therefore achieve O(N) behavior. To ' do this, it examines keys one byte at a time, counting the number of keys ' that have each possible byte value. The counts are then used to build an ' offset table specifying the sorted order of the keys. It's easiest to work ' backwards from the least significant digit of the keys (LSD Radix),' since order based on more significant digits is not disturbed by less ' significant digits. Unfortunately, the LSD approach requires padding short ' keys if key length is variable, and guarantees that all digits will be ' examined even if the first 3-4 digits contain all the information needed to ' achieve sorted order. Most significant digit (MSD) RadixSort takes a lot ' more bookkeeping -- the list must repeatedly be split into sublists for each ' value of the last digit processed -- but the pay-off is that only as many ' digits will be examined as are needed. As in QuickSort and MergeSort,' it's worthwhile to hand off the sublists to InsertionSort when they get ' short enough.'' MSD RadixSort is stable and runs in linear O(N) time. It is fairly memory ' intensive, needing space for some extra counting arrays and stack space for ' recursive calls, but still uses less memory than MergeSort. This version is ' set up for strings and would take some work to adapt to other data types. ' Integers and longs could be handled by converting them to strings (and ' limiting the arrays CNT and IND to the ten numerical digit values plus the ' minus sign); this is probably worthwhile only if you have hundreds of ' thousands of integers or longs to sort. Doubles are more of a challenge,' requiring conversion to an array of bytes by type casting. I have not ' pursued this, since VBA on the Mac does not include the CopyMemory function ' that enables type casting on Windows systems. '' NOTE: This version is set up to count byte values from 1 to 127. If your ' keys use (for instance) only lowercase or only uppercase alphabetic ' characters or only numerical digits, you can trim the CNT and IND arrays and ' your sorts will run correspondingly faster.'' Reference: P. M. McIlroy, K. Bostic and M. D. McIlroy,' "Engineering Radix Sort", Computing Systems 6(1):5-27 (1993).'' Speed: MSDRadixSortS sorts 500,000 random strings in 28.3 sec; sorts 100186 ' library call numbers in 21.3 sec; sorts 25479 dictionary words in 1.8 sec ' (random order), 1.5 sec (presorted) or 1.9 sec (reverse sorted). Timed in ' Excel 2001 on an 800 mhz PowerBook.'' Bottom line: complex and best suited to strings, but there's nothing faster ' for really long lists.' Usage: Dim S1(L To R) As StringsDim B1(1 To nChars) As ByteDim P1(L To R) As LongFor I = L To R S1(I) = GetRandomString()Next IStrsToBytes S1, B1, P1 'a routine that stores the strings in 0 terminated ' byte arrays with 'P1() holding pointers to the start of each byte series.MSDRadixSortS L, R, B1, P1' CODE:Sub MSDRadixSortS(N As Long) Dim CH() As Integer ReDim CH(1 To N) 'See below for type PILE (stack records) ReDim STACK(1 To 1000) StackPtr = 1 With STACK(StackPtr) .L = 1 .R = N .D = 0 End With StackPtr = StackPtr + 1 RadixS CHEnd SubSub RadixS(CH() As Integer) Dim L As Long Dim R As Long Dim D As Integer Dim I As Long Dim J As Long Dim TMP As Long Dim C As Integer Dim NextC As Integer Dim CNT(-1 To 127) As Long Dim IND(-1 To 127) As Long Dim NuCnt As Long Dim BigCnt As Long Dim OldSp As Long Dim BigSp As Long While StackPtr > 1 'Pop a PILE off the stack. StackPtr = StackPtr - 1 'Get left and right limits of sublist and depth of byte to examine With STACK(StackPtr) L = .L R = .R D = .D End With 'Sublists of <= 24 keys will be finished by InsertionSort If R - L > 24 Then 'Clear the count array. For I = -1 To 127 CNT(I) = 0 Next I 'Get the byte at depth D in each key and count the number of each byte ' value. For I = L To R C = CInt(B(P(I) + D)) CH(I) = C CNT(C) = CNT(C) + 1 Next I 'We will add the counts to create sorted addresses in array IND(). IND(-1) = L 'At the same time we'll push the sublists for each byte value onto the ' stack. OldSp = StackPtr 'We'll track the biggest sublist and make sure it comes off the stack ' last; 'this ensures the stack will not get more than logarithmically deep. BigCnt = 0 'Now we build the addresses out of the counts. For I = 0 To 127 J = I - 1 NuCnt = CNT(J) IND(I) = IND(J) + NuCnt 'If there is a sublist of keys starting with the current byte value, ' stack it. If NuCnt > 1 Then With STACK(StackPtr) .L = IND(J) .R = IND(I) - 1 .D = D + 1 End With StackPtr = StackPtr + 1 'Keep track of the largest count / sublist. If NuCnt > BigCnt Then BigCnt = NuCnt BigSp = StackPtr End If End If Next I 'Swap the biggest sublist down into the stack so it comes off last. TMP = BigSp BigSp = OldSp OldSp = TMP 'Now use the counts to move the pointers to their sorted positions ' based on the 'bytes examined so far; doing this in place this gets a bit ugly. For I = L To R 'We will use the byte at I to find what address P(I) should be mapped ' to. C = CH(I) 'We use -1 to flag pointers already moved. CH(I) = -1 'If C = -1 we skip the loop and increment I until we find a pointer ' not already moved. Do While C >= 0 'We go to IND(C) to get the destination address for P(I). J = IND(C) 'We swap the current pointer for the one at that address. TMP = P(I) P(I) = P(J) P(J) = TMP 'Now we determine where to map the pointer we just displaced. 'We get the byte value from its key. NextC = CH(J) 'We flag it as moved. CH(J) = -1 'Once we've used each address we increment it unless we've hit R. If J < R Then IND(C) = J + 1 Else Exit Do 'We set C to the byte of the key of the displaced pointer; ready ' to loop! C = NextC Loop 'If a series of displacements circles back to a pointer already moved, ' we 'increment I until we find a pointer not yet moved or until we end at ' R. Next I Else DeepInsertS B, P, L, 1 + R - L, D End If WendEnd SubType PILE L As Long R As Long D As IntegerEnd TypeDim STACK() As PILEDim StackPtr As IntegerSub DeepInsertS(B() As Byte, P() As Long, L As Long, N As Long, D As Integer) Dim LP As Long Dim RP As Long Dim TMP As Long Dim I As Long Dim J As Long For RP = L + 1 To L + N - 1 TMP = P(RP) For LP = RP To L + 1 Step -1 I = TMP + D J = P(LP - 1) + D Do While B(I) = B(J) If B(I) = 0 Or B(J) = 0 Then Exit Do I = I + 1 J = J + 1 Loop If CInt(B(I)) - CInt(B(J)) < 0 Then P(LP) = P(LP - 1) Else Exit For Next LP P(LP) = TMP Next RPEnd Sub


Data Observability Explained
Data is the lifeblood of any successful business, as it is the driving force behind critical decision-making, insight generation, and strategic development. However, due to its intricate nature, ensuring the