local function fShallowCopy(tData)
local tOutput = {}
for k,v in ipairs(tData) do
tOutput[k] = v
end
return tOutput
end
local function fLexTblSort(tA,tB) --sorter for tables
for i=1,#tA do
if tA[i]~=tB[i] then
return tA[i]<tB[i]
end
end
return false
end
function fBWT(tData)
--setup--
local iSize = #tData
local tSolution = {}
local tSolved = {}
--key table--
for n=1,iSize do
tData[iSize] = fRemove(tData,1)
tSolution[n] = fShallowCopy(tData)
end
table.sort(tSolution,fLexTblSort)
--encode output--
for i=1,iSize do
tSolved[i] = tSolution[i][iSize]
end
--finalize--
for i=1,iSize do
if fIsEqual(tSolution[i],tData) then
return i,tSolved
end
end
return false
end
Above is my current code for achieving BWT encoding in Lua. The issue is because of the size of the tables and lengths of loops it takes a long time to run. For a 1000 character input the average encoding time is about 1.15 seconds. Does anyone have suggestions for making a faster BWT encoding function?
the biggest slowdowns appear to be in fLexTblSort and fShallowCopy. I have included both above the BWT function as well.
If I see right, your algorithm has complexity
O(n^2 log n)
, if the sort is quicksort. The comparator functionfLexTblSort
takesO(n)
itself for each pair of values you compare.As I checked with my implementation from few years back, I see possible space to improve. You create all the possible rotations of the
tData
, which takes also a lot of time. I used only single data block and I stored only starting positions of particular rotations. You also use a lot of loops which can shrink into less.Mine implementation was in C, but the concept can be used also in Lua. The idea in some hybrid pseudocode between your Lua and C.
You will also need your own sort function, because the standard does not offer enough flexibility for this magic. Quicksort is a good idea (you might avoid some of the arguments, but I pasted just the C version I was using):
The last step is your own comparator function (similar to your original, but working on the rotations, again in C):
This is the basic idea I came up with few years ago and which worked for me. Let me know if there is something not clear or some mistake.