Yesterday I added the LZEXEDAT compression method to inicomp. This currently uses a plain LZEXE format data stream, without any Meta data.
The compression is done by lzexedat.exe, a DOS tool that generates the headerless compressed stream. I added the lzexedat.sh script which deals with leading dotdots then calls dosemu2. (The redirected drive does not allow to use a dotdot to go beyond its root directory, so the script will change the directory of the dosemu2 drive and strip the leading dotdots.)
The depacker worked immediately on the first attempt. There is a subsequent change to the file but it only changes a comment.
I implemented the depacker based on my description of the format in the LZEXE fork's documentation. Unlike the original online depacker, this uses the inicomp overlap detection and also checks for all sorts of overflows.
Despite the additional checks, the LZEXEDAT depacker is still reasonably fast. The following is a comparison of almost all inicomp methods on compressing the current lCDebugX as a triple-mode executable. The test was done on our amd64 Debian server, without KVM.
Scriptlet used to run the build, compression, and speed tests: INICOMP_SPEED_TEST=128 use_build_help_external=1 use_build_help_compressed=1 INICOMP_METHOD="brieflz lz4 snappy exodecr x heatshrink lzd lzo lzsa2 apl bzp zerocomp mvcomp lzexedat" use_build_compress_only=0 build_name=cdebugx time -p ./mak.sh -D_PM=1 -D_DEBUG -D_DEBUG_COND -D_DEFAULTSHOWSIZE
lDebug revision: 0d6619e128e6
The results:
-rw-r--r-- 1 98816 ../bin/lcdebugx.com
real 490.89
user 484.39
sys 5.73
ldebug/source$ LC_ALL=C sort ../tmp/cdebugx.siz
98816 bytes ( 68.19%), method lzd
102400 bytes ( 70.67%), method exodecr
104448 bytes ( 72.08%), method apl
105984 bytes ( 73.14%), method lzsa2
113664 bytes ( 78.44%), method lzo
114176 bytes ( 78.79%), method lzexedat
117248 bytes ( 80.91%), method lz4
117760 bytes ( 81.27%), method bzp
121344 bytes ( 83.74%), method heatshrink
121856 bytes ( 84.09%), method mvcomp
122368 bytes ( 84.45%), method brieflz
131072 bytes ( 90.45%), method snappy
140800 bytes ( 97.17%), method zerocomp
144896 bytes (100.00%), method none
ldebug/source$ LC_ALL=C sort ../tmp/cdebugx.spd
1.17s for 128 runs ( 9ms / run), method zerocomp
4.65s for 128 runs ( 36ms / run), method snappy
11.21s for 128 runs ( 87ms / run), method lz4
11.22s for 128 runs ( 87ms / run), method bzp
11.60s for 128 runs ( 90ms / run), method mvcomp
11.63s for 128 runs ( 90ms / run), method lzsa2
11.66s for 128 runs ( 91ms / run), method lzexedat
14.43s for 128 runs ( 112ms / run), method lzo
32.72s for 128 runs ( 255ms / run), method brieflz
35.71s for 128 runs ( 279ms / run), method exodecr
36.34s for 128 runs ( 283ms / run), method apl
58.47s for 128 runs ( 456ms / run), method heatshrink
132.34s for 128 runs ( 1033ms / run), method lzd
ldebug/source$ grep -F 'warning: ini' ../tmp/*/pcdebugx.lst | perl -ne 's/^.*warning: (.*) \[-w\+user\]$/$1/; /^([a-zA-Z0-9]+): ([0-9]+)/; printf("%6u $1\n", $2)' | LC_ALL=C sort
1456 inibzp
1488 inizero
1504 inimv
1552 inilzexe
1648 inihs
1744 iniapl
1840 iniexo
1840 inilzsa2
1936 inilz4
1952 inisz
2032 iniblz
2592 inilzo
3424 inilz
ldebug/source$
The current LZEXEDAT has a smaller depacker than LZSA2, is almost as fast, and compresses about 5% worse. In comparison to heatshrink, its depacker is smaller, much faster, and it compresses about 5% smaller.
Perhaps we may use LZEXEDAT for compressing the debugger's online help pages or the packed ELD library. It remains to be seen whether it will compress equally well for these tasks. But the window is a fixed 8 KiB currently, whereas we use 4 KiB for some heatshrink uses.