====== June work ====== **2025-06-29** ===== lDOS boot ====== * [[https://hg.pushbx.org/ecm/ldosboot/rev/e5db6da2f0a4|Update attribution year]] of testboot.asm * doc: [[https://hg.pushbx.org/ecm/ldosboot/rev/d02bfd954b6f|List FreeDOS protocol file size limits]] (128 KiB up to 134 KiB) * doc: [[https://hg.pushbx.org/ecm/ldosboot/rev/17bc239aa60b|Update attribution year]] * [[https://hg.pushbx.org/ecm/ldosboot/rev/8153a92faf53|Add mak script]] to build testpl.com and testboot.bin * doc: [[https://hg.pushbx.org/ecm/ldosboot/rev/9deb11439396|Mention]] the advantage of segment 200h load, mention lkernpl ===== lDebug ====== * doc: [[https://hg.pushbx.org/ecm/ldebug/rev/d11cd2d802f3|Add references/links to each section]] in the quick start ===== insref ===== [[https://hg.pushbx.org/ecm/insref/rev/9deb2b712b69|Add the section iref-prefix]] listing all prefixes (8086, 286, 386). Additionally, [[https://hg.pushbx.org/ecm/insref/rev/e0fbf77072fe|add the REPZ and REPNZ synonyms]] for REPE and REPNE. ===== wwwecm.scr ===== [[https://hg.pushbx.org/ecm/wwwecm.scr/rev/c801dafbc54f|Add lzexe]] to automatic current release builds, and to documentation builds, and add toolchain for FreePascal i8086 DOS target. (The FreePascal build of lzexe isn't used by the current builds yet.) ===== webecm ===== [[https://hg.pushbx.org/ecm/webecm/rev/c51eb34cafca|List LZEXE in a section on the page]], and add a news item about it. ===== LZEXE ===== Robert pointed out the recent source release of Fabrice Bellard's LZEXE executable packer to me (on 2025-05-24). Initially it seems the release wasn't accompanied by a free software license so I didn't want to look into it. However, a while later [[https://clownacy.wordpress.com/2025/05/24/the-original-lzexe-a-k-a-kosinski-compressor-source-code-has-been-released/|someone else blogged about it]] and noted the MIT license is applicable to the sources. That's when I picked up the sources and started porting and extending them. (It seems that Robert did initially load an archive without the license, but it was silently added later.) Building the initial source release proved difficult. [[https://hg.pushbx.org/ecm/lzexe/file/38524a7a5664/exepack.asm|The A86 source for the depacker stub]] was missing a segmentation directive and a label with a public directive. It's a riddle how this was supposed to build. I tried out several A86 versions and none of them worked with the file as is. Next, [[https://hg.pushbx.org/ecm/lzexe/file/38524a7a5664/lzexe.pas#l8|the stub size was hardcoded]] in lzexe.pas which is a bizarre choice. I did fix this later also. It also became clear that the provided sources do not exactly match any of the three binaries of LZEXE. This may be in part due to the choice of compiler and its version. (I use TurboPascal 5.0 to build.) But the v0.91 lzexe.exe file also contains a "chemin" keyword in its help display that doesn't match the provided sources. The documentation does exactly match v0.91 though. This revision of the documentation was all in french, much like most comments and messages in the sources. ==== The new stub format ==== I iterated on the stub format several times, producing the LZE0 to LZE6 formats and finally LZX0. In the final format: * Instead of placing 7 word variables at the start of the buffer, [[https://hg.pushbx.org/ecm/lzexe/rev/b70b9d8e3021|they're inlined into instruction immediates]]. * The variables [[https://hg.pushbx.org/ecm/lzexe/file/db078afe8f09/unlzexe.c#l51|are found by unlzexe using search patterns]] to match the code snippets. * There are 8 different variants of the stub: * With/without relocation table. If [[https://hg.pushbx.org/ecm/lzexe/file/db078afe8f09/exepack2.nas#l209|the relocation table would be empty]], it can be omitted from the stub. * With/without the segment change handling. If the decompressed image is smaller than 40 KiB (A000h bytes) [[https://hg.pushbx.org/ecm/lzexe/file/db078afe8f09/exepack2.nas#l175|this handling can be omitted]] from the stub. * With getbit inlined [[https://hg.pushbx.org/ecm/lzexe/file/db078afe8f09/exepack2.nas#l136|or not]]. This is purely a speed vs size tradeoff. * The choice of the variants can be influenced somewhat by six switches to LZEXE: * [[https://hg.pushbx.org/ecm/lzexe/rev/b6882d255033|/1 (use v0.91 alike stub) vs /2 (use LZX0 stub)]] * /S (stop optimisations) vs /O ([[https://hg.pushbx.org/ecm/lzexe/rev/9fc8727340cb|optimise reloc table]] and [[https://hg.pushbx.org/ecm/lzexe/rev/7767211d1b34|seg change]]) * [[https://hg.pushbx.org/ecm/lzexe/rev/d90b1a97f7fc|/I (inline getbit) vs /J (do not inline getbit)]] * The default is /2 /O /J The /1 option won't recreate an exactly identical output file to v0.91 because of some assembly encoding choices, and several ''or reg, reg'' to ''test reg, reg'' [[https://hg.pushbx.org/ecm/lzexe/rev/18b6d64ecd1a|changes I did to the original stub]]. As [[http://files.shikadi.net/malv/files/unlzexe.c|unlzexe 0.9G2]] ([[https://cosmodoc.org/topics/exe-file-format/#fn:1|via]]) uses part of the stub to detect the format, it very likely will fail to depack files written by our LZEXE with the /1 switch. ==== Relocation table format ==== The compressed data format for the LZX0 image did not change, much like between LZEXE v0.90 and v0.91. [[https://hg.pushbx.org/ecm/lzexe/rev/9fa0f74387af|The relocation table format changed slightly]] from v0.91: Instead of a first byte equal to 0 being the encoding for a longer table entry, now the first byte equal to 255 is used for this purpose. This allows to encode zero-difference entries. This includes duplicate relocation entries (more than one pointing at the same address) as well as a relocation entry at the very beginning of the executable image. The former is arguably invalid, albeit DOS supports it. But the latter could very well occur in a valid file. LZEXE v0.91 would corrupt its relocation table upon encountering a zero-difference relocation. In our fork, selecting the v0.91 format with the /1 switch [[https://hg.pushbx.org/ecm/lzexe/rev/f44d806209b1|will instead detect and reject files with such zero-difference relocation entries]]. ==== Porting ==== The A86 to NASM port was trivial with fixmem and ident86. ([[https://hg.pushbx.org/ecm/lzexe/rev/71faa361b0f4|exepack.nas]], [[https://hg.pushbx.org/ecm/lzexe/rev/c82ea83e8931|lzss.nas fixmem]] and [[https://hg.pushbx.org/ecm/lzexe/rev/4e5c1008ee8f|identicalising]].) The Pascal files originally built with TurboPascal 5.0 and I ported them to also build with FreePascal 3.2.2 using its i8086 msdos target. Needed changes: * [[https://hg.pushbx.org/ecm/lzexe/rev/b7b79b8c6145|Repeating function parameter and return types]] in the definition of functions, instead of relying only on the declaration. (There seems to be a FreePascal option to accept the old style but it doesn't harm us to change these.) * [[https://hg.pushbx.org/ecm/lzexe/rev/46af8ca84bfd|Marking variables for use by the assembly language object files]] using '';export'' (This is not accepted by TurboPascal.) As advised [[https://forum.lazarus.freepascal.org/index.php/topic,71520.0.html|on the FreePascal forum]]. * Selecting the large memory model to make some far pointers work. Notably the copy from PSP:80h in cmdline.pas (as I [[https://forum.lazarus.freepascal.org/index.php/topic,71532.0.html|queried on the FreePascal forum]]) and the function pointers in lzutil.pas. This also allows getmem to allocate more than 64 KiB in total (across multiple getmem calls). * [[https://hg.pushbx.org/ecm/lzexe/rev/ecfac7d0f1dd|Changing the section directives]] in lzss.nas to accommodate FreePascal's linker. This involved the repeated warning message ''Warning: Encountered an OMF reference to a section, that has been removed by smartlinking: DATA||''. I also [[https://forum.lazarus.freepascal.org/index.php/topic,71536.0.html|asked about this on the FreePascal forum]] but ultimately figured it out by my self. This also turned out to be incompatible to TurboPascal's linker so [[https://hg.pushbx.org/ecm/lzexe/rev/88c5a59e3c18|we now create two object files]], lzss.obj for TurboPascal, and lzss.o for FreePascal. The [[https://hg.pushbx.org/ecm/lzexe/rev/46af8ca84bfd#l1.7|added export keywords]] as well as [[https://hg.pushbx.org/ecm/lzexe/rev/63c7c17e47da|some compiler options that generate warnings in FreePascal]] are gated using ''%%{{$IFDEF FPC}}%%'' constructs that fortunately are supported by both compilers. ==== Other changes ==== * UPACKEXE [[https://hg.pushbx.org/ecm/lzexe/rev/f52e122ce475|won't create an invalid relocation table entry]] if the offset in the packed relocation table is equal to FFFFh. * COMTOEXE [[https://hg.pushbx.org/ecm/lzexe/rev/3ea0c0775b20|will add a stub to push a zero word on top of the stack]]. It will also limit the input file size. * Everything is translated to english. * unlzexe is [[https://hg.pushbx.org/ecm/lzexe/rev/cee5e3ac146f|added from v0.5 shipped with LZEXE v0.91e]], with one change picked from v0.7 (extracted from [[http://files.shikadi.net/malv/files/unlzexe.c|v0.9G2]]). It can detect and unpack any LZX0 variant created by our LZEXE. * unlzexe has more optional outputs, selected using the DEBUG variable. It can also be [[https://hg.pushbx.org/ecm/lzexe/rev/54d0e297eae6|run in dot mode]] with the second parameter equal to a lone dot, in which case it doesn't write its output anywhere. * unlzexe [[https://hg.pushbx.org/ecm/lzexe/shortlog/d186b0462753|can be compiled for amd64 Linux target]] or [[https://hg.pushbx.org/ecm/lzexe/rev/0a9a37d4fedb|ia16 DOS target]]. * LZX0 stub is optimised for size in several more ways than just embedding the variables. Its maximum size changed from 330 bytes to 305 bytes. * [[https://hg.pushbx.org/ecm/lzexe/rev/61d991e6a7b2|/L switch to LZEXE allows to keep output file]] even if it isn't smaller than the input file, for testing purposes. * [[https://hg.pushbx.org/ecm/lzexe/rev/aff79054b982|/# switch to LZEXE allows to hack the two signature bytes]] after the "LZ" signature. Only printable ASCII allowed. * [[https://hg.pushbx.org/ecm/lzexe/rev/dfce3e09f1f4|LZEXE]] and [[https://hg.pushbx.org/ecm/lzexe/rev/0f48409eec7c|unlzexe]] will display a printable ASCII signature if it is an unknown signature, instead of displaying "??" or "not LZEXE file". * The stub objects include a preparation function each. This is called by the host Pascal code, and it sets some variables exported by the Pascal code. The prepare function [[https://hg.pushbx.org/ecm/lzexe/rev/52d1f8005498|also returns a far pointer to its associated depacker stub]], avoiding the need to dispatch for two statements. * In several places either a ZM or MZ signature for the EXE header is allowed. * [[https://hg.pushbx.org/ecm/lzexe/rev/a18c1831d7b3|The lzexedat tool]] depicts how to use lzss.nas to build a plain compression format. No depacker exists for this format yet. {{tag>ldosboot ldebug insref wwwecm.scr webecm lzexe freepascal turbopascal}} ~~DISCUSSION~~