====== Enhanced DR-DOS single-file load ====== **2024-01-07** As I had previously [[https://github.com/FDOS/fdisk/issues/72#issuecomment-1843453333|mentioned on the FreeDOS FDISK bug tracker on github]], I went and combined the EDR-DOS kernel files to create a single-file load experience. ===== =====
I'd still be interested in learning how to combine the BIO and the DOS file into a single kernel file to then combine this with my iniload and inicomp stages, changing the kernel to use the native lDOS / RxDOS.3 load protocol. This protocol is described some in the lDOS boot docs: https://pushbx.org/ecm/doc/ldosboot.htm#protocol-sector-iniload One of the advantages to booting off of lDOS iniload is that it can be used as a number of different format kernels (unfortunately not as the current EDR-DOS DRBIO load protocol), quoth https://pushbx.org/ecm/doc/ldebug.htm#buildingprocessI also described some possible advantages to a single-file load in the BTTR Software forum ([[https://www.bttr-software.de/forum/forum_entry.php?id=20762&page=1&category=0&order=time|1]], [[https://www.bttr-software.de/forum/forum_entry.php?id=20763&page=1&category=0&order=time|2]]). Not all of these are utilised by my single-file load as yet, as the build still creates two separate files for the build. Also, each file is still created from linking a number of object files rather than building the entire module in one big assembly.The bootable executables can be used as MS-DOS 6 protocol IO.SYS, MS-DOS 7/8 IO.SYS, PC-DOS 6/7 IBMBIO.COM, FreeDOS KERNEL.SYS, RxDOS.3 RXDOS.COM, or as a Multiboot specification or Multiboot2 specification kernel. In any kernel load protocol case, the root FS that is being loaded from should be a valid FAT12, FAT16, or FAT32 file system on an unpartitioned (super)floppy diskette (unit number up to 127) or MBR-partitioned hard disk (unit number above 127). In addition, the bootable executables also are valid 86-DOS application programs that can be loaded in EXE mode either as application or as device driver. (Internally, all the .com files are MZ executables with a header, but they are named with a .COM file name extension for compatibility.)I wrote about why to prefer a single-file kernel load in the forum, as well: https://www.bttr-software.de/forum/forum_entry.php?id=20762&page=1&category=0&order=time
Several: * BIO and DOS file can get out of sync, potentially resulting in failures to boot or even data corruption. * One more file to keep track of. * Need a FAT FS read implementation in the BIO that is only ever used to read the DOS file, this is in [[https://hg.pushbx.org/ecm/edrdos/file/e1dbcad5136e/drbio/bdosldr.a86|bdosldr.a86]] for EDR-DOS, [[https://hg.pushbx.org/ecm/edrdos/file/e1dbcad5136e/drbio/biosinit.a86#l302|called by the BIO init routines here]]. * The BIO kernel has to locate the DOS file, meaning it will have to include a directory scanner. (Arguably, lDOS iniload contains just as much of a FAT FS reader (to read the remainder of the kernel file) as EDR-DOS's BIO file, but lDOS iniload certainly does not need a directory scanner.) * Compression of kernel files also needs two bespoke solutions, one for each file, whereas lDOS iniload + inicomp has a single compression stage which depacks the entire remaining kernel. This can make for better compression ratio than compressing two files, too. Back in the day I discussed this with Udo in the EDR-DOS forums. However, that thread is likely lost to time. I recall that Udo brought up you can update one of the files without the other, and vendors could get away with only providing a BIO file sharing a common DOS file that they wouldn't have to know much about. At the time I already noted that this last advantage is minor if the entire kernel is available as free software. [[https://hg.pushbx.org/ecm/edrdos/file/e1dbcad5136e/drbio/biosinit.a86#l302|The second link]] that I included in my list also hints it is possible for the DOS to be resident somewhere already, and not have the BIO load it from a file. This means part of the work towards a single-file kernel may already be done.I don't think it is required to combine the file. It is good for other reasons though.What are the other reasons?
Another advantage is you may be able to share more code between the entire kernel than if you split it into two files. And some build time calculations may be possible to optimise things that a two-file kernel cannot do. (In lRxDOS's single-assembly build, NASM can potentially calculate things that even a single-file kernel build cannot if you use a linker to link multiple object files into one executable.)===== Intention ===== As described in the quoted text, a single-file load has some advantages: * File can be smaller when compressed * Updates can never install an incompatible set of kernel files * No disk read, directory scan, and file system read needed to load additional file ===== Implementation ===== === drkernpl ==== [[https://hg.pushbx.org/ecm/ldosboot/file/6e436c848f79/drkernpl.asm|The drkernpl stage]] of the lDOS boot ecosystem is a payload to lDOS's iniload. It is [[https://hg.pushbx.org/ecm/ldosboot/file/6e436c848f79/fdkernpl.asm|based on fdkernpl]], the payload that passes control to an embedded FreeDOS kernel. In the case of drkernpl, two payload files are included, corresponding to the DRBIO and DRDOS modules. ==== Changes in the EDR-DOS repo's sources ==== The DRDOS module is in fact unchanged from the prior double-file load protocol. DRBIO [[https://hg.pushbx.org/ecm/edrdos/rev/b28a432679f9|contains some patches]] which concern keeping track of the position (segment address) and length of the loaded DRDOS module, as well as relocating it to the position it would be loaded to by the ''read_dos'' function. [[https://hg.pushbx.org/ecm/edrdos/rev/b28a432679f9#l2.12|Some additional code in DRBIO]] takes care of temporarily relocating the DOS module to segment 1070h first. This is only used if the compbios compression is used, as the depacker for this uses memory up to 70h:FFF0h. The compbios compression [[https://github.com/SvarDOS/edrdos/issues/7#issuecomment-1880041831|encodes consecutive zeroes using a simple run-length encoding]]. When compressing the entire kernel file, also enabling compbios is at best neutral but can actively increase the size of the final file. Other than the additional DRDOS payload, the protocol of drkernpl to DRBIO [[https://pushbx.org/ecm/doc/ldosboot.htm#protocol-sector-edrdos|mimics that of the original EDR-DOS load protocol]]: DRBIO is fully loaded at linear address 700h, entered at cs:ip = 70h:0, and passed a boot sector with (E)BPB at ds:bp == ss:bp. The boot load unit is passed in dl == bl, but can also be read from ''byte [ss:bp + 24h]'' (FAT12/FAT16) or ''byte [ss:bp + 40h]'' (FAT32). The DOS allocation is passed in ax => DOS file and si = length of DOS file in bytes. The stack [[https://pushbx.org/ecm/doc/ldosboot.htm#protocol-iniload-payload-stack|is allocated up to 8 KiB]] in a segment behind the DOS file. === Code dropped from DRBIO === The compbios compression [[https://hg.pushbx.org/ecm/edrdos/rev/0cf29efb060b|has been disabled]] to simplify the handling of the DRDOS module, and to avoid compressing the already-compressed stretch of data. The entire ''read_dos'' [[https://hg.pushbx.org/ecm/edrdos/rev/feda14976c32|function is dropped]], including its file system login, FAT chain walking, directory scanning, and file reading. Several multiplication and division functions were also dropped. A DOS version check function remains in the bdosldr source file, and the DOS-related error message also remains as it is used by this function. The DRDOS module's [[https://hg.pushbx.org/ecm/edrdos/rev/465be5fcbb65|filename in FCB format also got dropped]]. ==== iniload ==== The iniload stage is mostly unchanged from the recent revisions of lDOS boot. It [[https://pushbx.org/ecm/doc/ldosboot.htm#iniload-modes|can be loaded as a (first) kernel file]] for the following load protocols: * lDOS / RxDOS.3 * FreeDOS * Enhanced DR-DOS * MS-DOS v6 / IBM-DOS * MS-DOS v7 * Multiboot v1 * Multiboot v2 The only change to iniload is for [[https://hg.pushbx.org/ecm/ldosboot/rev/82071e8903ee|the addition of the EDR-DOS load protocol]]. This requires the full kernel file [[https://hg.pushbx.org/ecm/ldosboot/rev/e72fab017c25|to be at least 32 KiB in size]], to allow distinguishing EDR-DOS load from MS-DOS v6 / IBM-DOS load, all of which use an entrypoint of 70h:0. (Secretly, [[https://hg.pushbx.org/ecm/ldosboot/rev/2ec1c30d39f2|the change also allows to load the kernel]] using the NTLDR, BOOTMGR, or DOS-C IPL.SYS load protocols. But these are not documented.) ==== inicomp ==== [[https://hg.pushbx.org/ecm/inicomp/|The inicomp stage]] allows to compress a kernel at build time to depack it at runtime. In the lDOS boot ecosystem, inicomp is used to build a compressed kernel. For building the compressed EDR-DOS kernel, inicomp is assembled for single-mode operation, allowing to enter it only as a kernel. (When enabled, inicomp can also be entered as a DOS device driver or DOS application.) Literally no changes to inicomp were added related to EDR-DOS single-file load. ==== drload ==== [[https://hg.pushbx.org/ecm/ldosboot/file/6e436c848f79/drload.asm|The drload stage]] is a replacement for iniload. The advantage of drload is purely to optimise the resulting file for size. The disadvantage is that the file can only be used as a kernel for the FreeDOS or EDR-DOS load protocols. This limitation allows to drop all disk read, FAT chain walking, and file read code from drload, and most file system related code. The initial, already working revision of drload [[https://hg.pushbx.org/ecm/ldosboot/rev/b12892e666c3|was created almost exclusively by dropping lines]] from the copied iniload source file. All that remains is mostly dedicated to setting up the stack and a few variables for the next stage. As described in the ldosboot manual, [[https://pushbx.org/ecm/doc/ldosboot.htm#protocol-drload-payload|drload uses a subset of the iniload-to-payload protocol]] to run its own payload. This subset suffices to run a kernel-mode-only inicomp as well as drkernpl. ==== Second payload executable ==== The single-file (uncompressed) kernel that uses iniload is named ''edrdos.com''. To avoid [[https://retrocomputing.stackexchange.com/questions/14663/why-do-pc-dos-kernel-files-have-the-com-extension-even-though-they-are-not-exec|the odd choice of a file named .COM without being a valid DOS application executable]], this kernel [[https://hg.pushbx.org/ecm/edrdos/rev/5d4753bd7ca3|includes the second payload executable option of the iniload stage]]. Basically, the MZ executable header is mostly valid and refers to an image separate from the kernel payload image. In the first revision of the single-file load, this executable accepts an empty command line tail or one starting with the word "version" as a request to display the version string of the kernel. (This revision of the kernel [[https://hg.pushbx.org/ecm/edrdos/rev/5d02ab518bb2|identifies itself as the "2023 December" revision]].) ==== Compression ==== Aside the uncompressed files (''edrdos.com'' for iniload, ''edrdos.sys'' for drload), the mak.sh script of the single-file kernel [[https://hg.pushbx.org/ecm/edrdos/rev/465247a2713a|also builds two compressed files]]. These are named ''edrpack.com'' (for iniload) and ''edrpack.sys'' (for drload). The files are compressed [[https://www.nongnu.org/lzip/manual/lzip_manual.html#Stream-format|in the LZMA-lzip format]]. (The term "LZMA-lzip" is from the lzip version 1.21 manual, more recently the format is only called "LZMA-302eos".) From experiences with lDebug, this likely results in the smallest output file of all the supported formats. However, depacking on a low-end machine (eg NEC V20) may take several minutes to complete. The drload-using compressed file is smaller than the sum of the original double-file kernel files compressed [[https://pushbx.org/ecm/download/edrdos/pack101.zip|using the pack101 method]], supporting the idea of an advantage of the single-file load:
pack$ du drbio.sys.* drdos.sys.* --bytes --total
35575 drbio.sys.unpacked
36973 drdos.sys.unpacked
72548 total
pack$ du drbio.sys drdos.sys --bytes --total
18414 drbio.sys
27799 drdos.sys
46213 total
pack$
$ du * --bytes
80896 edrdos.com
76080 edrdos.sys
50176 edrpack.com
45680 edrpack.sys
$
Of course, two files are also worse in that they use
at least two directory entries and
more cluster slack space in the file system.
For example, with the smallest cluster size on a file system
that uses 512 Bytes/sector:
pack$ echo "$(( (18414 + 511) / 512 * 512 ))"
18432
pack$ echo "$(( (27799 + 511) / 512 * 512 ))"
28160
pack$ echo "$(( 18432 + 28160 ))"
46592
pack$
$ echo "$(( (45680 + 511) / 512 * 512 ))"
46080
$
===== Future =====
Dropping [[https://hg.pushbx.org/ecm/edrdos/file/465be5fcbb65/drbio/init.asm#l692|the early initialisation of sp to C000h]] (48 KiB within the stack segment) should be acceptable when compbios compression is not used.
[[https://hg.pushbx.org/ecm/edrdos/file/465be5fcbb65/drbio/biosinit.a86#l330|The support for an UPX-packed DRDOS module in DRBIO]] can be dropped for the single-file kernel, as there is no advantage to using UPX in this way.
[[https://hg.pushbx.org/ecm/inicomp/file/9725324008b6/inicomp.asm#l131|The dots progress display (_COUNTER)]] of the inicomp stage may be enabled to indicate the progress of the depacker, especially for slower machines.
Building using different inicomp compression methods may be added. The support exists within inicomp, using it is merely a question of scripting. ([[https://hg.pushbx.org/ecm/ldebug/file/0779e962fa82/source/mak.sh#l462|The canonical implementation of all of the methods]] is in lDebug's mak.sh script.)
===== Set up =====
Currently to use the provided mak.sh script to build the four single-file kernels, [[https://hg.pushbx.org/ecm/edrdos/rev/465247a2713a|some setup is needed]]:
* ldosboot repo accessible as ../ldosboot/
* lmacros likewise
* scanptab likewise
* inicomp likewise
* lzip and nasm must be found in the path
The script will default to building the entire kernel using dosemu2 and the mak.bat script. Passing the first parameter as the unquoted string "onlypl" allows to skip the dosemu2 call.
===== Reliability =====
Currently the kernel does some things that may not be reliable:
* [[https://hg.pushbx.org/ecm/edrdos/file/465be5fcbb65/drbio/init.asm#l692|Relocates sp to C000h early]] without relocating ss (could have been a problem before too)
* [[https://hg.pushbx.org/ecm/edrdos/file/465be5fcbb65/drbio/biosinit.a86#l253|Copies DRDOS module]] using forward ''rep movsb'' (Direction Flag UP) unconditionally, assumes that source and destination do not overlap
* Assumes that the DRDOS module fits in the memory layout without checking that this is true (could have been a problem before too, in fact due to sector size the problem was probably worse before)
On the other hand, in some ways the single-file kernel is definitely more reliable:
* [[https://hg.pushbx.org/ecm/edrdos/rev/feda14976c32#l1.233|There is a bug in which]] the BDOS loader may have calculated an off-by-one amount of sectors to read, but this is unlikely to cause any problems
* The BDOS loader did not support 256 sectors/cluster, which iniload and [[https://github.com/FDOS/kernel/commit/2efe4ab9f4a645eb3b44a178c99eb03c4cfa6ead|the new revisions of the FreeDOS boot sector loaders]] do support
===== Use =====
Any one of the four kernel files suffices to load the kernel. FreeDOS SYS or EDR-DOS SYS may both be used to install the boot sector loader, using their /K switch to set the appropriate filename for the kernel. (EDR-DOS SYS is so old it does not contain loaders that support 256 sectors per cluster.) Other loaders that support either EDR-DOS or FreeDOS are usable as well.
When using the ''edrdos.com'' or ''edrpack.com'' file (iniload), lDOS's instsect may also be used, in which case the /F= switch is used to set the filename. A build of instsect (with the lDOS protocol loaders) is included in lDebug builds. Renaming either of the .COM (iniload) files to ''io.sys'' or ''ibmbio.com'' also allows to load them using boot loaders for MS-DOS v6/v7 or PC-DOS v6/v7.
{{tag>edrdos ldosboot inicomp lzip}}
~~DISCUSSION~~