User Tools

Site Tools


blog:pushbx:2026:0516_dr-dos_v9_trial

DR-DOS v9 trial

I tried out the most recent build of DR-DOS v9 recently. In case you're unaware, DR-DOS v9 is a new project that provides a proprietary and closed source DOS that is "a faithful clean-room reimplementation" coupled with the original trademark for DR-DOS. It is "free to download and use for personal, educational, and hobbyist purposes", but does not appear to permit redistribution.

The most recent revision was "9.0 rev 648 - 2026-05-02" when I tried it. I ended up submitting a bug report to their tracker, which appears to be private.

Since, version "9.0 rev 692 - 2026-05-16" has been released, but there is no response to my bug report yet and the changelog doesn't appear to list a change addressing this bug.

The bug report

Subject

DOS corrupts .exe file during load after the image's first 38912 (9800h) bytes

Version

DR DOS (R) 9.0 rev 648

Hardware/Virtualization

QEMU emulator version 5.2.0 (Debian 1:5.2+dfsg-11+deb11u5)

Provide as much detail as possible about the bug

It appears that large MZ .exe files aren't fully loaded by DOS, or corrupted in some way. A test program is provided in https://pushbx.org/ecm/test/20260513/ including source text and executable. If need be, the executable can be rebuilt like so:

nasm -fobj testlong.asm && ~/proj/ldos/warplink.sh /mx /as:1 testlong.obj,testlong.exe,testlong.map\;

This requires NASM, dosemu2 and a DOS, WarpLink, and the warplink.sh script from https://hg.pushbx.org/ecm/msdos4/file/846cdd0dc7d3/warplink.sh

The expected return is no output, return code 0. On DR-DOS v9 rev 648, instead the address 0948:382 (corresponding to linear 9802h = 38914, subtract 2 as repe scasw points after the different word) is displayed, and the return code is 255. This indicates that the constant pattern in the program executable was corrupted at least starting at this address.

Debugging using application mode lDebug wasn't possible as the debugger itself is too large and runs afoul of this bug. Debugging using bootloadable lDebug, with a loader extension that reserves memory at segment 8000h used by drbios.sys, later setting a breakpoint on the int 21h handler for function 4Bh, then tracing, shows that DOS does appear to load and enter the debugger executable, however its first far branch to the init section ends up jumping into uninitialised code (all-zeroes seen).

Log of test:

...
DR DOS (R) 9.0 rev 648
Copyright 2022-2026 Whitehorn Ltd. Co., All Rights Reserved


C:\>a:testlong

0948:0382

C:\>a:quit

Entire test case source text (mangled by this text input box):

; Public Domain

	cpu 8086

	section CODE
..start:
	mov ax, FIRST
	mov es, ax
	mov dx, REPEAT
loop:
	mov ax, 2626h
	mov cx, BLOCKSIZE / 2
	xor di, di
	repe scasw
	jne bad
	dec dx
	jz good
	mov ax, es
	add ax, BLOCKSIZE / 16
	mov es, ax
	jmp loop

bad:
	int3
	mov ax, es
	mov bx, cs
	sub ax, bx
	call disp_ax_hex
	mov dl, ':'
	mov ah, 02h
	int 21h
	mov ax, di
	call disp_ax_hex
	mov dl, 13
	mov ah, 02h
	int 21h
	mov dl, 10
	mov ah, 02h
	int 21h
	mov ax, 4CFFh
	int 21h

good:
	mov ax, 4C00h
	int 21h

disp_ax_hex:
	xchg al, ah
	call .al
	xchg al, ah
.al:
	push cx
	mov cl, 4
	rol al, cl
	call .nybble
	rol al, cl
	pop cx
.nybble:
	push ax
	push dx
	and al, 15
	add al, '0'
	cmp al, '9'
	jbe .got
	add al, 7
.got:
	xchg dx, ax
	mov ah, 02h
	int 21h
	pop dx
	pop ax
	retn

REPEAT equ 64
BLOCKSIZE equ 1024

	section FIRST align=16
%assign id 0
%rep REPEAT
	section SECTION%+id align=16
	times BLOCKSIZE db 26h

%assign id id+1
%endrep

	section STACK stack
	resb 512

lDebug and loader updates

DRDOS9 load protocol

I examined the boot sector loader and parts of the drbios.sys initial loader to figure out their interaction. The update adds a new field to the LOADSETTINGS structure, lsCopyBPB, as well as the COPYBPB parameter to the boot protocol command. This is used by DRDOS9 to copy the boot sector (512 Bytes) from 0:7C00h to 0:500h. The initial loader expects to find the boot sector (particularly BPB) at this address, but it is too low an address to store the main instance of the BPB and stack there.

The boot sector loader stores an LBA flag at 7DFDh (last byte before the 0AA55h boot sector signature), which it uselessly writes to initialise despite being part of the boot sector loaded by the prior stage.

The loader uses 186+ instructions.

The loader also seems to load the FAT to 07E00h, which it doesn't use, and the root directory to 10000h (corrupting the FAT, and assumed to be <= 64 KiB) to find the drbios.sys file. However, it doesn't seem like the next stage makes use of either of these in memory.

If the drbios.sys file isn't found, the loader enters a cli \ hlt loop without any error indicator to the user.

The next stage is entered with DS = ES = CS = 70h, SS:SP = 60h:FFFEh, DL = boot unit, AX = 0060h. However, it appears to be sufficient to pass CS:IP = 70h:0 and DL = boot unit, so that's what lDebug currently does.

The next stage (first kernel file in lDebug terminology) may be as small as 1024 Bytes (1469 Bytes seen), but the boot sector loader will load 8 consecutive sectors. The next stage relocates itself using a 1000h word move, so it seems to expect it may be as large as 8 KiB. lDebug allows to load between 1024 and 8192 Bytes in this revision of the debugger.

Debugging the early kernel load

The drbios.sys stage unconditionally relocates to 8000h:0 (512 KiB) and sets up SS:SP = 8000h:FFFEh (just below 576 KiB). This would corrupt parts of the bootloaded resident debugger leading to funny results such as a loaded quit.eld not reacting to the QUIT command and EXT commands failing to run Extensions for lDebug with some quote related error.

Something that's good (for us) is that the kernel at no point initialises the int 1 and int 3 vectors in the IVT. So the resident debugger is easily accessible by running breakpoints in a small file (eg int3.com) after the kernel has finished booting.

To avoid the debugger corruption, a workaround was to load another instance of the debugger (resident at a lower address) then run the DOS under that. But that's not very satisfactory. So I added the nreserve Extension for lDebug (actually an Extension for Loader), which allows to reserve memory at pre-boot time. Running it as ext nreserve.eld install 8000 insures to reserve memory between segment 8000h (512 KiB) and the current end of the Low Memory Area.

News update

Last changeset to the debugger is the update to the news-r11 section of the manual.

How to find the bug

I tried loading another instance of the debugger, in application mode, while running the kernel under the bootloaded mode of the debugger with the memory starting at 512 KiB reserved. To do so, I set up a diskette image like so:

nasm ~/proj/ldosboot/boot.asm -D_LOAD_NAME="'LDEBUG'" -I ~/proj/lmacros/ -o boot.bin && nasm ~/proj/bootimg/bootimg.asm -D_BOOTPATCHFILE=boot.bin -I ~/proj/lmacros/ -o diskette.img -D_PAYLOADFILE=ldebug.com,fdbplace.eld,nreserve.eld,ldebug.sld,loader.com,quit.com,int3.com,int19.com,extlib.eld,testlong.exe && stty sane && qemu-system-i386 -fda diskette.img -boot order=a -display curses -chardev serial,id=serial2,path=/tmp/vptty-dos -serial null -serial chardev:serial2 -hda ~/test/20260512/boot16.img; stty sane

I added the following to a startup Script for lDebug:

:bootstartup
@r ysf or= 4000
ext extlib.eld alias.eld install #2048
alias add res rc.execute boot protocol ldos loader.com . move bottom\; ext nreserve.eld install 8000\; move top\; boot protocol ldos cmdline=0 ldebug.com\; g; q
alias add dr9 boot protocol drdos9 hda1 .
alias add seri install serial timer
ext extlib.eld dm.eld install
ext extlib.eld rcexec.eld install
rc.abort
ext extlib.eld quit.eld install
install quickrun
goto :eof

@:applicationstartup
@:devicestartup
@?version
@install getinput

The res alias sets up nreserve.eld to reserve memory, then reloads the debugger again. dr9 just loads the drbios.sys initial loader from the first hard disk first partition. (As DR-DOS v9 doesn't allow redistribution, this server directory doesn't include the hard disk image with the actual DOS files.)

Running the following allows to trace DOS's EXEC call:

res
dr9
g
a:int3
bp at ptr ri21p when ah == 4B
g
a:ldebug /b
seri

In my tests I actually tried loading lCDebugX as lcdebugx.com or the shim executable cdebugx.com (without iniload). However, I don't expect the difference to matter. At first I would bw 0 to reset the WHEN condition for the breakpoint, then run g to try to figure out how the application mode debugger failed to load by breaking on the debugger's first int 21h call. This led to crashes or hangs rather than breaking at the interrupt handler entrypoint.

Eventually I redid the setup, then trace-proceeded with repeated tp commands. After a crash I looked through the serial terminal's history until I came across the first wrong instruction, which was the all-zeroes add [bx+si],al. The instructions immediately before that turned out to be the application mode entrypoint of the debugger, which set up a far pointer on the stack to branch to the init section. I found that the registers and memory at the entrypoint were fine, just the init section wasn't loaded properly it appeared.

That led me to write the testlong test case program, proving that the DOS doesn't load a 64 KiB+ executable image properly.

You could leave a comment if you were logged in.
blog/pushbx/2026/0516_dr-dos_v9_trial.txt · Last modified: 2026-05-16 20:40:11 +0200 May Sat by ecm