August / September work

2023-09-03

Some work on EDR-DOS and lDebug happened. A manual for TESTHOOK has been added. Additionally, some experience using lDebug was gained.

EDR-DOS

During testing of building Enhanced DR-DOS from within itself (running in dosemu2, either on MFS-redirected drives or on a FAT32 hard disk image) it turned out that EDR-DOS's command.com does not support the /Y switch to the COPY command. I had used that switch for FreeCOM in order to copy the version.inc file from the repo root into the three subdirectories of the build tree. Lacking support for this switch, I went and simply added it to EDR-DOS's command.com. The COPY command already defaulted to overwriting files, so usually the /Y switch is a no-op.

It was pointed out in the forum that FreeCOM doesn't require the /Y switch when invoked from a batch file. That means my addition wasn't needed after all, I could have simply not used any switches for the COPY commands.

TESTHOOK manual

During testing of the lDebug optimisations for the interrupt handling, I unearthed the more advanced TESTHOOK program that had been given its own repo. I found I had to read up on its operation to figure out how to use it. To simplify this in the future, I added a manual that describes all parameters and switches of the application, as well as some details on its operation. I also added it to the current release builds created by the wwwecm scripts.

lDebug

List parameters accepting multiple sizes

The list parameter newly accepts AS WORDS and AS DWORDS keywords before the list contents to change the list item size in lDebug's release 6. Now, the size can be changed after every item, allowing to use different sizes. With AS BYTES the size can also be changed back to the default byte size.

IVT access disables Interrupt Flag

When neither in Protected Mode nor having DOS available, the debugger does direct IVT access to read or modify interrupt vectors. These accesses will now clear the Interrupt Flag during the access to help harden against invalid use of a half-modified vector. (The vector read can use a les dx instruction to read 32 bits at once, but the vector modification must happen with two instructions to stay 8086-compatible.)

Optimisations and one bugfix

Function to set up 86 Mode direct IVT access is shared
Preserving IF across IVT access is always done using pushf and popf, as these are only used in 86 Mode. So no need for the PM helpers for pushing and popping the Interrupt Flag.
After detecting the debugger to call for Update IISP Header, re-use the value still left in the ax register rather than reloading same value from the variable.
When using heatshrink to compress the online help pages or the debugger executable, try most permissible parameters. The -w 15 choice is avoided due to a bug in the packer. (The help pages often end up using a lower -w parameter than the prior minimum of 10.)
The $INICOMP_HEATSHRINK variable gets a default value early enough if none was specified.
The UnhookInterruptSim function's "harder" search will push and pop cx once rather than for every multiplex number.
The update_inttab_optional end marker was changed twice: Once to detect the marker of 0FFFFh using test ax, ax followed by js, and another time to change it to a zero word as marker, checked only using jcxz. This latter change was possible by making all valid entries contain a 1 in their high bytes, which are never used otherwise.

The DIL bugfix

During adding the variable auxbuff size using the /A switch, all comparisons to the size were replaced by instructions that access one of two variables. In particular, the DIL command's uses of the auxbuff want to check that at least 24 Bytes remain in the buffer.

However, as it turned out during testing of the interrupt handling changes, some of these variable accesses used the default ds segment where this is not valid. The error usually manifests as an "Error!" message when at least one hidden chain exists. (The check for the IVT-reachable chain happened to run with ds still equal to ss.) In the fix, we changed all three of these cmp instructions to use an ss: segment override just to be certain.

This is a very unfortunate bug to discover one week after a release. To mitigate this a little, there's a patch that will fix the uncompressed executables of the release. Naturally, the patch is provided as a Script for lDebug. You simply load the debugger with the uncompressed executable to patch, as in ldebug ldebugu.com, then run a y patch6.sld command.

The script will take care to reload the executable file in FLAT mode, which is needed for patching an MZ .EXE executable. Then it will search the last 64 KiB of the binary for a short sequence found in the init, which is used to learn the exact offset of the variable which is wrongly used.

Next, the accesses to this variable are searched for in the code section. To make this work for both lDebug and lDebugX, the search starts at 16 KiB into the binary, searching a 64 KiB chunk. The script verifies that this instruction is matched exactly three times.

Finally comes the time to patch these accesses. As it would be nontrivial to patch a 4-byte instruction to a 5-byte instruction, we cheat a little: The patch restores the prior behaviour of comparing against a fixed auxbuff size. Thus the compare instructions need only 4 bytes each. The size to compare to is fixed at (8 * #1024 + #16 - #24) which corresponds to the minimum auxbuff size. That means the patched DIL command won't make use of an enlarged auxbuff, but it will work at all and also never possibly overflow the buffer.

I think it's a good compromise to allow the uncompressed files to be patched like this.

Here is the complete content of the patch6.sld file:

install flat
l
r v1 := bxcx >> 4
r v2 := 0
r v3 := bxcx
if (v1 <= 1000) then goto :small
r v2 := v1 - 1000
r v3 := 10000
:small
s cs+10+v2:0 l v3 B1 04 89 C3 D3 EB
if (src != 1) then goto :error
u srs:sro - 4 l 1
if (auo != sro) then goto :error
r v0 := word [srs:sro - 2]
s cs+10+400:0 l 10000 3B 3E byte v0 byte (v0 >> 8)
if (src != 3) then goto :error
a srs:sro0
 cmp di, (8 * #1024 + #16 - #24)
 .
a srs:sro1
 cmp di, (8 * #1024 + #16 - #24)
 .
a srs:sro2
 cmp di, (8 * #1024 + #16 - #24)
 .
w
goto :eof

:error
; Error detected

Notes from debugging lDebug

Do not put a breakpoint where ''hook2F'' will run into it

The first note is this: Running lDebugX, do not put a breakpoint on the int 2Dh entrypoint initially. Why? The hook interrupt 2Fh function is called deep within run (1, 2). At that point, breakpoints are already written. (This may be subject to change. The int 2Fh and int 21h/int 10h output services are also called there, which is not very desirable.)

So when using lcdebugx /d- or ldebugx and putting a breakpoint on int 2Dh, the AMIS callouts in hook2F will break back into the debugger in an invalid way. The result is a half crash; while the debugger command line comes up again, the debugger will not work correctly any longer.

The workaround is to run a single run command (T, TP, P, or G – but not TTEST) before enabling the breakpoint on the interrupt 2Dh entrypoint. This will give the debugger the chance to hook the DPMI entrypoint before the breakpoint is enabled.

Another choice would be to uninstall int2F before setting the breakpoint. In particular, if no DPMI host is installed the debugger may retry the DPMI entrypoint hook on every run command. However, it seems that this won't actually work because the DCO4 flag is checked too late into the function, where the int 2Dh and int 2Fh calls already happened.

Install two debuggers as TSR within the context of the inner debugger

I used this trick during testing of the interrupt handling changes, which also turned up the DIL command bug.

It goes like this:

Run lcdebugx /d- lcdebugx.com
(In outer debugger:)
Enter an INSTALL AMIS command
Enter a G command
(In inner debugger:)
Enter an INSTALL TIMER command
Enter a TSR command
Enter a G command
(In outer debugger:)
Enter a TSR command
Enter a G command
(On the DOS command line:)
Run a keephook 8 command
Run a testhook 8 /n x command
Run an int3 command

The interrupt 3 will break into the inner debugger. This is because the TSR then G command to the inner debugger will leave the inner debugger's mandatory interrupt hooks installed, but the process termination will branch to the Parent Return Address set up by the outer debugger.

It is important that the outer debugger not be in debuggable mode during this operation. This way, it will preserve the state of being in the inner debugger's run context. In particular, the non-debuggable outer debugger won't try to uninstall then re-install its own mandatory interrupt hooks.

This way, the BU command can be used in the inner debugger as usual to break into the outer debugger in its own debuggee context.

A similar trick was used to debug the FreeDOS kernel bug in order to load the second debugger near the top of the Low Memory Area. This also had the second debugger run a TSR command then a G command, returning to the PRA of the first debugger in the debuggee context of the second debugger. In that case, the first debugger was simply quit to leave only the second debugger resident.

edrdos, testhook, ldebug, heatshrink

Table of Contents