2024-11-10
This week some work happened on Enhanced DR-DOS, lDebug, MSDebug, and MS-DOS. I was also involved in commenting on several stackoverflow questions.
A user attempted to assemble some code at 1000h:0 using the command a 1000:0
. They noticed that the third instruction would cause an error and be assembled to the wrong opcode. They identified that this only happened on one system (MS-DOS v6.22 running in VirtualBox) and did not happen on another system (NTVDM of MSWindows 2000).
The first several comments discussed whether the question is appropriate for stackoverflow. I was the first to suggest a reason for the observed behaviour: The address 1000h:0 (linear 10000h, 64 KiB) may be in use by the kernel or the debugger. Thus, overwriting it may be the cause of the misbehaviour.
Another user (whose comments are now deleted) suggested that the written address would not be a problem, but was unable to answer why the error occurred, simply stating that the debugger application must be broken.
Eventually yet another user was able to recreate the wrong case and, along with references to the 2024 free software release of the Debug sources, confirmed that the erroneous write would corrupt a part of the debugger's assembler table. My diagnosis was spot on.
Another question was from a user writing a Debug script to be used in DOSBox-X. Their resulting program misbehaved. This turned out being due to their use of the instruction mov cx, al
which the debugger silently assembled to a different encoding (mov cx, ax
or 89h C1h).
I first suggested this incorrect instruction may be among those accepted by MS-DOS Debug. Trying in lDebug, the instruction correctly was flagged as an error. The user, however, stated that DOSBox-X shipped with the debugger used. I didn't think they'd ship a (formerly nonfree) MS-DOS Debug.
After searching a while, I came across the repo's src/builtin/debug_exe.cpp
that contained a hexdump of a file called debug.exe. Using a small perl scriptlet (perl -pe 's/[^0-9A-Fa-f]*(0[Xx])?([0-9A-Fa-f]+)[,\r\n]*/sprintf("%c", hex($2))/ge' debug.txt > debug.exe
) I extracted the file. Turns out it is (Free)DOS Debug v1.25. Indeed, the binary exactly matches DEBUG.COM from that version's release. The choice of an .exe filename extension in the DOSBox-X sources is misleading, as the executable of this version of DOS Debug is a flat-format (.com) executable.
I did wonder whether the application is properly credited in DOSBox-X. Its source tree contains a file named CREDITS.md but the only relevant entry is "FreeDOS utilities as binary blobs (FreeDOS; no license)". Turns out for the original DOS Debug branch this is probably fine. Maybe not so much for other FreeDOS applications included there.
Eventually I confirmed that the most recent release of DOS Debug/X still accepts the incorrect mov cx, al
instruction. (ETA: I reported the bug to the DOS Debug/X repo.)
I reviewed some lDebug releases but did not gain much information there, as the daily builds did not reach back far enough (only up to 2021-10-19) to accept the incorrect instruction.
Therefore I set up a build environment to build older lDebug revisions. This entailed several steps:
Clone lDebug, lmacros, ldosboot, scanptab, inicomp, crc16-t, and symsnip repos.
Try to update lDebug repo using a command like hg up -r 'branch(default) and date("<2018-02-19")'
.
The lDebug repo would update to a changeset from much later that was backdated to 2010 as it added an old doc file. Strip that changeset using hg strip -r .
.
Retry updating as before.
Update all the other repos to era-appropriate revisions, using this scriptlet:
cd - for d in * do if [ $d != ldebug ] then echo -ne "$d: " hg up -R $d -r "date('<"$(hg log -R ldebug -r . --template '{date|shortdate}')"')" fi done cd -
(On review, the quoting seems wrong but with the shortdate format it doesn't matter and works anyway.)
Build the debugger, using either INICOMP_METHOD=none ./make -w-pp-macro-params-legacy -D_BOOTLDR
or INICOMP_METHOD=none ./mak.sh -w-pp-macro-params-legacy -D_BOOTLDR
The NASM warning disabled there pops up when assembling older source trees with a recent NASM. It is emitted often enough to be distracting.
The explicit enabling of the _BOOTLDR define is needed for certain revisions (2018-02-19) that will fail to build if it isn't present. This bug was eventually fixed (2018-02-26), or can be worked around by commenting out the LOADSETTINGS istruc uses in msg.asm.
The output files produced will be either debug.com or ldebugu.com.
This set up allowed me to identify the revision using a manual "bisect" style process (a roughly binary search to find the exact changeset). It turned out to be a happy coincidence; the changeset (2019-10-11) was only concerned with rejecting invalid combinations of the pushw
and pushd
push immediate instructions.
The relevant change is that in the line assembler's ao15
the opsize
variable is no longer set unconditionally. Instead, the control flow is transferred to ao09
. This part of the code checks that either the size in al
matches opsize
, or opsize
used to hold a zero (later named SIZ_NONE) and is then updated with the size in al
.
It turns out that in case of the invalid mov cx, al
instruction the ao15
code is called twice, once with the size being passed as 2 (SIZ_WORD) and once with it as 1 (SIZ_BYTE). With the unconditional overwriting, the instruction is incorrectly accepted.
In this question, a user asked about the 8086 segmentation. At first I was puzzled by the "4 memory segments" framing until I parsed that these correspond to the four segregs that the 8086 has available. In effect, they stated that only up to 4 times 64 KiB can be accessed "at a time".
In my comments I stated that in practice what you do is use segment arithmetic and change the segment registers around to access more than that limit. I also addressed some statements about segment overlap. A different user eventually prepared a proper answer that's also good.
In the comments of that answer, the original poster suggested that a difference of 100h in the segment registers should lead to a unique prefix of offsets 0000h to 00FFh in the lower segment. I corrected this, stating that the answer's offset 0000h to 0FFFh is correct, as the segment address is specified in paragraphs of 16 bytes each.
This project's updates primarily concern the micro-optimisations suggested to the SvarDOS repo's tracker.
Two older commits of the SvarDOS repo that affect the shell were apparently forgotten in 2024 January. This came up during applying the optimisations. I picked these changes separately.
Finally, I updated the lDOS version to 2024 November.
public
directives (needed for MS-DOS)Last week had a change not uploaded then, describing the dbitmap Extension for lDebug some more in the manual.
The only recent change to the debugger was to enable _DEBUG4 builds by referring to the getc.raw
label rather than getc.rawnext
. However, _DEBUG4 builds with dual code sections may stlll be broken.
The change was needed to try and debug a problem with an older lDDebug, which occurred during the bisecting and examining of the change that prohibited mov cx, al
. It turned out that that revision of lDebug (2019-10-11) unconditionally called the serial port cleanup function in its qq command. This, of course, led to a hang in the outer lDebug (a recent revision) when it was connected to serial I/O.
I looked into reconfiguring the old lDDebug to avoid this, but it turns out back then the IRQ and port number were hardcoded. Another interesting aspect was that this lDDebug would also reset the IRQ's interrupt vector unconditionally, clearing it to zero if it was never hooked. This would lead to problems if serial I/O was reconnected later in the same VM.
This bug, of course, was fixed later (2020-05-15).
The first changeset simply adds some changes to the manual that weren't listed yet, and updates a reference to a feature now partially done (sector access to drives > 32 MiB, but not FAT32 drives).
The subsequent changes concern the P command. It didn't proceed past indirect call instructions (FF /2 and FF /3) if their mod=0 or (near only) mod=3. Further, it didn't parse segment prefixes before the opcodes, which may be valid for repeated string instructions or memory-indirect call instructions.
The mod=3 case and the segment prefixes it turns out were handled by MS-DOS v6.22 Debug. The mod=0 cases however were not. Thus these changes are documented as partially recreating the later MS-DOS revisions and partially as new.
Further, the release number was updated to release 3.
The msdisp.asm and disp.asm files being dropped last week didn't work initially, breaking the build. I fiddled with it some until I was able to have the build succeed, and use the new NASM sources.
The subsequent changes are all towards the next module, mscode.obj.
ifssym.inc could define IFSR_MODE to either a byte variable or a word variable. A possibly differing expansion (address) for the same define is fine, but fixmem.pl really wants them to all have the same size. Therefore I edited the uses of IFSR_MODE to instead refer to IFSR_MODE_BYTE (lseek) and IFSR_MODE_WORD (open).
I also manually ported the multi-line macros in ifssym.mac. I put these macros into the "NASM original macros" block. Before that I added a commented out list of all IFSR_ defines using the labelsize macro. Although the addresses (expansions) are all invalid, fixmem.pl will read the correct sizes from these lines to process uses of these defines.
The mmacros will first undefine all IFSR_ smacros to avoid interference, as labelsize cannot deal with defines that are already defined as single-line macros.
Some additional definite values in structure definitions were dropped to avoid the "attempt to assemble code in absolute space" NASM errors. As before we may eventually need to handle instances of these structures specifically.
In dosmac.mac, I_NEED macros now accept a single parameter and will default to byte size. (Numeric sizes were already accepted to mean byte size by accident.) Further, some macros are newly defined with %imacro
directives to match caps-insensitively.
In one instance, a TABLE ENDS
directive was used to switch the current section back to the CODE section. The NASM macros do not yet contain a mechanism to make this work, so for now I inserted a section CODE
directive.
Finally, I pushed an incorrect changeset to the repo yesterday. Rather than deleting the original (.ASM) files I deleted the new (.nas) files. This predictably made the build fail. (I did think to manually delete the mscode.obj and mscode.fob files before the build ran.) I hope today's fix will make the build succeed.