User Tools

Site Tools


blog:pushbx:2023:0529_msdebug_changes_and_ldebug

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

blog:pushbx:2023:0529_msdebug_changes_and_ldebug [2023-05-29 16:33:00 +0200 May Mon] (current)
ecm created
Line 1: Line 1:
 +====== MSDebug changes, and lDebug ======
 +
 +**2023-05-28**
 +
 +This week I finally got back to MSDebug, my fork of the debugger included in [[https://github.com/microsoft/MS-DOS/|the 2018 free software release of MS-DOS version 2]]. I also worked some on lDebug, my debugger based on FreeDOS Debug/X.
 +
 +
 +===== lCDebug tracing into int 21h =====
 +
 +First off, during some testing that I did on the DOS machine, I ran into a crash when trying to trace into an int 21h instruction using the T command with Trace Mode (TM) set to 1.
 +
 +Turns out I didn't bother switching lCDebug into non-debuggable mode, nor did I instruct it to force InDOS mode. So when it tried to hook the debugger interrupts it used int 21h to do that, which of course led to running the breakpoint.
 +
 +As mentioned, switching off debuggable mode (clear DCO6 100h) or enabling InDOS mode (set DCO 8) fixed this.
 +
 +
 +===== MASM word keyword problem =====
 +
 +During development a certain instruction would always access a word two bytes higher than expected. What happened? I forgot to add the ''ptr'' keyword to the access. It came out as ''mov es, word [user_proc_pdb]''. The ''word'' keyword in this case is interpreted by MASM (at least the version I am using for MSDebug) as a "plus two" term in the address expression. Bad!
 +
 +
 +===== lDebug changes =====
 +
 +The first changes are to lDebug's N command. First, it should [[https://hg.pushbx.org/ecm/ldebug/rev/33d95c948f90|not allow to create command line tails]] longer than 128 bytes. (1 byte count field, 126 bytes of text, 1 byte for the Carriage Return terminator.) This is not a problem in MSDebug because it only ever creates command line tails from its own command line tail (limited to 128 bytes) or its line input buffer (limited to 80 bytes). The fix is to use a register for the end check, and set that register to the lower of the limits to enforce both a 128-byte limit and not overflowing the N buffer. There was also an off-by-one which would result in the correct data but emit an unneeded error message.
 +
 +[[https://hg.pushbx.org/ecm/ldebug/rev/e7d9e9835d33|When copying the command line tail explicitly]], the two unopened FCBs are also copied into the debuggee PSP.
 +
 +There are two options to switch the N command into MS-DOS Debug compatible mode. [[https://hg.pushbx.org/ecm/ldebug/rev/d05cfcc36438|The first causes two changes]]: The entire N parameters are used as command line tail, including the initial program load filename. And the PSP fields (two unopened FCBs and the initialised part of the command line tail) are copied to a debuggee PSP assumed to be at DS:0. This feature obviously can corrupt data if the debuggee DS does not point to a PSP.
 +
 +[[https://hg.pushbx.org/ecm/ldebug/rev/5e111a5b0668|The second option]] causes the N command to allcaps its command line tail. This is also what the MS-DOS Debug N command does, as in MS-DOS Debug [[https://hg.pushbx.org/ecm/msdebug/file/89b419291325/debug.asm#l450|all line input is run through a loop]] that capitalises all text except that which occurs between quote marks.
 +
 +In the same set of changes, the K command was added. This does exactly what the N command used to do. It does not observe the MS-DOS compatibility options for N, however.
 +
 +[[https://hg.pushbx.org/ecm/ldebug/rev/4aa28ec81f7e|The next change was]] to add .HEX file read support to the program-loading L command. This was [[https://hg.pushbx.org/ecm/fddebug/file/version-1.13/DEBUG.ASM#l46|a longstanding missing feature]] in FreeDOS Debug/X. The first iteration was based on a description of the MSDebug implementation, but with the EOF buffer overflow bugs fixed. Subsequently, the handling was changed [[https://hg.pushbx.org/ecm/ldebug/rev/59f2e6925ac4|to only read data from type 00h records]]. This was in preparation for the next change, which was [[https://hg.pushbx.org/ecm/ldebug/rev/67d617243ce2|to support type 04h records to change the high word]] of the target offset. (.HEX file records contain a 16-bit target offset, each.) These are emitted by NASM when an ''-f ith'' output file nears or exceeds 64 KiB.
 +
 +During development another problem cropped up: The BC command did not accept a comma as separator between the ''BC'' part and an ''ALL'' keyword. [[https://hg.pushbx.org/ecm/ldebug/rev/c0da5607f556|The bb.asm code was changed]] to use ''skipcomma'' or ''skipcomm0'' in a number of places.
 +
 +Also, the ''_REGSREADABLEFLAGS'' option [[https://hg.pushbx.org/ecm/ldebug/rev/4e32cd122d28|got enabled by default]]. It comes with two DCO6 options to turn on style 2 or style 3 alternative flag state displays. (Style 1 of [[https://sourceforge.net/p/freedos/feature-requests/93/|the original feature request]] was never implemented because it would require overlapping the flags display with the disassembly line.) (This feature request came to my attention through [[https://github.com/Baron-von-Riedesel/DOS-debug/issues/14|a report to the current Debug/X repo]].)
 +
 +Finally some documentation updates were added. This includes [[https://hg.pushbx.org/ecm/ldebug/rev/94703001ba26|the IF EXISTS R command]], as well as [[https://hg.pushbx.org/ecm/ldebug/rev/85d7b685b6e3|the lDebug advertisement section from the MSDebug manual]], just with the beginning extended and slightly reworded.
 +
 +
 +===== MSDebug changes =====
 +
 +I had wanted to change MSDebug ever since 2022 August, particularly to fix the process model to something sensible that would work on all compatible DOS kernels. However, back then I ran into a block when I discovered how severely broken the N command was (as described above).
 +
 +This time I went the other way around: First, I [[https://hg.pushbx.org/ecm/msdebug/rev/8f77f880c61a|added the K command]] to behave like lDebug's K command (or its default N command). This required introduction of a separate buffer called ''programloadname''. All the "XENIX" file access functions, using the common label ''oc_file'', use this buffer now. The buffer overlaps 128 bytes of the debugger's initialisation code, which was previously unused so the memory use was not increased. This buffer is located at offset 100h of the process, mirroring the N buffer of lDebug somewhat. (However, lDebug stores the ASCIZ pathname first, then the command line tail. MSDebug's ''programloadname'' actually cheats, it may have leading blanks and isn't ASCIZ, the NUL terminator is added dynamically before using the buffer content as a pathname. This was left as is to simplify the change.)
 +
 +To enable the K command to not force its command line tail allcaps, [[https://hg.pushbx.org/ecm/msdebug/file/89b419291325/debcom2.asm#l939|it does some pointer arithmetic]]. The debugger actually has two line buffers, the ''linebuf'' which is passed to the DOS int 21h function 0Ah, and the ''bytebuf'' which receives the capitalised contents copied from ''linebuf''. (I assume this is intended to allow the line input recall to reproduce the original input rather than the capitalised variant.) By doing the pointer arithmetic, the K command can access the ''linebuf'' to read the original input line. The pointer arithmetic is done using two instructions, one ''sub'' and then an ''add''. This is because I assume the linker doesn't allow us to use two labels in a single numeric expression that would be required to do it in one instruction.
 +
 +(Meanwhile, in lDebug land we assemble the entire debugger in one assembler run without involving a linker at all, which is why we can do all sorts of label arithmetic. Another example in MSDebug is [[https://hg.pushbx.org/ecm/msdebug/file/89b419291325/debug.asm#l149|the calculation of the paragraphs size]] for the int 21h function 4Ah (resize) call, which does only the "plus 15" at build time and does the "shift right by 4" at run time instead. lDebug can do things like this all during build time.)
 +
 +The PSP FCBs and command line tail of the debugger are now used to hold exactly the same fields of data used for the process creation. (This had to wait until a subsequent changeset for the process model fix to be fully implemented.)
 +
 +As a boon to developers, [[https://hg.pushbx.org/ecm/msdebug/rev/e5f5b5d33cd3|the BU command was added]]. It stands for Break Upwards. It causes the debugger to run an ''int3'' in its command dispatcher. This is alike lDDebug's BU command. Unlike lDDebug, the EOL is not checked and no message is emitted. Unlike lDebug, the interrupt 1 and 3 handlers are not hooked during processing of all commands, only when running the debuggee (just like lDDebug). Therefore, the MSDebug BU command is not conditional on a debuggable build or mode (like lDDebug or lCDebug).
 +
 +The next change was [[https://hg.pushbx.org/ecm/msdebug/rev/f497579cbb81|to fix the process model of the debugger]]. In particular, the PSP created during initialisation was used to source the EXEC buffers for unopened FCBs and the command line tail. This proved incompatible with the intended operation of the K command. It is also completely unnecessary. This actually reduced the memory use of the debugger. (There was also the problem that apparently with repeated L commands the debugger memory use would grow dynamically.)
 +
 +This change also allowed to re-create a child process whenever the debuggee terminates. This can fail if no memory is available, though. Unlike lDebug, MSDebug will not attempt to allocate a process repeatedly. Rather, [[https://hg.pushbx.org/ecm/msdebug/rev/69de8d4133a3|a new command was added]] to explicitly retry this operation, which always displays a message indicating what it did. This is the comma command. It is checked that immediately after the comma there is a linebreak, enabling other potential uses of comma commands with non-empty parameters.
 +
 +The process creation was also changed to use int 21h function 55h rather than function 26h. This appears to be more compatible across DOS versions. The memory size fields at ''word [PSP:2]'' and ''word [PSP:6]'' are written manually to avoid DOS incompatibilities. (Note: lDebug needs to be fixed to do this too.) This includes recalculation of the CALL 5 entrypoint. This was changed to use a similar formula as FreeDOS and more recent MS-DOS versions. The initial stack was fixed to normally use an SP equal to 0FFFEh, when more than 64 KiB are allocated to the process block.
 +
 +There is also [[https://hg.pushbx.org/ecm/msdebug/rev/f6f42b54569a|a set of changesets]] to align data [[https://hg.pushbx.org/ecm/msdebug/rev/36a9faeac601|on word boundaries]]. In particular the stack happened to be on an odd address sometimes. Aligning **all** data is arguably not very useful, but the stack should be aligned. When using ''xchg'' instructions on word memory variables, alignment can also avoid a possible split lock fault. (All ''xchg'' instructions with memory operands are implicitly locked, even when no explicit ''lock'' prefix is used.)
 +
 +The [[https://hg.pushbx.org/ecm/msdebug/rev/be5f65f34612|header displayed during startup was changed]] to read "MSDebug release 0 by ecm". A similar header was [[https://hg.pushbx.org/ecm/msdebug/rev/98c98d4d8283|added to the command help]] and online help. The command help [[https://hg.pushbx.org/ecm/msdebug/rev/65220281f85d|was moved to overlap the stack space]], reducing the memory use and executable size a bit.
 +
 +Writing of which, I found that data initialised to question marks in the source was actually emitted as zeroes in the executable, even when this data was at the end of the program image. I was afraid that my uses of the ''even'' directive for alignment would cause more zeroed data to be emitted than otherwise needed. I found that this was not the case, as all the zeroed data was emitted either way. I don't know if there is a way to omit this data from the executable image.
 +
 +The process creation [[https://hg.pushbx.org/ecm/msdebug/rev/90cc4c0cc1b3|was also changed to write]] a ''retn'' instruction (0C3h) at the entrypoint, allowing to run just a G command to cause the process to terminate.
 +
 +[[https://hg.pushbx.org/ecm/msdebug/rev/85dc95af709e|A manual was added]], based on lDebug's manual but greatly shortened to only the parts relevant to MSDebug. It started out referencing lDebug as an alternative. This was later extended with [[https://hg.pushbx.org/ecm/msdebug/rev/739f46a8353f|an lDebug advertisement section]] at the end of the manual, listing many advantages of lDebug.
 +
 +The K command code was largely [[https://hg.pushbx.org/ecm/msdebug/rev/20aa26142b2f|used to replace the bespoke command line tail processing]] done at the startup of the debugger. This processing was once the example on which the K command was based, and I suspect also the reason FreeDOS Debug's N command worked the way it did. This was because this code did what I expected: It parsed the program load pathname, then parsed the remaining parameters into the command line tail. The re-use of most K command code reduces the amount of redundant code. This change involved a re-use of the ''programloadname'' buffer [[https://hg.pushbx.org/ecm/msdebug/rev/b53badd387ca|for parsing the FCBs]], as the original source may be the debugger's PSP command line tail which is rewritten during the processing of the tail.
 +
 +Subsequent changesets merged things so that the N and K command largely share code, called or chained to to implement either command: [[https://hg.pushbx.org/ecm/msdebug/rev/3e1737f32612|1]], [[https://hg.pushbx.org/ecm/msdebug/rev/5448f54dc3e5|2]], [[https://hg.pushbx.org/ecm/msdebug/rev/79b0eced47c4|3]], [[https://hg.pushbx.org/ecm/msdebug/rev/a3badb2f3d55|4]].
 +
 +The .HEX file reader [[https://hg.pushbx.org/ecm/msdebug/rev/eb9d13a53bdc|got several bugfixes]]. These do what apparently was intended to be done: Upon reading a NUL (0) byte, an EOF (1Ah) byte, or after having processed the entire file (reached the End Of File), the read should end. What it used to do instead was overflow the buffer and potentially loop infinitely. For example, passing a completely empty .HEX file could lock up the debugger.
 +
 +An extension is to check that the .HEX file data [[https://hg.pushbx.org/ecm/msdebug/rev/da1732f9abc0|does not wrap around the end of the segment]].
 +
 +Unlike lDebug, reading .HEX files beyond the first segment is not supported.
 +
 +Another bugfix is to close the file handle that is used for .HEX file reading. This leak may have affected other program-loading L commands and program-saving W commands, as well. Instead of hunting down all the bugsites, I [[https://hg.pushbx.org/ecm/msdebug/rev/390c1e21f610|added a close call]] in the ''command'' loop and made sure to only write to the ''handle'' variable when an opened file handle was returned in the "XENIX" function handling.
 +
 +I found two instructions [[https://hg.pushbx.org/ecm/msdebug/rev/149f1f711269|that were completely unreachable]] during the prior change. I dropped those.
 +
 +All in all, the MSDebug repo seems to be ready for a release. I did all the changes that I wanted to, and it should be more usable now.
 +
 +{{tag>ldebug msdebug masm}}
 +
 +
 +~~DISCUSSION~~
  
blog/pushbx/2023/0529_msdebug_changes_and_ldebug.txt ยท Last modified: 2023-05-29 16:33:00 +0200 May Mon by ecm