User Tools

Site Tools


blog:pushbx:2023:0501_late_april_work_on_the_debugger

Late April work on the debugger

2023-04-30

This week was busy for the debugger's development, but not for any other projects.

MMX variable read changes

The MMX variables were among the last holdovers to still implement their own check in the if ladders of the isvariable? function. That was moved into the var_mm_setup function, largely re-using the "special setup" and "separator special" facilities added to the isvariable? structures for the likes of RIxxy variables.

This turned out to be a bit of a pessimisation in terms of code space, however, surprisingly enough. But shuffling things into the structures definitely buys us a little speed because the common repne scasw scan can quickly skip candidate variable names that do not start with "MM".

Some optimisations were still done after this. The only remainders in the isvariable? code that are done in code instead of using the structures are the debuggee registers (8-bit, 16-bit, 32-bit) as well as the status flags. The registers include compound registers, either two 16-bit register names or one name with a "00" suffix for the "compound with zero" variables. The many choices for compound registers do not lend themselves to the data structures easily. The remaining 8 + 16 + 10 full registers could be encoded in the structures.

Other expression changes

The big loop in isvariable? now clears some flags that should be match-specific, for the remote possibility of processing a match that turns out to reject the input, followed by processing another match.

IF EXISTS R changes

The IF command's variable name check is somewhat smarter now. In particular, if a variable name is ended by an "L" separator then the IF code won't find the "THEN" keyword that it expects. (The reason that "L" is a separator is because of how Debug allows it to mark the end of a hexadecimal number, indicating in a range parameter that a range length follows.) This would lead to an error before. Now, the code backs up and tries to skip the candidate variable name as if no variable was recognised.

A minor bug was fixed: If no variable match occurred, there was an off-by-one as the skip logic would first run a lodsb despite already having the beginning of the text in the al register. This only affected particularly malformed inputs like just "IF EXISTS R" which could lead to an overflow in the line_in buffer.

ATTACH command

The ATTACH command is the opposite of the TSR command. Both are valid in DOS application and DOS device driver mode. If you squint, running ATTACH in device mode leads to something similar to the default state of things for application mode. Likewise the reverse with the TSR command.

The QC "device Quit from Container" command will now set up the PSP created during device initialisation as the owner of the regular MCB it creates for the debugger. Finally, QC will route things to the dispatcher that handles either attached or detached (TSR) termination for the application mode. Before, it immediately jumped to the detached termination.

Symbol table troubles

If a symbol table was allocated in DOS memory and the QD command ran, it would not free the symbol table memory.

This bug was actually fixed in two ways: First, the table memory is freed explicitly. Second, QD quit now also terminates its PSP, borrowing some code from the detached quit handler. Either would be sufficient on its own, but combining both solutions hardens things. For instance, if we ever allocate more memory to the debugger then this would be freed by the process termination.

Device loader difficulties with MCBs

A QD quit with a symbol table allocated in DOS memory is actually quite unlikely. Running the FreeDOS kernel (without an UMB provider), during device initialisation there are only two MCBs: An SD system block presumably containing earlier drivers that stayed resident, and a very large free block. Unfortunately, the free block overlaps both the driver currently loading at its beginning and apparently some of DOS's memory in use at the far end. Allocating memory with the DOS interface is doomed to fail, silently, in these circumstances. Silently, that is, until the system crashes.

I do not recall exactly how MS-DOS handles this, but I do believe it fails in a better way than FreeDOS.

Code and entry sections shuffling

Patterned after the second code section's support for them, the entry section is now covered by 386-related patch tables as well. This allowed to move some code in run.asm into the entry section. The initial exception handler that is the DPMI entrypoint into the resident debugger was easier to move, as it didn't contain any patchsites.

All of these shufflings are to make more space in the first code section, which is too close to full for comfort. In fact, adding the ATTACH command required the DPMI entrypoints be moved in order to build the full lCDebugX build (symbolic, immasm, DPMI, conditionally debuggable, triple mode). After the help pages were moved into the "message segment", the entry section boasted more than 20 kB of space left over.

Initialisation clarifications

The various image positioning and layout defines are now commented a lot more. The "NONBOOT" defines have been renamed to "APP" instead to clarify that they aren't used for device mode. The DATAENTRYTABLESIZE define was moved up from its device mode origins to the main sources and used in several more spots. The early init section relocation in app mode and device mode is now checked to ensure no overlap occurs. This is very unlikely but could occur if the data stack buffers, history segment, and auxiliary buffer all were nonexistent or small. It is checked at build time that no such overlap will happen.

Device attribute change

The device attribute 800h (bit 11) is now set. It indicates support for the Device Open and Device Close requests. This will lead to a critical error upon opening the "LDEBUG$$" device, instead of only when trying to read or write to an open handle to this device. We want to avoid open handles referencing our device because they will be left dangling if we uninstall the debugger later.

I noticed a particularly odd situation when submitting an "Ignore" response to the critical error prompt on opening the device. It returned a successfully opened file handle. But upon termination of the attached process (Q or QA command), the machine would appear to hang.

I believe I understand why this happens: The kernel will go and close the process file handles from the Process Handle Table (PHT) in ascending order. Thus it will close all the standard handles first, successfully. Then it gets to our "LDEBUG$$" handle. This close would cause a critical error. However, FreeCOM's critical error handler presumably uses the stderr handle to write its prompt and read a response. But that handle is already closed! Thus the hang.

There are four ways to avoid this:

  • Close the handle explicitly before terminating the process. This will cause a normal critical error prompt, where the user can choose "Ignore" again.
  • The shell could detect this situation and use its own handle instead of the user process's then.
  • The kernel could close the handles in descending order.
  • The device can support Device Close calls and return successfully.

I implemented the last option for now.

Idea: The device could support both Device Open and Device Close, and keep track of how often it was opened. Then the QC quit would only be allowed if the counter is zero. However, this would imply that Device Open would not cause an error any longer.

Device init inheriting handles

The device mode debugger inherits its handles from the DOS device installer's PSP. (This is at segment 60h for FreeDOS.) The original CON, AUX, and PRN handles will stay around due to this. However, testing found that a sixth process handle was open in the debugger's PSP. And after attaching to a process and running a QA command, the sixth handle would be inherited to the debuggee process as well.

This handle was the one referring to the FDCONFIG.SYS file. To avoid this, the debugger will now close any process handles numbered 5 and higher upon the debugger's initialisation.

The device init target mystery

I spent many a minute developing a test case and thinking through the details of why the "device init target" define included a PSP's worth of address space twice.

Once, in the INITSECTIONOFFSET which addresses the init section's position in the initial program image. This is addressed from the pseudo PSP position's ds so that ds:100h points to the program image.

And a second time in the DEVICEADJUST, the size of the device shim that is eventually installed for the resident debugger plus the pseudo MCB and the actual PSP that will be allocated to the debugger.

It turns out that both reservations are actually needed for proper operation. Building with -D_BOOTLDR_DISCARD=0 -D_BOOTLDR_DISCARD_HELP=0 makes it so one of the buffers (probably the history segment) is initialised immediately in front of the relocated init section. If the PSP size was incorrectly doublebooked, then a PSP-sized gap would be observed here.

I spent more time on this wild goose chase than I'd like to admit. Even if this was a bug, it would be very benign.

You could leave a comment if you were logged in.
blog/pushbx/2023/0501_late_april_work_on_the_debugger.txt · Last modified: 2023-05-01 10:14:45 +0200 May Mon by ecm