2024-03-24
This week I worked some on instsect, ldosboot, and lDebug. Some development on Debug/X happened as well (1, 2), part of which was in response to my bug reports (1, 2).
While using instsect to install a stackoverflow user's boot sector loader I noticed that it misdetected two filenames in the error messages. This was because their error messages happened to be in allcaps. It turned out to be sufficient in this case to detect when a blank in either candidate field (base name or extension) is followed by a nonblank in the same field, which never occurs for valid filenames.
The test payload shares the filename detection of instsect. Therefore I picked the blank detection change from instsect.
Several updates to the manual happened. In particular, the description of the ELD (multi-purpose) puts handler was updated and had the puts_ext_next
entry added. The ELD puts copyoutput and puts getline handlers were added too.
During the updates to the manual, I noticed that transfer_ext_cx
did not preserve the flags in the _PM=1 build. This was initially documented as a bug. It happened to be a benign bug though.
The internal LINKCALL ELD (not loaded and installed like a regular ELD, but behaves like one otherwise) prompted me to add a new variable: pm_2_86m_0
is a word variable that is equal to 2 in Protected Mode and equal to 0 otherwise. (In _PM=0 builds it is always equal to 0, of course.)
This simple variable allows to optimise the Protected Mode dependent dispatch, dropping a function call or test instruction as well as a conditional jump. The dispatch can happen either directly for variables, or for code pathes instead by utilising jump tables.
The biggest win is likely that we need not run pushf
and popf
instructions any longer, as the dispatch can occur completely using just mov
, push
,pop
, or lea
instructions. These never affect the arithmetic status flags.
However, the new dispatch does have to use an additional register, usually di
. This is a minor disadvantage though.
To actually apply this to the link call handlers, I multiplied the offsets passed in bl
by 2 for the _PM=1 build. In the _PM=0 build they are passed as 0 (entry section), 2 (first code section), or 4 (second code section). In _PM=1 builds they are now passed as 0, 4, or 8. The extcall_table
was re-ordered to accommodate the new offsets. This involves unused padding words between the .ret
entries, and interleaving the references to the selector variables with the references to the segment variables. This enabled the dispatch to not need any addition or multiplication instructions.
After the link call handlers in the LINKCALL ELD, the next dispatch to adjust was transfer_ext_cx
. This one required a jump table because of how the di
register needs to be preserved but the dispatch needs to push a mode-dependent variable to the stack. I initially pondered a trampoline in the ELD code section that would consist of the instructions pop di
and jmp cx
. However, this would make it more difficult to initialise the ELD infrastructure and in particular to not initialise it when the ELD code section is too small to contain the LINKCALL ELD.
Another bug was that the multi-purpose and copyoutput puts handlers were not always entered with NC. This is not a problem in practice, but could be if a handler directly chained to its downlink rather than process the message. If it preserves the flags then it should be entered with NC to let the message be written, in case the downlink consists of a branch back into the debugger.
The code_ret_to_ext
function was adjusted to use the new dispatch mechanism next. (In the _PM=0 build this function is trivial as it consists of a single retf
instruction.)
A new function entry_to_code_segsel
was added, also using the new dispatch mechanism. This function is now used in two places: To return from the entry section to the first code section and subsequently to the ELD code section, and to transfer from the entry section's interrupt return handler into the first code section.
An unrelated change was to add a comment indicating that ispm
changes flags to NC, which the code already did. It wasn't documented for the _PM=0 replacement code however. (This replacement is a function consisting of only test sp, sp
and retn
. It is assumed that our stack pointer can never be 0000h. The _PM=1 function already assumes we are running on our stack.)
To test the new entry via first code to ELD code path, I added the entry_retn
function in serialp.asm which does nothing other than returning near. I also added the code link to this function, the first code link actually referencing section 2 (the entry section). Then I created eldtesth.asm to try out calling into the entry section from an ELD. It worked flawlessly on the first attempt.
Next I adjusted the dispatch in the two _PM=1 dual call helpers (1, 2, 3). These are actually assembled into four helpers: One each for calling into the same code section as the caller, and one each for calling into the other code section.
Next up was the dual call return helper. This likewise is used only for _PM=1 builds and gets assembled twice, one per code section. This uses a small dispatch table to convert from the word values 1 or 0 to appropriate offsets from the code_seg
variable. In practice, this entails multiplying by four. However, by using the table changing the flags is avoided so that pushf
and popf
can be dropped.
After that, the near call helper was next. This is used unless ! _PM && _DUALCODENEARDUAL
is true. For _PM=1 builds it is always used. For _PM=0 builds it is used only if near calls to the other section do not use per-function trampolines. (The default is to use the trampolines.)
The _doscall_return_es
and _doscall
functions were adjusted next (1, 2). Like the transfer_ext_cx
dispatch, these use a jump table.
Finally, the use of the ax
register in several of the dual code helpers was avoided, preferring to use bx
and di
only.
A small change to the get_messagesegsel
function optimises the _PM=1 build to always set ds
to the selector value first, before testing for Protected Mode.
The debugger init was adjusted to allow allocations before the LINKCALL ELD. This is normally never the case. To test the support, an option was added that does just such an allocation before allocating the LINKCALL ELD.