User Tools

Site Tools


blog:pushbx:2024:0314_the_protected_mode_int_21h_handler_mystery

The Protected Mode int 21h handler mystery

Following the bug report on the forum, I changed the PM int 21h handler of lDebugX to first switch stacks and then transfer control to the code section, as the transfer function as yet requires a 16-bit stack (either ss B bit or esph must be zero).

This involved moving some patch sites into the entry section. This is not a problem any longer as the 386-related patches now store separate tables for the entry section. However, it was a problem before this change which is why the original int 21h entry in the entry section used all manual patches not using the tables.

To test that the entry section patches work, I rebuild lDebugX with _CATCHPMINT214C=0. It required some changes to build. For instance, with _DEBUG=0 the excsave table wasn't included in the build. To avoid using this table I tried to build using _DEVICE=0 _TSR=0. This failed due to a recent change to inicomp involving the lCFG block, which didn't allow building with _IMAGE_EXE=1 _DEVICE=0.

After clearing these hurdles, tracing into Protected Mode led to a dosemu2 process killing crash upon terminating the DPMI client. Using dosemu2's dosdebug was not particularly helpful. bpintd 21 4C00 did skip all the tedious tracing required otherwise, but ti immediately led to the crash rather than tracing into the PM int 21h handler or the subsequent 86 Mode execution of the Parent Return Address.

Aside: dosemu2's dosdebug doesn't come with a P command and its breakpoints don't seem to work particularly well. I also had to manually remove stale files to make it find its dosemu process. This involved deleting either ~/.dosemu/run/* (older dosemu2) or /var/run/user/1001/dosemu2/dosemu.dbg* (newer dosemu2) before starting the dosemu2 process to attach to.

Eventually, I tried out patching the debugger built with _CATCHPMINT214C to nop out parts of its PM int 21h handler. This led to some progress. Patching away the call to pm_reset_handlers did not lead to a crash. Patching away the entire call to pmint21_4C_code (a near call plus its word data), or patching the dispatcher so it never executes its own handler (conditional jne to unconditional jmp), led to the crash. Nopping out parts of pmint21_4C_code revealed that clearing the canswitchmode flag caused the machine to not crash.

This led me to investigate all uses of this flag. I found out in run.asm what was causing the crash: reset_interrupts (_DEBUG only) and handle_mode_changed both will try to switch back into Protected Mode if the debugger is re-entered in 86 Mode and the canswitchmode flag is set. The reset function is to swap out the lDDebugX / lCDebugX PM handlers for the outer debugger's. The mode changed function takes care of, in this case, converting stored selectors to corresponding segments if they have a 86 Mode compatible base. (There was an unrelated bug if a selector did not match a segment base, introduced much earlier in 2021.)

Manually clearing this flag in Protected Mode by running the commands s dentrysel:0 l 2000 as dwords dif then r dword [srs:sro] clr= 40000 also avoids the crash. Patching the _DEBUG=0 lDebugX to assume this flag is not set in handle_mode_changed also works.

I also tried to move the call to handle_mode_changed to after the call to reset_interrupts in lDDebugX so as to allow tracing handle_mode_changed and observing the flag being set, to allow a workaround at the exact position of the check. However, this failed because reset_interrupts also uses the canswitchmode flag and (naturally) cannot be traced by lDebugX. (If lDDebugX would first reset its 86 Mode handlers and then the Protected Mode handlers then the latter could be traced after the former has already run. But this is not currently how it is done.) Further, I undid this move because I am unsure whether calling handle_mode_changed only after reset_interrupts would work correctly, or whether the latter requires some of the housekeeping done by handle_mode_changed. So they should stay in the original order.

In conclusion, this is another reason that the _CATCHPMINT214C build option must be set. There is already an additional option, _OVERRIDE_BUILD_PM_DEBUG, which must be used to enable building with _PM=1 _DEBUG=1 _CATCHPMINT214C=0 lest an %error is emitted. I added a similar error to the build if the handle_mode_changed function in its current state is used along _CATCHPMINT214C=0. I added another error if the exception handler uses the entry_to_code_sel transfer function which assumes a 16-bit stack.

You could leave a comment if you were logged in.
blog/pushbx/2024/0314_the_protected_mode_int_21h_handler_mystery.txt · Last modified: 2024-03-14 09:51:52 +0100 Mar Thu by ecm