User Tools

Site Tools


blog:pushbx:2024:0414_ldebug_work_in_mid_early_april

lDebug work in mid early April

2024-04-14

Documentation or comment only changes

Code changes

In detail

Push or pop memory without size

I noticed that MSDebug doesn't require a specification of the operand size for a push or pop with an explicit memory operand. While it does accept the "word" or "word ptr" size specifier, it assembles the same without it. This is likely because it has no notion of 32-bit operands so the size can always be implied as word.

I noticed that Debug/X changed its call, jmp, and push instructions so as to do the equivalent of our OP_1632_DEFAULT, but only in the assembler. This is done with a hack in ao13, which activates only for a (hardcoded) operand list index of 40. This is the "OP_1632+OP_RM" operand list.

There are several problems with this approach in Debug/X:

  • The operand list index number is hardcoded, thus fragile
  • The hack is only in the assembler, thus the disassembler will always display the size (I found this desirable for push in lDebug, but not for call or jmp)
  • The pop mem instruction uses a different operand list, hence its size must be specified
  • The push imm instruction can make use of lDebug's OP_1632_DEFAULT as well

The lDebug approach to push and pop with memory operand is instead to use another new operand size type, OP_1632_DEFAULT_ASM. This allows a default size in the assembler if none is specified. However, the disassembler will always display the size. Unlike this, the disassembler will omit the default size for call and jmp (far or near) as well as push imm.

MSDebug also omits the size for call and jmp (alike lDebug) but also for push and pop (unlike lDebug). This is because of how there is no non-default size for push and pop in MSDebug. In lDebug, disassembly of push and pop will continue to display a word size even if the disassembler could omit it. (Perhaps a new Debugger Assembler Option flag is in order?)

Optimising 16-bit immediates and displacements

The FreeDOS Debug/X and lDebug assembler would not optimise positive 16-bit immediates to negative sign-extended 8-bit immediates when the values would allow this. The problem was that the assembler checked for sign extension to 32-bits, requiring either a negative value (-1) or a 32-bit positive value (FFFF_FFFF).

Reported on stackexchange:

I got misled by FreeDOS DEBUG, which didn’t use that encoding when I wrote adc dx, ffff. Now that I checked, adc dx, -1 does trigger it, and the former does as well in MS-DOS DEBUG. comment by user3840170

MSDebug never had this problem because it only ever supported 16-bit immediates, so its sign extension check presumably always operates on 16 bits.

Four different classes of this missed optimisation were identified:

Several changesets addressed each class in turn. This seems to handle all cases of the problematic sign extension checks.

Some problems remain: The ALU, imul, or push cases will accept FFFF_FFFF same as if -1 was specified. The displacement correctly rejects FFFF_FFFF while allowing -1. This requires more study. (Do note that numbers can be specified using the expression evaluator by surrounding the expression with round parentheses. It may be desirable to accept +(-1) the same as -1.)

The actual implementation of each of the checks is done as follows:

If OP_IMMS8 is not the first operand, and the first operand is known to have a SIZ_WORD, then accept sign-extension if high word is zero (thus allowing positive 16-bit values). (This catches the ALU and imul cases.)

If OP_IMMS8 is the first operand, and there is a W suffix to the mnemonic (detected from OP_SHOSIZ's setting of byte [opsize]), likewise accept sign-extension if high word is zero. (This catches the pushw case.)

If OP_IMMS8 is the first operand, and there is no D suffix, and the CS D-bit is clear (16-bit CS), accept sign-extension if high word is zero. (This catches the push case without a suffix in a 16-bit CS.)

If there is a ModR/M memory addressing mode being assembled, and there are one or two 16-bit address registers (no a32), then accept sign-extension if the high word of the displacement is zero.

In all of these cases, sign-extension with the sign bit cleared automatically passes the new check as well because in that case the high word must be zero too.

Also, the OP_IMM8 operand type (eg int, out, in, shifts and rotates) only wants to check for zero-extension. This does not need to be handled specifically because the special casing will just possibly check for a zero high word in a redundant way, which always correctly handles the zero-extension case.

You could leave a comment if you were logged in.
blog/pushbx/2024/0414_ldebug_work_in_mid_early_april.txt · Last modified: 2024-04-14 18:21:16 +0200 Apr Sun by ecm