User Tools

Site Tools


blog:pushbx:2023:0707_msdebug_ldebug_x2b2_warplink_2023_june_work

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

blog:pushbx:2023:0707_msdebug_ldebug_x2b2_warplink_2023_june_work [2023-07-07 11:23:16 +0200 Jul Fri] (current)
ecm created
Line 1: Line 1:
 +====== MSDebug, lDebug, x2b2, Warplink (2023 June work) ======
 +
 +**2023-06-25**
 +
 +The last three weeks in part had me on vacation, but that doesn't mean we didn't get any work done.
 +
 +I don't know when I will be able to upload this blog post, as our main desktop (Linux) box is experiencing some difficulties. RAM has already been swapped, and a new PSU has arrived by post.
 +
 +As usual some work on two of the debuggers happened, as well as X2B (a public domain "EXE to binary" tool) and a port of WarpLink from the original TASM / MASM sources to sources that can be assembled with a recent NASM.
 +
 +
 +===== MSDebug =====
 +
 +Other than some documentation updates, a bug in the debugger's M command was fixed. This bug affected overlapping moves with segments that differed in the opposite way as the linear addresses. Prior, the comparison to figure out whether to move forwards or backwards happened extended to the 32 bits of the segmented address. This does not adequately deal with the fact that one linear address can be addressed with up to 4096 different segmented addresses.
 +
 +The other code change is so as to run most of the debugger's code with Enabled Interrupts (EI). The ''saveints'' function disabled interrupts during its IVT access. The solution is to enable them again after some calls to this function.
 +
 +
 +===== wwwecm.scr =====
 +
 +The only change was to add the x2b2 current release to the build script.
 +
 +
 +===== lDebug =====
 +
 +The debugger will now always write the "ALASAP" and CALL 5 fields of PSPs that it creates to insure valid and compatible values.
 +
 +Quote marks are now accepted as separators by parsing, allowing commands such as ''E 100 0"foo"0'' to write 5 bytes. This improves compatibility to MSDebug.
 +
 +Some bugs of the F RANGE command were fixed, particularly if the second range is in a small segment (limit < 64 KiB) and there is no length specified for the second range.
 +
 +Several options for MSDebug compatibility were added:
 +
 +  * Treat 0-length ranges as 64 KiB, except U which treats them as 1 B
 +  * Display additional blanks in 80-column 16-bit R dump
 +  * Display variable prompts for ''R F'' and ''R reg'' commands in the same way as MSDebug
 +  * Shorten disassembly opcode field width
 +
 +
 +===== x2b2 =====
 +
 +In response to a stackoverflow question I recommended using either the ''exe2bin'' command of the 2018 free software MS-DOS v2 release (which is free but without sources) or the Public Domain (with sources) X2B. With my attention on that, I improved a variant of the latter, which I called X2B2.
 +
 +Instead of a separate build that ignores some error conditions, I added parsing for some switches which allow to select the specific parts to ignore. I also fixed the IP ignore part to handle a wrong IP as if an IP of zero was specified. (Open question: Should we check for CS equal to +0 as well?)
 +
 +The program also checks for a short read of the MZ EXE header now. Additionally, a short read of the image is only allowed if a switch is passed. Without the switch this short read is also handled as an error.
 +
 +The buffers and variables are aligned variously. The pathname buffers cannot overflow any longer. An initial ''cl'' other than zero is no longer a problem.
 +
 +Another switch allows files exceeding 64 KiB, with something similar to the prior limit implemented as an optional check.
 +
 +The pathnames may now contain dollar signs without corrupting the output. Further, spurious NUL bytes are no longer displayed after the pathnames.
 +
 +
 +===== WarpLink =====
 +
 +The WarpLink source release of version 2.70, released into the Public Domain as of 1999-11-05, was the basis for our repo. About 30 assembly source files are built together to form the main executable of the linker. These were written to target MASM or TASM.
 +
 +I created a script called ''fixmem.pl'' which initially was intended to fix only the memory accesses done without square brackets. I started out editing the first file manually, but soon found I could automate some of my edits.
 +
 +The core purpose of the script required to parse variable definitions as well as ''EXTRN'' directives to learn the sizes of symbols. For simplicity, the size must be learned before it is used, though running the script with the same input file specified twice could likely work around this requirement. The needed WarpLink sources are well-behaved however.
 +
 +Some accesses that already had brackets still needed a size to be added.
 +
 +8086 instructions with two or one explicit operands needed to be handled. The one-operand form is simpler: If the operand is a single 8086 register name without ''SIZE PTR'' keywords and without brackets, then it is a register operand. Anything else requires brackets, if it doesn't have them yet.
 +
 +Two-operand forms are more difficult. If an operand is not a register name, and not an equate name, and does not come with a prepended ''OFFSET'' keyword, then it must be a memory access. If it comes with ''SIZE PTR'' or brackets already, it is also a memory access. (This is always true, except for ''JMP NEAR PTR'' which is a direct branch, despite the ''PTR'' keyword.)
 +
 +One-operand forms always have to specify the access size of a memory operand. Two-operand forms of memory accesses do not have to specify a size, if the other operand is a register. Except for if the destination (first) operand is a memory operand, and the source operand is ''cl'', and the instruction is a shift or rotate instruction, in which case the size of the memory operand also needs to be specified.
 +
 +Soon the next hurdle came up: Structure definitions. Instead of editing all the definitions, we relied on recent NASM supporting the ''db ?'' style directives for reserving space, as well as the ''db NUMBER dup (?)'' directives to reserve multiple units of space. That left us with the ''STRUC'' and ''ENDS'' directives that are prefixed by the structure names. Not very difficult to handle with the little-known ''%00'' parameter type to a NASM multi-line macro. The macros for that went into the ''nasm.mac'' macro file.
 +
 +This file also got other extensions such as defining to empty the ''OFFSET'' and ''PTR'' keywords, as well as handling the ''PROC'', ''PROC NEAR'', ''ENDP'', or ''LABEL'' directives. Further, the ''DOSSEG'' style segment definitions went there, along with ''.DATA'', ''.DATA?'', ''.CONST'', and ''.CODE'' directives to switch to different segments.
 +
 +Additionally, ''PUBLIC'' and ''EXTRN'' directives were implemented as macros too. The ''EXTRN'' macro is nontrivial, as it has to parse and drop the size or type specifications appended to the symbol's label name with a colon. It was considered less work to handle this in a NASM macro than to change all the directives in the ported sources. The build time performance is probably not very good, but is acceptable.
 +
 +A particularly good part of ''nasm.mac'' is the EVEN directive. It checks for whether it is used in the ''_BSS'' segment or not, and will automatically expand to either ''alignb 2'' or ''align 2''. This is better than TASM's implementation of the EVEN directive, because that one will emit a warning whenever it needs to reserve a byte in a BSS section. (I had this concern previously for MSDebug using EVEN directives in space-reserving parts of the sources.)
 +
 +Subsequently, ''fixmem.pl'' learned a lot of new tricks. ''IFDEF'' directives get a percent sign prepended to be handled by the NASM preprocessor. ''COMMENT'' directives work the same way as in TASM. (I found that the handling of these is very line-oriented even in TASM, and presumably likewise in MASM.) Structure definitions are parsed and remembered specifically to emit proper ''istruc'' uses and "local" labels when a structure instance is to be expanded. Some directives are commented out. ''INCLUDE'' directives are changed to NASM preprocessor directives, with the filenames uncapitalised and ''.inc'' replaced by ''.mac'' as the filename extension. Assignments of the form ''label = $'' are replaced by equates with ''equ''. ''DGROUP:'' and ''_TEXT:'' prefixes are replaced by ''wrt'' specifiers for NASM.
 +
 +''jmp'' and ''call'' instructions with ''DWORD PTR'' keywords are changed to refer to ''far'' memory operands with brackets. Segment override prefixes in front of an opening bracket are moved into the brackets. ''lods'' instructions with explicit operands (usually to specify a segment override) are transformed into NASM-style instructions.
 +
 +One special case that came up late into the work was that the file ''mlddl2.nas'' had structure definitions matching those in ''mlddl1.nas''. But the global variables of these structure types were referenced in ''mlddl2.nas'' with simply a single symbol of ''byte'' type. Despite that, the sources referred to structure fields of these globals with the dot syntax. I invented a new directive for these cases, which dumps equates for all the structure fields of a given structure definition, prepended by a given label and a dot.
 +
 +The new directive also causes ''fixmem.pl'' to infer the symbol size for all of these equates. It may be the better choice to dump these symbol sizes in a recoverable form into the output; for now the sizes are only recorded internally by the script.
 +
 +Yesterday I finished the port of all source files required for the main linker executable. It can be built all in DOS (using the DPMI executable released by NASM) or mostly on the Linux host (using the host's native NASM). Only the last command, the linking, **must** be ran inside DOS. This is the WarpLink command to link all OMF object files together to create a DOS .EXE file.
 +
 +The resulting file has a similar size as the original, and the final link step can be carried out by running the resulting file itself. This latter operation results in exactly the same file as created by the released WarpLink executable. So it seems to work.
 +
 +
 +{{tag>msdebug ldebug x2b2 warplink fixmem}}
 +
 +
 +~~DISCUSSION~~
  
blog/pushbx/2023/0707_msdebug_ldebug_x2b2_warplink_2023_june_work.txt ยท Last modified: 2023-07-07 11:23:16 +0200 Jul Fri by ecm