2024-04-28
I didn't upload the prior blog post yet as of the time of this writing. Oh well.
This week I worked some on ACEGALS, lDebug, and the MS-DOS v4.01 sources newly released as free software.
The CHG definition now lists that CHGed registers are call-clobbered whereas those not listed in a CHG nor OUT block (when a CHG block exists) are call-preserved. These terms are from stackoverflow's Peter Cordes.
As opposed to the assembler, it is somewhat difficult for the disassembler to detect when it should display a MODRM keyword. The first trick was to add the OP_X operand type extension, so that we don't have to worry about overflowing the 63 allowed sizeless operand types.
Then it came down to inserting a number of (extended) operand types which check for certain conditions, such as the operand being for a word or dword sized register (for inc and dec), or the other operand (in the reg/spare field) being one of the accumulator registers (al, ax, eax) and the operand being a memory address that consists only of a disp16 or disp32 without any address registers (for mov).
Some of the checks are only for the ModR/M operand to encode a register, this is for push, pop, and forms of mov and ALU instructions that wouldn't be selected by default when assembling with two registers. A mov with immediate source and ModR/M register destination also takes this OPX type.
Initially I thought I could get away without a check for a MODRM keyword already having been displayed. However, it turned out that xchg ax, modrm ax
would have the disassembler display two MODRM keywords because both the "this operand is ax" check and the "the other operand is ax" check for xchg did match. So I enabled the already displayed check.
I'd previously written this when I added two different checks for the same form of mov, however in that case I noticed only one of them can match at the same time. Namely, one checking for a register and the other for a memory operand (with just disp16/disp32, and other operand is al/ax/eax).
Two different operands can never both be flagged with a MODRM keyword because only one operand can be a ModR/M operand and the keyword is only ever displayed for ModR/M operands.
This week, the MS-DOS version 4.01 sources were released as free software, along with all the tools needed to re-build the entire release (as binaries only). Of course I immediately played with it some.
ecm says: April 26, 2024 at 11:24 pm
The line endings actually were problematic for nosrvbld.exe and for exe2bin which has its stdin redirected to supply a default answer to a prompt for a relocation segment (done using int 21h service 0Ah). https://www.os2museum.com/wp/how-not-to-release-historic-source-code/#comment-381254
ecm says: April 28, 2024 at 11:21 am
@felsqualle
That’s me. The commands don’t really “fix” things, especially not select. They just make it so the build can finish. The sed avoids the too long assembly source lines. The unix2dos fixes the files that come out of git with LF line endings (at least on Linux). When I made these adjustments I didn’t even know how broken select is. Probably someone will have to identicalise select using a known good build to restore the correct text. https://www.os2museum.com/wp/how-not-to-release-historic-source-code/#comment-381457
I had similar errors, in mapper/getmsg.asm, select/select2.asm, and select/usa.inf
For me these turned out to be errors involving a single-byte character being expanded to a three-byte UTF-8 encoded value. When nearly a whole line (of 80 columns) is filled with these EF BF BD strings then MASM (the version shipped with the repo) doesn't like the total line length in bytes. I eventually fixed this by running: (After manually deleting a few lines to work around the problem.)
sed -i -re 's/\xEF\xBF\xBD|\xC4\xBF|\xC4\xB4/#/g' FILENAMES…
I had many more errors, starting with nosrvbld.exe running on eg boot/boot.skl complaining it couldn't find something. The something was named as a bunch of gibberish text however. This, and some other problems, turned out to be because git and/or Microsoft spat out text files with LF line endings whereas many of the DOS tools expect CR LF line endings. The following command worked for me:
find -iname '*.bat' -o -iname '*.asm' -o -iname '*.skl' -o -iname 'zero.dat' -o -iname 'locscr' | xargs unix2dos -f
(The zero.dat and locscr files are used to redirect numbers into exe2bin's relocation number prompt. Failure to unix2dos those ended up hanging dosemu2 and/or ConnectBot, and spamming notifications to ConnectBot. Likely related to the old problem of int 21h service 0Ah not properly detecting and handling an EOF.)
To successfully compile the C program parts, I had to fix the setenv.bat script (apart from using another drive) to point to the headers and libraries that actually ship with the repo:
$ cat src/e.bat @echo off echo setting up system to build the MS-DOS 4.01 SOURCE BAK... set CL= set LINK= set MASM= set COUNTRY=usa-ms set BAKROOT=e: rem BAKROOT points to the home drive/directory of the sources. set LIB=%BAKROOT%\src\tools\bld\lib set INIT=%BAKROOT%\src\tools set INCLUDE=%BAKROOT%\src\tools\bld\inc set PATH=%BAKROOT%\src\tools;%PATH%To boot the kernel, I used lDebug's command BOOT PROTOCOL MSDOS6 hda1/ to boot off a file system that I created using my bootimg.asm script. I had to set -D_OEM_NAME="'IBM 3.1'" to make the kernel accept my file system; with my default it calculated a wrong (at least) Sectors per Cluster value of 4 and failed to load the DOS module. This is my command line to create the image:
nasm -I ../../lmacros/ -I ../../bootimg/ ../../bootimg/bootimg.asm -D_PAYLOADFILE="io.sys,msdos.sys,mem.exe,sys.com,../../ldebug/bin/callver.com,::rename,../../../.dosemu/drive_c/command.com,freecom.exe,command.com" -o disk.img -D_MEDIAID=0F8h -D_BPE=16 -D_ERROR_SMALL32=0 -D_SPF=256 -D_SPI=128000 -D_SPC=2 -D_MBR -D_MBR_PART_TYPE=fat16_chs -D_CHS_HEADS=128 -D_CHS_SECTORS=32 -D_OEM_NAME="'IBM 3.1'"
Regards, ecm https://sourceforge.net/p/freedos/mailman/message/58765259/