User Tools

Site Tools


blog:pushbx:2024:0612_early_june_work

Early June work

2024-06-09

lDebug

The load_unit_flags equates are moved into iniload.mac, and a data link is added for load_unit_flags. The chstool Extension for lDebug will now use these flags.

instsect

Add flag _LBA_WORKAROUND to the LBA detection search strings.

Not yet uploaded: Detect doubles in the LBA search strings, and eventually avoid writing them.

ldosboot

Add _LBA_WORKAROUND code to _LBA_SKIP_CHECK=0 handling. This will use the Book8088 Xi8088 BIOS bug workaround of setting ds = 40h for interrupt 13h function 41h.

Enhanced DR-DOS

UMBs not enabled on JemmEx

Today a report came in that EDR-DOS (single-file load loaded from lDebug) didn't allocate UMBs from JemmEx. I suggested moving the dos=umb line to after the device= line for JemmEx. This works. It appears that EDR-DOS processes these directives more linearly than other DOS kernels.

LBA not used properly (CD bug)

A user on the BTTR Software forum reported that CD commands fail for some directories. They guessed that the directories may have start clusters beyond 64 kilo binary on a FAT32 drive.

This turned out to be wrong, but by creating a file system with a directory beyond 64 Ki clusters I was able to reproduce the problem proper on qemu. Confusingly enough, not moving the directory that far up made my tests succeed, so I initially guessed the assumption might be correct.

This led me to fruitlessly trace the int 21h handler for function 3Bh (change directory) for a while, looking for unexpected behaviour. This was made more difficult by the fact I do not have trace listing files for the kernel yet, so TracList cannot be used. That meant a lot of manual look up of source files while tracing.

Another problem was with my method of tracing: I would trace to a certain handler, then use proceed steps (P command in lDebug) to try processing a function call. If this caused the critical error, I would break on another breakpoint and Go to the last callsite before the error, then Trace into it.

This failed when the same function was called repeatedly during the change directory system call, but with the first call actually succeeding. So that wasted a bunch of time.

Another problem was that I needed to run int3.com between attempts to arm the conditional int 21h breakpoint again. This is due to an lDebug design fault. Using multiple permanent breakpoints can work around this problem.

Eventually I traced the problem into the BIO's block device's disk read routines. For a moment I believed incorrectly that the file system size was set wrongly, but I misinterpreted the meaning of the 16-bit "total sectors" field. Then I continued tracing the high read and found that LBA was not in use.

Tracing the CHS routines, I found that the int 13h function 02h call used the wrong address. I determined this using a residently-installed chstool Extension for lDebug. In fact, using the chstool.eld with the input tuple FFFF_FFFF told me that only 20-odd MiB were accessible using CHS, yet my access attempt was far beyond that. Turns out the qemu VM detected the CHS geometry from my bootimg script so it used 2 CHS Heads and 18 CHS Sectors. (These are the defaults because a 1440 KiB diskette image uses this geometry.)

However, both the size of the file system and the chosen MBR partition type should make the kernel chose LBA access. From this insight it was a quick way to finding out:

I booted into lDebug and ran y ii13.sld then created a conditional permanent breakpoint using bp at ptr ri13p when ah == 41. This is the "intercept interrupt 13h" Script for lDebug:

@if (vff) then goto :eof
@if (ri13p == 550) then goto :eof
@a 0:550
@ jmp (ri13s):(ri13o)
@ .
@r dword [0:13*4] = 550
@bp new ptr ri13p
@bl at ptr ri13p

(Did I forget setting VFF here? Very possible. Asides, putting a nop before the jump would be of use sometimes.)

Then I loaded EDR-DOS using boot protocol ldos edrdos.com. Running the kernel with g ended up hitting the breakpoint from lDOS's iniload once. I continued with another g command. Then the debugger detection was hit. Ran r f CY and another g to tell the kernel there is a debugger.

The next breakpoint hit was in the BIO's file system detection. So we use g ptr [ss:sp] to run until the int 13h call returns. Tracing a little farther we can observe a point at which the UDF_LBA flag is set in the UDSC. From there, it was a simple tp DDDDDD while (word [ssss:oooo] & 400) == 400 to trace until an instruction clears the LBA flag. (Where the ssss:oooo address is a placeholder for the address of the UDSC's flag word. The DDDDDD count is a literal however; the exact value doesn't matter but it should be high.)

This eventually pointed to the instruction that cleared the LBA flag, making the kernel fall back to trying to access the block device's unit using CHS. My quick patch changes this particular instruction to not clear UDF_LBA. However, it has been pointed out that other accesses to the UDSC flags may be problematic too.

Another idea is to check for CHS overflows before trying to read the sector. I believe the BIO does not do this yet.

The lack of a trace listing file made me ask on the SvarDOS EDR-DOS tracker whether RASM86 can create listing files. Apparently it can, but I didn't test them yet. If they are useful as inputs for a conversion script we may yet get trace listings for the kernel.

You could leave a comment if you were logged in.
blog/pushbx/2024/0612_early_june_work.txt · Last modified: 2024-06-12 21:47:10 +0200 Jun Wed by ecm