====== Support of segment overrides for XLATB ====== Copied from a mail sent to freedos-devel: ===== ===== On at 2024-06-30 23:04 +0200, Bernd Böckmann via Freedos-devel wrote: > Hello, > > I have a question our assembly people: I stumbled upon an unusual instruction in the EDR-DOS source [1]. In RASM86 this is expressed as > > '' XLAT CS:AL'' > > which gets encoded as 2E D7. > > So this is a CS segment override prefix for an otherwise usual XLAT instruction. I found no way to reproduce this instruction in JWasm other than doing the following: > > '' DB 2eh'' > '' XLAT'' > > The Intel documentation for XLAT explains that the value to be added to AL is fetched from DS:BX, so probably the DS gets replaced by the CS override. But is the override prefix valid at this point, and does this work on all common CPUs? Intel documentation is silent about this. > > Thanks, > Bernd > > [1] https://github.com/SvarDOS/edrdos/blob/d579db43a97554dcac35becf510d7be8f5a85aae/drdos/header.a86#L1250 Hello Bernd, A simple web search [[https://duckduckgo.com/?t=ftsa&q=segment+override+xlatb&ia=web|for "segment override xlatb" without the quote marks]] does find some relevant results, in particular [[https://retrocomputing.stackexchange.com/questions/20031/undocumented-instructions-in-x86-cpu-prior-to-80386|this retrocomputing stackexchange thread]]. Quoting part of the question: > XLAT with a segment override prefix: Manuals for 8086 and 80286 only mention that it uses the DS register. It was first mentioned in the 80386 manual that it accepts segment prefixes. [[https://retrocomputing.stackexchange.com/questions/20031/undocumented-instructions-in-x86-cpu-prior-to-80386/24616#24616|One of the answers]] has this to say: > ''XLAT'' with segment override - works the same as 8088, segment override for base address can be ''CS:[BX]'' or ''ES:[BX]''. ... segment overrides for ''XLAT'' on a V20 do seem to work consistent with the 808x. I just fired up my HP 95LX which runs a NEC V20 as well. I can confirm that ''cs xlatb'', ''es xlatb'', and ''ss xlatb'' all work as expected. [[https://retrocomputing.stackexchange.com/questions/20031/undocumented-instructions-in-x86-cpu-prior-to-80386/20034#20034|Another answer]] claims that MAME does not support XLATB with a segment override: > MAME's 8086/88/186/188/286 emulator [[https://github.com/mamedev/mame/blob/master/src/devices/cpu/i86/i86.cpp|here]], V20/V30/V33/V33A emulator [[https://github.com/mamedev/mame/blob/master/src/devices/cpu/nec/necinstr.hxx|here]], and V30MZ emulator [[https://github.com/mamedev/mame/blob/master/src/devices/cpu/v30mz/v30mz.cpp|here]] all support 83/1, 83/4, and 83/6, and they all don't support a segment override for XLAT (the prefix is allowed but ignored). Search for 0x83 and 0xd7 to find the implementations. > > The fact that they all agree doesn't necessarily mean they're correct, since they all seem to have been forked from common code at some point. But I suppose that whoever implemented 83/x with no CPU-version test and put an explicit DS in the XLAT implementation probably knew what they were doing. Consequently, I added a remark on this [[https://pushbx.org/ecm/doc/insref.htm#insXLATB|to my NASM v2.05 based instruction reference]]: > On 386 or higher level machines, the segment register used to load from ''[BX+AL]'' or ''[EBX+AL]'' can be overridden by using a segment register name as a prefix (for example, ''ES XLATB''). It is reported that a segment override may be ignored by CPUs of a lower level than a 386. I [[https://hg.pushbx.org/ecm/insref/rev/9360e3b5ca1c|added this remark on 2022-03-18]]. On 2021-06-06 I [[https://hg.pushbx.org/ecm/inicomp/rev/ed6df0b9056c|replaced a use of cs xlatb in inicomp's lzd.asm]], most directly by a sequence that costs 3 bytes more than cs xlatb, like so: push ds push cs pop ds xlatb pop ds [[https://hg.pushbx.org/ecm/inicomp/rev/cd6a42afe7db|Later I replaced the use of xlatb]] by code that branches to get the next state number. --- However, looking at the current source of MAME ([[https://github.com/mamedev/mame/blob/3d357c07c0ca824868bbe7586839c8caae236571/src/devices/cpu/i86/i86.cpp#L1990|mame/src/devices/cpu/i86/i86.cpp]] and [[https://github.com/mamedev/mame/blob/3d357c07c0ca824868bbe7586839c8caae236571/src/devices/cpu/nec/necinstr.hxx#L625|.../cpu/nec/necinstr.hxx]]) it appears as if XLATB was changed to use a segment override prefix since [[https://retrocomputing.stackexchange.com/questions/20031/undocumented-instructions-in-x86-cpu-prior-to-80386/20034#20034|this answer]] was written (2021-06-04). [[https://github.com/mamedev/mame/blob/3d357c07c0ca824868bbe7586839c8caae236571/src/devices/cpu/i86/i86.cpp#L1990|i86.cpp has]]: case 0xd7: // i_trans m_regs.b[AL] = GetMemB( DS, m_regs.w[BX] + m_regs.b[AL] ); CLK(XLAT); break; This calls [[https://github.com/mamedev/mame/blob/3d357c07c0ca824868bbe7586839c8caae236571/src/devices/cpu/i86/i86inline.h#L410|GetMemB in i86inline.h]]: inline uint8_t i8086_common_cpu_device::GetMemB(int seg, uint16_t offset) { return read_byte(calc_addr(seg, offset, 1, I8086_READ)); } This calls [[https://github.com/mamedev/mame/blob/3d357c07c0ca824868bbe7586839c8caae236571/src/devices/cpu/i86/i86.cpp#L679|calc_addr]]: uint32_t i8086_common_cpu_device::calc_addr(int seg, uint16_t offset, int size, int op, bool override) { if ( m_seg_prefix && (seg==DS || seg==SS) && override ) { m_easeg = m_seg_prefix; return (m_sregs[m_prefix_seg] << 4) + offset; } else [[https://github.com/mamedev/mame/blob/3d357c07c0ca824868bbe7586839c8caae236571/src/devices/cpu/i86/i86.h#L155|calc_addr with only 4 parameters]] uses the default for the override parameter, which is equal to ''true'': virtual uint32_t calc_addr(int seg, uint16_t offset, int size, int op, bool override = true); So the 8086 sources for current MAME actually do support segment overrides for XLATB. [[https://github.com/mamedev/mame/blob/3d357c07c0ca824868bbe7586839c8caae236571/src/devices/cpu/nec/necinstr.hxx#L625|necinstr.hxx]] uses this line to implement XLATB: OP( 0xd7, i_trans ) { uint32_t dest = (Wreg(BW)+Breg(AL))&0xffff; Breg(AL) = GetMemB(DS0, dest); CLKS(9,9,5); } I'm unsure which GetMemB definition is used here, but it uses "''DS0''" which [[https://github.com/mamedev/mame/blob/3d357c07c0ca824868bbe7586839c8caae236571/src/devices/cpu/nec/necinstr.hxx#L455|looking at eg the A0h to A3h opcodes]] is what is known as DS on the 8086 and in a way that allows a segment override: OP( 0xa0, i_mov_aldisp ) { uint32_t addr; addr = fetchword(); Breg(AL) = GetMemB(DS0, addr); CLKS(10,10,5); } OP( 0xa1, i_mov_axdisp ) { uint32_t addr; addr = fetchword(); Wreg(AW) = GetMemW(DS0, addr); CLKW(14,14,7,14,10,5,addr); } OP( 0xa2, i_mov_dispal ) { uint32_t addr; addr = fetchword(); PutMemB(DS0, addr, Breg(AL)); CLKS(9,9,3); } OP( 0xa3, i_mov_dispax ) { uint32_t addr; addr = fetchword(); PutMemW(DS0, addr, Wreg(AW)); CLKW(13,13,5,13,9,3,addr); } (''DS1'' appears to be 8086's ES.) --- Moreover, I believe that the RC.SE answer is actually wrong. The git blame for the 0xd7 GetMemB in (then) src/emu/cpu/i86/i86.c points to [[https://github.com/mamedev/mame/commit/d41f2ae4f8a7d6e5c4afef51a52ccf1a89874c8f#diff-2d05f8414c3f0151da9561748d1b843e90d667ee0223c7b40d426d430368dc58R3218|a 2013-07-02 commit]]. In this commit GetMemB already can use a segment override if the first parameter is passed as "DS". This contradicts [[https://retrocomputing.stackexchange.com/questions/20031/undocumented-instructions-in-x86-cpu-prior-to-80386/20034#20034|the 2021-06-04 RC.SE answer]], which claimed that no segment override prefix was supported at that time. It appears that the user was wrong. So actually it seems everyone supports segment overrides for XLATB! Regards, ecm {{tag>xlatb 95lx insref inicomp freedos-devel}} ~~DISCUSSION~~