User Tools

Site Tools


blog:pushbx:2022:0924_more_hp_95lx_adventures

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

blog:pushbx:2022:0924_more_hp_95lx_adventures [2022-09-24 19:33:32 +0200 Sep Sat] (current)
ecm created
Line 1: Line 1:
 +====== More HP 95LX adventures ======
 +
 +**2022-09-20**
 +
 +Yesterday I loaded a bunch of files onto the HP 95LX, intending to read on the device. I used some nondescript "PDF to text" website to gain plain text from two PDFs. Next, I attempted to convert the resulting Big Text File from the UTF-8 encoding to Code Page 850 using iconv. This repeatedly complained about invalid codepoints. I manually search-and-replaced all the smartquotes and friends (using Pluma, the text editor of the MATE Desktop Environment). After that I wrote a small scriptlet to make iconv point me to the next invalid codepoint, so that I could replace those manually. (These were mostly CJK, Russian, or Greek sniplets in the text.) After having processed about one third of the Big Text File I figured out that iconv has a switch, -c, which will make it skip invalid codepoints. (I am not sure whether they are omitted or replaced by a placeholder.)
 +
 +===== =====
 +
 +After finishing the conversion of the Big Text File I split it into smaller files using a perl scriptlet. I had to break up the files not only for each chapter but also after every 1000 lines, so as to get result files smaller than 64 KiB. That's because I anticipated there would be a file size limit in the Memo application.
 +
 +Then, it was time to transfer the files to the 95LX. This was harder than it ought to be because there were several dozen files and I have yet only mastered the transfer from the Linux box connected to the serial port using the Datacomm application's XMODEM protocol, which can only transfer one file at a time and requires entering the source and destination filename on both sides.
 +
 +I figured I could zip up the files and send the archive plus a DOS build of Info-ZIP's unzip tool. That wasn't entirely wrong. However, I initially started sending the whole distribution SFX file of unzip, which took way too long. So I unpacked the SFX using dosemu2 on the Linux host, yielding an unzip.exe that comes in at less than 60 KiB. (Judging from its performance this is also a packed executable.) Even with the sunk cost fallacy, I noticed it'd be faster to cancel the transfer now and send only unzip.exe instead of waiting for the whole SFX to arrive.
 +
 +I decided to pack the Bunch of Smallish Text Files into several archives, and then pack the archives into one main archive. This was because unpacking all the text files alongside the whole archive may have exceeded the disk size of the 2 MiB SRAM card.
 +
 +After transferring the archive, I noticed that unzip (after complaining about the lack of a TZ environment variable) errored out noting it did not have enough memory to unpack the first file that I attempted. I had to crank up the main memory from the 386 KiB I'd previously allocated to it. (On the 95LX, the internal memory is split with a configurable ratio between the main memory and the RAM disk that is used for the C: drive. The model that I have is equipped with 512 KiB of internal memory.) I have it set to 418 KiB of main memory now.
 +
 +However, the Memo application complained for some files that it couldn't open them because they were too large. These files were smaller than 64 KiB but exceeded 50 kB. I went and loaded them in the debugger, then saved them into two separate halves each. I didn't bother finding a linebreak or anything to split on, instead opting to simply divide the size in bytes by 2.
 +
 +That got us almost all the way. However, I noticed that the linebreaks were missing in the Memo application. Loading a file in the debugger it turned out it had Linux-style, LF-only linebreaks. So I had to create an implementation of the unix2dos tool. I opted for a "filter" style program, that is one which reads from its stdin and writes to stdout (much like the b16cat tool I entered with the help of echoify). Thus avoiding the need for parsing filenames and opening the files in the tool. My first attempt appeared to work, but it was prohibitively slow (as depicted by the progress indicator that I added for every 256th iteration). That was probably due to making two DOS syscalls for every input byte. So I created the second revision, which makes two syscalls per every 8 KiB of the input! This one is fast enough, needing only a few seconds to run to completion on the larger files.
 +
 +Unfortunately I overwrote the first program with the second one, noticing only later that I had used the filenames unix2dos.com and unix2dos2.com, which DOS conflated because of its 8.3 Short File Name limit. It wasn't much to phone home about, just a loop reading one byte from interrupt 21h service 3Fh handle 0 (stdin), and writing either one or two bytes using service 40h handle 1 (stdout). The progress indicator was written to handle 2 (stderr).
 +
 +Here is the second program:
 +
 +<code>&r dco6 or= 1000
 +&u 100 l 100
 +36BE:0100 81FC0088  cmp     sp, 8800
 +36BE:0104 7301      jae     0107
 +36BE:0106 C3        retn
 +36BE:0107 BA0020    mov     dx, 2000
 +36BE:010A B90020    mov     cx, 2000
 +36BE:010D 31DB      xor     bx, bx
 +36BE:010F B43F      mov     ah, 3F
 +36BE:0111 CD21      int     21
 +36BE:0113 726B      jb      0180
 +36BE:0115 91        xchg    ax, cx
 +36BE:0116 E368      jcxz    0180
 +36BE:0118 89D6      mov     si, dx
 +36BE:011A BF0040    mov     di, 4000
 +36BE:011D 90        nop
 +36BE:011E 90        nop
 +36BE:011F 90        nop
 +36BE:0120 AC        lodsb
 +36BE:0121 3C0A      cmp     al, 0A
 +36BE:0123 741B      jz      0140
 +36BE:0125 AA        stosb
 +36BE:0126 E2F8      loop    0120
 +36BE:0128 BA0040    mov     dx, 4000
 +36BE:012B 89F9      mov     cx, di
 +36BE:012D 29D1      sub     cx, dx
 +36BE:012F B440      mov     ah, 40
 +36BE:0131 BB0100    mov     bx, 0001
 +36BE:0134 CD21      int     21
 +36BE:0136 EBC8      jmp     0100
 +36BE:0138 90        nop
 +36BE:0139 90        nop
 +36BE:013A 90        nop
 +36BE:013B 90        nop
 +36BE:013C 90        nop
 +36BE:013D 90        nop
 +36BE:013E 90        nop
 +36BE:013F 90        nop
 +36BE:0140 B80D0A    mov     ax, 0A0D
 +36BE:0143 AB        stosw
 +36BE:0144 EBE0      jmp     0126
 +36BE:0146 90        nop
 +...
 +36BE:017F 90        nop
 +36BE:0180 B8004C    mov     ax, 4C00
 +36BE:0183 CD21      int     21
 +36BE:0185 90        nop
 +...
 +36BE:01FF 90        nop
 +&q</code>
 +
 +
 +**2022-09-21**
 +
 +Today I managed to nearly brick the 95LX when I experimented with how to set environment variables automatically in its DOS.
 +
 +I created a CONFIG.SYS file on the internal memory RAM disk drive C: and failed to use the /P or /K switches to COMMAND.COM in the SHELL= directive. I did use a /C switch however. This resulted in the shell running my A:\AUTOEXEC.BAT file (as desired) but then exiting back to DOS's initial process. In this MS-DOS version (3.22) this apparently leads to an error about not finding a shell, without any input possible to retry the same or a different command.
 +
 +Luckily I was able to recover the device without a factory reset that would have led to the loss of all files on the RAM disk. Relevantly, just yesterday in fact the CR2325 battery I'd ordered for the SRAM card had arrived, which I had quickly inserted into the card. (As per the 95LX's manual you can replace the card battery while the card is inserted in the powered 95LX so as not to lose all its data.)
 +
 +So to fix it there would have been two ways: Either write an A:\CONFIG.SYS that takes precedence with a vanilla SHELL= directive, or modify the A:\AUTOEXEC.BAT to do something sensible, like starting an interactive shell as a child, or the $SYSMGR program which (as I learned) provides access to all of the applications, including the Memo text editor.
 +
 +Now, the modern day peripheral for accessing the 95LX style RAM card from a Linux desktop box host goes for beyond 500$ so naturally I don't have one. But I do have something else: A second HP 95LX, though in slightly worse shape. (The screen hinges are too loose and the backup battery tray is missing. And I only have the one card, which actually came with this second device.) It was easy to insert the card into this computer. Even if the files would have not permitted booting with the card inserted, I have found one can insert the card at any time after booting just as well. In any case, I took to Memo and modified the batch file to end in a line reading "command". Swapping the card back to the first 95LX I was able to boot to a useable shell and start the $SYSMGR program manually to get back to its Memo.
 +
 +Subsequently I tested a few more commands and settled on an A:\CONFIG.SYS with the line SHELL=C:\COMMAND.COM /P, deleting the file on the C: drive instead. That means I can boot without the card to directly load the system manager or with it to load the shell. (With this directive the shell automatically loads the A:\AUTOEXEC.BAT file without specifying it explicitly.)
 +
 +Other than that I have continued reading on the first 95LX and of course went to write this post today, late in the evening.
 +
 +
 +**2022-09-23**
 +
 +Today I added a 40-column friendly mode to the D commands (DB, DW, and DD). It will list up to 8 bytes' worth of data per line, instead of the default 16. Further, it will not list the segment on each line, so that listing in a 16-bit segment (limit < 64 KiB) will fit within the 40 columns.
 +
 +The segment can be displayed by enabling the header or the trailer, which will no longer include the "header" or "trailer" labels and instead the segment (or selector).
 +
 +There are two options (in DCO6) which are disabled by default: Indenting the data of the odd lines (starting with an address that modulo 16 equals 8) by an additional blank, and displaying the dash in the middle of the data. The indenting was planned earlier but considered not to be aesthetically pleasing. The dash was easy to add optionally.
 +
 +This marks the last of the three most common commands to be modified to respect the 40-column mode option. That's R, U, and D. It was easier than expected, not requiring much code to be copied or very much branching to be added. We may revisit the U command to drop the indentation of the operands to the 8th column after the mnemonic offset, next. This would allow more instructions to be fully readable in 40 columns. Further, the default length of the D commands may be shortened to accommodate the 40x16 screen of the 95LX.
 +
 +
 +===== Additional information (added on the desktop) =====
 +
 +Web service used to extract the text from a PDF:
 +
 +https://tools.pdfforge.org/en/extract-text
 +
 +Scriptlet used to get iconv to point to invalid codepoints:
 +
 +<code>$ tail --bytes +"$(iconv -f UTF-8 -t CP858 fic.txt -o ficcp.txt 2>&1 | sed -re 's/.*at position (.*)$/\1/g')" fic.txt | head</code>
 +
 +Scriptlet used to split the big text file into several dozen smaller files:
 +
 +<code>$ perl -e 'my $OUTNR = 1; my $LINENR = 1; my $OUTOPEN = 0; while(<<>>) { if ($OUTOPEN == 0) { open($OUTHANDLE, ">", "fic".$OUTNR.".txt"); $OUTOPEN = 1; }; print $OUTHANDLE $_; if (/qqqq Chapter/ || $LINENR > 1000) { close($OUTHANDLE); $OUTOPEN = 0; $OUTNR = $OUTNR + 1; $LINENR = 0; }; $LINENR = $LINENR + 1; }' ficcp.txt</code>
 +
 +b16cat and echoify:
 +
 +https://hg.pushbx.org/ecm/b16cat/
 +
 +{{tag>95lx reading ldebug assembly}}
 +
 +
 +~~DISCUSSION~~
  
blog/pushbx/2022/0924_more_hp_95lx_adventures.txt ยท Last modified: 2022-09-24 19:33:32 +0200 Sep Sat by ecm