|
|
|
|
— |
blog:pushbx:2026:0328_comments_on_label_delta_arithmetic [2026-03-28 13:25:24 +0100 Mar Sat] (current) ecm created |
| | ====== Comments on label delta arithmetic ====== |
| | |
| | Today I posted a stackoverflow question and answer on how to do arbitrary calculations on assembly language labels, [[https://stackoverflow.com/questions/79916187/how-can-i-do-arbitrary-calculations-on-assembly-language-labels/79916188#79916188|How can I do arbitrary calculations on assembly language labels?]]. As a companion to that question, here is a collection of comments in which I have referred to label delta arithmetic before. |
| | |
| | |
| | ===== NASM report ===== |
| | |
| | ==== $$ fails to take origin (ORG) into account. ==== |
| | |
| | <blockquote>Your workaround is correct. A label that is segmented (not a scalar number) cannot be used for some calculations such as shifting, division, or masking. This is because a segmented label is passed to the linker, and the linker doesn't have relocation types for these calculations. |
| | |
| | NASM's ''-f bin'' format from the assembler's point of view is an object output format like any other, and the ''org'' setting is handled by the output format - not by the assembler. The ''-f bin'' format acts as a linker integrated into NASM. You can see from a simple example that NASM's listing output starts counting at ''00000000'' even if a nonzero ''org'' is given, and ''$$'' is (by design) a segmented label referring to the beginning of the section. That is, the point referenced as ''00000000'' in the listing file. The brackets indicate that a value is subject to relocation. Note how the relocated executable output evaluates the ''$$'' use to 0100h, not 0000h: |
| | |
| | <code>test/20260206$ nasm test.asm -l /dev/stderr && podhex test |
| | 1 |
| | 2 org 256 |
| | 3 00000000 [0000] dw $$ |
| | 000000 00 01 >..< |
| | 000002</code> |
| | |
| | To work around this, you have to work with label delta arithmetic. As you found, ''label - $$'' can be used for masking operations. This is because a delta (difference between two labels in the same section) is a scalar value, allowing all sorts of calculations. As you further found, ''$$'' refers to the equivalent of a vstart of 100h (the ''org'' value) when used in the first section. Therefore it is correct to add the (well-known) ''org'' value to the label deltas to calculate the final relocated address corresponding to label. |
| | |
| | I make extensive use of label deltas in some of my source texts, sometimes chaining them like ''origin + section1_end - section1_start + section2_label - section2_start''. I'm also revising a draft that explains problems like these some more, which I can link here once it's up. |
| | |
| | <cite>2026-02-06 reply to a NASM issue, https://github.com/netwide-assembler/nasm/issues/197#issuecomment-3858298717</cite></blockquote> |
| | |
| | |
| | ===== Stackoverflow questions ===== |
| | |
| | ==== how to get IDT handler address in assembly ==== |
| | |
| | <blockquote>Instead of ''<handler>'' use a delta, eg place ''start:'' at beginning (before ''cli'') then ''(handler - start + 7C00h)'' can be calculated to allow masking and shifting. |
| | |
| | <cite>2025-12-24 comment, https://stackoverflow.com/questions/79854512/how-to-get-idt-handler-address-in-assembly#comment140916679_79854512</cite></blockquote> |
| | |
| | |
| | ==== How to access global C variables from within standalone assembler modules with the GCC toolchain for Atmel AVR32? ==== |
| | |
| | <blockquote>Guessing from my NASM/8086 experience, you may be right that the label address is resolved at link time. But there isn't necessarily an object file relocation format for such calculations, so the assembler cannot encode your desired instruction as it cannot communicate it to the linker. |
| | |
| | <cite>2025-10-29 comment, https://stackoverflow.com/questions/79804147/how-to-access-global-c-variables-from-within-standalone-assembler-modules-with-t#comment140825649_79804147</cite></blockquote> |
| | |
| | |
| | ==== Overcoming "division operator may only be applied to scalar values" ==== |
| | |
| | <blockquote>@Joshua You can use ''absolute $'' [[https://pushbx.org/ecm/download/nasm/3392808%20%e2%80%93%20%5bSupport%20request%5d%20absolute%20with%20label_dollar%20sign%20to%20change%20from%20progbits%20section%20to%20a%20nobits%20section.html|to switch to a nobits part after a progbits part]]. While this isn't documented it does work, and you can then subtract a label at the start of the progbits section from one at the end of the nobits part to get the size. (You do still need the label delta arithmetic I think. It just works as a single section for these purposes then.) |
| | |
| | <cite>2024-11-30 comment, https://stackoverflow.com/questions/79238530/overcoming-division-operator-may-only-be-applied-to-scalar-values/79238627#comment139728681_79238530</cite></blockquote> |
| | |
| | <blockquote>@PeterCordes ''alignb 16'' in a nobits section would ''sectalign'' I believe. You don't want that because your deltas won't know about this padding. However, you can replace ''alignb 16'' by equivalent use of ''times ... resb 1'' with ''section .bss nobits align=1'' that aligns to a paragraph but takes into account the exact length of the prior section. |
| | |
| | <cite>2024-11-30 comment, https://stackoverflow.com/questions/79238530/overcoming-division-operator-may-only-be-applied-to-scalar-values/79238627#comment139728511_79238627</cite></blockquote> |
| | |
| | |
| | ==== How does an assembler find the offset of a label without knowing the value of the segment register? ==== |
| | |
| | <blockquote>A label is a relocateable symbol within a given section. Relocateable just means that depending on the linker, the assembler's view of the section may be incomplete so the linker may apply an adjustment. (When you assemble with ''nasm -f bin'' then a linker internal to NASM is used.) Other than the relocateable part, the label just represents an offset (into a section/segment). In NASM no type, size, or segment information is stored alongside the label's name, offset (in a section), and section (to make it relocateable). You can misuse a label with a "wrong" segment, NASM will be none the wiser. |
| | |
| | <cite>2024-09-03 comment, https://stackoverflow.com/questions/78945635/how-does-an-assembler-find-the-offset-of-a-label-without-knowing-the-value-of-th#comment139192947_78945635</cite></blockquote> |
| | |
| | |
| | ==== NASM ORG doesn't like SECTION ==== |
| | |
| | <blockquote>Yeah, the'' -f bin'' format of NASM is said to involve a "linker internal to NASM" which is why some relocations will be processed not by the assembler proper but rather the linker. So you get a ''[00000000]'' style relocation marker in the listing file because the listing part doesn't communicate with the linker. I mentioned some of this in yesterday's blog post: https://pushbx.org/ecm/dokuwiki/blog/pushbx/2024/0714_early_mid_july_work |
| | |
| | <cite>2024-07-15 comment, https://stackoverflow.com/questions/78747126/nasm-org-doesnt-like-section#comment138846008_78747126</cite></blockquote> |
| | |
| | |
| | ==== Ways to perform bitwise operations with symbols inside Assembly constants? ==== |
| | |
| | <blockquote>I'm using NASM for 8086 code but perhaps my approach could work on a GNU toolchain as well. That is, you cannot do bitwise operations, shifts, or divides with relocatable addresses. But you can calculate deltas within a single assembly module, if they stay in the same section. These deltas need no relocation, ie they are simply scalar values, so you can do all sorts of calculations on them. However, you need to somehow pre-calculate where a section is placed and then manually add in this adjustment (as another scalar) to obtain the equivalent of doing calculations on a relocatable address. |
| | |
| | <cite>2024-06-25 comment, https://stackoverflow.com/questions/78669648/ways-to-perform-bitwise-operations-with-symbols-inside-assembly-constants#comment138700367_78669648</cite></blockquote> |
| | |
| | |
| | ==== Why does code in MS-DOS Debug does not run and makes the prompt disappear? ==== |
| | |
| | <blockquote>I prepared examples at https://pushbx.org/ecm/test/20240212/ - 1.txt: Move data to behind code, assembles ''mov dx'' again to fill in the offset only known after the entire code is assembled. 2.txt: Put the data at a considerable distance from the code so we can predict during the assembly a valid offset that we can place the data at. 3.txt: Skip over data using a ''jmp'' (size of the jump + length of the data must be predicted or fixed up afterwards again). 4.txt: Input that works only with lDebug, using lDebug's variables to fix up the ''mov dx'' instruction with the correct address automatically. |
| | |
| | <cite>2024-02-12 comment, https://stackoverflow.com/questions/77979198/why-does-code-in-ms-dos-debug-does-not-run-and-makes-the-prompt-disappear#comment137475437_77979198</cite></blockquote> |
| | |
| | |
| | ==== Why does pointer addition work in NASM, 32 bits programming? ==== |
| | |
| | <blockquote>@PeterCordes The NASM ''-f bin'' format is actually a linker internal to NASM. So the assembler doesn't actually know numeric values for symbols that are to be relocated. Still, ''a + b'' works in NASM ''-f bin'' too. |
| | |
| | <cite>2023-12-08 comment, https://stackoverflow.com/questions/77622288/why-does-pointer-addition-work-in-nasm-32-bits-programming#comment136855564_77622288</cite></blockquote> |
| | |
| | <blockquote>@PeterCordes Indeed, the ''-f bin'' listing file doesn't use [] or () around the byte written for the instruction immediate. And yes, the relocations are not arbitrary, which is why I often work with deltas (subtract one label from another). Negative relocations are another bit that's not supported: https://sourceforge.net/p/nasm/feature-requests/176/ |
| | |
| | <cite>2023-12-08 comment, https://stackoverflow.com/questions/77622288/why-does-pointer-addition-work-in-nasm-32-bits-programming#comment136856036_77622288</cite></blockquote> |
| | |
| | |
| | ==== I cannot execute my bootloader file in qemu emulator ==== |
| | |
| | <blockquote>Assuming ''ds'' = 0 is not the same as assuming ''ds'' = ''cs'' because ''cs'' can also be nonzero. Also, in the very next instruction after the one that changes ''ss'' you should also change ''sp''. Furthermore, using two sections is possible with a padding line like ''times 510 - ($ - $$) - (text_end - text_start) db 0'' if the proper labels are placed in the ''.text'' section and alignment is taken care of. (Either specify ''align=1'' as a section attribute, or place an appropriate ''align'' directive before the label ''text_end''.) |
| | |
| | <cite>2023-08-10 comment, https://stackoverflow.com/questions/76872613/i-cannot-execute-my-bootloader-file-in-qemu-emulator/76872703#comment135520707_76872703</cite></blockquote> |
| | |
| | |
| | ==== How to use arbitrary arithmentic expressions on pointers in gas ==== |
| | |
| | <blockquote>I'm only particularly well versed in NASM, but it appears to be similar to gas in this regard. What I do when I want to shift or divide values derived from labels at build time is I do label arithmetic so that the assembler can operate on deltas. This is perhaps only really useful in a format like NASM's ''-f bin'' where you can reconstruct all section start/vstart/size parameters using deltas, eg ''(data_label - data_start + text_end - text_start + 256 + 15) / 16'' where just ''(data_label + 15) / 16'' would not be valid. |
| | |
| | <cite>2023-06-02 comment, https://stackoverflow.com/questions/76389786/how-to-use-arbitrary-arithmentic-expressions-on-pointers-in-gas#comment134705598_76389786</cite></blockquote> |
| | |
| | |
| | ==== What value we get when we subtract name of two labels and store it in a variable ==== |
| | |
| | <blockquote>You can take a delta between two labels (subtract one from the other) if they're in the same section. The result is a scalar numeric value, in NASM terms. Label arithmetic like that allows to divide or shift label deltas, which is not allowed with labels that are relocateable (non-scalar) values. |
| | |
| | <cite>2023-02-06 comment, https://stackoverflow.com/questions/58082833/what-value-we-get-when-we-subtract-name-of-two-labels-and-store-it-in-a-variable/75359034#comment132974951_75359034</cite></blockquote> |
| | |
| | |
| | ==== NASM: Define static IDT at compile time without using the linker ==== |
| | |
| | <blockquote>Do label arithmetic to gain a scalar value that NASM will allow to shift. For example, if your section starts with (say) ''org 7C00h'' then you can do arithmetic like ''dw (isr - $$ + 7C00h) >> 16''. In general, the delta between two labels that are in the same section results in a scalar value that you can then use in all kinds of calculations, including shifts and division. |
| | |
| | <cite>2022-10-02 comment, https://stackoverflow.com/questions/73927720/nasm-define-static-idt-at-compile-time-without-using-the-linker#comment130534145_73927720</cite></blockquote> |
| | |
| | |
| | ==== Calculating size in paragraphs from an address, as a NASM constant expression ==== |
| | |
| | <blockquote>NASM with the multi-section bin output format requires you to use proper label arithmetic to calculate things like this. This is required because NASM's internal linker doesn't allow for arbitrary calculations. For example, you need ''(stack_end - stack_start + code_end - code_start + 256 + 15) / 16'' if the length you want to include has the PSP (''org 256'') plus a code section (stretching from offset 0100h up to the stack) plus this stack section. [[https://hg.pushbx.org/ecm/callver/file/55bf60a4fea0/callver.asm#l16|Here's some code]] using label arithmetic (in a single section). |
| | |
| | <cite>2022-01-04 comment, https://stackoverflow.com/questions/70573150/calculating-size-in-paragraphs-from-an-address-as-a-nasm-constant-expression#comment124755973_70573150</cite></blockquote> |
| | |
| | <blockquote>[[https://hg.pushbx.org/ecm/tsr/file/daca203fa216/transien.asm#l972|Here's another example]], in this case all sections' sizes are paragraph aligned so we can just add up the individual sizes to get a valid length in paragraphs. [[https://hg.pushbx.org/ecm/ldebug/file/c2d96de77120/source/debug.asm#l127|Here's yet another example]], this one uses several sections (data entry, data stack, code, init) and there's more label arithmetic done overall. |
| | |
| | <cite>2022-01-04 comment, https://stackoverflow.com/questions/70573150/calculating-size-in-paragraphs-from-an-address-as-a-nasm-constant-expression#comment124756051_70573150</cite></blockquote> |
| | |
| | <blockquote>@Peter Cordes: You are right about using ''add'' with an immediate if you have a build time constant. However, do consider that back in the bad old days people often calculated the entire constant at run time, eg [[https://hg.pushbx.org/ecm/callver/file/09e933418542/callver.asm#l15|in the old callver]] before I changed it. |
| | |
| | <cite>2022-01-04 comment, https://stackoverflow.com/questions/70573150/calculating-size-in-paragraphs-from-an-address-as-a-nasm-constant-expression#comment124756229_70573150</cite></blockquote> |
| | |
| | <blockquote>@Peter Cordes: Yes, if you put it at the right spot and in the right section then what I called ''code_end - code_start'' can be replaced by ''$ - $$''. In a single section the ''$$'' will evaluate to a symbol at the offset equal to the ''org'' value, but (like a regular label) not as a scalar. |
| | |
| | <cite>2022-01-06 comment, https://stackoverflow.com/questions/70573150/calculating-size-in-paragraphs-from-an-address-as-a-nasm-constant-expression#comment124808875_70573150</cite></blockquote> |
| | |
| | |
| | ==== Load low byte of address only in x86 assembly? ==== |
| | |
| | <blockquote>In NASM I usually do label arithmetic, such as putting ''label1'' at the start of a section then ''label2'' later in the section. The difference ''label2 - label1'' can be computed by the assembler without involving the linker backend. |
| | |
| | <cite>2021-05-06 comment, https://stackoverflow.com/questions/67422867/load-low-byte-of-address-only-in-x86-assembly#comment119173048_67422867</cite></blockquote> |
| | |
| | |
| | ==== Checking if the user's input is a palindrome ==== |
| | |
| | <blockquote>I don't see ''string'' being defined anywhere. The other problem is that you want ''movzx ecx, byte [length]'' \ ''shr ecx, 1'' to get the string length (memory access with brackets) then divide by two. Your ''mov ecx, (length / 2)'' is wrong and would, if it was supported, just calculate half of the **address** of the ''length'' variable, not the content. (If you really want to do arithmetic on addresses in NASM source, you need to calculate deltas like ''label2 - label1'' from two labels in the same section. These can be used for assemble-time division, shifting, etc.) |
| | |
| | <cite>2020-12-14 comment, https://stackoverflow.com/questions/65295702/checking-if-the-users-input-is-a-palindrome#comment115436806_65295702</cite></blockquote> |
| | |
| | |
| | ==== What's the real meaning of $$ in nasm ==== |
| | |
| | <blockquote>"''$ - $$'' is a "scalar" (as is any difference between labels)" -- Correction: Any difference between two labels //in the same section// is a scalar. |
| | |
| | <cite>2020-09-04 comment, https://stackoverflow.com/questions/14928741/whats-the-real-meaning-of-in-nasm/14933178#comment112709144_14933178</cite></blockquote> |
| | |
| | |
| | ==== Getting section size with nasm -f bin ==== |
| | |
| | <blockquote>I use ''align 16'' + ''section_end'' or ''section_size'' style [[https://hg.ulukai.org/ecm/ldebug/file/c34f60ea5916/source/debug.asm#l8122|labels after the data to emit]], then [[https://hg.ulukai.org/ecm/ldebug/file/c34f60ea5916/source/debug.asm#l8242|sum all the respective deltas]]. Eg ''text_end - text_start'' to figure out the length of the "text" section. NASM allows some arithmetic on addresses like this, but none that crosses sections. So you need to list all sections separately, and put a label behind the end of the data. (NASM 2.10 was supposed to introduce ''%final'' but that was scrapped.) |
| | |
| | <cite>2020-07-13 comment, https://stackoverflow.com/questions/62877962/getting-section-size-with-nasm-f-bin#comment111202724_62877962</cite></blockquote> |
| | |
| | |
| | ==== YASM [symbol+$$] Effective Address is Too Complex in a flat binary ==== |
| | |
| | <blockquote>"Binary files could be an exception, but aren't" -- In the NASM architecture, flat binary output is just another output format for the frontend, which is linked internally by the backend. That is, there is a linker and an intermediate object format there, albeit both internal to the assembler. That is why bin output doesn't allow fancy relocations which could be uniquely supported by it, eg https://sourceforge.net/p/nasm/feature-requests/176/ |
| | |
| | <cite>2019-11-24 comment, https://stackoverflow.com/questions/59010047/yasm-symbol-effective-address-is-too-complex-in-a-flat-binary/59012976#comment104284764_59012976</cite></blockquote> |
| | |
| | |
| | ==== How do I properly hook Interrupt 28h in assembly for DOS, and restore it? ==== |
| | |
| | <blockquote>''(Setup-Start+15)/16'' is incorrect, you need to add 256 for the PSP. |
| | |
| | <cite>2019-09-03 comment, https://stackoverflow.com/questions/56403333/how-do-i-properly-hook-interrupt-28h-in-assembly-for-dos-and-restore-it/56418301#comment101967158_56418301</cite></blockquote> |
| | |
| | |
| | ==== Adding together labels in Assembly Language? ==== |
| | |
| | <blockquote>There are two types of symbol in x86 architecture: 1. plain number (scalar) 2. address (vector of two components: segment:offset). While you can add plain numbers or offset parts of addresses, addition of two addresses is not permitted. You can only add/subtract a plain number to/from an address, or subtract two addresses if both belong to the same segment. |
| | |
| | <cite>2015-01-18 comment by [[https://stackoverflow.com/users/2581418/vitsoft|vitsoft]], https://stackoverflow.com/questions/27996727/adding-together-labels-in-assembly-language#comment44406788_28000532</cite></blockquote> |
| | |
| | |
| | ==== NASM: emit MSW of non-scalar (link-time) value ==== |
| | |
| | <blockquote>Well... as you probably know, Nasm will condescend to do a shift on the difference between two labels. The usual construct is something like: |
| | |
| | ''dw (int3 - $$) >> 16'' |
| | |
| | where ''$$'' refers to the beginning of the section. This calculates the "file offset". This is probably not the value you want to shift. |
| | |
| | ''dw (int3 - $$ + ORIGIN) >> 16'' |
| | |
| | may do what you want... where ''ORIGIN'' is... well, what we told Nasm for ''org'', if we were using flat binary. I ASSume you're assembling to ''-f elf32'' or ''-f elf64'', telling ld ''--oformat=binary'', and telling ld either in a linker script or on the command line where you want ''.text'' to be (?). This seems to work. I made an interesting discovery: if you tell ld ''-oformat=binary'' (one hyphen) instead of ''--oformat=binary'' (two hyphens), ld silently outputs nothing! Don't do this - you waste a lot of time! |
| | |
| | <cite>2013-05-03 answer by [[https://stackoverflow.com/users/2082698/frank-kotler|Frank Kotler]], https://stackoverflow.com/questions/16351889/nasm-emit-msw-of-non-scalar-link-time-value/16355350#16355350</cite> |
| | </blockquote> |
| | |
| | |
| | {{tag>nasm warplink stackoverflow linkspam}} |
| | |
| | |
| | ~~DISCUSSION~~ |
| |