CP/M and MS-DOS Fat Binary

This article is not really a part of my series on CP/M and MS-DOS.

This comment on Raymond Chen's blog post on patching Microsoft Money sparked my interest.

"ChristW" wrote:

Nerdiest byte sequence that I ever thought up was: EB 03 C3 yy xx

If you create a .COM file with those 5 bytes as the first ones, and look at the disassembly (I'm assuming an x86 disassembler here..), you'll see 'JMP SHORT 3', followed by 3 garbage bytes. Nothing ususual here...

If you look at a Z80 disassembly of those same bytes, that translates to 'EX DE,HL; INC BC;'. Exchange the content of 2 registers, increment another one, nothing special. The 3rd byte is 'JUMP' followed by the 16-bit address specified as yy xx.

So, if you create a .COM file with the 5 bytes as above, followed by 8088 code, followed by Z80 code (at offset xxyy + 0x100 in the source), you'll have a .COM file that runs on MS-DOS and (i.e.) CP/M...

Now, I had to try that, didn't I?

Armed with a trustworthy x86 assembler for MS-DOS and two sites that list the opcodes for the 8086 and z80 in the right syntax and simple enough for me to follow, I figured I could do this.

I followed "ChristW"'s instructions to the byte.

This is the entire program in a hex editor. It is 45 bytes long.


You can see that it starts with his sequence of EB 03 C3.

And this is the assembler source I came up with.


(You can find a copyable version of the program text here.)

The first two lines are assembler directives. This is 16 bit code (officially) and it starts at address 100h. I use three labels "msdos", "cpm" and "hello" but I only put the names there to make it easier to read. In fact the code doesn't use the label names.

The next three lines are the 8086 short jump to [an address I calculated later and which turned out to be 105h], a comment showing the  z80 jump to [an address I calculated later: 110h], and the individual bytes the z80 jump assembles to. (Note that the 8086 and the z80 are litte-endian and hence 110h becomes 10h,1 in individual bytes.)

Counting the bytes used so far gives me 5 bytes. Adding 5 to 100h gives me 105h for the beginning of my MS-DOS code. Jumping to 105h is a short jump of 3 bytes and is assembled to such code.

The next 11 bytes are the standard MS-DOS code for printing a string on the screen and exiting the program. The string to be printed is located at [an address I calculated later: 11Dh]. 16 bytes have been used in total since the beginning.

Following those 11 bytes, at address 110h (100h + 16), starts the z80 code for printing the same string. The comments show the z80 assembler and the lines below those show each individual command translated into individual bytes.

You can see that the process of printing the string is pretty much the same on both platforms:

  1. Write the address into the data register.
  2. Write the system call number 9 into the register used for system calls.
  3. Call the operating system.
  4. Write the system call number 0 into the register used for system calls.
  5. Call the operating system.

The z80 code took 13 bytes. (There are 256 interrupts and MS-DOS is called via interrupt. Calling the interrupt takes two bytes, one byte for the opcode and one for the interrupt number. But there are 65536 possible addresses and CP/M is called via address. Calling CP/M takes three bytes, one byte for the opcode and two bytes for the address even though that address if 5. I think)

This brings me to address 11Dh (100h + 16 + 13). This is where my string is stored. I called the location "hello" but that doesn't matter since the label name is not used. I used absolute addresses because the labels would not exist in both worlds, 8086 and z80 anyway.


Ok, that was fun. Bye now.

 © Andrew Brehm 2016