This is a continuation of my series of articles on the history of personal computers. It follows the article on the MS-DOS zero page.
The series starts with my first CP/M article. If you have not read the previous articles, I recommend that you read them first.
In my article on the MS-DOS "Hello, world" program I described a program essentially identical to the CP/M "Hello, world" program for the 8-bit machine. Both programs live in a world where memory ranges from 0000h (0 B) to FFFFh (65535 B).
But the 8086 CPU is different in that it uses segmented memory. While programs running on the 8086, like on the 8080 and z80, can see 64 KB of memory at a time, the CPU itself can actually see 1024 KB (1 MB) of memory. To translate between a 16 bit address used in a program and a 20 bit address used by the CPU, the CPU takes another address from one of the four segment registers, shifts it 4 bits to the left and adds the address seen by the program to it to get a 20 bit address. (Note that this means that the segments that can be described are not distinct and overlap so segment start addresses should be chosen wisely.)
I'll leave out the actual maths to avoid making a silly mistake. We'll never need to see the actual numbers.
I will try to demonstrate the multisegment nature of MS-DOS programs using a "Hello, world" program that uses one code and two distinct data segments.
(Copyable text in mshello.asm)
As you can see replacing the segment address in the data segment register points the instructions to a different content at address zero.
We better make a quick note of the fact that MS-DOS supports two different types of programs.
The first is the CP/M-style command file (program.com) which is a direct image of the program in memory, has no header and is simply loaded into memory at address 100h and run. Such a program has no idea that more than one segment of memory exists (in fact it doesn't even know that one segment exists) and just assumes that it covers all available memory with a 16 bit pointer. The Zero Page is located in memory between addresses 00h and FFh.
The second is the new-style executable file (program.exe) which is a program sorted into segments, has a header and is loaded into several segments by the operating system according to its wishes. The Zero Page is located in memory between addresses 00h and FFh relative to the address cotained in the data segment register of the CPU.
Sources
A usable list of x86 opcodes: http://www.mathemainzel.info/files/x86asmref.html
Useful Software
flat assembler (FASM) for MS-DOS, Windows, and UNIX: http://flatassembler.net
FASM can actually create program image files (COM) and formatted program files (EXE) for MS-DOS (MZ header) and Windows NT (PE header) as well as for UNIX (COFF and ELF format).