Migrated 1 year ago from Stack Overflow
Why is it said that all modern Intel processors of the x86 family are said to descent from the Intel 8086 and not the Intel 8080? From the Wikipedia article on the Intel 8086,
The 8086 gave rise to the x86 architecture, which eventually became Intel's most successful line of processors.
But why start at the 8086 when the 8086 was source compatible with the 8080, had some 16-bit operations? What is the defining feature that so set the two chips apart that the 8086 is said to start the architecture?1.2kEvan Carroll
8086 was designed to make asm source porting from 8080 easy (not the other direction). It is not binary compatible with 8080, and not source-compatible either. 8080 is not an x86 CPU. 8080 is a more distant ancestor that had some influence on the design of 8086, but it's not the same architecture. As an analogy, all x86 CPUs are the same genus but different species, while 8080 is a different genus.
8080 itself has some ancestors like 8008, so if you're considering more-distant ancestors and not strict binary compat, then you definitely don't stop at 8080 as the earliest ancestor.
Modern x86 CPUs are binary compatible with 8086. You can literally run 8086 binaries on a modern PC, in real mode. (The species analogy is a stretch here, but works if you look at forward compat instead of backwards compat: old x86 chips can't run AVX / AVX2 / FMA / AVX512 code, so you could look at each ISA extension as a speciation event.)
The 86 in x86 comes from 8086 / 80186 / 80286 / ..., Intel's official CPU model numbers until they switched to names like Pentium (because you can't trademark a number).
Modern PC firmware usually still supports booting in legacy BIOS mode, supporting
int 0x10 /
int 0x13 etc. "BIOS" system calls for keyboard/screen input/output, reading disks, and so on. This is a PC software thing, going beyond 8086 binary compatibility, but it does mean you can still boot an 8086 kernel / bootloader on a modern PC.
8080 machine code is completely different from 8086: reg,reg instructions are 1-byte long (8080 opcode map), vs. most 8086 instructions specifying operands in a ModR/M byte. In 8080, the destination register is always part of the opcode (and ALU ops are mostly only available with
A (the accumulator) as the destination).
8080 asm source is also not the same as 8086 asm source: the register names are different, and so are many of the mnemonics. e.g.
ADI 123 for an add-immediate (implicitly to the accumulator, I think) or
ORA E to do
A |= E.
8086 has segmentation but 8080 doesn't (just a flat 16-bit address space).
You can write a program to mechanically translate from 8080 to 8086 asm source, but you can't just rebuild the same asm source for a different architecture. It's not even close to really being the same.
MichaelPetch says there were assemblers that could read 8080 source and output 8086 machine code (i.e. with the translation built in to the assembler, presumably with some fixed mapping between 8080 byte registers and 8086 AL/AH/BL/BH/...). IDK if they would ever have to emit multiple 8086 instructions for any 8080 mnemonics.
The manual for one such translator is XLT86™ 8080 to 8086 Assembly Language Translator USER'S GUIDE, from Digital Research, Inc.
This is not what I'd call "assembly-language compatible". It's close enough to enable translating single instructions separately (I think), but that's about it. You have to realize that by programming an 8086 using pure 8080 asm source, you're missing out on the power of 16-bit operations, and any 8086-specific optimizations.
Fun fact: Patrick Schlüter comments about chips that were binary compatible, not just source compatible. Contrast this with Intel chips which did not do this:
NEC V20/V30 were 80186 compatible CPUs that could explicitly execute 8080 binaries. They had 2 instructions that allowed to call and to trap into 8080 functions.
That's a similar idea to modern CPUs that support multiple machine-code formats, like ARM with Thumb vs. ARM, or early Itanium (IA-64) with hardware support for x86 machine code with some rules for mapping IA-32 register state onto the IA-64 registers. Or x86 protected (and long) mode with far calls between 16-bit and 32-bit (and 64-bit) code segments. (But not real mode; although it decodes the same as 16-bit protected mode, segment regs mean different things so real-mode code usually has to run inside vm86 mode, hardware virtualization under a 32-bit kernel.)
To supplement @PeterCordes's excellent answer, I thought it would be worth going into the details of exactly how close to source code compatible the two processors are -- for example, how easy would it be to use textual substitutions (e.g. macros) to automatically translate 8080 code to 8086 code, and what the limitations would be.
The first point would be to examine how the registers in the architecture can be mapped. Fortunately, the 8086 registers are effectively a superset of the 8080 registers, so we can map A to AL, BC to CX, DE to DX and HL to BX (this ends up with the registers in a non-intuitive order, as HL can be used for indirect memory addressing, which is better supported using BX than the other general purpose registers on the 8086 -- but note that this unusual ordering of the registers is actually reflected in their conversion to machine code, suggesting that while the mnemonics for the registers weren't named with 8080 compatibility in mind, the design of the instruction set was). Clearly SP and IP must map to the registers for the same purpose, as must the flag register, which conveniently has the bits with the same meanings in the same locations when it is stored elsewhere. But here we note the first incompatibility: the 8080 groups the A register with the flags register (referring to the combination as the 'processor status word') and handles them together as a unit (for example when pushing and popping to the stack), but in the 8086 both are expanded to a full 16 bits and handled individually.
This means that the following 8080 instructions have no single instruction that can perform the same operation on the 8086:
PUSH PSW ; "push af" for those who prefer Z80 syntax POP PSW ; "pop af"
To emulate these operations on the 8086 you'd need multiple instructions:
LAHF ; Load AH from low-order 8-bits of flags PUSH AX POP AX SAHF ; Store AH in low-order 8-bits of flags
Again, I have a suspicion that the LAHF and SAHF instructions were specifically designed to allow this translation -- they're a pretty unusual operation to support in most respects -- and the choice of AH (when AL would be the more usual target for such an operation) seems strongly to indicate that these instructions were added to make 8080 translation easier.
Looking through the table of instructions supported by the 8080, few others stand out as not being easy one-to-one translations, although as Peter Cordes points out many 1-byte instructions become 2-byte instructions on the 8086 (e.g.
MOV C,M or Z80 equivalent
ld c,(hl) which is 4Eh on the 8080 would convert to
MOV CL,[BX] or 8Ah 0Fh on the 8086, or
PCHL load program counter with HL - equivalent to 8086
CALL BX FFh D3h). Also slightly tricky are the
RST n instructions, which could plausibly convert to
INT nn, although there are subtle differences on the receiving end of the call ... but that would usually be system software, and I believe the intent was to allow easy application compatibility but that a complete rewrite of system software would have been expected. Another group of instructions that aren't supported are conditional calls and returns (e.g.
CNZ addr /
call nz, addr) which would need to be emulated in the 8086 by a conditional jump with the opposite condition skipping over the instruction.
There are two issues that are caused by the expansion of single byte instructions into two bytes:
It therefore seems reasonably simple to perform an automated one-to-one translation of code from 8080 to 8086; a macro assembler may well have been able to handle the translation, even if dedicated packages (as mentioned above) weren't available. It wouldn't work for all programs, but with a small increase in memory required, it should be reasonably simple to make most programs work successfully.
Another interesting question is to what extend the extended variants of the 8080 are also compatible? That is, either the 8085 or the Z80?
The relevant 8085 extensions are:
RIM(read interrupt masks) and
SIM(store interrupt masks) - the operations here are entirely unsupported by the 8086; machines using the 8086 typically use an external programmable interrupt control that provides the same feature.
RDELare undocumented 16-bit arithmetic instructions that are obviously well supported on the 8086
LDHI, interestingly, is (an undocumented) equivalent to the 8086s
LEA DX, [BX+n]instruction, which I'd previously thought to be entirely unique to the 8086.
LDSIis a parallel to
LDHIbut using SP as its source register: the 8086 equivalent would be
LEA DX, [SP+n], except that the 8086 doesn't support using SP like that ... you'd have to wait for the 80386 to get support for an equivalent instruction, and then only with 32-bit registers. You'd probably encode this as
MOV DI, SP; LEA DX, [DI+n]instead. Which is the first time I've seen a 4 byte instruction come out from a single byte instruction input...
LHLXare 16-bit (undocumented) indirect memory operations, of a kind quite natural on the 8086.
The Z80 is harder still: it extends the 8080's register set with not just a new pair of index registers (IX and IY, which could be mapped to DI and SI ... although the undocumented I[X/Y]H and I[X/Y]L instructions would have no direct equivalent there) but an entire duplicate set of registers (which cannot be mapped to anything, because we've run out of registers now). Any application using the Z80's
ex af, af' instructions would be difficult to translate automatically. Other Z80 instructions are easier, e.g.
djnz has no exact equivalent (the 8086's
LOOP CX is a 16-bit equivalent, but there is no 8 bit version), and instructions like
ldd, etc are broadly (although not precisely) equivalent to the 8086's string processing instructions (e.g.
MOVSB) and repeat prefix (
REP MOVSB being roughly equivalent to
ldir) -- although the precise details are different, meaning some register remapping may be necessary to make them work. On the whole, the possibility of doing automatic translation of Z80 programs is a whole lot less convincing.
© 2012 - Nathan Osman - [About]