The 80x86 Opcode Summary

Prepared by Tom Novelli, August 2003

Revised, August 2009

Caveat Emptor - Missing Pieces

The reason I wrote this was to show how the core 80x86 instruction set is organized, as concisely as possible, to help me write compilers and assemblers for use under Linux and other "modern" systems.

The non-core instructions are relatively straightforward, but you won't find anything about them here. What's missing:

FPU instructions

MMX, SSE, SSE2/3/4+, and other extensions

Supervisor mode instructions

Segmentation cruft

16-bit "Real Mode" cruft

Notation & Terminology

All numbers are in OCTAL unless otherwise indicated. Viewed in the proper radix, the x86 is a thing of beauty, almost :-)

Here's the general format of machine code instructions, i.e. the instruction encoding:

prefix(es) opcode [xrm [sib]] disp imm

prefixes

Zero or more prefix bytes that affect the following opcode.

opcode

One or two opcode bytes encode the instruction, i.e. MOV, ADD, SUB, etc. Some simple instructions (e.g. PUSH, POP) encode their operands in the opcode.

xrm

Also called the ModRM or Mod-Reg-R/M byte: more on this in the next section.

sib

Scale-Index-Base byte: an "overflow byte" for XRM; more later.

disp

Displacement: a 32-bit absolute memory address, or, in certain forms of CALL/JMP instructions, a short (8-bit) or near (32-bit) relative offset.

imm

Immediate: i.e., a constant, a literal value. Used when the source operand is hard-coded in the instruction. Can be 8, 16, or 32 bits depending on prefixes and opcode.

A few more terms and abbreviations used below:

word

The machine's word size, 32-bit in 32-bit mode, and so forth. Default for most instructions; can be overridden by OPSIZ and ADRSIZ prefixes.

reg

Register operand

r/m

Register-or-Memory operand

Opcode Encodings

Here's an example:

Encoding Mnemonic Notes

210+dw xrm MOV r/m, reg

210 is the base opcode, in octal.

+dw means you may add a direction flag to bit 1 of the opcode, and you may add a word-size flag to bit 0.

xrm means an XRM byte must follow the opcode.

Anywhere there's a letter in place of an octal digit, substitute a 3-bit number (2-bit for the most significant digit, of course).

Most opcodes include one or two of the following flags, almost always in these bits:

+w bit 0 Word size: 0=byte, 1=word

+d bit 1 Direction: reverse src,dest (Applies to MOV, ALU)

+s bit 1 Sign-extend imm8 to word (Applies to PUSH, ALU, IMUL3)

Encoding	Mnemonic	Notes
210+dw xrm	MOV r/m, reg

Operand Encodings

Most of the instruction set's apparent complexity can be factored out as follows.

Register Encodings

Reg No. 0 1 2 3 4 5 6 7

byte AL CL DL BL AH CH DH BH

word AX CX DX BX SP BP SI DI

dword EAX ECX EDX EBX ESP EBP ESI EDI

sreg ES CS SS DS FS GS

These registers all have names corresponding with their special purpose. Notice that they're sorted by their encoding, not alphabetically as in all the manuals. (As if machine code hacking wasn't confusing enough :-)

AX = Accumulator (for arithmetic, logic, etc.)

CX = Counter (for LOOP, REP, etc.)

DX = Data (e.g. in IN/OUT) or Double (e.g. in MUL/DIV)

BX = Base (for base+displacement addressing)

SP = Stack Pointer

BP = Base Pointer (aka Stack Frame Pointer)

SI = Source Index (for string operations)

The segment registers are nearly obsolete. The CS (Code), SS (Stack), and DS (Data) segments are simply set to 0 under any Unix-like OS. This makes it a lot easier to do nifty things like JIT, "Cheney on the MTA" GC, etc.

Reg No.	0	1	2	3	4	5	6	7
byte	AL	CL	DL	BL	AH	CH	DH	BH
word	AX	CX	DX	BX	SP	BP	SI	DI
dword	EAX	ECX	EDX	EBX	ESP	EBP	ESI	EDI
sreg	ES	CS	SS	DS	FS	GS

XRM (ModRM) Encoding

When an instruction involves two operands, they are usually encoded in the second byte, called the "XRM" byte (officially it's "ModRM" but that doesn't fit neatly into our little ASCII diagrams!) If a memory reference is involved (whether absolute or relative to a register), then we call it the displacement; it can be a full 32 bits, shown as disp32 below, or it can be a "short" 8 bits, shown as disp8.

xxrrrmmm (binary), best viewed as 3 octal digits:

x: modifier flag (indicates one of four cases)

r: register (normally the source operand)

m: reg/mem (normally the destination operand)

Encoding Reg/Mem Addressing Mode Notes

0rm DS:[base]

0r5 disp32 DS:[disp32] Can't use EBP as Base with no disp

1rm disp8 DS:[base + disp8]

2rm disp32 DS:[base + disp32]

xr4 sib SIB byte follows (see below) Can't use ESP as Base

3rm reg

Note: Special cases are shown in italics.

Encoding	Reg/Mem Addressing Mode	Notes
0rm	DS:[base]
0r5 disp32	DS:[disp32]	Can't use EBP as Base with no disp
1rm disp8	DS:[base + disp8]
2rm disp32	DS:[base + disp32]
xr4 sib	SIB byte follows (see below)	Can't use ESP as Base
3rm	reg

SIB (Scale*Index+Base) Encoding

More complex memory references are encoded in a second "SIB" byte, which always follows an "XRM" byte.

ssiiibbb (binary), again, best viewed as 3 octal digits:

s: Scale (multiplier for the Index register)

i: Index register

b: Base register

Encoding Reg/Mem Addressing Mode Notes

0r4 sib DS:[base + scale*index]

0r4 si5 disp32 DS:[scale*index + disp32] Can't use EBP as Base w/o disp

1r4 sib disp8 DS:[base + scale*index + disp8]

2r4 sib disp32 DS:[base + scale*index + disp32]

xr4 04b DS:[base] Can't use ESP as Index

xr4 s4b --- Undefined if s > 0

Encoding	Reg/Mem Addressing Mode	Notes
0r4 sib	DS:[base + scale*index]
0r4 si5 disp32	DS:[scale*index + disp32]	Can't use EBP as Base w/o disp
1r4 sib disp8	DS:[base + scale*index + disp8]
2r4 sib disp32	DS:[base + scale*index + disp32]
xr4 04b	DS:[base]	Can't use ESP as Index
xr4 s4b	---	Undefined if s > 0

Instruction Encodings

MOV

MOV dest, src

Encoding Mnemonic Notes

210+dw xrm MOV r/m, reg

214+d xsm MOV r/m, sreg Segments not used in Unix

240+dw disp MOV acc, mem

26r imm8 MOV reg, imm8

27r imm32 MOV reg, imm32 bit 3 = 'W' bit

306+w xrm imm MOV r/m, imm

Encoding	Mnemonic	Notes
210+dw xrm	MOV r/m, reg
214+d xsm	MOV r/m, sreg	Segments not used in Unix
240+dw disp	MOV acc, mem
26r imm8	MOV reg, imm8
27r imm32	MOV reg, imm32	bit 3 = 'W' bit
306+w xrm imm	MOV r/m, imm

LEA

LEA dest, src -- Load effective address (store address of src in dest)

215 xrm LEA reg, r/m

XCHG

206+w xrm XCHG reg, r/m

22r XCHG EAX, reg (XCHG EAX,EAX = 220 = NOP :-)

Arithmetic and Logic (ALU)

??? dest, src

Eight instructions following a pattern, with three different forms.

0p0+dw xrm ??? r/m, reg

0p4+w imm ??? acc, imm

200+sw xpm imm ??? r/m, imm 202 (extend word->byte) is invalid

... p=0 ADD

... p=1 OR

... p=2 ADC

... p=3 SBB

... p=4 AND

... p=5 SUB

... p=6 XOR

... p=7 CMP Read-only SUB; sets FLAGS only.

TEST dest, src

Read-only AND; sets FLAGS but discards its result.

204+w xrm TEST reg, r/m

250+w imm TEST acc, imm

366+w x0m imm TEST r/m, imm

??? src

Accumulator is implicit destination.

366+w x2m NOT r/m

366+w x3m NEG r/m

366+w x4m MUL r/m

366+w x5m IMUL r/m

366+w x6m DIV r/m

366+w x7m IDIV r/m

IMUL dest, src (Integer Multiply) IMUL src, src, dest (Strange RISC-like form :-)

017 257 xrm imm IMUL reg, r/m

151+(2w) xrm imm IMUL reg, r/m, imm (r/m * imm -> reg)

INC/DEC

10r INC reg32

11r DEC reg32

376+w x0m INC r/m

376+w x1m DEC r/m

PUSH/POP

Same pattern as INC/DEC.

12r PUSH reg32

13r POP reg32

150+s imm PUSH imm

377 x6m PUSH r/m

217 x0m POP r/m

140 PUSHA

141 POPA

Shift/Rotate

Eight instructions following a pattern, with three different forms.

300+w xpm imm8 ??? r/m, imm Rotate by a number (modulo opsize)

320+w xpm ??? r/m, 1 Rotate by one

322+w xpm ??? r/m, CL Rotate by value in CL register

... p=0 ROL

... p=1 ROR

... p=2 RCL

... p=3 RCR

... p=4 SHL/SAL

... p=5 SHR

... p=7 SAR

BCD Conversion

047 DAA

057 DAS

067 AAA

077 AAS

324 012 AAM 012 specifies base 10. Some 80x86 chips accept others.

325 012 AAD

Zero/Sign Extend

Load a byte into a full-width register.

017 266+w MOVZX reg, r/m8 Zero-extend byte to word

017 276+w MOVSX reg, r/m8 Sign-extend byte to word

Sign-extend the accumulator; typically used before DIV.

230 CBW / CWDE Half-width to full-width (AX -> EAX)

231 CWD / CDQ Full-width to double-width (EAX -> EAX:EDX)

Control Transfer

160+cc disp8 Jcc (short)

017 200+cc disp32 Jcc (near)

017 220+cc x0m SETcc r/m8

340 disp8 LOOPNE

341 disp8 LOOPE

342 disp8 LOOP

343 disp8 JCXZ

350 disp CALL disp Relative displacement

377 x2m CALL r/m Absolute address

351 disp JMP disp Relative

377 x4m JMP r/m Absolute

303 RET

302 imm16 RET imm Drops N locals from stack

310 imm32 imm8 ENTER locals, nesting Considered obsolete

311 LEAVE (ditto)

313 RET FAR Pops CS:IP

312 imm16 RET FAR imm ... and drops N locals

Condition Codes (note that most Jcc instructions have several aliases):

cc Mnemonics Flags Operation Long-Winded Name

00 o OF=1 Overflow

01 no OF=0 Not Overflow

02 c b nae CF=1 < unsigned u< Carry / Below unsigned

03 nc nb ae CF=0 > unsigned Not Carry / Not Below / Above/Equal

04 z e ZF=1 == Zero / Equal

05 nz ne ZF=0 != Not Zero / Not Equal

06 be na CF=1 & ZF=1 <= unsigned Below/Equal / Not Above

07 nbe a CF=0 & ZF=0 > unsigned Above / Not Below/Equal

10 s SF=1 < 0 Sign bit (Negative)

11 ns SF=0 >= 0 Not Sign (Positive)

12 p pe PF=1 Parity (Even)

13 np po PF=0 No Parity (Odd)

14 l nge SF<>OF < Less / Not Greater/Equal

15 nl ge SF==OF >= Not Less / Greater/Equal

16 le ng ZF=1 | SF<>OF <= Less/Equal / Not Greater

17 nle g ZF=0 & SF==OF > Not Less/Equal / Greater

cc	Mnemonics	Flags	Operation	Long-Winded Name
00	o	OF=1		Overflow
01	no	OF=0		Not Overflow
02	c b nae	CF=1	< unsigned	u< Carry / Below unsigned
03	nc nb ae	CF=0	> unsigned	Not Carry / Not Below / Above/Equal
04	z e	ZF=1	==	Zero / Equal
05	nz ne	ZF=0	!=	Not Zero / Not Equal
06	be na	CF=1 & ZF=1	<= unsigned	Below/Equal / Not Above
07	nbe a	CF=0 & ZF=0	> unsigned	Above / Not Below/Equal
10	s	SF=1	< 0	Sign bit (Negative)
11	ns	SF=0	>= 0	Not Sign (Positive)
12	p pe	PF=1		Parity (Even)
13	np po	PF=0		No Parity (Odd)
14	l nge	SF<>OF	<	Less / Not Greater/Equal
15	nl ge	SF==OF	>=	Not Less / Greater/Equal
16	le ng	ZF=1 \| SF<>OF	<=	Less/Equal / Not Greater
17	nle g	ZF=0 & SF==OF	>	Not Less/Equal / Greater

Flags

234 PUSHF Push full FLAGS register

235 POPF Pop to FLAGS; certain flags protected for security

236 SAHF Store AH -> FLAGS (only affects SF,ZF,AF,PF,CF)

237 LAHF Load low byte of FLAGS -> AH

365 CMC Complement CF (carry flag)

370 CLC Clear CF

371 STC Set CF

372 CLI Clear IF (disable hardware interrupts)

373 STI Set IF (enable hardware interrupts)

374 CLD Clear DF (string operations go forward)

375 STD Set DF (string operations go backward)

Strings

244+w MOVS

246+w CMPS

252+w STOS

254+w LODS

256+w SCAS

154+w INS Operands acc, DX implied

156+w OUTS Operands DX, acc implied

IN/OUT

IN value, port OUT port, value

Notice the redundant assembler mnemonics. Your options are really very limited with IN/OUT :-)

344+w imm8 IN acc, port

346+w imm8 OUT port, acc

354+w IN acc, DX

356+w OUT DX, acc

Prefixes

I'll just list the opcodes. Refer to a processor manual for usage details.

146 OPSIZ

147 ADRSIZ

360 LOCK

363 REP (same as REPE, REPZ)

362 REPNE (also REPNZ)

Miscellaneous

017 31r BSWAP reg

364 HLT

315 imm8 INT imm8

316 INT0

314 INT3

317 IRET

360 LOCK

220 NOP

233 WAIT

327 XLAT Equivalent to AL = [EBX+AL]

64-bit Extensions

This is known variously as x64, AMD 64, Intel 64, IA-32e, EM64T. They're all the same, with minor exceptions. (Not to be confused with Itanium/IA-64)

Main changes to the core opcodes:

RIP-relative addressing for easy position-independent code (PIC)

ModRM encodings with Mod=00 are RIP-relative in 64-bit mode. No REX

SIB encodings are still absolute, always.

64-bit registers: RAX, RBX, etc. (expanded versions of EAX, etc.) These can still be accessed in 32, 16, or 8 bit slices as before. They're still special-purpose.

Eight new general-purpose registers: R8..R15. These generally behave the same the first eight, but aren't implicitly used by any instructions (I don't think they are). R12 and R13 have quirks corresponding to ESP and EBP.

Use the REX prefix byte to access the new registers.

REX prefix encoding: 0100wrxb (binary)

w: Sets 64-bit operand size (default is still 32-bit)

r: Bit 3 of Reg in ModRM

x: Bit 3 of Index in SIB

b: Bit 3 of Base in ModRM or SIB

The REX prefix replaces the single-byte INC/DEC opcodes (10r/11r).

Other changes:

Sixteen 128-bit SIMD registers: XMM0..XMM15

Eight 80-bit FPU registers: FPR0..FPR7, aka ST(0)..ST(7). (Same registers, different instructions?)

NX (No eXecute) page bit

Cruft removed (it's still there, but there are no opcodes for it in 64-bit mode):

Segment registers have no effect (except FS and GS which were retained out of pity for Microsoft)

TSS (Task State Segments, intended for multitasking but not actually needed)

V86 mode (for virtualizing ancient 16-bit programs... use an emulator now.)

+w	bit 0	Word size: 0=byte, 1=word
+d	bit 1	Direction: reverse src,dest (Applies to MOV, ALU)
+s	bit 1	Sign-extend imm8 to word (Applies to PUSH, ALU, IMUL3)

206+w xrm	XCHG reg, r/m
22r	XCHG EAX, reg	(XCHG EAX,EAX = 220 = NOP :-)

0p0+dw xrm	??? r/m, reg
0p4+w imm	??? acc, imm
200+sw xpm imm	??? r/m, imm	202 (extend word->byte) is invalid
... p=0	ADD
... p=1	OR
... p=2	ADC
... p=3	SBB
... p=4	AND
... p=5	SUB
... p=6	XOR
... p=7	CMP	Read-only SUB; sets FLAGS only.

204+w xrm	TEST reg, r/m
250+w imm	TEST acc, imm
366+w x0m imm	TEST r/m, imm

366+w x2m	NOT r/m
366+w x3m	NEG r/m
366+w x4m	MUL r/m
366+w x5m	IMUL r/m
366+w x6m	DIV r/m
366+w x7m	IDIV r/m

017 257 xrm imm	IMUL reg, r/m
151+(2w) xrm imm	IMUL reg, r/m, imm	(r/m imm -> reg)*

12r	PUSH reg32
13r	POP reg32
150+s imm	PUSH imm
377 x6m	PUSH r/m
217 x0m	POP r/m
140	PUSHA
141	POPA

300+w xpm imm8	??? r/m, imm	Rotate by a number (modulo opsize)
320+w xpm	??? r/m, 1	Rotate by one
322+w xpm	??? r/m, CL	Rotate by value in CL register
... p=0	ROL
... p=1	ROR
... p=2	RCL
... p=3	RCR
... p=4	SHL/SAL
... p=5	SHR
... p=7	SAR

047	DAA
057	DAS
067	AAA
077	AAS
324 012	AAM	012 specifies base 10. Some 80x86 chips accept others.
325 012	AAD

017 266+w	MOVZX reg, r/m8	Zero-extend byte to word
017 276+w	MOVSX reg, r/m8	Sign-extend byte to word

230	CBW / CWDE	Half-width to full-width (AX -> EAX)
231	CWD / CDQ	Full-width to double-width (EAX -> EAX:EDX)

160+cc disp8	Jcc (short)
017 200+cc disp32	Jcc (near)
017 220+cc x0m	SETcc r/m8
340 disp8	LOOPNE
341 disp8	LOOPE
342 disp8	LOOP
343 disp8	JCXZ
350 disp	CALL disp	Relative displacement
377 x2m	CALL r/m	Absolute address
351 disp	JMP disp	Relative
377 x4m	JMP r/m	Absolute
303	RET
302 imm16	RET imm	Drops N locals from stack
310 imm32 imm8	ENTER locals, nesting	Considered obsolete
311	LEAVE	(ditto)
313	RET FAR	Pops CS:IP
312 imm16	RET FAR imm	... and drops N locals

234	PUSHF	Push full FLAGS register
235	POPF	Pop to FLAGS; certain flags protected for security
236	SAHF	Store AH -> FLAGS (only affects SF,ZF,AF,PF,CF)
237	LAHF	Load low byte of FLAGS -> AH
365	CMC	Complement CF (carry flag)
370	CLC	Clear CF
371	STC	Set CF
372	CLI	Clear IF (disable hardware interrupts)
373	STI	Set IF (enable hardware interrupts)
374	CLD	Clear DF (string operations go forward)
375	STD	Set DF (string operations go backward)

244+w	MOVS
246+w	CMPS
252+w	STOS
254+w	LODS
256+w	SCAS
154+w	INS	Operands acc, DX implied
156+w	OUTS	Operands DX, acc implied

344+w imm8	IN acc, port
346+w imm8	OUT port, acc
354+w	IN acc, DX
356+w	OUT DX, acc

146	OPSIZ
147	ADRSIZ
360	LOCK
363	REP	(same as REPE, REPZ)
362	REPNE	(also REPNZ)

017 31r	BSWAP reg
364	HLT
315 imm8	INT imm8
316	INT0
314	INT3
317	IRET
360	LOCK
220	NOP
233	WAIT
327	XLAT	Equivalent to `AL = [EBX+AL]`