Instruction Sets

 home  Bib Algorithms Bioinfo FP Logic MML Prog.Lang and the mmlist

Prog'Langs
glossary
Instn sets
#Reg-file
Block Str.

Below are the rudiments of typical computer instruction-sets as they affect code-generation by a compiler. The main kinds of machines are 0-address, 1-address, 2-address, and 3-address machines. A particular computer may have instructions of only one kind, or it may have a mixture of instructions of two or more kinds.

General

Typical operation-code mnemonics:

<opcd> ::= load | sto (store) | add | sub | rev_sub | mult | div | rev_div | and | or | compare |...
<op> ::= + | - | rev- | * | % | rev% | & | or | compare |...
<cond> ::= eq | ne | le | lt | ge | gt |...

note that (x `rev-` y) is (y-x),  and that (x `rev%` y) is (y%x).

Division (div) typically leaves the result and the remainder in a pair of registers on a 2-address machine, possibly an even-odd pair.

There may be instructions for
bit-shifting (left | right) and (arithmetic | logical | circular).

As well is general-purpose integer-|index-registers, there may also be separate registers for floating-point arithmetic (possibly single and double precision).

The following ignores byte- | half-word- | word- | double-word |... addressing. It assumes that memory is in units all of one size -- the word. In fact many machines use byte-addressing where a word is probably 4 bytes (?maybe 8 bytes on a 64-bit machine?) and is probably aligned on boundaries that are multiples of that size. In that case an op-code would also indicate if it stands for a half-word, word or double-word operation where necessary.

where the processor has a stack and some supporting hardware, at least a top of stack (TOS) pointer.

operation e.g. or comment
TOS:=TOS+1; stack[TOS]:=<int>
load a constant onto the top of stack; this can be used in arithmetic or to get an address onto the stack for use by a load or a store instruction later (it is splitting hairs to argue whether the literal is an address or a constant which might happen to be used as an address elsewhere)
stack[TOS]:=memory[stack[TOS]]
take the top-of-stack as an address, replace the top-of-stack with the contents of that address.
sto effect:
memory[stack[TOS-1]]:=stack[TOS]; TOS:=TOS-2
store contents of top of stack at the address in stack[TOS-1] then pop the value and the address
<opcd> where <opcd> is add | sub |...
effect:
stack[TOS-1] := stack[TOS-1] <op> stack[TOS];
TOS:=TOS-1
possible code generated for   x := y + z:

sto

where @x is the address of x; the compiler retrieves the address of a variable from the compiler's lookup-table, e.g., @x might be 36, say.

Some possible extensions to the instruction set include:
instructions (such as permute, or duplicate) to operate on elements near the top of the stack (which can make up for the lack of reverse-subtract or reverse-divide)

where the processor has a special accumulator which is implicitly used in each arithmetic operation. The accumulator could be the only register, or there could also be one or more index-registers in which case there would very probably be accumulator-to-indexReg transfer operations. In the former case an address is limited to being an integer constant; in the latter case indexed addresses would be allowed.

operation alternative notation e.g. or comment
load_indirect1 <addr> acc := [<addr>] use the word at the address as a pointer to the word to be loaded into acc; something like this is necessary if there are no index registers, or...
load_indirect2 acc := [acc] ... use the contents of the acc as a pointer to the word to be loaded into the acc; something like this is necessary if there are no index registers
store indirect, similar considerations to load indirect
possible code generated for   x := y + z:

sto @x   --x := acc

the compiler retrieves the address of a variable from the compiler's lookup-table.

where
<reg> ::= R0 ... R15, say.
general-purpose index- | integer- registers.
if present the register is an index and it and the offset are added to give the address.

In some machines only the early registers, e.g., R0-R3, can be used for indexing.

In some 2-address machines indexing with R0 is used to denote do not index, in which case <offset>[R0] is equivalent to <offset>.
operation alternative notation e.g. or comment
R7:=R3
transfer from one register to another
<opcd> <reg>,<reg> <reg> <op>= <reg> add R7, R3, or
R7 += R3,
arithmetic operation on two registers
R7 := 36[R2].
load the register with the contents of the word at the given address
36[R2] := R2,
store the contents of the register in the word at the given address
R3+=36[R2]
jmp <reg> jmp <reg> unconditional jump to the address held in the register
save the program counter in the register (e.g., for procedure return) and then jump to the address
possible code generated for   x := y + z:

load R5, @y   --R5 := y
add R5, @z   --R5 += z
sto R5, @x   --x := R5

the compiler retrieves the address of a variable from the compiler's lookup-table. Note that local, non-local and global variables in a block-structured language require different, perhaps more complex, code sequences to access them.

Each instruction above applies to a register and a "full" memory address. Arguably a register is a restricted kind of address, so you might call them 1.5-address instructions. There may also be some 2-full-address instructions. These are in fact necessary for any operation on arbitrarily long operands such as strings or decimal numbers; there is also the matter of specifying an operand's length -- either in the instruction, or in the operand, or in a register.

10[R3] += 20[R6]

where three-address instructions act on values in memory, not in registers.

operation alternative notation e.g. or comment
10[R1]:=20[R2]+30[R3]
Combine the second and third operands using the operator (e.g., +) and store the result in the location indicated by the first address.
possible code generated for   x := y + z:

operation alternative notation e.g. or comment
<opcd> <reg><reg><reg> <reg> := <reg><op><reg> add R2 R3 R4, or
R2:=R3+R4
etc.
possible code generated for   x := y + z:
 load R1, @y --R1 := y load R2, @z --R2 := z add R6, R1, R2 --R6 := R1 + R2 sto R6, @x --x := R3

On the CDC 6600 machines, address-registers, A0-A7, were paired with general arithmetic registers, R0-R7 (floating-point values, or integer values stored as unnormalised floats). Loading an address into A0-A5 (A6-A7) caused the corresponding general arithmetic register to be loaded from (stored into) the memory word having that address. There were 3-address instructions on the general arithmetic registers:

possible code generated for   x := y + z:
A1 :=literal @y   --R1 := y
A2 :=literal @z   --R2 := z
R6 := R1 + R2
A6 :=literal @x   --x := R6 (!)

Note that this form of instruction is efficient for stepping through an array because adding one to an A-register causes the corresponding R-register to be loaded from (stored in) the array.

The zero-address machine needs more instructions for x := y + z than the 3-address machine, say, but they are shorter instructions so the 0-address machine does not necessarily have a performance disadvantage, and the 3-address machine does not necessarily have a performance advantage, because of this.

-- LA, Dept. Comp. Sci., U.W.A., & Dept. Comp. Sci., Monash U..
window on the wide world:
 The Darwin Awards V: Next Evolution

 Linux  Ubuntu free op. sys. OpenOffice free office suite, ver 3.4+ The GIMP ~ free photoshop Firefox web browser FlashBlock like it says!

 © L. Allison   http://www.allisons.org/ll/   (or as otherwise indicated), Faculty of Information Technology (Clayton), Monash University, Australia 3800 (6/'05 was School of Computer Science and Software Engineering, Fac. Info. Tech., Monash University, was Department of Computer Science, Fac. Comp. & Info. Tech., '89 was Department of Computer Science, Fac. Sci., '68-'71 was Department of Information Science, Fac. Sci.) Created with "vi (Linux + Solaris)",  charset=iso-8859-1,  fetched Thursday, 20-Jun-2019 19:01:04 AEST.