History of Intel processors and architectures

The content is omitted here

Compiling Into Assembly

C Code(sum.c)

1
2
3
4
5
6
long plus(long x, long y);

void sumstore(long x, long y, long *dest) {
long t = plus(x, y);
*dest = t;
}

Generated x86-64 Assembly

1
2
3
4
5
6
7
sumstore:
pushq %rbx
movq %rdx, %rbx
call plus
movq %rax, (%rbx)
popq %rbx
ret

Warning: Will get very different results on different machines(Linux, Mac OS) due to different versions of gcc and different compiler settings.

Assembly Characteristics: Data Types

Integer data of 1, 2, 4 or 8 bytes

In integer data types, they don’t distinguish sign and unsigned, and even address or pointer is just stored as a number in computer, doesn’t have any special significance to it.

Floating point data of 4, 8, or 10 bytes

A floating point is handled in a very different way, they use different set of registers.

Code

Code: Byte sequences encoding series of instructions.

The program in x86 is just a series of bytes.

No aggregate types such as arrays or structures, they don’t exist at the machine level, they’re sort of constructed artificially by the compiler, just contiguously allocated bytes in memory.

Assembly Characteristics: Operations

Each instruction is very limited in what it can do.

Perform arithmetic function on register or memory data

Transfer data between memory and register

Load data from memory into register

Store register data into memory

Linker

Resolves references between files

Combines with static run-time libraries(eg. code for malloc, printf)

Some libraries are dynamically linked.

Machine Instructions Example

C Code

1
*dest = t

Store value t where designated by dest.(Put star in front of it means want it referenced as a point, this instruction store a number of value at that place where the pointer pointing to.)

Assembly

1
movq	%rax, (rbx)

Move 8-byte value to memory

Operand: t: Register %rax, dest: Register %rbx, *dest: Memory M[%rbx]

Object Code

1
0x40059e:	48 89 03

3-byte instruction, stored at address 0x40059e

Disassmbly

The content is omitted here

X86-64 Integer Registers

%rsp, %rdi, %rsi…

if you the sort of %r name of it, you will get 64 bits, and %e means 32 bits.

%rsp is stack pointer, it has a very specific role.

Moving Data

Moving Data

movq Source, Dest

Operand Types

Immediate: Constant integer data

e.g: $0x400, $-533

Like c constant, but prefixed with $, encoding with 1, 2, or 4 bytes.

Register: One of 16 integer registers

e.g: %rax, %r13

But %rsp reserved for special use

Others have special uses for particular instructions

Memory: 8 consecutive bytes of memory at address given by register.

Simplest example: (%rax)

movq Operand Combinations

Cannot do memory-memory transfer with a single instruction.

Simple Memory Addressing Modes

1
movq (%rcx), %rax

Register R specifies memory address.

1
movq 8(%rbp), %rdx

Displacement D® Mem[Reg[R]+ D]

Register R specifies start of memory region, constant displacement D specifies offset.

That is useful for accessing different data structures.

Example of Simple Addressing Modes

1
2
3
4
5
6
void swap(long *xp, long *yp) {
long t0 = *xp;
long t1 = *yp;
*xp = t1;
*yp = t0;
}
1
2
3
4
5
6
swap:
movq (%rdi), %rax
movq (%rsi), %rdx
movq %rdx, (%rdi)
movq %rax, (rsi)
ret

Address Computation Instruction

leaq Src, Dst

Src is address mode expression

Set Dst to address denoted by expression

The destination has to be a register, and the source should be one of these memory references.

It actually writes that address not the memory value, but the value of got that got computed directly to the register

Example

1
2
3
long m12(long x) {
return x*12
}
1
2
leaq	(%rdi, %rdi, 2), %rax  #t <- x + x * 2
salq $2, %rax #returen t << 2

It just computer three times %rdi. It adds%rdi + %rdi*2.