Instruction Pointer

%eip or %rip

It contains the address of the currently executing instruction, we can’t access it in a normal way, but there are tricks what we can find out what the value of that is.

So it just tells us where in the program, what part of the program is currently being executed.

Condition Codes

Implicit Setting

CF Carry Flag (for unsigned)

If adding two unsigned numbers, and do the binary arithmetic, sometimes an extra one pops out of the left side,

and it can’t be contained in the 32, 64, 16, or 8-bit result.

SFSign Flag (for signed)

If the signed value computed is minus then the SF will be 1.

ZF Zero Flag

OF Overflow Flag(for signed)

Those four flags get set as a sort of normal activity by many of instructions, except lea instruction

Explicit Setting: Compare

cmpq Src2, Src1

cmpq b a like computing a - b without setting destination

The Conditions Codes will be set in the same way.

The comparison is like a subtraction except it don’t do anything with the result, but it will set there four condition flags.

Explicit Setting: Test

testq Src2 Src1

testq b a like computing a & b without setting destination.

Sets condition codes based on value of Src1 & Src2

Src2 can equal to Src1.

ZF set when a & b == 0

SF set when a & b < 0

Reading Condition Codes

Usually we not read the condition codes directly, we use the set instruction to do it.

The set instruction will sets a single byte of a single register to either 1 or 0.

register low-order byte
%rax %al
%rbx %bl
%rsp %spl

The l means low in there.

x86-64 Integer register can directly set the lowest order byte of it to either 1 or 0,

and it doesn’t affect any of the other 7 bytes of that register.

Cont

Does not alter remaining bytes.

Typically use movzbl to finish job, 32-bit instructions also set upper 32 bits to 0.

1
2
3
int get(long x, long y) {
return x > y;
}
1
2
3
4
cmpq	%rsi,	%rdi #Compare x:y
setg %al #Set when >
movzbl %al, %eax #Zero rest of %rax
ret
Register Use(s)
%rdi Argument x
%rsi Argument y
%rax Return Value

Conditional branches

JX Instructions

Jump to different part of code depending on condition codes

Example(Old Style)

Generation: gcc -Og -S -fno-if-conversion xxx.c

1
2
3
4
5
6
7
8
long absdiff(long x, long y) {
long result;
if (x > y)
result = x - y;
else
result = y - x;
return result;
}
1
2
3
4
5
6
7
8
9
10
absdiff:
cmpq %rsi, %rdi # x:y
jle .L4
movq %rdi, %rax
subq %rsi, %rax
ret
.L4:
movq %rsi, %rax
subq %rdi, %rax
ret
Register Use(s)
%rdi Argument x
%rsi Argument y
%rax Return value

Expressing with Goto Code

c allows goto statement, jump to position designated by label

Use this way of presenting code, just so that we can look at and understand what these control structures look like without having to having to scrutinize the low-level assembly code instructions.

1
2
3
4
5
6
7
8
long absdiff(long x, long y) {
long result;
if (x > y)
result = x - y;
else
result = y - x;
return result;
}
1
2
3
4
5
6
7
8
9
10
11
long absdiff_j(long x, long y) {
long result;
int ntest = x <= y;
if (ntest) goto Else;
result = x - y;
goto Done;
Else:
result = y - x;
Done:
return result;
}

General Conditional Expression Translation

Using Branches

C Code:val = Test ? Then_Expr : Else_Expr

Goto Version:

1
2
3
4
5
6
7
8
	ntest = !Test;
if (ntest) goto Else;
val = Then_Expr;
goto Done;
Else:
val = Else_Expr;
Done:
...
  • Create separate code regions for then & else expressions
  • Execute appropriate one

Using Conditional Moves

Instruction supports: if (Test)Dest ⬅ Src

gcc tries to use them only when known to be safe.

C Code:val = Test ? Then_Expr : Else_Expr

Goto Version:

1
2
3
4
5
result = Then_Expr;
eval = Else_Expr;
nt = !Test;
if (nt) result = eval;
return result;

Why:

  • Branches are very disruptive to instruction flow through pipelines
  • Conditional moves do not require control transfer

A detailed explanation: https://www.zhihu.com/question/441518636/answer/1701252133

Example
1
2
3
4
5
6
7
8
long absdiff(long x, long y) {
long result;
if (x > y)
result = x - y;
else
result = y - x;
return result;
}
1
2
3
4
5
6
7
8
absdiff:
movq %rdi, %rax #x
subq %rsi, %rax #result = x - y
movq %rsi, %rdx
subq %rdi, %rdx #eval = y - x
cmpq %rsi, %rdi #x:y
cmovle %rdx, %rax #if <=, result = eval
ret
Register Use(s)
%rdi Argument x
%rsi Argument y
%rax Return value
Bad Cases for Conditional Move
Expensive Computations:

val = Test(x) ? Hard1(x) : Hard2(x)

  • Both values get computed
  • Only makes sense when computations are very simple
Risky Computation

val = p ? *p : 0

  • Both values get computed
  • May have undesirable effects(pointer)
Computations with side effects

val = x > 0 ? x *= 7 : x += 3

  • Both values get computed
  • Must be side-effect free

Do-While Loop Example

Count number of 1’s in argument x(“popcount”)

C Code

1
2
3
4
5
6
7
8
long pcount_do(unsigned long x) {
long result = 0;
do {
result += x & 0x1;
x >>= 1;
} while(x);
return result
}

Goto Version

1
2
3
4
5
6
7
8
long pcount_goto(unsigned long x) {
long result = 0;
loop:
result += x & 0x1;
x >>= 1;
if (x) goto loop;
return result;
}

Assembly:

Register Use(s)
%rdi Argument x
%rax result
1
2
3
4
5
6
7
8
	move	$0,		%eax	#result = 0
.L2: #loop:
movq %rdi, %rdx
andl $1, %edx #t = x & 0x1 (000...01) so t = 0 or 1
addq %rdx, %rax #result += t
shrq %rdi #x >>= 1
jump .L2 #if(x) goto loop
ret

General “Do-while” Translation

C Code

1
2
3
do
Body
while (test)

Goto Version

1
2
3
4
loop:
Body
if (Test)
goto loop

General “while” Translation #1

“Jump-to-middle” translation

Used with -Og

While version

1
2
while (Test)
Body

Goto Version

1
2
3
4
5
6
7
	goto test;
loop:
Body
test:
if (Test)
goto loop;
done:

While Loop Example #1

C Code

1
2
3
4
5
6
7
8
long pcount_while(unsigned long x) {
long result = 0;
while (x) {
result += x & 0x1;
x >>= 1;
}
return result;
}

Jump to Middle

1
2
3
4
5
6
7
8
9
10
long pcount_goto_jtm(unsigned long x) {
long result = 0
goto Test;
Loop:
result += x & 0x1;
x >>= 1;
Test:
if (x) goto Loop;
return result;
}

General “while” Translation #2

“Do-while” conversion.

The idea of that is to do a essentially to take a while loop and turn it into a do-while loop, But introducing a conditional before.

Used with -O1

While version:

1
2
While (Test)
Body

Do-While Version

1
2
3
4
5
6
if (!Test)
goto done;
do
Body
while(Test);
done

Goto Version

1
2
3
4
5
6
7
	if (!Test)
goto done;
loop:
Body
if (Test)
goto loop;
done:

While Loop Example #2

C Code

1
2
3
4
5
6
7
8
long pcount_while(unsigned long x) {
long result = 0;
while (x) {
result += x & 0x1;
x >>= 1;
}
return result;
}

Do-While Version

1
2
3
4
5
6
7
8
9
10
long pcount_goto_dw(unsigned long x) {
long result = 0;
if (!x) goto done;
loop:
result += x & 0x1;
x >>= 1;
if (x) goto loop;
done:
return result;
}

“For” Loop Form

1
2
3
4
5
6
7
8
9
10
#define WSIZE 8*sizeof(int)
long pcount_for(unsigned long x) {
size_t i;
long result = 0;
for (i = 0; i < WSIZE; i++) {
unsigned bit = (x >> i) & 0x1;
result += bit;
}
return result;
}

The “for” loop has four components.

It has an initialization, has a test, has a rule to update.

1
2
for (Init; Test; Update)
Body

Init: i = 0

Test: i < WSIZE

Update: i++

“For” Loop to While Loop

For Version

1
2
for (Init; Test; Update)
Body

While Version

1
2
3
4
5
Init;
while (Test) {
Body
Update;
}

For-While Conversion

Init: i = 0

Test: i < WSIZE

Update: i++

1
2
3
4
5
6
7
8
9
10
11
12
13
long pcount_for_while(unsigned long x) {
//init
size_t i;
long result = 0;
i = 0;
while (i < WSIZE) { //test
//body
unsigned bit = (x >> i) & 0x1;
result += bit;
i++; //update
}
return result;
}

“For” Loop DO-While Conversion

C Code

1
2
3
4
5
6
7
8
9
long pcount_for(unsigned long x) {
size_t i;
long result = 0;
for (i = 0; i < WSIZE; i++) {
unsigned bit = (x >> i) & 0x1;
result += bit;
}
return result;
}

Goto Version

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
long pcount_for_goto_dw(unsigned long x) {
size_t i;
long result = 0;
i = 0; //Init
if (!(i < WSIZE)) //delete
goto done; //delete !Test
loop:
{
//Body
unsigned bit = (x >> i) & 0x1;
result += bit;
}
i++; //Update
if (i < WSIZE)
goto loop; //Test
done:
return result;
}

Initial test can be optimized optimized.

When compiling, the compiler will find i < WSIZE is true, so it will delete !Test.

Switch Statement Example

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
long switch_eg(long x, long y, long z) {
long w = 1;
switch(x) {
case 1:
w = y * z;
break;
case 2:
w = y / z;
/*
Fall Through
*/
case 3:
w += z;
break;
case 5:
case 6:
w -= z;
break;
default:
w = 2;
}
return w;
}

Multiple case labels: 5 & 6.

Fall through cases: 2.

Missing cases: 4.

Jump Table Structure

Normally if you were told you should not use switch statements anymore, What you’d probably do is write this big long chain of if else, if else, if else, and you’d expect that to be the machine code, but it’s not.

Think of the general form of it as being some blocks of code,

The entry points of which are labeled by these case values,

and then the box you know string together in various different ways and do various things,

what we’re going to do is compile a code for all of those blocks, and store them away in some part of memory load up memory load up memory to contain these code blocks.

and then build a table, each entry of this table describes the starting location of one of these code blocks, and we’ll put them in order of case labels if have.

Switch Statement Example

1
2
3
4
5
6
7
long switch_eg(long x, long y, long z) {
long w = 1;
switch(x) {
... //6 was the largest value of any of cases
}
return w;
}
1
2
3
4
5
switch_eg:
moveq %rdx, %rcx
cmpq $6, %rdi # x:6
ja .L8 # default
jmp *.L4(, %rdi, 8)
Register Use(s)
%rdi Argument x
%rsi Argument y
%rdx Argument z
%rax Return value

Jump table:

1
2
3
4
5
6
7
8
9
10
.section	.rodata
.align 8
.L4
.quad .L8 # x = 0
.quad .L3 # x = 1
.quad .L5 # x = 2
.quad .L9 # x = 3
.quad .L8 # x = 4
.quad .L7 # x = 5
.quad .L7 # x = 6

about ja:

ja means jump above, that’s unsigned comparison

if a number is negative, if think of it as an unsigned value it becomes very large positive value,

so by doing the ja instead of jg(jump greater), if x greater than 6, jump to the default, but also it will cause it to jump if x is less than 0.

about jmp:

jmp index into a table and extract out of that an address, then jump to that address.

about Jump table:

it’s constructed is specified in assembly code, and it’s the job of the assembler to fill in the contents of this table.

.quad is just a declaration to say I need an 8 byte value here, and that value should match whatever address.

.L4 covers both cases five and six.

Code Blokes (x == 1)

1
2
3
4
case 1:
w = y * z;
break;
...
1
2
3
4
.L3:
movq %rsi, %rax #y
imulq %rdx, %rax #Y*Z
ret
Register Use(s)
%rdi Argument x
%rsi Argument y
%rdx Argument z
%rax Return value

Handling Fall-Through

The compiler is never exactly what you’d expect

it patched together this fall through case by two blocks of code.

If the x is minus, it will add a bias.

and if the case value is sparse, it will use if-else.