# Writing a C Compiler, Part 4

*This is the fourth post in a series. Read part 1 here.*

This week we’re adding some boolean operators (`&&`

, `||`

) and a whole bunch of relational operators (`<`

, `==`

, etc.). Since we already know how to handle binary operators, this week will be pretty straightforward. As always, you can find the accompanying tests here.

# Week 4: Even More Binary Operators

We’re adding eight new operators this week:

- Logical AND
`&&`

- Logical OR
`||`

- Equal to
`==`

- Not equal to
`!=`

- Less than
`<`

- Less than or equal to
`<=`

- Greater than
`>`

- Greater than or equal to
`>=`

As usual, we’ll update our lexing, parsing, and code generation passes to support these operations.

## Lexing

Each new operator corresponds to a new token. Here’s the full list of tokens we need to support, with old tokens at the top and new tokens in bold at the bottom:

- Open brace
`{`

- Close brace
`}`

- Open parenthesis
`(`

- Close parenthesis
`)`

- Semicolon
`;`

- Int keyword
`int`

- Return keyword
`return`

- Identifier
`[a-zA-Z]\w*`

- Integer literal
`[0-9]+`

- Minus
`-`

- Bitwise complement
`~`

- Logical negation
`!`

- Addition
`+`

- Multiplication
`*`

- Division
`/`

**AND**`&&`

**OR**`||`

**Equal**`==`

**Not Equal**`!=`

**Less than**`<`

**Less than or equal**`<=`

**Greater than**`>`

**Greater than or equal**`>=`

#### ☑ Task:

Update the *lex* function to handle the new tokens. It should work for all stage 1-4 examples in the test suite, including the invalid ones.

## Parsing

Last week, we found that we needed one production rule in our grammar for each operator precedence level. This week we have a lot more precedence levels, which means our grammar will grow a lot. However, our parsing strategy hasn’t changed at all; we’ll handle our new production rules exactly the same way as the old rules for `exp`

and `term`

. Honestly, this is going to be pretty tedious, but I hope it will help solidify all the stuff about parsing from last week.

Here are our all binary operators, from highest to lowest precedence^{1}:

- Multiplication & division (
`*`

,`/`

) - Addition & subtraction (
`+`

,`-`

) - Relational less than/greater than/less than or equal/greater than or equal (
`<`

,`>`

,`<=`

,`>=`

) - Relational equal/not equal (
`==`

,`!=`

) - Logical AND (
`&&`

) - Logical OR (
`||`

)

We handled the first two bullet points last week; the last four are new. We’ll add a production rule for each of the last four bullet points. The new grammar is below, with changed/added rules bolded.

<program> ::= <function> <function> ::= "int" <id> "(" ")" "{" <statement> "}" <statement> ::= "return" <exp> ";"<exp> ::= <logical-and-exp> { "||" <logical-and-exp> }<logical-and-exp> ::= <equality-exp> { "&&" <equality-exp> }<equality-exp> ::= <relational-exp> { ("!=" | "==") <relational-exp> }<relational-exp> ::= <additive-exp> { ("<" | ">" | "<=" | ">=") <additive-exp> }<additive-exp> ::= <term> { ("+" | "-") <term> }<term> ::= <factor> { ("*" | "/") <factor> } <factor> ::= "(" <exp> ")" | <unary_op> <factor> | <int> <unary_op> ::= "!" | "~" | "-"

`<additive-exp>`

is the same as `<exp>`

from last week. We had to rename it because `<exp>`

now refers to logical OR expressions, which now have lowest precedence.

Last week you wrote `parse_exp`

and `parse_term`

; now you’ll need `parse_relational_exp`

, `parse_equality_exp`

, etc. Other than handling different operators, these functions will all be identical.

And for the sake of completeness, here’s our AST definition:

```
program = Program(function_declaration)
function_declaration = Function(string, statement) //string is the function name
statement = Return(exp)
exp = BinOp(binary_operator, exp, exp)
| UnOp(unary_operator, exp)
| Constant(int)
```

This is identical to last week, except we’ve added more possible values of `binary_operator`

.

#### ☑ Task:

Update your expression-parsing code to handle this week’s new binary operators. It should successfully parse all valid stage 1-4 examples in the test suite, and fail on all invalid stage 1-4 examples. The test suite doesn’t directly verify that your program generates the correct AST, so you’ll need to manually inspect the AST for each example to make sure it’s right.

## Code Generation

Our general approach to code generation for binary operations is the same as last week:

- Calculate
`e1`

- Push it onto the stack
- Calculate
`e2`

- Pop
`e1`

from the stack back into a register - Perform the operation on
`e1`

and`e2`

.

All the new stuff will be in step 5.

### Relational Operators

Let’s handle the relational operators first. Like the logical NOT operator (`!`

) in week 2, these return 1 for true results and 0 for false results. These operators are almost identical to `!`

except that they compare two expressions to each other, instead of comparing an expression to zero.

Here’s the assembly we generated for `!`

in week 2:

```
<CODE FOR exp GOES HERE>
cmpl $0, %eax ;set ZF on if exp == 0, set it off otherwise
movl $0, %eax ;zero out EAX (doesn't change FLAGS)
sete %al ;set AL register (the lower byte of EAX) to 1 iff ZF is 1
```

We can modify this slightly to implement `==`

:

```
<CODE FOR e1 GOES HERE>
push %eax ; save value of e1 on the stack
<CODE FOR e2 GOES HERE>
pop %ecx ; pop e1 from the stack into ecx - e2 is already in eax
cmpl %eax, %ecx ;set ZF on if e1 == e2, set it off otherwise
movl $0, %eax ;zero out EAX (doesn't change FLAGS)
sete %al ;set AL register (the lower byte of EAX) to 1 iff ZF is on
```

The `sete`

instruction is just one of a whole slew of conditional set instructions. There’s also `setne`

(set if not equal), `setge`

(set if greater than or equal), and so on. To implement `<`

, `>`

, and the other relational operators, we can generate exactly the same assembly as we used for `==`

above, just replacing `sete`

with the appropriate conditional set instruction. Easy!

In week 2, we talked about testing for equality with the zero flag (ZF). But we can’t use ZF to determine which operand is larger. For that, we need the sign flag (SF), which is set if the result of an operation is negative, like so:

```
movl $0, %eax ;zero out EAX
movl $2, %ecx ;ECX = 2
cmpl $3, %ecx ;compute 2 - 3, set flags
setl %al ;set AL if 2 < 3, i.e. if 2 - 3 is negative
```

Now let’s talk about `&&`

and `||`

. I’ll use `&`

and `|`

to indicate bitwise AND and OR, respectively.

### Logical OR

x86 doesn’t have a logical OR instruction, only bitwise OR. To implement logical OR, we just need to realize that `e1 || e2 == 0`

only if `e1 | e2 == 0`

. Then we can take the following steps to calculate `e1 || e2`

:

- Calculate
`e1 | e2`

- Compare the result to 0
- If the result is 0, set EAX to 0. Otherwise set EAX to 1.

Here’s the assembly:

```
<CODE FOR e1 GOES HERE>
push %eax ;save value of e1 on the stack
<CODE FOR e2 GOES HERE>
pop %ecx ;pop e1 from the stack into ecx
orl %ecx, %eax ;compute e1 | e2, set ZF
movl $0, %eax ;zero out EAX without changing ZF
setne %al ;set AL register (the lower byte of EAX) to 1 iff e1 | e2 != 0
```

### Logical AND

x86 has a bitwise AND instruction too, but we can’t handle it exactly the same way as OR. For example, `1 && 2`

is true, but:

`1 & 2`

would be represented in binary as`01 & 10`

, which is`00`

Here are the steps we can take instead:

- Set
`register1`

to`(a != 0)`

(i.e. 0 if a is 0, 1 otherwise) - Set a
`register2`

to`(b != 0)`

- Set
`EAX = register_1 & register_2`

We’re basically forcing bitwise AND to act like logical AND by only giving it 1 and 0 as operands. There are still some details to work out around allocating registers and zeroing out the upper bits of EAX. There’s more than one way to handle those details, but here’s how I did it:

```
<CODE FOR e1 GOES HERE>
push %eax ;save value of e1 on the stack
<CODE FOR e2 GOES HERE>
pop %ecx ;pop e1 from the stack into ECX
; Step 1: SET CL = 1 iff e1 != 0
cmpl $0, %ecx ;compare e1 to 0
setne %cl ;set CL to 1 iff e1 != 0
; Step 2: SET AL = 1 iff e2 != 0
cmpl $0, %eax ;compare e2 to 0
movl $0, %eax ;zero EAX register before storing result
setne %al ;set AL to 1 iff e2 != 0
; Step 3: compute al & cl
andb %cl, %al ;store AL & CL in AL
```

#### ☑ Task:

Update your code-generation pass to emit correct code for `&&`

, `||`

, `==`

, `!=`

, `<`

, `<=`

, `>`

, and `>=`

. It should succeed on all valid examples and fail on all invalid examples for stages 1-4.

## Other Binary Operators

We still haven’t implemented all the binary operators! We can’t implement assignment operators yet (like `+=`

and `-=`

), because we don’t have support for local variables. But there are other operators you should be able to implement on your own now:

- Modulo
`%`

- Bitwise AND
`&`

- Bitwise OR
`|`

- Bitwise XOR
`^`

- Bitwise shift left
`<<`

- Bitwise shift right
`>>`

This week’s tests don’t cover these, so it’s up to you whether to implement them or skip them.

## Up Next

Next week we’ll add local variables! That means we’ll finally be able to write programs that aren’t just return statements. See you then!

*If you have any questions, corrections, or other feedback, you can email me or open an issue.*