Compare commits
10 Commits
43821be7c9
...
bb0b3ddcc3
Author | SHA1 | Date | |
---|---|---|---|
|
bb0b3ddcc3 | ||
51e7bb9401 | |||
e94f43ab27 | |||
|
6f7d023408 | ||
|
80c6d14249 | ||
|
cf28a22ddf | ||
112b3b19c8 | |||
79f862f941 | |||
24c26fd8b0 | |||
e04d421f70 |
@ -170,18 +170,24 @@ Using some if-statement and loops, this implementation goes through each `i` cha
|
|||||||
A visualization of how this code would lex the expression `+ 12 34` could look like this:
|
A visualization of how this code would lex the expression `+ 12 34` could look like this:
|
||||||
|
|
||||||
```
|
```
|
||||||
text i state tokens
|
text i state tokens
|
||||||
|
|
||||||
+ 12 0 make Plus []
|
+ 12 34 0 make Plus []
|
||||||
^
|
^
|
||||||
+ 12 1 skip whitespace [Plus]
|
+ 12 34 1 skip whitespace [Plus]
|
||||||
^
|
^
|
||||||
+ 12 2 make Int [Plus]
|
+ 12 34 2 make Int [Plus]
|
||||||
^
|
^
|
||||||
+ 12 3 make Int [Plus]
|
+ 12 34 3 make Int [Plus]
|
||||||
^
|
^
|
||||||
+ 12 4 done [Plus Int(12)]
|
+ 12 34 4 skip whitespace [Plus Int(12)]
|
||||||
^
|
^
|
||||||
|
+ 12 34 5 make Int [Plus Int(12)]
|
||||||
|
^
|
||||||
|
+ 12 34 6 make Int [Plus Int(12)]
|
||||||
|
^
|
||||||
|
+ 12 34 7 done [Plus Int(12) Int(34)]
|
||||||
|
^
|
||||||
```
|
```
|
||||||
|
|
||||||
#### Exercises
|
#### Exercises
|
||||||
|
@ -68,7 +68,7 @@ I'll add 3 functions for iterating through characters of the text:
|
|||||||
class Lexer {
|
class Lexer {
|
||||||
// ...
|
// ...
|
||||||
private step() { /*...*/ }
|
private step() { /*...*/ }
|
||||||
private done(): bool { return this.index >= this.text.length; }
|
private done(): boolean { return this.index >= this.text.length; }
|
||||||
private current(): string { return this.text[this.index]; }
|
private current(): string { return this.text[this.index]; }
|
||||||
// ...
|
// ...
|
||||||
}
|
}
|
||||||
@ -127,11 +127,8 @@ And a method for creating valueless tokens:
|
|||||||
class Lexer {
|
class Lexer {
|
||||||
// ...
|
// ...
|
||||||
private token(type: string, pos: Pos): Token {
|
private token(type: string, pos: Pos): Token {
|
||||||
return {
|
const length = this.index - pos.index;
|
||||||
index: this.index,
|
return { type, pos, length };
|
||||||
line: this.line,
|
|
||||||
col: this.col,
|
|
||||||
};
|
|
||||||
}
|
}
|
||||||
// ...
|
// ...
|
||||||
}
|
}
|
||||||
@ -142,11 +139,11 @@ And a method for testing/matching the `.current()` against a regex pattern or st
|
|||||||
```ts
|
```ts
|
||||||
class Lexer {
|
class Lexer {
|
||||||
// ...
|
// ...
|
||||||
private test(pattern: RegExp | string): Token {
|
private test(pattern: RegExp | string): boolean {
|
||||||
if (typeof pattern === "string")
|
if (typeof pattern === "string")
|
||||||
return this.current === pattern;
|
return this.current() === pattern;
|
||||||
else
|
else
|
||||||
return pattern.test(this.current);
|
return pattern.test(this.current());
|
||||||
}
|
}
|
||||||
// ...
|
// ...
|
||||||
}
|
}
|
||||||
@ -164,7 +161,7 @@ class Lexer {
|
|||||||
// ...
|
// ...
|
||||||
console.error(`Lexer: illegal character '${this.current()}' at ${pos.line}:${pos.col}`);
|
console.error(`Lexer: illegal character '${this.current()}' at ${pos.line}:${pos.col}`);
|
||||||
this.step();
|
this.step();
|
||||||
return next();
|
return this.next();
|
||||||
}
|
}
|
||||||
// ...
|
// ...
|
||||||
}
|
}
|
||||||
@ -182,21 +179,13 @@ We don't need to know anything about whitespace, so we'll skip over it without m
|
|||||||
class Lexer {
|
class Lexer {
|
||||||
// ...
|
// ...
|
||||||
public next(): Token | null {
|
public next(): Token | null {
|
||||||
if (this.done())
|
// ...
|
||||||
return null;
|
|
||||||
const pos = this.pos();
|
|
||||||
if (this.test(/[ \t\n]/)) {
|
if (this.test(/[ \t\n]/)) {
|
||||||
while (!this.done() && this.test(/[ \t\n]/))
|
while (!this.done() && this.test(/[ \t\n]/))
|
||||||
this.step();
|
this.step();
|
||||||
return next();
|
return this.next();
|
||||||
}
|
}
|
||||||
// ...
|
// ...
|
||||||
console.error(
|
|
||||||
`Lexer: illegal character '${this.current()}'`
|
|
||||||
+ ` at ${pos.line}:${pos.col}`,
|
|
||||||
);
|
|
||||||
this.step();
|
|
||||||
return next();
|
|
||||||
}
|
}
|
||||||
// ...
|
// ...
|
||||||
}
|
}
|
||||||
@ -216,7 +205,7 @@ class Lexer {
|
|||||||
if (this.test("#")) {
|
if (this.test("#")) {
|
||||||
while (!this.done() && !this.test("\n"))
|
while (!this.done() && !this.test("\n"))
|
||||||
this.step();
|
this.step();
|
||||||
return next();
|
return this.next();
|
||||||
}
|
}
|
||||||
// ...
|
// ...
|
||||||
}
|
}
|
||||||
@ -408,7 +397,8 @@ const text = `
|
|||||||
const lexer = new Lexer(text);
|
const lexer = new Lexer(text);
|
||||||
let token = lexer.next();
|
let token = lexer.next();
|
||||||
while (token !== null) {
|
while (token !== null) {
|
||||||
console.log(`Lexed ${token}`);
|
const value = token.identValue ?? token.intValue ?? token.stringValue ?? "";
|
||||||
|
console.log(`Lexed ${token}(${value})`);
|
||||||
token = lexer.next();
|
token = lexer.next();
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
@ -1,7 +1,7 @@
|
|||||||
|
|
||||||
# 3 Parser
|
# 3 Parser
|
||||||
|
|
||||||
In this chaper I'll show how I would make a parser.
|
In this chapter I'll show how I would make a parser.
|
||||||
|
|
||||||
A parser, in addition to our lexer, transforms the input program as text, meaning an unstructured sequence of characters, into a structered representation. Structured meaning the representation tells us about the different constructs such as if statements and expressions.
|
A parser, in addition to our lexer, transforms the input program as text, meaning an unstructured sequence of characters, into a structered representation. Structured meaning the representation tells us about the different constructs such as if statements and expressions.
|
||||||
|
|
||||||
@ -9,7 +9,7 @@ A parser, in addition to our lexer, transforms the input program as text, meanin
|
|||||||
|
|
||||||
The result of parsing is a tree structure representing the input program.
|
The result of parsing is a tree structure representing the input program.
|
||||||
|
|
||||||
This structure is a recursive acyclic structure storing the different parts of the program.
|
This structure is a recursive structure storing the different parts of the program.
|
||||||
|
|
||||||
This is how I would define an AST data type.
|
This is how I would define an AST data type.
|
||||||
|
|
||||||
@ -23,7 +23,7 @@ type Stmt = {
|
|||||||
type StmtKind =
|
type StmtKind =
|
||||||
| { type: "error" }
|
| { type: "error" }
|
||||||
// ...
|
// ...
|
||||||
| { type: "let", ident: string, value: Expr }
|
| { type: "return", expr?: Expr }
|
||||||
// ...
|
// ...
|
||||||
;
|
;
|
||||||
|
|
||||||
@ -62,7 +62,7 @@ class Parser {
|
|||||||
}
|
}
|
||||||
// ...
|
// ...
|
||||||
private step() { this.currentToken = this.lexer.next() }
|
private step() { this.currentToken = this.lexer.next() }
|
||||||
private done(): bool { return this.currentToken == null; }
|
private done(): boolean { return this.currentToken == null; }
|
||||||
private current(): Token { return this.currentToken!; }
|
private current(): Token { return this.currentToken!; }
|
||||||
// ...
|
// ...
|
||||||
}
|
}
|
||||||
@ -95,14 +95,14 @@ class Parser {
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
The parser does not need to keep track of `index`, `line` and `col` as those are stored in the tokens. The token's position is prefered to the lexer's.
|
The parser does not need to keep track of `index`, `line` and `col` as those are stored in the tokens. The token's position is preferred to the lexer's.
|
||||||
|
|
||||||
Also like the lexer, we'll have a `.test()` method in the parser, which will test for token type rather than strings or regex.
|
Also like the lexer, we'll have a `.test()` method in the parser, which will test for token type rather than strings or regex.
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
class Parser {
|
class Parser {
|
||||||
// ...
|
// ...
|
||||||
private test(type: string): bool {
|
private test(type: string): boolean {
|
||||||
return !this.done() && this.current().type === type;
|
return !this.done() && this.current().type === type;
|
||||||
}
|
}
|
||||||
// ...
|
// ...
|
||||||
@ -151,7 +151,7 @@ class Parser {
|
|||||||
|
|
||||||
## 3.3 Operands
|
## 3.3 Operands
|
||||||
|
|
||||||
Operands are the individual parts of an operation. For example, in the math expression `a + b`, (would be `+ a b` in the input language), `a` and `b` are the *operands*, while `+` is the *operator*. In the expression `a + b * c`, the operands are `a`, `b` and `c`. But in the expression `a * (b + c)`, the operands of the multiply operation are `a` and `(b + c)`. `(b + c)` is an operands, because it is enclosed on both sides. This is how we'll define operands.
|
Operands are the individual parts of an operation. For example, in the math expression `a + b`, (would be `+ a b` in the input language), `a` and `b` are the *operands*, while `+` is the *operator*. In the expression `a + b * c`, the operands are `a`, `b` and `c`. But in the expression `a * (b + c)`, the operands of the multiply operation are `a` and `(b + c)`. `(b + c)` is a singular operand, because it is enclosed on both sides. This is how we'll define operands.
|
||||||
|
|
||||||
We'll make a public method in `Parser` called `parseOperand`.
|
We'll make a public method in `Parser` called `parseOperand`.
|
||||||
|
|
||||||
@ -189,17 +189,17 @@ class Parser {
|
|||||||
public parseOperand(): Expr {
|
public parseOperand(): Expr {
|
||||||
// ...
|
// ...
|
||||||
if (this.test("ident")) {
|
if (this.test("ident")) {
|
||||||
const value = this.current().identValue;
|
const value = this.current().identValue!;
|
||||||
this.step();
|
this.step();
|
||||||
return this.expr({ type: "ident", value }, pos);
|
return this.expr({ type: "ident", value }, pos);
|
||||||
}
|
}
|
||||||
if (this.test("int")) {
|
if (this.test("int")) {
|
||||||
const value = this.current().intValue;
|
const value = this.current().intValue!;
|
||||||
this.step();
|
this.step();
|
||||||
return this.expr({ type: "int", value }, pos);
|
return this.expr({ type: "int", value }, pos);
|
||||||
}
|
}
|
||||||
if (this.test("string")) {
|
if (this.test("string")) {
|
||||||
const value = this.current().stringValue;
|
const value = this.current().stringValue!;
|
||||||
this.step();
|
this.step();
|
||||||
return this.expr({ type: "string", value }, pos);
|
return this.expr({ type: "string", value }, pos);
|
||||||
}
|
}
|
||||||
@ -327,7 +327,7 @@ class Parser {
|
|||||||
this.report("expected ident");
|
this.report("expected ident");
|
||||||
return this.expr({ type: "error" }, pos);
|
return this.expr({ type: "error" }, pos);
|
||||||
}
|
}
|
||||||
const value = this.current().identValue;
|
const value = this.current().identValue!;
|
||||||
this.step();
|
this.step();
|
||||||
subject = this.expr({ type: "field", subject, value }, pos);
|
subject = this.expr({ type: "field", subject, value }, pos);
|
||||||
continue;
|
continue;
|
||||||
@ -365,7 +365,7 @@ class Parser {
|
|||||||
if (this.test("[")) {
|
if (this.test("[")) {
|
||||||
this.step();
|
this.step();
|
||||||
const value = this.parseExpr();
|
const value = this.parseExpr();
|
||||||
if (!this.test("]") {
|
if (!this.test("]")) {
|
||||||
this.report("expected ']'");
|
this.report("expected ']'");
|
||||||
return this.expr({ type: "error" }, pos);
|
return this.expr({ type: "error" }, pos);
|
||||||
}
|
}
|
||||||
@ -405,7 +405,7 @@ class Parser {
|
|||||||
if (this.test("(")) {
|
if (this.test("(")) {
|
||||||
this.step();
|
this.step();
|
||||||
let args: Expr[] = [];
|
let args: Expr[] = [];
|
||||||
if (!this.test(")") {
|
if (!this.test(")")) {
|
||||||
args.push(this.parseExpr());
|
args.push(this.parseExpr());
|
||||||
while (this.test(",")) {
|
while (this.test(",")) {
|
||||||
this.step();
|
this.step();
|
||||||
@ -414,7 +414,6 @@ class Parser {
|
|||||||
args.push(this.parseExpr());
|
args.push(this.parseExpr());
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
const value = this.parseExpr();
|
|
||||||
if (!this.test(")") {
|
if (!this.test(")") {
|
||||||
this.report("expected ')'");
|
this.report("expected ')'");
|
||||||
return this.expr({ type: "error" }, pos);
|
return this.expr({ type: "error" }, pos);
|
||||||
@ -431,10 +430,10 @@ class Parser {
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
Similarly to index epxressions, if we find a `(`-token, we step over it, parse the arguments, check for a `)` and replace `subject` with a call expression containing the previous `subject`.
|
Similarly to index expressions, if we find a `(`-token, we step over it, parse the arguments, check for a `)` and replace `subject` with a call expression containing the previous `subject`.
|
||||||
|
|
||||||
When parsing the arguments, we start by testing if we've reached a `)` to check if there are any arguments. If not, we parse the first argument.
|
When parsing the arguments, we start by testing if we've reached a `)` to check if there are any arguments. If not, we parse the first argument.
|
||||||
The consecutive arguments are all preceded by a `,`-token. There we test or `,`, to check if we should keep parsing arguments. After checking for a seperating `,`, we check if we've reached a `)` and break if so. This is to allow for trailing comma.
|
The consecutive arguments are all preceded by a `,`-token. There we test for `,`, to check if we should keep parsing arguments. After checking for a seperating `,`, we check if we've reached a `)` and break if so. This is to allow for trailing comma.
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
func(
|
func(
|
||||||
@ -445,7 +444,7 @@ func(
|
|||||||
|
|
||||||
## 3.5 Prefix expressions
|
## 3.5 Prefix expressions
|
||||||
|
|
||||||
Contrasting postfix expressions, prefix expression are operations where the operator comes first, then the operands are listed. In some languages, operations such as negation (eg. `-value`) and not-operations (eg. `!value`) are prefix operations. In the language we're making, all binary and unary arithmetic operations are prefix. This includes both expressions with a single operand, such as not (eg. `not value`), but also expressions with 2 operands, such ass addition (eg. `+ a b`) and equation (eg. `== a b`).
|
Contrasting postfix expressions, prefix expression are operations where the operator comes first, then the operands are listed. In some languages, operations such as negation (eg. `-value`) and not-operations (eg. `!value`) are prefix operations. In the language we're making, all binary and unary arithmetic operations are prefix. This includes both expressions with a single operand, such as not (eg. `not value`), but also expressions with 2 operands, such as addition (eg. `+ a b`) and equation (eg. `== a b`).
|
||||||
|
|
||||||
This is because infix operators (eg. `a + b`) makes parsing more complicated, as it requires reasoning about operator precedence, eg. why `2 + 3 * 4 != (2 + 3) * 4`.
|
This is because infix operators (eg. `a + b`) makes parsing more complicated, as it requires reasoning about operator precedence, eg. why `2 + 3 * 4 != (2 + 3) * 4`.
|
||||||
|
|
||||||
@ -642,7 +641,7 @@ class Parser {
|
|||||||
public parseBreak(): Stmt {
|
public parseBreak(): Stmt {
|
||||||
const pos = this.pos();
|
const pos = this.pos();
|
||||||
this.step();
|
this.step();
|
||||||
if (!this.test(";")) {
|
if (this.test(";")) {
|
||||||
return this.stmt({ type: "break" }, pos);
|
return this.stmt({ type: "break" }, pos);
|
||||||
}
|
}
|
||||||
const expr = this.parseExpr();
|
const expr = this.parseExpr();
|
||||||
@ -672,7 +671,7 @@ class Parser {
|
|||||||
public parseReturn(): Stmt {
|
public parseReturn(): Stmt {
|
||||||
const pos = this.pos();
|
const pos = this.pos();
|
||||||
this.step();
|
this.step();
|
||||||
if (!this.test(";")) {
|
if (this.test(";")) {
|
||||||
return this.stmt({ type: "return" }, pos);
|
return this.stmt({ type: "return" }, pos);
|
||||||
}
|
}
|
||||||
const expr = this.parseExpr();
|
const expr = this.parseExpr();
|
||||||
@ -715,27 +714,25 @@ class Parser {
|
|||||||
this.report("expected ident");
|
this.report("expected ident");
|
||||||
return this.stmt({ type: "error" }, pos);
|
return this.stmt({ type: "error" }, pos);
|
||||||
}
|
}
|
||||||
const ident = this.current().identValue;
|
const ident = this.current().identValue!;
|
||||||
this.step();
|
this.step();
|
||||||
if (!this.test("(")) {
|
if (!this.test("(")) {
|
||||||
this.report("expected '('");
|
this.report("expected '('");
|
||||||
return this.stmt({ type: "error" }, pos);
|
return this.stmt({ type: "error" }, pos);
|
||||||
}
|
}
|
||||||
const params = this.parseFnParams();
|
const params = this.parseFnParams();
|
||||||
if (!params.ok)
|
|
||||||
return this.stmt({ type: "error" }, pos);
|
|
||||||
if (!this.test("{")) {
|
if (!this.test("{")) {
|
||||||
this.report("expected block");
|
this.report("expected block");
|
||||||
return this.stmt({ type: "error" }, pos);
|
return this.stmt({ type: "error" }, pos);
|
||||||
}
|
}
|
||||||
const body = this.parseBlock();
|
const body = this.parseBlock();
|
||||||
return this.stmt({ type: "fn", ident, params: params.value, body }, pos);
|
return this.stmt({ type: "fn", ident, params, body }, pos);
|
||||||
}
|
}
|
||||||
// ...
|
// ...
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
We first step over the initial `fn`-token. Then we grap the value of an `ident`-token. Then we check for a `(` and call `.parseFnParams()` to parse the parameters, including the encapsulating `(` and `)`. Then we check for and parse a block. And then we return the statement.
|
We first step over the initial `fn`-token. Then we grab the value of an `ident`-token. Then we check for a `(` and call `.parseFnParams()` to parse the parameters, including the encapsulating `(` and `)`. Then we check for and parse a block. And then we return the statement.
|
||||||
|
|
||||||
Then we define the `.parseFnParams()` method.
|
Then we define the `.parseFnParams()` method.
|
||||||
|
|
||||||
@ -783,7 +780,7 @@ class Parser {
|
|||||||
public parseParam(): { ok: true, value: Param } | { ok: false } {
|
public parseParam(): { ok: true, value: Param } | { ok: false } {
|
||||||
const pos = this.pos();
|
const pos = this.pos();
|
||||||
if (this.test("ident")) {
|
if (this.test("ident")) {
|
||||||
const ident = self.current().value;
|
const ident = this.current().identValue!;
|
||||||
this.step();
|
this.step();
|
||||||
return { ok: true, value: { ident, pos } };
|
return { ok: true, value: { ident, pos } };
|
||||||
}
|
}
|
||||||
@ -830,7 +827,7 @@ class Parser {
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
We step over the first `let`-token. Then we parse a parameter using the `.parseParam()` method. If it fails, we return an error statement. Then we check for and step over a `=`-token. We then parse an expressions. And lastly return a let statement with the `ident` and `value`.
|
We step over the first `let`-token. Then we parse a parameter using the `.parseParam()` method. If it fails, we return an error statement. Then we check for and step over a `=`-token. We then parse an expression. And lastly return a let statement with the `ident` and `value`.
|
||||||
|
|
||||||
## 3.14 Assignment and expression statements
|
## 3.14 Assignment and expression statements
|
||||||
|
|
||||||
@ -915,6 +912,7 @@ class Parser {
|
|||||||
// ...
|
// ...
|
||||||
while (!this.done()) {
|
while (!this.done()) {
|
||||||
if (this.test("}")) {
|
if (this.test("}")) {
|
||||||
|
this.step();
|
||||||
return this.expr({ type: "block", stmts }, pos);
|
return this.expr({ type: "block", stmts }, pos);
|
||||||
// ...
|
// ...
|
||||||
}
|
}
|
||||||
@ -966,6 +964,7 @@ class Parser {
|
|||||||
}
|
}
|
||||||
// ...
|
// ...
|
||||||
private parseSingleLineBlockStmt(): Stmt {
|
private parseSingleLineBlockStmt(): Stmt {
|
||||||
|
const pos = this.pos();
|
||||||
if (this.test("let"))
|
if (this.test("let"))
|
||||||
return this.parseLet();
|
return this.parseLet();
|
||||||
if (this.test("return"))
|
if (this.test("return"))
|
||||||
@ -1011,6 +1010,7 @@ class Parser {
|
|||||||
}
|
}
|
||||||
// ...
|
// ...
|
||||||
private parseMultiLineBlockExpr(): Expr {
|
private parseMultiLineBlockExpr(): Expr {
|
||||||
|
const pos = this.pos();
|
||||||
if (this.test("{"))
|
if (this.test("{"))
|
||||||
return this.parseBlock();
|
return this.parseBlock();
|
||||||
if (this.test("if"))
|
if (this.test("if"))
|
||||||
@ -1042,8 +1042,10 @@ class Parser {
|
|||||||
this.eatSemicolon();
|
this.eatSemicolon();
|
||||||
stmts.push(this.stmt({ type: "assign", subject: expr, value }, pos));
|
stmts.push(this.stmt({ type: "assign", subject: expr, value }, pos));
|
||||||
} else if (this.test(";")) {
|
} else if (this.test(";")) {
|
||||||
|
this.step();
|
||||||
stmts.push(this.stmt({ type: "expr", expr }, expr.pos));
|
stmts.push(this.stmt({ type: "expr", expr }, expr.pos));
|
||||||
} else if (this.test("}")) {
|
} else if (this.test("}")) {
|
||||||
|
this.step();
|
||||||
return this.expr({ type: "block", stmts, expr }, pos);
|
return this.expr({ type: "block", stmts, expr }, pos);
|
||||||
} else {
|
} else {
|
||||||
this.report("expected ';' or '}'");
|
this.report("expected ';' or '}'");
|
||||||
@ -1118,7 +1120,7 @@ class Parser {
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
Then we test, if we've reached a single line statement, meaning it should end with a `;`, ishc as let, return and break.
|
Then we test, if we've reached a single line statement, meaning it should end with a `;`, such as let, return and break.
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
class Parser {
|
class Parser {
|
||||||
@ -1129,7 +1131,7 @@ class Parser {
|
|||||||
if (this.test("fn")) {
|
if (this.test("fn")) {
|
||||||
// ...
|
// ...
|
||||||
} else if (this.test("{") || this.test("if") || this.test("loop")) {
|
} else if (this.test("{") || this.test("if") || this.test("loop")) {
|
||||||
let expr = this.parseMultiLineBlockExpr();
|
const expr = this.parseMultiLineBlockExpr();
|
||||||
stmts.push(this.stmt({ type: "expr", expr }, expr.pos));
|
stmts.push(this.stmt({ type: "expr", expr }, expr.pos));
|
||||||
// ...
|
// ...
|
||||||
}
|
}
|
||||||
@ -1152,6 +1154,7 @@ class Parser {
|
|||||||
// ...
|
// ...
|
||||||
} else {
|
} else {
|
||||||
stmts.push(this.parseAssign());
|
stmts.push(this.parseAssign());
|
||||||
|
this.eatSemicolon();
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
return stmts;
|
return stmts;
|
||||||
@ -1162,7 +1165,7 @@ class Parser {
|
|||||||
|
|
||||||
If none of the above, we parse an assignment statement, which will parse an assignment statement or an expression statement.
|
If none of the above, we parse an assignment statement, which will parse an assignment statement or an expression statement.
|
||||||
|
|
||||||
## 3 Exercises
|
## Exercises
|
||||||
|
|
||||||
1. Implement boolean literals: `true` and `false` and null literal: `null`.
|
1. Implement boolean literals: `true` and `false` and null literal: `null`.
|
||||||
2. Implement the binary operators: `-`, `*`, `/`, `!=`, `<`, `>`, `<=`, `>=`, `or` and `and`.
|
2. Implement the binary operators: `-`, `*`, `/`, `!=`, `<`, `>`, `<=`, `>=`, `or` and `and`.
|
||||||
|
@ -85,12 +85,12 @@ function valueToString(value: Value): string {
|
|||||||
return value.value ? "true" : "false";
|
return value.value ? "true" : "false";
|
||||||
}
|
}
|
||||||
if (value.type === "array") {
|
if (value.type === "array") {
|
||||||
const valueStrings = result.values
|
const valueStrings = value.values
|
||||||
.map(value => value.toString());
|
.map(value => value.toString());
|
||||||
return `[${valueStrings.join(", ")}]`;
|
return `[${valueStrings.join(", ")}]`;
|
||||||
}
|
}
|
||||||
if (value.type === "struct") {
|
if (value.type === "struct") {
|
||||||
const fieldStrings = Object.entries(result.fields)
|
const fieldStrings = Object.entries(value.fields)
|
||||||
.map(([key, value]) => `${key}: ${valueToString(value)}`);
|
.map(([key, value]) => `${key}: ${valueToString(value)}`);
|
||||||
return `struct { ${fieldStrings.join(", ")} }`;
|
return `struct { ${fieldStrings.join(", ")} }`;
|
||||||
}
|
}
|
||||||
@ -137,12 +137,12 @@ type SymMap = { [ident: string]: Sym }
|
|||||||
class Syms {
|
class Syms {
|
||||||
private syms: SymMap = {};
|
private syms: SymMap = {};
|
||||||
|
|
||||||
public constructor(private parent?: SymMap) {}
|
public constructor(private parent?: Syms) {}
|
||||||
// ...
|
// ...
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
The `Sym` structure represents a symbol, and contains it's details such as the value and the position where the symbol is declared. The `SymMap` type is a key value map, which maps identifiers to their definition. To keep track of symbols in regard to scopes, we also define a `Syms` class. An instance of `Syms` is a node in a tree structure.
|
The `Sym` structure represents a symbol, and contains its details such as the value and the position where the symbol is declared. The `SymMap` type is a key value map, which maps identifiers to their definition. To keep track of symbols in regard to scopes, we also define a `Syms` class. An instance of `Syms` is a node in a tree structure.
|
||||||
|
|
||||||
We'll define a method for defining symbols.
|
We'll define a method for defining symbols.
|
||||||
|
|
||||||
@ -184,11 +184,11 @@ class Syms {
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
If the symbol is defined locally, return the symbol. Else if a the parent node is defined, defer to the parent. Otherwise, return a not-found result.
|
If the symbol is defined locally, return the symbol. Else if the parent node is defined, defer to the parent. Otherwise, return a not-found result.
|
||||||
|
|
||||||
## 4.3 Control flow
|
## 4.3 Control flow
|
||||||
|
|
||||||
Most code will run with unbroken control flow, but some code will 'break' control flow. This is the case for return statements in functions and break statements in loops. To keep track of, if a return or break statement has been run, we'll define a data structure representing the control flow action of evaluted code.
|
Most code will run with unbroken control flow, but some code will 'break' control flow. This is the case for return statements in functions and break statements in loops. To keep track of, if a return or break statement has been run, we'll define a data structure representing the control flow action of evaluated code.
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
type Flow = {
|
type Flow = {
|
||||||
@ -204,7 +204,7 @@ The 3 implemented options for control flow is breaking in a loop, returning in a
|
|||||||
For ease of use, we'll add some functions to create the commonly used flow types and values.
|
For ease of use, we'll add some functions to create the commonly used flow types and values.
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
function flowWalue(value: Value): Flow {
|
function flowValue(value: Value): Flow {
|
||||||
return { type: "value", value };
|
return { type: "value", value };
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
@ -255,8 +255,6 @@ We'll want a *root* symbol table, which stores all the predefined symbols. We al
|
|||||||
class Evaluator {
|
class Evaluator {
|
||||||
private root = new Syms();
|
private root = new Syms();
|
||||||
// ...
|
// ...
|
||||||
public defineBuiltins() { /*...*/ }
|
|
||||||
// ...
|
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@ -271,10 +269,11 @@ type FnDef = {
|
|||||||
params: Param[],
|
params: Param[],
|
||||||
body: Expr,
|
body: Expr,
|
||||||
id: number,
|
id: number,
|
||||||
|
syms: Syms,
|
||||||
};
|
};
|
||||||
```
|
```
|
||||||
|
|
||||||
The parameters are needed, so that we can verify when calling, that we call with the correct amount of arguments. The body is the AST expression to be evaluated. And an identifier, so that we can refer to the definition by it's id `fnDefId`.
|
The parameters are needed, so that we can verify when calling, that we call with the correct amount of arguments. The body is the AST expression to be evaluated. An identifier so that we can refer to the definition by its id `fnDefId`. And a symbol table, at the time and place of the definition, as opposed to the callers time and place.
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
class Evaluator {
|
class Evaluator {
|
||||||
@ -283,7 +282,7 @@ class Evaluator {
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
We'll also add an array of function definitions to the evaluator class. The index of a function definition will also be it's id.
|
We'll also add an array of function definitions to the evaluator class. The index of a function definition will also be its id.
|
||||||
|
|
||||||
## 4.5 Expressions
|
## 4.5 Expressions
|
||||||
|
|
||||||
@ -292,12 +291,12 @@ Let's make a function `evalExpr` for evaluating expressions.
|
|||||||
```ts
|
```ts
|
||||||
class Evaluator {
|
class Evaluator {
|
||||||
// ...
|
// ...
|
||||||
public evalExpr(expr: Expr): Flow {
|
public evalExpr(expr: Expr, syms: Syms): Flow {
|
||||||
if (expr.type === "error") {
|
if (expr.kind.type === "error") {
|
||||||
throw new Error("error in AST");
|
throw new Error("error in AST");
|
||||||
}
|
}
|
||||||
// ...
|
// ...
|
||||||
throw new Error(`unknown expr type "${expr.type}"`);
|
throw new Error(`unknown expr type "${expr.kind.type}"`);
|
||||||
}
|
}
|
||||||
// ...
|
// ...
|
||||||
}
|
}
|
||||||
@ -312,10 +311,10 @@ class Evaluator {
|
|||||||
// ...
|
// ...
|
||||||
public evalExpr(expr: Expr, syms: Syms): Flow {
|
public evalExpr(expr: Expr, syms: Syms): Flow {
|
||||||
// ...
|
// ...
|
||||||
if (expr.type === "ident") {
|
if (expr.kind.type === "ident") {
|
||||||
const result = syms.get(expr.value);
|
const result = syms.get(expr.kind.value);
|
||||||
if (!result.ok)
|
if (!result.ok)
|
||||||
throw new Error(`undefined symbol "${expr.value}"`);
|
throw new Error(`undefined symbol "${expr.kind.value}"`);
|
||||||
return flowValue(result.sym.value);
|
return flowValue(result.sym.value);
|
||||||
}
|
}
|
||||||
// ...
|
// ...
|
||||||
@ -331,17 +330,17 @@ class Evaluator {
|
|||||||
// ...
|
// ...
|
||||||
public evalExpr(expr: Expr, syms: Syms): Flow {
|
public evalExpr(expr: Expr, syms: Syms): Flow {
|
||||||
// ...
|
// ...
|
||||||
if (expr.type === "null") {
|
if (expr.kind.type === "null") {
|
||||||
return flowValue({ type: "null" });
|
return flowValue({ type: "null" });
|
||||||
}
|
}
|
||||||
if (expr.type === "int") {
|
if (expr.kind.type === "int") {
|
||||||
return flowValue({ type: "int", value: expr.value });
|
return flowValue({ type: "int", value: expr.kind.value });
|
||||||
}
|
}
|
||||||
if (expr.type === "string") {
|
if (expr.kind.type === "string") {
|
||||||
return flowValue({ type: "string", value: expr.value });
|
return flowValue({ type: "string", value: expr.kind.value });
|
||||||
}
|
}
|
||||||
if (expr.type === "bool") {
|
if (expr.kind.type === "bool") {
|
||||||
return flowValue({ type: "int", value: expr.value });
|
return flowValue({ type: "bool", value: expr.kind.value });
|
||||||
}
|
}
|
||||||
// ...
|
// ...
|
||||||
}
|
}
|
||||||
@ -358,8 +357,8 @@ class Evaluator {
|
|||||||
// ...
|
// ...
|
||||||
public evalExpr(expr: Expr, syms: Syms): Flow {
|
public evalExpr(expr: Expr, syms: Syms): Flow {
|
||||||
// ...
|
// ...
|
||||||
if (expr.type === "group") {
|
if (expr.kind.type === "group") {
|
||||||
return this.evalExpr(expr.expr, syms);
|
return this.evalExpr(expr.kind.expr, syms);
|
||||||
}
|
}
|
||||||
// ...
|
// ...
|
||||||
}
|
}
|
||||||
@ -376,15 +375,15 @@ class Evaluator {
|
|||||||
// ...
|
// ...
|
||||||
public evalExpr(expr: Expr, syms: Syms): Flow {
|
public evalExpr(expr: Expr, syms: Syms): Flow {
|
||||||
// ...
|
// ...
|
||||||
if (expr.type === "field") {
|
if (expr.kind.type === "field") {
|
||||||
const [subject, subjectFlow] = expectValue(this.evalExpr(expr.subject, syms));
|
const [subject, subjectFlow] = expectValue(this.evalExpr(expr.kind.subject, syms));
|
||||||
if (!subject)
|
if (!subject)
|
||||||
return subjectFlow;
|
return subjectFlow;
|
||||||
if (subject.type !== "struct")
|
if (subject.type !== "struct")
|
||||||
throw new Error(`cannot use field operator on ${subject.type} value`);
|
throw new Error(`cannot use field operator on ${subject.type} value`);
|
||||||
if (!(expr.value in subject.fields))
|
if (!(expr.kind.value in subject.fields))
|
||||||
throw new Error(`field ${expr.value} does not exist on struct`);
|
throw new Error(`field ${expr.kind.value} does not exist on struct`);
|
||||||
return subject.fields[expr.value];
|
return flowValue(subject.fields[expr.kind.value]);
|
||||||
}
|
}
|
||||||
// ...
|
// ...
|
||||||
}
|
}
|
||||||
@ -401,11 +400,11 @@ class Evaluator {
|
|||||||
// ...
|
// ...
|
||||||
public evalExpr(expr: Expr, syms: Syms): Flow {
|
public evalExpr(expr: Expr, syms: Syms): Flow {
|
||||||
// ...
|
// ...
|
||||||
if (expr.type === "index") {
|
if (expr.kind.type === "index") {
|
||||||
const [subject, subjectFlow] = expectValue(this.evalExpr(expr.subject, syms));
|
const [subject, subjectFlow] = expectValue(this.evalExpr(expr.kind.subject, syms));
|
||||||
if (!subject)
|
if (!subject)
|
||||||
return subjectFlow;
|
return subjectFlow;
|
||||||
const [value, valueFlow] = expectValue(this.evalExpr(expr.value, syms));
|
const [value, valueFlow] = expectValue(this.evalExpr(expr.kind.value, syms));
|
||||||
if (!value)
|
if (!value)
|
||||||
return valueFlow;
|
return valueFlow;
|
||||||
if (subject.type === "struct") {
|
if (subject.type === "struct") {
|
||||||
@ -431,11 +430,11 @@ class Evaluator {
|
|||||||
if (subject.type === "string") {
|
if (subject.type === "string") {
|
||||||
if (value.type !== "int")
|
if (value.type !== "int")
|
||||||
throw new Error(`cannot index into string with ${value.type} value`);
|
throw new Error(`cannot index into string with ${value.type} value`);
|
||||||
if (value.value >= subject.values.length)
|
if (value.value >= subject.value.length)
|
||||||
throw new Error("index out of range");
|
throw new Error("index out of range");
|
||||||
if (value.value < 0) {
|
if (value.value < 0) {
|
||||||
const negativeIndex = subject.values.length + value.value;
|
const negativeIndex = subject.value.length + value.value;
|
||||||
if (negativeIndex < 0 || negativeIndex >= subject.values.length)
|
if (negativeIndex < 0 || negativeIndex >= subject.value.length)
|
||||||
throw new Error("index out of range");
|
throw new Error("index out of range");
|
||||||
return flowValue({ type: "int", value: subject.value.charCodeAt(negativeIndex) });
|
return flowValue({ type: "int", value: subject.value.charCodeAt(negativeIndex) });
|
||||||
}
|
}
|
||||||
@ -451,7 +450,7 @@ class Evaluator {
|
|||||||
|
|
||||||
The index operator can be evaluated on a subject of either struct, array or string type. If evaluated on the struct type, we expect a string containing the field name. If the field does not exist, we return a null value. This is in contrast to the field operator, which throws an error, if no field is found. If the subject is instead an array, we expect a value of type int. We check if either the int value index or negative index is in range of the array values. If so, return the value at the index or the negative index. If the subject is a string, evaluation will behave similarly to an array, evaluating to an int value representing the value of the text character at the index or negative index.
|
The index operator can be evaluated on a subject of either struct, array or string type. If evaluated on the struct type, we expect a string containing the field name. If the field does not exist, we return a null value. This is in contrast to the field operator, which throws an error, if no field is found. If the subject is instead an array, we expect a value of type int. We check if either the int value index or negative index is in range of the array values. If so, return the value at the index or the negative index. If the subject is a string, evaluation will behave similarly to an array, evaluating to an int value representing the value of the text character at the index or negative index.
|
||||||
|
|
||||||
The negative index is when a negative int value is passed as index, where the index will start at the end of the array. Given an array `vs` containing the values `["a", "b", "c"]` in listed order, the indices `0`, `1` and `2` will evalute to the values `"a"`, `"b"` and `"c"`, whereas the indices `-1`, `-2`, `-3` will evaluate to the values `"c"`, `"b"` and `"a"`. A negative index implicitly starts at the length of the array and subtracts the absolute index value.
|
The negative index is when a negative int value is passed as index, where the index will start at the end of the array. Given an array `vs` containing the values `["a", "b", "c"]` in listed order, the indices `0`, `1` and `2` will evaluate to the values `"a"`, `"b"` and `"c"`, whereas the indices `-1`, `-2`, `-3` will evaluate to the values `"c"`, `"b"` and `"a"`. A negative index implicitly starts at the length of the array and subtracts the absolute index value.
|
||||||
|
|
||||||
### 4.5.6 Call expressions
|
### 4.5.6 Call expressions
|
||||||
|
|
||||||
@ -460,18 +459,18 @@ class Evaluator {
|
|||||||
// ...
|
// ...
|
||||||
public evalExpr(expr: Expr, syms: Syms): Flow {
|
public evalExpr(expr: Expr, syms: Syms): Flow {
|
||||||
// ...
|
// ...
|
||||||
if (expr.type === "call") {
|
if (expr.kind.type === "call") {
|
||||||
const [subject, subjectFlow] = expectValue(this.evalExpr(expr.subject, syms));
|
const [subject, subjectFlow] = expectValue(this.evalExpr(expr.kind.subject, syms));
|
||||||
if (!subject)
|
if (!subject)
|
||||||
return subjectFlow;
|
return subjectFlow;
|
||||||
const args: Value[] = [];
|
const args: Value[] = [];
|
||||||
for (const arg of expr.args) {
|
for (const arg of expr.kind.args) {
|
||||||
const [value, valueFlow] = expectValue(this.evalExpr(expr.value, syms));
|
const [value, valueFlow] = expectValue(this.evalExpr(arg, syms));
|
||||||
if (!value)
|
if (!value)
|
||||||
return valueFlow;
|
return valueFlow;
|
||||||
args.push(value);
|
args.push(value);
|
||||||
}
|
}
|
||||||
if (subject.type === "builtin") {
|
if (subject.type === "builtin_fn") {
|
||||||
return this.executeBuiltin(subject.name, args, syms);
|
return this.executeBuiltin(subject.name, args, syms);
|
||||||
}
|
}
|
||||||
if (subject.type !== "fn")
|
if (subject.type !== "fn")
|
||||||
@ -481,8 +480,8 @@ class Evaluator {
|
|||||||
const fnDef = this.fnDefs[subject.fnDefId];
|
const fnDef = this.fnDefs[subject.fnDefId];
|
||||||
if (fnDef.params.length !== args.length)
|
if (fnDef.params.length !== args.length)
|
||||||
throw new Error("incorrect amount of arguments in call to function");
|
throw new Error("incorrect amount of arguments in call to function");
|
||||||
let fnScopeSyms = new Syms(this.root);
|
let fnScopeSyms = new Syms(fnDef.syms);
|
||||||
for (const [i, param] in fnDef.params.entries()) {
|
for (const [i, param] of fnDef.params.entries()) {
|
||||||
fnScopeSyms.define(param.ident, { value: args[i], pos: param.pos });
|
fnScopeSyms.define(param.ident, { value: args[i], pos: param.pos });
|
||||||
}
|
}
|
||||||
const flow = this.evalExpr(fnDef.body, fnScopeSyms);
|
const flow = this.evalExpr(fnDef.body, fnScopeSyms);
|
||||||
@ -498,20 +497,20 @@ class Evaluator {
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
The first thing we do is evaluate the subject expression of the call (`subject(...args)`). If that yeilds a value, we continue. Then we evaluate each of the arguments in order. If evaluation of an argument doesn't yeild a value, we return immediately. Then, if the subject evaluated to a builtin value, we call `executeBuiltin`, which we will define later, with the builtin name, call arguments and symbol sable. Otherwise, we assert that the subject value is a function and that a function definition with the id exists. We then check that the correct amount of arguments are passed. Then, we make a new symbol table with the root table as parent, which will be the called functions symbols. We assign each argument value to the corrosponding parameter name, dictated by argument order. We then evaluate the function body. Finally, we check that the control flow results in either a value, which we simply return, or a return flow, which we convert to a value.
|
The first thing we do is evaluate the subject expression of the call (`subject(...args)`). If that yields a value, we continue. Then we evaluate each of the arguments in order. If evaluation of an argument doesn't yield a value, we return immediately. Then, if the subject evaluated to a builtin value, we call `executeBuiltin`, which we will define later, with the builtin name, call arguments and symbol table. Otherwise, we assert that the subject value is a function and that a function definition with the id exists. We then check that the correct amount of arguments are passed. Then, we make a new symbol table with the function definition's symbol table as parent, which will be the called function's symbols. We assign each argument value to the corresponding parameter name, dictated by argument order. We then evaluate the function body. Finally, we check that the control flow results in either a value, which we simply return, or a return flow, which we convert to a value.
|
||||||
|
|
||||||
### 4.5.7 Unary expressions
|
### 4.5.7 Unary expressions
|
||||||
|
|
||||||
Next, we will implement evaluation of unary expressions, meaning postfix expressions with one operand such as when using the `not` operator.
|
Next, we will implement evaluation of unary expressions, meaning postfix expressions with one operand, such as when using the `not` operator.
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
class Evaluator {
|
class Evaluator {
|
||||||
// ...
|
// ...
|
||||||
public evalExpr(expr: Expr, syms: Syms): Flow {
|
public evalExpr(expr: Expr, syms: Syms): Flow {
|
||||||
// ...
|
// ...
|
||||||
if (expr.type === "unary") {
|
if (expr.kind.type === "unary") {
|
||||||
if (expr.unaryType === "not") {
|
if (expr.kind.unaryType === "not") {
|
||||||
const [subject, subjectFlow] = expectValue(this.evalExpr(expr.subject, syms));
|
const [subject, subjectFlow] = expectValue(this.evalExpr(expr.kind.subject, syms));
|
||||||
if (!subject)
|
if (!subject)
|
||||||
return subjectFlow;
|
return subjectFlow;
|
||||||
if (subject.type === "bool") {
|
if (subject.type === "bool") {
|
||||||
@ -519,7 +518,7 @@ class Evaluator {
|
|||||||
}
|
}
|
||||||
throw new Error(`cannot apply not operator on type ${subject.type}`);
|
throw new Error(`cannot apply not operator on type ${subject.type}`);
|
||||||
}
|
}
|
||||||
throw new Error(`unhandled unary operation ${expr.unaryType}`);
|
throw new Error(`unhandled unary operation ${expr.kind.unaryType}`);
|
||||||
}
|
}
|
||||||
// ...
|
// ...
|
||||||
}
|
}
|
||||||
@ -538,23 +537,23 @@ class Evaluator {
|
|||||||
// ...
|
// ...
|
||||||
public evalExpr(expr: Expr, syms: Syms): Flow {
|
public evalExpr(expr: Expr, syms: Syms): Flow {
|
||||||
// ...
|
// ...
|
||||||
if (expr.type === "binary") {
|
if (expr.kind.type === "binary") {
|
||||||
const [left, leftFlow] = expectValue(this.evalExpr(expr.left, syms));
|
const [left, leftFlow] = expectValue(this.evalExpr(expr.kind.left, syms));
|
||||||
if (!left)
|
if (!left)
|
||||||
return leftFlow;
|
return leftFlow;
|
||||||
const [right, rightFlow] = expectValue(this.evalExpr(expr.right, syms));
|
const [right, rightFlow] = expectValue(this.evalExpr(expr.kind.right, syms));
|
||||||
if (!right)
|
if (!right)
|
||||||
return rightFlow;
|
return rightFlow;
|
||||||
if (expr.binaryType === "+") {
|
if (expr.kind.binaryType === "+") {
|
||||||
if (left.type === "int" && right.type === "int") {
|
if (left.type === "int" && right.type === "int") {
|
||||||
return flowValue({ type: "int", value: left.value + right.value });
|
return flowValue({ type: "int", value: left.value + right.value });
|
||||||
}
|
}
|
||||||
if (left.type === "string" && right.type === "string") {
|
if (left.type === "string" && right.type === "string") {
|
||||||
return flowValue({ type: "string", value: left.value + right.value });
|
return flowValue({ type: "string", value: left.value + right.value });
|
||||||
}
|
}
|
||||||
throw new Error(`cannot apply ${expr.binaryType} operator on types ${left.type} and ${right.type}`);
|
throw new Error(`cannot apply ${expr.kind.binaryType} operator on types ${left.type} and ${right.type}`);
|
||||||
}
|
}
|
||||||
if (expr.binaryType === "==") {
|
if (expr.kind.binaryType === "==") {
|
||||||
if (left.type === "null" && right.type === "null") {
|
if (left.type === "null" && right.type === "null") {
|
||||||
return flowValue({ type: "bool", value: true });
|
return flowValue({ type: "bool", value: true });
|
||||||
}
|
}
|
||||||
@ -564,12 +563,13 @@ class Evaluator {
|
|||||||
if (left.type !== "null" && right.type === "null") {
|
if (left.type !== "null" && right.type === "null") {
|
||||||
return flowValue({ type: "bool", value: false });
|
return flowValue({ type: "bool", value: false });
|
||||||
}
|
}
|
||||||
if (["int", "string", "bool"].includes(left.type) && left.type === right.type) {
|
if ((left.type === "int" || left.type === "string" || left.type === "bool")
|
||||||
|
&& left.type === right.type) {
|
||||||
return flowValue({ type: "bool", value: left.value === right.value });
|
return flowValue({ type: "bool", value: left.value === right.value });
|
||||||
}
|
}
|
||||||
throw new Error(`cannot apply ${expr.binaryType} operator on types ${left.type} and ${right.type}`);
|
throw new Error(`cannot apply ${expr.kind.binaryType} operator on types ${left.type} and ${right.type}`);
|
||||||
}
|
}
|
||||||
throw new Error(`unhandled binary operation ${expr.unaryType}`);
|
throw new Error(`unhandled binary operation ${expr.kind.binaryType}`);
|
||||||
}
|
}
|
||||||
// ...
|
// ...
|
||||||
}
|
}
|
||||||
@ -577,7 +577,7 @@ class Evaluator {
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
Add operation (`+`) is straight forward. Evaluate the left expressions, evaluate the right expressions and return a value with the result of adding left and right. Addition should work on integers and strings. Add string two strings results in a new string consisting of the left and right values concatonated.
|
Add operation (`+`) is straight forward. Evaluate the left expressions, evaluate the right expressions and return a value with the result of adding left and right. Addition should work on integers and strings. Adding two strings results in a new string consisting of the left and right values concatenated.
|
||||||
|
|
||||||
The equality operator (`==`) is a bit more complicated. It only results in values of type bool. You should be able to check if any value is null. Otherwise, comparison should only be allowed on two values of same type.
|
The equality operator (`==`) is a bit more complicated. It only results in values of type bool. You should be able to check if any value is null. Otherwise, comparison should only be allowed on two values of same type.
|
||||||
|
|
||||||
@ -594,16 +594,16 @@ class Evaluator {
|
|||||||
// ...
|
// ...
|
||||||
public evalExpr(expr: Expr, syms: Syms): Flow {
|
public evalExpr(expr: Expr, syms: Syms): Flow {
|
||||||
// ...
|
// ...
|
||||||
if (expr.type === "if") {
|
if (expr.kind.type === "if") {
|
||||||
const [condition, conditionFlow] = expectValue(this.evalExpr(expr.condition, syms));
|
const [cond, condFlow] = expectValue(this.evalExpr(expr.kind.cond, syms));
|
||||||
if (!condition)
|
if (!cond)
|
||||||
return conditionFlow;
|
return condFlow;
|
||||||
if (condition.type !== "bool")
|
if (cond.type !== "bool")
|
||||||
throw new Error(`cannot use value of type ${subject.type} as condition`);
|
throw new Error(`cannot use value of type ${cond.type} as condition`);
|
||||||
if (condition.value)
|
if (cond.value)
|
||||||
return this.evalExpr(expr.truthy, syms);
|
return this.evalExpr(expr.kind.truthy, syms);
|
||||||
if (expr.falsy)
|
if (expr.kind.falsy)
|
||||||
return this.evalExpr(exor.falsy, syms);
|
return this.evalExpr(expr.kind.falsy, syms);
|
||||||
return flowValue({ type: "null" });
|
return flowValue({ type: "null" });
|
||||||
}
|
}
|
||||||
// ...
|
// ...
|
||||||
@ -616,16 +616,16 @@ We start by evaluating the condition expression. The condition value should be a
|
|||||||
|
|
||||||
### 4.5.10 Loop expressions
|
### 4.5.10 Loop expressions
|
||||||
|
|
||||||
Next, we'll implement the loop expression. The loop expression will repeatedly evaluate the body expression while throwing away the resulting values, until it results in breaking control flow. If the control flow is of type break, the loop expression itself will evalute to the break's value.
|
Next, we'll implement the loop expression. The loop expression will repeatedly evaluate the body expression while throwing away the resulting values, until it results in breaking control flow. If the control flow is of type break, the loop expression itself will evaluate to the break's value.
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
class Evaluator {
|
class Evaluator {
|
||||||
// ...
|
// ...
|
||||||
public evalExpr(expr: Expr, syms: Syms): Flow {
|
public evalExpr(expr: Expr, syms: Syms): Flow {
|
||||||
// ...
|
// ...
|
||||||
if (expr.type === "loop") {
|
if (expr.kind.type === "loop") {
|
||||||
while (true) {
|
while (true) {
|
||||||
const flow = this.evaluate(expr.body, syms);
|
const flow = this.evalExpr(expr.kind.body, syms);
|
||||||
if (flow.type === "break")
|
if (flow.type === "break")
|
||||||
return flowValue(flow.value);
|
return flowValue(flow.value);
|
||||||
if (flow.type !== "value")
|
if (flow.type !== "value")
|
||||||
@ -638,7 +638,7 @@ class Evaluator {
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
First, start an infinite loop. In each iteration, evalute the loop body. If the resulting control flow is breaking, return the break value. If the control flow is not a value, meaning return or other unimplemented control flow, just return the control flow. Otherwise, discard the value and repeate.
|
First, start an infinite loop. In each iteration, evaluate the loop body. If the resulting control flow is breaking, return the break value. If the control flow is not a value, meaning return or other unimplemented control flow, just return the control flow. Otherwise, discard the value and repeate.
|
||||||
|
|
||||||
|
|
||||||
## 4.5.11 Block expressions
|
## 4.5.11 Block expressions
|
||||||
@ -650,15 +650,15 @@ class Evaluator {
|
|||||||
// ...
|
// ...
|
||||||
public evalExpr(expr: Expr, syms: Syms): Flow {
|
public evalExpr(expr: Expr, syms: Syms): Flow {
|
||||||
// ...
|
// ...
|
||||||
if (expr.type === "block") {
|
if (expr.kind.type === "block") {
|
||||||
let scopeSyms = new Syms(syms);
|
let scopeSyms = new Syms(syms);
|
||||||
for (const stmt of block.stmts) {
|
for (const stmt of expr.kind.stmts) {
|
||||||
const flow = this.evalStmt(stmt, scopeSyms);
|
const flow = this.evalStmt(stmt, scopeSyms);
|
||||||
if (flow.type !== "value")
|
if (flow.type !== "value")
|
||||||
return flow;
|
return flow;
|
||||||
}
|
}
|
||||||
if (expr.expr)
|
if (expr.kind.expr)
|
||||||
return this.evalExpr(expr.expr, scopeSyms);
|
return this.evalExpr(expr.kind.expr, scopeSyms);
|
||||||
return flowValue({ type: "null" });
|
return flowValue({ type: "null" });
|
||||||
}
|
}
|
||||||
// ...
|
// ...
|
||||||
@ -682,11 +682,11 @@ For evaluating statements, we'll make a function called `evalStmt` .
|
|||||||
class Evaluator {
|
class Evaluator {
|
||||||
// ...
|
// ...
|
||||||
public evalStmt(stmt: Stmt, syms: Syms): Flow {
|
public evalStmt(stmt: Stmt, syms: Syms): Flow {
|
||||||
if (stmt.type === "error") {
|
if (stmt.kind.type === "error") {
|
||||||
throw new Error("error in AST");
|
throw new Error("error in AST");
|
||||||
}
|
}
|
||||||
// ...
|
// ...
|
||||||
throw new Error(`unknown stmt type "${expr.type}"`);
|
throw new Error(`unknown stmt type "${stmt.kind.type}"`);
|
||||||
}
|
}
|
||||||
// ...
|
// ...
|
||||||
}
|
}
|
||||||
@ -703,10 +703,10 @@ class Evaluator {
|
|||||||
// ...
|
// ...
|
||||||
public evalStmt(stmt: Stmt, syms: Syms): Flow {
|
public evalStmt(stmt: Stmt, syms: Syms): Flow {
|
||||||
// ...
|
// ...
|
||||||
if (stmt.type === "break") {
|
if (stmt.kind.type === "break") {
|
||||||
if (!stmt.expr)
|
if (!stmt.kind.expr)
|
||||||
return { type: "break" };
|
return { type: "break", value: { type: "null" } };
|
||||||
const [value, valueFlow] = expectValue(this.evalExpr(stmt.expr));
|
const [value, valueFlow] = expectValue(this.evalExpr(stmt.kind.expr, syms));
|
||||||
if (!value)
|
if (!value)
|
||||||
return valueFlow;
|
return valueFlow;
|
||||||
return { type: "break", value };
|
return { type: "break", value };
|
||||||
@ -728,10 +728,10 @@ class Evaluator {
|
|||||||
// ...
|
// ...
|
||||||
public evalStmt(stmt: Stmt, syms: Syms): Flow {
|
public evalStmt(stmt: Stmt, syms: Syms): Flow {
|
||||||
// ...
|
// ...
|
||||||
if (stmt.type === "return") {
|
if (stmt.kind.type === "return") {
|
||||||
if (!stmt.expr)
|
if (!stmt.kind.expr)
|
||||||
return { type: "return" };
|
return { type: "return", value: { type: "null" } };
|
||||||
const [value, valueFlow] = expectValue(this.evalExpr(stmt.expr));
|
const [value, valueFlow] = expectValue(this.evalExpr(stmt.kind.expr, syms));
|
||||||
if (!value)
|
if (!value)
|
||||||
return valueFlow;
|
return valueFlow;
|
||||||
return { type: "return", value };
|
return { type: "return", value };
|
||||||
@ -753,8 +753,8 @@ class Evaluator {
|
|||||||
// ...
|
// ...
|
||||||
public evalStmt(stmt: Stmt, syms: Syms): Flow {
|
public evalStmt(stmt: Stmt, syms: Syms): Flow {
|
||||||
// ...
|
// ...
|
||||||
if (stmt.type === "expr") {
|
if (stmt.kind.type === "expr") {
|
||||||
return this.evalExpr(stmt.expr);
|
return this.evalExpr(stmt.kind.expr, syms);
|
||||||
}
|
}
|
||||||
// ...
|
// ...
|
||||||
}
|
}
|
||||||
@ -771,41 +771,42 @@ class Evaluator {
|
|||||||
// ...
|
// ...
|
||||||
public evalStmt(stmt: Stmt, syms: Syms): Flow {
|
public evalStmt(stmt: Stmt, syms: Syms): Flow {
|
||||||
// ...
|
// ...
|
||||||
if (stmt.type === "assign") {
|
if (stmt.kind.type === "assign") {
|
||||||
const [value, valueFlow] = expectValue(this.evalExpr(stmt.value, syms));
|
const [value, valueFlow] = expectValue(this.evalExpr(stmt.kind.value, syms));
|
||||||
if (!value)
|
if (!value)
|
||||||
return valueFlow;
|
return valueFlow;
|
||||||
if (stmt.subject.type === "ident") {
|
if (stmt.kind.subject.kind.type === "ident") {
|
||||||
const ident = stmt.subject.value;
|
const ident = stmt.kind.subject.kind.value;
|
||||||
const { ok: found, sym } = syms.get(ident);
|
const getResult = syms.get(ident);
|
||||||
if (!found)
|
if (!getResult.ok)
|
||||||
throw new Error(`cannot assign to undefined symbol "${ident}"`);
|
throw new Error(`cannot assign to undefined symbol "${ident}"`);
|
||||||
|
const { sym } = getResult;
|
||||||
if (sym.value.type !== value.type)
|
if (sym.value.type !== value.type)
|
||||||
throw new Error(`cannot assign value of type ${value.type} to symbol originally declared ${sym.value.type}`);
|
throw new Error(`cannot assign value of type ${value.type} to symbol originally declared ${sym.value.type}`);
|
||||||
sym.value = value;
|
sym.value = value;
|
||||||
return valueFlow({ type: "null" });
|
return flowValue({ type: "null" });
|
||||||
}
|
}
|
||||||
if (stmt.subject.type === "field") {
|
if (stmt.kind.subject.kind.type === "field") {
|
||||||
const [subject, subjectFlow] = expectValue(this.evalExpr(stmt.subject.subject, syms));
|
const [subject, subjectFlow] = expectValue(this.evalExpr(stmt.kind.subject.kind.subject, syms));
|
||||||
if (!subject)
|
if (!subject)
|
||||||
return subjectFlow;
|
return subjectFlow;
|
||||||
if (subject.type !== "struct")
|
if (subject.type !== "struct")
|
||||||
throw new Error(`cannot use field operator on ${subject.type} value`);
|
throw new Error(`cannot use field operator on ${subject.type} value`);
|
||||||
subject.fields[stmt.subject.value] = value;
|
subject.fields[stmt.kind.subject.kind.value] = value;
|
||||||
return valueFlow({ type: "null" });
|
return flowValue({ type: "null" });
|
||||||
}
|
}
|
||||||
if (stmt.subject.type === "index") {
|
if (stmt.kind.subject.kind.type === "index") {
|
||||||
const [subject, subjectFlow] = expectValue(this.evalExpr(stmt.subject.subject, syms));
|
const [subject, subjectFlow] = expectValue(this.evalExpr(stmt.kind.subject.kind.subject, syms));
|
||||||
if (!subject)
|
if (!subject)
|
||||||
return subjectFlow;
|
return subjectFlow;
|
||||||
const [index, indexFlow] = expectValue(this.evalExpr(stmt.subject.value, syms));
|
const [index, indexFlow] = expectValue(this.evalExpr(stmt.kind.subject.kind.value, syms));
|
||||||
if (!index)
|
if (!index)
|
||||||
return valueFlow;
|
return indexFlow;
|
||||||
if (subject.type === "struct") {
|
if (subject.type === "struct") {
|
||||||
if (index.type !== "string")
|
if (index.type !== "string")
|
||||||
throw new Error(`cannot index into struct with ${index.type} value`);
|
throw new Error(`cannot index into struct with ${index.type} value`);
|
||||||
subject.fields[index.value] = value;
|
subject.fields[index.value] = value;
|
||||||
return valueFlow({ type: "null" });
|
return flowValue({ type: "null" });
|
||||||
}
|
}
|
||||||
if (subject.type === "array") {
|
if (subject.type === "array") {
|
||||||
if (index.type !== "int")
|
if (index.type !== "int")
|
||||||
@ -813,19 +814,19 @@ class Evaluator {
|
|||||||
if (index.value >= subject.values.length)
|
if (index.value >= subject.values.length)
|
||||||
throw new Error("index out of range");
|
throw new Error("index out of range");
|
||||||
if (index.value >= 0) {
|
if (index.value >= 0) {
|
||||||
subject.value[index.value] = value;
|
subject.values[index.value] = value;
|
||||||
} else {
|
} else {
|
||||||
const negativeIndex = subject.values.length + index.value;
|
const negativeIndex = subject.values.length + index.value;
|
||||||
if (negativeIndex < 0 || negativeIndex >= subject.values.length)
|
if (negativeIndex < 0 || negativeIndex >= subject.values.length)
|
||||||
throw new Error("index out of range");
|
throw new Error("index out of range");
|
||||||
subject.value[negativeIndex] = value;
|
subject.values[negativeIndex] = value;
|
||||||
return valueFlow({ type: "null" });
|
return flowValue({ type: "null" });
|
||||||
}
|
}
|
||||||
return valueFlow({ type: "null" });
|
return flowValue({ type: "null" });
|
||||||
}
|
}
|
||||||
throw new Error(`cannot use field operator on ${subject.type} value`);
|
throw new Error(`cannot use field operator on ${subject.type} value`);
|
||||||
}
|
}
|
||||||
throw new Error(`cannot assign to ${stmt.subject.type} expression`);
|
throw new Error(`cannot assign to ${stmt.kind.subject.kind.type} expression`);
|
||||||
}
|
}
|
||||||
// ...
|
// ...
|
||||||
}
|
}
|
||||||
@ -839,7 +840,7 @@ For assigning to identifiers, eg. `a = 5`, we start by finding the symbol. If no
|
|||||||
|
|
||||||
For assigning to fields, eg. `a.b = 5`, we evaluate the inner (field expression) subject expression, `a` in this case. Then we reassign the field value or assign to a new field, if it doesn't exist.
|
For assigning to fields, eg. `a.b = 5`, we evaluate the inner (field expression) subject expression, `a` in this case. Then we reassign the field value or assign to a new field, if it doesn't exist.
|
||||||
|
|
||||||
And then, for assigning to indeces, eg. `a[b] = 5`, we evalute the inner (index expression) subject `a` and index value `b` in that order. If `a` is a struct, we check that `b` is a string and assign to the field, the string names. Else, if `a` is an array, we check that `b` is an int and assign to the index or negative index (see 4.5.5 Index expressions).
|
And then, for assigning to indices, eg. `a[b] = 5`, we evaluate the inner (index expression) subject `a` and index value `b` in that order. If `a` is a struct, we check that `b` is a string and assign to the field, the string names. Else, if `a` is an array, we check that `b` is an int and assign to the index or negative index (see 4.5.5 Index expressions).
|
||||||
|
|
||||||
### 4.6.5 Let statements
|
### 4.6.5 Let statements
|
||||||
|
|
||||||
@ -850,14 +851,14 @@ class Evaluator {
|
|||||||
// ...
|
// ...
|
||||||
public evalStmt(stmt: Stmt, syms: Syms): Flow {
|
public evalStmt(stmt: Stmt, syms: Syms): Flow {
|
||||||
// ...
|
// ...
|
||||||
if (stmt.type === "let") {
|
if (stmt.kind.type === "let") {
|
||||||
if (syms.definedLocally(stmt.param.ident))
|
if (syms.definedLocally(stmt.kind.param.ident))
|
||||||
throw new Error(`cannot redeclare symbol "${stmt.param.ident}"`);
|
throw new Error(`cannot redeclare symbol "${stmt.kind.param.ident}"`);
|
||||||
const [value, valueFlow] = expectValue(this.evalExpr(stmt.value, syms));
|
const [value, valueFlow] = expectValue(this.evalExpr(stmt.kind.value, syms));
|
||||||
if (!value)
|
if (!value)
|
||||||
return valueFlow;
|
return valueFlow;
|
||||||
syms.define(stmt.param.ident, value);
|
syms.define(stmt.kind.param.ident, { value: value, pos: stmt.pos });
|
||||||
return valueFlow({ type: "null" });
|
return flowValue({ type: "null" });
|
||||||
}
|
}
|
||||||
// ...
|
// ...
|
||||||
}
|
}
|
||||||
@ -874,10 +875,10 @@ class Evaluator {
|
|||||||
// ...
|
// ...
|
||||||
public evalStmt(stmt: Stmt, syms: Syms): Flow {
|
public evalStmt(stmt: Stmt, syms: Syms): Flow {
|
||||||
// ...
|
// ...
|
||||||
if (stmt.type === "fn") {
|
if (stmt.kind.type === "fn") {
|
||||||
if (syms.definedLocally(stmt.ident))
|
if (syms.definedLocally(stmt.kind.ident))
|
||||||
throw new Error(`cannot redeclare function "${stmt.ident}"`);
|
throw new Error(`cannot redeclare function "${stmt.kind.ident}"`);
|
||||||
const { params, body } = stmt;
|
const { params, body } = stmt.kind;
|
||||||
|
|
||||||
let paramNames: string[] = [];
|
let paramNames: string[] = [];
|
||||||
for (const param of params) {
|
for (const param of params) {
|
||||||
@ -887,9 +888,9 @@ class Evaluator {
|
|||||||
}
|
}
|
||||||
|
|
||||||
const id = this.fnDefs.length;
|
const id = this.fnDefs.length;
|
||||||
this.fnDefs.push({ params, body, id });
|
this.fnDefs.push({ params, body, id, syms });
|
||||||
this.syms.define(stmt.ident, { type: "fn", fnDefId: id });
|
syms.define(stmt.kind.ident, { value: { type: "fn", fnDefId: id }, pos: stmt.pos });
|
||||||
return flowValue({ type: "none" });
|
return flowValue({ type: "null" });
|
||||||
}
|
}
|
||||||
// ...
|
// ...
|
||||||
}
|
}
|
||||||
@ -906,9 +907,9 @@ We'll want a function for evaluating the top-level statements.
|
|||||||
```ts
|
```ts
|
||||||
class Evaluator {
|
class Evaluator {
|
||||||
// ...
|
// ...
|
||||||
public evalStmts(stmts: Stmt[], syms: Syms) {
|
public evalStmts(stmts: Stmt[]) {
|
||||||
let scopeSyms = new Syms(syms);
|
let scopeSyms = new Syms(this.root);
|
||||||
for (const stmt of block.stmts) {
|
for (const stmt of stmts) {
|
||||||
const flow = this.evalStmt(stmt, scopeSyms);
|
const flow = this.evalStmt(stmt, scopeSyms);
|
||||||
if (flow.type !== "value")
|
if (flow.type !== "value")
|
||||||
throw new Error(`${flow.type} on the loose!`);
|
throw new Error(`${flow.type} on the loose!`);
|
||||||
@ -1063,13 +1064,13 @@ class Evaluator {
|
|||||||
private executeBuiltin(name: string, args: Value[], syms: Syms): Flow {
|
private executeBuiltin(name: string, args: Value[], syms: Syms): Flow {
|
||||||
// ...
|
// ...
|
||||||
if (name === "println") {
|
if (name === "println") {
|
||||||
if (args.length < 1)
|
if (args.length < 1 || args[0].type !== "string")
|
||||||
throw new Error("incorrect arguments");
|
throw new Error("incorrect arguments");
|
||||||
let msg = args[0];
|
let msg = args[0].value;
|
||||||
for (const arg of args.slice(1)) {
|
for (const arg of args.slice(1)) {
|
||||||
if (!msg.includes("{}"))
|
if (!msg.includes("{}"))
|
||||||
throw new Error("incorrect arguments");
|
throw new Error("incorrect arguments");
|
||||||
msg.replace("{}", valueToString(arg));
|
msg = msg.replace("{}", valueToString(arg));
|
||||||
}
|
}
|
||||||
console.log(msg);
|
console.log(msg);
|
||||||
return flowValue({ type: "null" });
|
return flowValue({ type: "null" });
|
||||||
@ -1080,7 +1081,7 @@ class Evaluator {
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
This function takes a format-string as the first argument, and, corrosponding to the format-string, values to be formattet in the correct order.
|
This function takes a format-string as the first argument, and, corresponding to the format-string, values to be formatted in the correct order.
|
||||||
|
|
||||||
Examples of how to use the function follows.
|
Examples of how to use the function follows.
|
||||||
|
|
||||||
@ -1095,7 +1096,7 @@ println("{} + {} = {}", 1, 2, 1 + 2);
|
|||||||
|
|
||||||
### 4.8.5 Exit
|
### 4.8.5 Exit
|
||||||
|
|
||||||
Normally, the evaluator will return a zero-exit code, meanin no error. In case we program should result in an error code, we'll need an exit function.
|
Normally, the evaluator will return a zero-exit code, meaning no error. In case the program should result in an error code, we'll need an exit function.
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
class Evaluator {
|
class Evaluator {
|
||||||
@ -1151,7 +1152,7 @@ class Evaluator {
|
|||||||
if (args.length !== 1 || args[0].type !== "string")
|
if (args.length !== 1 || args[0].type !== "string")
|
||||||
throw new Error("incorrect arguments");
|
throw new Error("incorrect arguments");
|
||||||
const value = parseInt(args[0].value);
|
const value = parseInt(args[0].value);
|
||||||
if (value === NaN)
|
if (isNaN(value))
|
||||||
return flowValue({ type: "null" });
|
return flowValue({ type: "null" });
|
||||||
return flowValue({ type: "int", value });
|
return flowValue({ type: "int", value });
|
||||||
}
|
}
|
||||||
@ -1171,14 +1172,20 @@ Finally, we need a way to define the builtin functions in the symbol table.
|
|||||||
class Evaluator {
|
class Evaluator {
|
||||||
// ...
|
// ...
|
||||||
public defineBuiltins() {
|
public defineBuiltins() {
|
||||||
this.root.define("array", { type: "builtin_fn", name: "array" });
|
function defineBuiltin(syms: Syms, name: string) {
|
||||||
this.root.define("struct", { type: "builtin_fn", name: "struct" });
|
syms.define(name, { value: { type: "builtin_fn", name } });
|
||||||
this.root.define("array_push", { type: "builtin_fn", name: "array_push" });
|
}
|
||||||
this.root.define("array_len", { type: "builtin_fn", name: "array_len" });
|
|
||||||
this.root.define("string_concat", { type: "builtin_fn", name: "string_concat" });
|
defineBuiltin(this.root, "array");
|
||||||
this.root.define("string_len", { type: "builtin_fn", name: "string_len" });
|
defineBuiltin(this.root, "struct");
|
||||||
this.root.define("println", { type: "builtin_fn", name: "println" });
|
defineBuiltin(this.root, "array_push");
|
||||||
this.root.define("exit", { type: "builtin_fn", name: "exit" });
|
defineBuiltin(this.root, "array_len");
|
||||||
|
defineBuiltin(this.root, "string_concat");
|
||||||
|
defineBuiltin(this.root, "string_len");
|
||||||
|
defineBuiltin(this.root, "println");
|
||||||
|
defineBuiltin(this.root, "exit");
|
||||||
|
defineBuiltin(this.root, "to_string");
|
||||||
|
defineBuiltin(this.root, "string_to_int");
|
||||||
}
|
}
|
||||||
// ...
|
// ...
|
||||||
}
|
}
|
||||||
|
250
compiler/chapter_5.md
Normal file
250
compiler/chapter_5.md
Normal file
@ -0,0 +1,250 @@
|
|||||||
|
|
||||||
|
# 5 Make a test program
|
||||||
|
|
||||||
|
To test the evaluator and any future runtimes we'll remake the calculator from chapter 1.
|
||||||
|
|
||||||
|
## 5.1 Hello world
|
||||||
|
|
||||||
|
First, we'll need to setup the parser and evaluator.
|
||||||
|
|
||||||
|
I'll show this example using Deno.
|
||||||
|
|
||||||
|
```ts
|
||||||
|
const filename = Deno.args[0];
|
||||||
|
const text = await Deno.readTextFile(filename);
|
||||||
|
const parser = new Parser(new Lexer(text));
|
||||||
|
const ast = parser.parseStmts();
|
||||||
|
const evaluator = new Evaluator();
|
||||||
|
evaluator.defineBuiltins();
|
||||||
|
evaluator.evalStmts(ast);
|
||||||
|
```
|
||||||
|
|
||||||
|
Now we can write source code in a file.
|
||||||
|
```rs
|
||||||
|
println("hello world");
|
||||||
|
```
|
||||||
|
Save it to a file, eg. `program.txt`, run it with the following command.
|
||||||
|
```sh
|
||||||
|
deno run -A main.ts program.txt
|
||||||
|
```
|
||||||
|
And it should output the following.
|
||||||
|
```
|
||||||
|
hello world
|
||||||
|
```
|
||||||
|
|
||||||
|
## 5.2 Representation in code
|
||||||
|
|
||||||
|
We'll use the same representation in code as we did in chapter one. The following is how it was done in Javascript.
|
||||||
|
|
||||||
|
```js
|
||||||
|
const expr = {
|
||||||
|
type: "add",
|
||||||
|
left: { type: "int", value: 1 },
|
||||||
|
right: {
|
||||||
|
type: "multiply",
|
||||||
|
left: { type: "int", value: 2 },
|
||||||
|
right: { type: "int", value: 3 },
|
||||||
|
},
|
||||||
|
};
|
||||||
|
```
|
||||||
|
|
||||||
|
To do this is our language, (unless you've implemented struct literals), we'll need to use the `struct` builtin and field expression assignment.
|
||||||
|
|
||||||
|
```rs
|
||||||
|
let expr = struct();
|
||||||
|
expr["type"] = "add";
|
||||||
|
expr["left"] = struct();
|
||||||
|
expr.left["type"] = "int";
|
||||||
|
expr.left["value"] = 1;
|
||||||
|
expr["right"] = struct();
|
||||||
|
expr.right["type"] = "multiply";
|
||||||
|
expr.right["left"] = struct();
|
||||||
|
expr.right.left["type"] = "int";
|
||||||
|
expr.right.left["value"] = 2;
|
||||||
|
expr.right["right"] = struct();
|
||||||
|
expr.right.right["type"] = "int";
|
||||||
|
expr.right.right["value"] = 3;
|
||||||
|
```
|
||||||
|
|
||||||
|
If we print this value, we should expect an output like the following.
|
||||||
|
```
|
||||||
|
{ type: "add", left: { type: "int", value: 1 }, right: { type: "multiply", left: { type: "int", value: 2 }, right: { type: "int", value: 3 } } }
|
||||||
|
```
|
||||||
|
|
||||||
|
## 5.3 Evaluating expressions
|
||||||
|
|
||||||
|
Exactly like in chapter 1, we need a function for evaluating the expression structure described above.
|
||||||
|
|
||||||
|
```rs
|
||||||
|
fn eval_expr(node) {
|
||||||
|
if == node.type "int" {
|
||||||
|
return node.value;
|
||||||
|
}
|
||||||
|
if == node.type "add" {
|
||||||
|
let left = eval_expr(node.left);
|
||||||
|
let right = eval_expr(node.right);
|
||||||
|
return + left right;
|
||||||
|
}
|
||||||
|
if == node.type "multiply" {
|
||||||
|
let left = eval_expr(node.left);
|
||||||
|
let right = eval_expr(node.right);
|
||||||
|
return * left right;
|
||||||
|
}
|
||||||
|
println("unknown expr type");
|
||||||
|
exit(1);
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Exercises
|
||||||
|
|
||||||
|
1. Implement `subtract` and `divide`.
|
||||||
|
|
||||||
|
|
||||||
|
## 5.4 Parsing source code
|
||||||
|
|
||||||
|
### 5.4.1 Lexer
|
||||||
|
|
||||||
|
```rs
|
||||||
|
fn char_in_string(ch, val) {
|
||||||
|
let i = 0;
|
||||||
|
let val_length = string_len(val);
|
||||||
|
loop {
|
||||||
|
if >= i val_len {
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
if == ch val[i] {
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
i = + i 1;
|
||||||
|
}
|
||||||
|
false
|
||||||
|
}
|
||||||
|
|
||||||
|
fn lex(text) {
|
||||||
|
let i = 0;
|
||||||
|
let tokens = array();
|
||||||
|
let text_length = array_len(text);
|
||||||
|
loop {
|
||||||
|
if >= i text_length {
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
loop {
|
||||||
|
if not char_in_string(text[i], " \t\n") {
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
i = + i 1;
|
||||||
|
}
|
||||||
|
if char_in_string(text[i], "1234567890") {
|
||||||
|
let text_val = "";
|
||||||
|
loop {
|
||||||
|
if not char_in_string(text[i], "1234567890") {
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
text_val = string_concat(text_val, text[i]);
|
||||||
|
i = + i 1;
|
||||||
|
}
|
||||||
|
let token = struct();
|
||||||
|
token["type"] = "int";
|
||||||
|
token["value"] = string_to_int(text_val);
|
||||||
|
array_push(token);
|
||||||
|
} else if == text[i] "+"[0] {
|
||||||
|
i = + i 1;
|
||||||
|
let token = struct();
|
||||||
|
token["type"] = "+";
|
||||||
|
array_push(token);
|
||||||
|
} else if == text[i] "*"[0] {
|
||||||
|
i = + i 1;
|
||||||
|
let token = struct();
|
||||||
|
token["type"] = "*";
|
||||||
|
array_push(token);
|
||||||
|
} else {
|
||||||
|
println("illegal character");
|
||||||
|
exit(1);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
tokens
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Exercises
|
||||||
|
|
||||||
|
1. Implement `-` and `/` for subtraction and division.
|
||||||
|
2. \* Add position (line and column number) to each token.
|
||||||
|
3. \*\* Rewrite lexer into an iterator (eg. use the OOP iterator pattern).
|
||||||
|
|
||||||
|
### 5.4.2 Parser
|
||||||
|
|
||||||
|
```rs
|
||||||
|
fn parser_new(tokens) {
|
||||||
|
let self = struct();
|
||||||
|
self["tokens"] = tokens;
|
||||||
|
self["i"] = 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
fn parser_step(self) { self.i = + self.i 1; }
|
||||||
|
fn parser_done(self) { >= self.i array_len(self.tokens) }
|
||||||
|
fn parser_current(self) { self.tokens[self.i] }
|
||||||
|
|
||||||
|
fn parser_parse_expr(self) {
|
||||||
|
if parser_done(self) {
|
||||||
|
println("expected expr, got end-of-file");
|
||||||
|
exit(1);
|
||||||
|
} else if == parser_current(self).type "+" {
|
||||||
|
parser_step(self);
|
||||||
|
let left = parser_parse_expr(self);
|
||||||
|
let right = parser_parse_expr(self);
|
||||||
|
let node = struct();
|
||||||
|
node["type"] = "add";
|
||||||
|
node["left"] = left;
|
||||||
|
node["right"] = right;
|
||||||
|
node
|
||||||
|
} else if == parser_current(self).type "*" {
|
||||||
|
parser_step(self);
|
||||||
|
let left = parser_parse_expr(self);
|
||||||
|
let right = parser_parse_expr(self);
|
||||||
|
let node = struct();
|
||||||
|
node["type"] = "multiply";
|
||||||
|
node["left"] = left;
|
||||||
|
node["right"] = right;
|
||||||
|
node
|
||||||
|
} else if == parser_current(self).type "int" {
|
||||||
|
let value = parser_current(self).value;
|
||||||
|
parser_step(self);
|
||||||
|
let node = struct();
|
||||||
|
node["type"] = "int";
|
||||||
|
node["value"] = value;
|
||||||
|
node
|
||||||
|
} else {
|
||||||
|
println("expected expr");
|
||||||
|
exit(1);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
#### Exercises
|
||||||
|
|
||||||
|
1. Implement subtraction and division.
|
||||||
|
2. \* Add position (line and column number) to each expression.
|
||||||
|
|
||||||
|
## 5.5 Putting it together
|
||||||
|
|
||||||
|
```rs
|
||||||
|
let text = "+ 1 2";
|
||||||
|
|
||||||
|
let tokens = lex(text);
|
||||||
|
let parser = parser_new(tokens);
|
||||||
|
let expr = parser_parse_expr();
|
||||||
|
let result = eval_expr(expr);
|
||||||
|
println("result of {} is {}", text, result);
|
||||||
|
```
|
||||||
|
|
||||||
|
Running the code, we should expect console output like the following.
|
||||||
|
```
|
||||||
|
result of + 1 2 is 3
|
||||||
|
```
|
||||||
|
|
||||||
|
## Exercises
|
||||||
|
|
||||||
|
1. Make a performance benchmark.
|
||||||
|
|
64
compiler/chapter_6.md
Normal file
64
compiler/chapter_6.md
Normal file
@ -0,0 +1,64 @@
|
|||||||
|
|
||||||
|
# 6 Summary theory
|
||||||
|
|
||||||
|
What's the next step?
|
||||||
|
|
||||||
|
To answer that question, we'll have to understand a bit of theory, to know where we are and how we got here.
|
||||||
|
|
||||||
|
## 6.1 Parsing source code into AST
|
||||||
|
|
||||||
|
We started in chapter 2 and 3 by making a parser consisting of the `Lexer` and `Parser` classes. The parser takes source code and translates, or *parses*, it into a structured representation called AST. AST is short for __A__bstract __s__yntax __t__ree, which means that the original program, or code, is represented as a tree-structure.
|
||||||
|
|
||||||
|
We defined the *structured* of the AST (*code as tree-structure*), by consisting of the `Stmt`, `Expr`, ... types.
|
||||||
|
|
||||||
|
We converted the source code from text to AST, because AST is easier to work with in the step that followed.
|
||||||
|
|
||||||
|
## 6.2 Evaluating AST
|
||||||
|
|
||||||
|
In chapter 4 we made an evaluator consisting primarily of the `Value` type, `Syms` class and `Evaluator` class. The evaluator, or AST evaluator, takes a program represented in AST (*code as tree-structure*), goes through the tree-structure and calculates the resulting values.
|
||||||
|
|
||||||
|
Execution using the evaluater is a top-down, outside-in proceess, where we start with the root node, and then call `evalStmt` and `evalExpr` for each child node recursively. We then use the value optained by evaluating the child nodes to calculate the result for a node.
|
||||||
|
|
||||||
|
The only upside for implementing an AST evaluator, is that it is simple to implement and think about. An AST evaluator operates on an AST which is a comparatively close representation of the original source code. Humans understand programs from the point of the source code. Therefore, an AST evaluator executes the code, or *thinks about* the program, the same way we humans think about program. The human brain functions efficiently using layers of abstraction.
|
||||||
|
|
||||||
|
Take the math expression `2 * (3 + 4)`. We start by examining the entire expression. Then we split up the expression into its components, that being a multiplication of `2` by the addition of `3` and `4`. We then, to calculate the result of the outer expression, calculate the result of the inner expression: `3 + 4 = 7`. Then we use that result and calculate the multiplicat: `2 * 7 = 14`. The evaluator functions exactly this way.
|
||||||
|
|
||||||
|
There are multiple downsides to AST evaluation.
|
||||||
|
|
||||||
|
One downside is that some features of the source languages are *ugly* to implement. While expression evaluation is conceptually simple to evaluate using function calls, other features are not. Control flow related features such as break and return cannot be evaluated using only function calls. This is because function calls follow the (control) flow, while break and return *breaks* the control flow.
|
||||||
|
|
||||||
|
But the primary downside of AST evaluation is performance. While humans are most efficient when using layers of abstractions, computers are not. For various reasons, calling functions recursively repeatedly and jumping through tree-structures is very inefficient for computers. Both in terms of memory footprint and execution time. Computers are much more efficient with sequential execution.
|
||||||
|
|
||||||
|
Take the expression defined above. Now imagine we're describing the instructions for how to get the result. We would of course look at the while expression, then break it down. We could then formulate *instructions* such as *add 3 to 4, then multiply by 2*. Now if we execute the instructions, we don't start by examening the entire expression, we just execute the instructions in order. We have now translated the expression into linear execution, which computers are very good at running.
|
||||||
|
|
||||||
|
## 6.3 Instructions
|
||||||
|
|
||||||
|
The 2 main pros of using instructions counter the cons of AST evaluation.
|
||||||
|
|
||||||
|
A lesser point pertaining to implementing control flow, is that everything is done using the sequential instructions. This means special control flow such as break, are handled in the same manner as regular control flow like loops.
|
||||||
|
|
||||||
|
The primary upside compared to AST evaluation is performance. Running instructions is a sequential operation, which could for example look like an array of instructions, where a location of the current instruction is stored, and then execution loops over the array with a for loop. This is A LOT faster compared to AST evaluation with a tree-structure and recursive function calls. The technical details of why the performance is faster this way is hard to explain both simply and accurately, so I'll spare the explanation. You can think about it like this: the parser generates a tree, to evalute the program, we need to do tree-traversal, tree-traversal is slow, therefore we should minimize the amount of tree-traversal. In the AST evaluator, tree-traversal is used on an expression everytime it is run, in a loop or function call for instance. Instead we'd rather want to do tree-traversal once for the entire program, and then generate these instructions, which does not require tree-traversal to run, ie. we do the costly tree-traversal once instead of multiple times.
|
||||||
|
|
||||||
|
The primary downside of this approach compared to AST evaluation is the effort required. AST evalution is both conceptually simple and relatively simple to implement, as it executes the code in just the form the parser outputs, which is also close to the source language. To run instructions instead, we need to translate the program from the AST into runnable sequential instructions. To evalute using instructions, instead of AST evalution, we need to do the following conceptual steps (implementationally seperate in our case):
|
||||||
|
|
||||||
|
- Parse source code into AST.
|
||||||
|
- ~~Evaluate AST.~~
|
||||||
|
- Resolve symbols. Like how we used the symbol table in the evaluator.
|
||||||
|
- Check semantic code structure. We can often parse source code, that cannot be run. In the evaluator we had checks different places, that would throw errors. This is the same concept. This will also include type checking in our case.
|
||||||
|
- Translate (or compile) resolved and checked AST into instructions.
|
||||||
|
- Execute instructions.
|
||||||
|
|
||||||
|
As can be seen, some of the needed steps are steps which are combined in the evaluator. Symbol resolution (resolving) is comparable to how we resolved symbol values (variables) in the evaluator. Semantic code checking is comparative to how we check the AST at different places in the evaluator, such as checking that the `+`-operator is only applied to either 2 ints or 2 strings.
|
||||||
|
|
||||||
|
Translation into instructions and seperate executions are new steps. These are also conceptually more advanced than AST evaluation in the sense, that AST evaluation operates on a high level representation of the code meaning it's close to the original source code, whereas instructions are further away, meaning more low level. This means we need to make some information needed for executing instructions explicit, which may be implicit in AST representation because the tree-structure in-an-of-itself contains some of the information.
|
||||||
|
|
||||||
|
We also need to design the instructions, meaning we need to choose which instructions should be available (instruction set) and some details of how they should be run. The design decisions in this step is essentially arbitrary, meaning there often is not a *correct* decision, whereas the evaluation is designed exactly to evaluate the AST which in some sense is designed to exactly represent the source code. These design decisions require trade-offs, eg. of perfomance, simplicity, ease of implementation, portability, versatility.
|
||||||
|
|
||||||
|
## 6.4 Virtual machine
|
||||||
|
|
||||||
|
The component executing the program is called a virtual machine, VM for short, when we're working with instructions. We'll make the distinction like this: an evaluator is synonymous with an AST evaluation, whereas a virtual machine runs instructions.
|
||||||
|
|
||||||
|
The design decisions of a virtual machine, ie. how it runs and how the program should be, is called the architecture.
|
||||||
|
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue
Block a user