Add to chapter 6, 7

This commit is contained in:
Simon From Jakobsen 2024-10-29 14:31:27 +00:00
parent 45a6d9f8a6
commit 815cde19ae
2 changed files with 221 additions and 6 deletions

View File

@ -60,5 +60,19 @@ The component executing the program is called a virtual machine, VM for short, w
The design decisions of a virtual machine, ie. how it runs and how the program should be, is called the architecture.
When running a program, we often need features which operate *outside of* the program. An example is the builtin functions in chapter 4. These functionalities are called the *runtime*. The runtime is typically implemented to a certian degree in the virtual machine. In that sense the VM also functions as an interface between the program and the host computer.
## 6.5 Types
All values in a program, are of certain types (value types). The type determines how the value is handled, eg. `+` operator on two integers is addition, while the same operator on two strings is concatenation. Types in a program are either known in advance *at compiletime* or is first known when the program runs *at runtime*. The evaluator implemented in chapter 4 would figure out the types as the evaluation went along, this was an example of determining types at runtime, also called *dynamic typing*. When all types are determined in compiletime, this is called *static typing*. The language implemented thus far offer the programmer no way of explicitly specifying types for variables or functions.
There are three important factors in the decision of using either *static* or *dynamic* typing.
The first is language design and developer experience. Types may both help the programmer in writing clear and correct programs, but may also hinder them, because the amount of work to be done is larger, when everything requires a determined type.
The second factor pertains to target, runtime and performance. A program can only be compiled down to and run at a certain level without types being determined. With types predetermined, we can determine the size of values and which operations to execute, without we need to run the program to figure these things out. The lower we can compile a program before execution, the less the theoretical overhead. The higher level the program is at runtime, the bigger the *runtime* need to be theoretically.
The third factor is about effort. Static types require more tooling, eg. parsing types and type checking.
We could also delve into type annotations in dynamically typed languages and runtime compilation and optimizations in JIT compilers to explain languages like Typescript, but that's beyond this course.

View File

@ -91,14 +91,14 @@ We'll start by defining a way to resolve identifiers.
```ts
class Resolver {
// ...
private resolveIdentExpr(expr: Expr, syms: Syms): { ok: boolean } {
private resolveIdentExpr(expr: Expr, syms: Syms) {
if (expr.type !== "ident")
throw new Error("expected ident");
const ident = expr.kind.ident;
const { sym, ok: symFound } = syms.get(ident);
if (!symFound) {
this.reportUseOfUndefined(ident, expr.pos, syms);
return { ok: false };
return;
}
expr.kind = {
type: "sym",
@ -109,17 +109,218 @@ class Resolver {
expr.kind.stmt = sym.stmt;
if (sym.param)
expr.kind.param = sym.param;
return { ok: true };
}
// ...
}
```
When resolving an identifier, we essentially convert the identifier expression in-AST into a symbol expression. Therefore we need to take the expression, so we can mutate it. Because the `Expr` type is unspecific, we have to assert we've gotten an identifer.
When resolving an identifier, we essentially convert the identifier expression in-AST into a symbol expression. Therefore we need to take the expression, so we can mutate it. Because the `Expr` type is unspecific, we have to assert we've gotten an identifer. Then we try and find the symbol in the symbol table. And then we do the mutation, ie. converting the identifier expression into a symbol expression. We have to check that `sym.stmt` and `sym.param` are present, before we assign them.
Then we try and find the symbol in the symbol table, and if we don't, we report and error and return a non-ok result.
## 7.6 Resolving expressions
And then we do the mutation, ie. converting the identifier expression into a symbol expression. We have to check that `sym.stmt` and `sym.param` are present, before we assign them.
```ts
class Resolver {
// ...
private resolveExpr(expr: Expr, syms: Syms) {
if (expr.kind.type === "error") {
return;
}
if (expr.kind.type === "ident") {
this.resolveIdentexpr(expr, syms);
return;
}
// ...
if (expr.kind.type === "binary") {
this.resolveExpr(expr.kind.left, syms);
this.resolveExpr(expr.kind.right, syms);
return;
}
// ...
if (expr.kind.type === "block") {
const childSyms = new Syms(syms);
for (const stmt of expr.stmts) {
this.resolveStmt(stmt, childSyms);
}
if (expr.expr) {
this.resolveExpr(expr.expr, childSyms);
}
return;
}
// ...
throw new Error(`unknown expression ${expr.kind.type}`);
}
// ...
}
```
We traverse the tree, meaning we call `resolveExpr` on each childnode recursively. This way, we reach all identifiers in the AST, that need to be resolved.
Binary expressions are resolved by resolving the left and right operands.
Block expressions are resolved by making a child symbol table, and resolving each statement and the expression if present.
### Exercises
1. Implement the rest of the expressions.
## 7.7 Resolving let statements
```ts
class Resolver {
// ...
private resolveLetStmt(stmt: Stmt, syms: Syms) {
if (stmt.kind.type !== "let")
throw new Error("expected let statement");
this.resolveExpr(stmt.kind.value, syms);
const ident = stmt.kind.param.ident;
if (syms.locallyDefined(ident)) {
this.reportAlreadyDefined(ident, stmt.pos, syms);
return;
}
syms.define(ident, {
ident,
type: "let",
pos: stmt.param.pos,
stmt,
param: stmt.param,
});
}
// ...
}
```
To resolve a let statement, we resolve the value expression, then we check that the symbol has not been defined already, and then we define the symbol as a let-symbol.
## 7.8 Resolving function definition statements
```ts
class Resolver {
// ...
private resolveFnStmt(stmt: Stmt, syms: Syms) {
if (stmt.kind.type !== "fn")
throw new Error("expected fn statement");
if (syms.locallyDefined(stmt.kind.ident)) {
this.reportAlreadyDefined(stmt.kind.ident, stmt.pos, syms);
return;
}
syms.define(ident, {
ident: stmt.kind.ident,
type: "fn",
pos: stmt.pos,
stmt,
});
const fnScopeSyms = new Syms(syms);
for (const param of stmt.kind.params) {
if (fnScopeSysm.locallyDefined(param.ident)) {
this.reportAlreadyDefined(param.ident, param.pos, syms);
continue;
}
fnScopeSyms.define(param.ident, {
ident: param.ident,
type: "fn_param",
pos: param.pos,
param,
});
}
this.resolveExpr(stmt.kind.body, fnScopeSyms);
}
// ...
}
```
To resolve a function definition we first check that the symbol is not already defined, then we define it. Then we make a child symbol table, define all the parameters and lastly resolve the function body.
Parameters must not have the same name, to that end we check that each parameters identififer is not already defined.
Contrary to resolving the let statement, we define the function symbol before resolving the body expression. This is so that the function body is able to call the function recursively.
## 7.9 Resolving statements
```ts
class Resolver {
// ...
private resolveStmt(stmt: Stmt, syms: Syms) {
if (stmt.kind.type === "error") {
return;
}
if (stmt.kind.type === "let") {
this.resolveLetStmt(stmt, syms);
return;
}
if (stmt.kind.type === "fn") {
this.resolveFnStmt(stmt, syms);
return;
}
if (stmt.kind.type === "return") {
if (stmt.kind.expr)
this.resolveExpr(stmt.kind.expr, syms);
return;
}
// ...
throw new Error(`unknown statement ${expr.kind.type}`);
}
// ...
}
```
Just like expressions, we traverse the AST and resolve every sub-statement and expression.
### Exercises
1. Implement the rest of the expressions.
## 7.10 Resolving AST
```ts
class Resolver {
// ...
private resolveStmts(stmts: Stmt[]) {
const scopeSyms = new Syms(this.root);
for (const stmt of stmts) {
this.resolveStmt(stmt, scopeSyms);
}
}
// ...
}
```
We'll make a child symbol table of the root and resolve each statement.
## 7.11 Errors
### 7.11.1 Use of undefined
```ts
class Resolver {
// ...
private reportUseOfUndefined(ident: Ident, pos: Pos, syms: Syms) {
console.error(`use of undefined symbol '${ident}' at ${pos.line}${pos.col}`);
}
// ...
}
```
### 7.11.2 Already defined
```ts
class Resolver {
// ...
private reportAlreadyDefined(ident: Ident, pos: Pos, syms: Syms) {
console.error(`symbol already defined '${ident}', at ${pos.line}${pos.col}`);
const prev = syms.get(ident);
if (!prev.ok)
throw new Error("expected to be defined");
if (!prev.sym.pos)
return;
const { line: prevLine, col: prevCol } = prev.sym.pos;
console.error(`previous definition of '${ident}' at ${prevLine}:${prevPos}`);
}
// ...
}
```
Print the last definition to help the user.
## Exercises
1. \* Implement, so that `reportUseOfUndefined` searches for similar symbols, eg. using lLevenshtein distance.