compiler_tokens.md (3329B)
1 # Compiler Tokens 2 3 1. Lexxing 4 5 A “lex token” is one classified unit of source code that the lexer has 6 recognized and tagged with a type. 7 8 “lex” refers to lexical analysis (the first stage of a compiler/interpreter that 9 breaks text into tokens). 10 11 A “lex token” is simply one of those enum variants produced by that stage, 12 representing a smallest unit of meaning such as a keyword, identifier, operator, 13 or literal. 14 15 ```text 16 use foo as bar 17 ``` 18 19 ```rust 20 [ 21 LexTok::Use, 22 LexTok::Identifier("foo".into()), 23 LexTok::Alias, 24 LexTok::Identifier("bar".into()), 25 ] 26 ``` 27 28 2. Parsing 29 30 Parsing is the step where a program takes a stream of tokens (like Use, Fn, 31 identifiers, (, ), etc.) and checks whether they form a valid structure 32 according to the language’s grammar. 33 34 After lexing has turned raw text into tokens, parsing: 35 36 - Reads the tokens in order and matches them against grammar rules (like “a 37 function definition is fn + name + params + body”). 38 39 - Builds a tree structure (often called a parse tree or syntax tree) that 40 represents how the program is organized: expressions, statements, blocks, 41 functions, etc. 42 43 - Reports syntax errors if the token sequence doesn’t fit the grammar (missing 44 ), extra ;, wrong keyword order, and so on). 45 46 In the pipeline you’re looking at: 47 48 - Lexing: characters → LexTok sequence 49 50 - Parsing: LexTok sequence → syntax tree (AST) that later stages (like type 51 checking or code generation) will use 52 53 3. Evaluating 54 55 “evaluating” is the step where you actually run or compute the meaning of an 56 expression or program. 57 58 - After lexing: you have tokens. 59 60 - After parsing: you have a syntax tree (often an AST). 61 62 - Evaluating that AST means: 63 64 - Walking the tree, 65 66 - Applying operators (+, *, ==, etc.), 67 68 - Looking up variable values, calling functions, handling control flow (if, 69 while, etc.), 70 71 - And producing a result (like 42, "hello", or some side effect like printing 72 or updating state). 73 74 ``` 75 +------------------+ 76 | Source code | 77 | (text, .ling) | 78 +------------------+ 79 | 80 | 1. Lexing (tokenization) 81 v 82 +------------------+ 83 | Tokens stream | 84 | [LexTok::Use, | 85 | LexTok::Fn, | 86 | Identifier, ...]| 87 +------------------+ 88 | 89 | 2. Parsing (syntax analysis) 90 v 91 +---------------------------+ 92 | AST (syntax tree) | 93 | e.g. FunctionDef( | 94 | name, params, body ) | 95 +---------------------------+ 96 | 97 | 3. Evaluating / Executing 98 v 99 +---------------------------+ 100 | Program behavior | 101 | - results / return vals | 102 | - printed output | 103 | - changed state, etc. | 104 +---------------------------+ 105 ``` 106 107 ### Question: What is the difference between an interpreter and a compiler? 108 109 - Compiler: Translates the whole program into machine code (or bytecode) before 110 you run it, producing a separate executable or binary file. 111 112 - Interpreter: Reads your source code and executes it directly, usually line by 113 line or statement by statement, without producing a standalone executable. 114 115 - Compiled programs usually run faster because all translation work is done 116 ahead of time and the result is optimized machine code. 117 118 - Interpreted programs usually run slower because translation happens as you 119 execute the code.