notes

Log | Files | Refs | README

compiler_tokens.md (3329B)


      1 # Compiler Tokens
      2 
      3 1. Lexxing
      4 
      5 A “lex token” is one classified unit of source code that the lexer has
      6 recognized and tagged with a type.
      7 
      8 “lex” refers to lexical analysis (the first stage of a compiler/interpreter that
      9 breaks text into tokens).
     10 
     11 A “lex token” is simply one of those enum variants produced by that stage,
     12 representing a smallest unit of meaning such as a keyword, identifier, operator,
     13 or literal.
     14 
     15 ```text
     16 use foo as bar
     17 ```
     18 
     19 ```rust
     20 [
     21     LexTok::Use,
     22     LexTok::Identifier("foo".into()),
     23     LexTok::Alias,
     24     LexTok::Identifier("bar".into()),
     25 ]
     26 ```
     27 
     28 2. Parsing
     29 
     30 Parsing is the step where a program takes a stream of tokens (like Use, Fn,
     31 identifiers, (, ), etc.) and checks whether they form a valid structure
     32 according to the language’s grammar.
     33 
     34 After lexing has turned raw text into tokens, parsing:
     35 
     36 - Reads the tokens in order and matches them against grammar rules (like “a
     37   function definition is fn + name + params + body”).
     38 
     39 - Builds a tree structure (often called a parse tree or syntax tree) that
     40   represents how the program is organized: expressions, statements, blocks,
     41   functions, etc.
     42 
     43 - Reports syntax errors if the token sequence doesn’t fit the grammar (missing
     44   ), extra ;, wrong keyword order, and so on).
     45 
     46 In the pipeline you’re looking at:
     47 
     48 - Lexing: characters → LexTok sequence
     49 
     50 - Parsing: LexTok sequence → syntax tree (AST) that later stages (like type
     51   checking or code generation) will use
     52 
     53 3. Evaluating
     54 
     55 “evaluating” is the step where you actually run or compute the meaning of an
     56 expression or program.
     57 
     58 - After lexing: you have tokens.
     59 
     60 - After parsing: you have a syntax tree (often an AST).
     61 
     62 - Evaluating that AST means:
     63 
     64   - Walking the tree,
     65 
     66   - Applying operators (+, *, ==, etc.),
     67 
     68   - Looking up variable values, calling functions, handling control flow (if,
     69     while, etc.),
     70 
     71   - And producing a result (like 42, "hello", or some side effect like printing
     72     or updating state).
     73 
     74 ```
     75 +------------------+
     76 |   Source code    |
     77 |  (text, .ling)  |
     78 +------------------+
     79           |
     80           |  1. Lexing (tokenization)
     81           v
     82 +------------------+
     83 |   Tokens stream  |
     84 | [LexTok::Use,    |
     85 |  LexTok::Fn,     |
     86 |  Identifier, ...]|
     87 +------------------+
     88           |
     89           |  2. Parsing (syntax analysis)
     90           v
     91 +---------------------------+
     92 |  AST (syntax tree)        |
     93 |  e.g. FunctionDef(        |
     94 |    name, params, body )   |
     95 +---------------------------+
     96           |
     97           |  3. Evaluating / Executing
     98           v
     99 +---------------------------+
    100 |  Program behavior         |
    101 |  - results / return vals  |
    102 |  - printed output         |
    103 |  - changed state, etc.    |
    104 +---------------------------+
    105 ```
    106 
    107 ### Question: What is the difference between an interpreter and a compiler?
    108 
    109 - Compiler: Translates the whole program into machine code (or bytecode) before
    110   you run it, producing a separate executable or binary file.
    111 
    112 - Interpreter: Reads your source code and executes it directly, usually line by
    113   line or statement by statement, without producing a standalone executable.
    114 
    115 - Compiled programs usually run faster because all translation work is done
    116   ahead of time and the result is optimized machine code.
    117 
    118 - Interpreted programs usually run slower because translation happens as you
    119   execute the code.