notes

Log | Files | Refs | README

thread.md (5997B)


      1 # Thread
      2 
      3 ### Question: What exactly is a thread?
      4 
      5 A thread is the smallest unit of execution that a CPU can run — essentially, an
      6 independent sequence of instructions within a program. The CPU distinguishes
      7 threads by their saved state and ID, not by “feeling” them electrically 😜.
      8 
      9 #### Every thread has a small bundle of CPU-related data called its context:
     10 
     11 - Program counter (which instruction to run next)
     12 
     13 - CPU registers (temporary values)
     14 
     15 - Stack pointer (where its stack lives)
     16 
     17 - A thread ID the OS uses to refer to it
     18 
     19 This context is stored in memory in a per-thread data structure managed by the
     20 OS (often called a TCB, “thread control block”).
     21 
     22 #### When the OS scheduler decides to run a different thread:
     23 
     24 - It takes the context of that thread (from memory).
     25 
     26 - It loads that context into the CPU’s registers and program counter.
     27 
     28 - From the CPU’s perspective, it is now executing that thread.
     29 
     30 So “which thread am I running?” = “which context did the OS most recently load
     31 into my registers and program counter?”
     32 
     33 ### Question: Why This Matters for Arc?
     34 
     35 This is exactly why Arc exists in Rust — when multiple threads share the same
     36 data, they all access the same memory. Without atomic operations on the
     37 reference count, two threads incrementing it simultaneously could corrupt it,
     38 leading to use-after-free bugs or memory leaks. Arc's atomic reference counting
     39 prevents this without needing a lock.
     40 
     41 ### Question: What is the relationship between a thread and a process and a program?
     42 
     43 ```
     44              ,----------------,              ,---------,
     45         ,-----------------------,          ,"        ,"|
     46       ,"                      ,"|        ,"        ,"  |
     47      +-----------------------+  |      ,"        ,"    |
     48      |  .-----------------.  |  |     +---------+      |
     49      |  |                 |  |  |     | -==----'|      |
     50      |  |  I LOVE DOS!    |  |  |     |         |      |
     51      |  |  Bad command or |  |  |/----|`---=    |      |
     52      |  |  C:\>_          |  |  |   ,/|==== ooo |      ;
     53      |  |                 |  |  |  // |(((( [33]|    ,"
     54      |  `-----------------'  |," .;'| |((((     |  ,"
     55      +-----------------------+  ;;  | |         |,"
     56         /_)______________(_/  //'   | +---------+
     57    ___________________________/___  `,
     58   /  oooooooooooooooo  .o.  oooo /,   \,"-----------
     59  / ==ooooooooooooooo==.o.  ooo= //   ,`\--{)B     ,"
     60 /_==__==========__==_ooo__ooo=_/'   /___________,"
     61 `-----------------------------'
     62                v
     63 +--------------------------------------------------------------+
     64 |                          PROGRAM                             |
     65 |                 (code on disk, not running)                  |
     66 +--------------------------------------------------------------+
     67         |                                     |
     68         | OS loads program into memory, twice |
     69         v                                     v
     70 +----------------------+        +----------------------+
     71 |      PROCESS 1       |        |      PROCESS 2       |
     72 |  (instance of prog)  |        |  (instance of prog)  |
     73 |  Own address space   |        |  Own address space   |
     74 +----------------------+        +----------------------+
     75     |   |   |                           |   |   |
     76     |   |   +---------------+           |   |   +---------------+
     77     |   +-------+           |           |   +-------+           |
     78     v           v           v           v           v           v
     79 +--------+  +--------+  +--------+  +--------+  +--------+  +--------+
     80 |THREAD 1|  |THREAD 2|  |THREAD 3|  |THREAD A|  |TRHEAD B|  |THREAD C|
     81 +--------+  +--------+  +--------+  +--------+  +--------+  +--------+
     82 
     83 - PROGRAM: one set of instructions on disk
     84 - PROCESS 1 and PROCESS 2: two running instances of the same program
     85 - THREADs: multiple execution flows inside PROCESS 1 (PROCESS 2 could also have threads)
     86 ```
     87 
     88 The three concepts form a clear hierarchy: a program becomes a process when run,
     89 and a process contains one or more threads.
     90 
     91 Program → Process → Thread
     92 
     93 Think of it like this:
     94 
     95 - A program is a static set of instructions stored on disk — it's just a file,
     96   doing nothing ​
     97 
     98 - A process is what a program becomes when the OS loads it into memory and
     99   starts executing it — it's a living, running instance of the program ​
    100 
    101 - A thread is the actual unit of execution inside a process — the sequence of
    102   instructions the CPU is actively running
    103 
    104 ### Question: Are threads allocated automatically to a process? How do I know how many threads a process runs? And how do I know how many process a program runs?
    105 
    106 Every process automatically gets exactly one thread when it starts — the main
    107 thread. This is the thread that begins executing at main(). Any additional
    108 threads beyond that must be explicitly created by the program itself — the OS
    109 does not add more threads on its own. So the number of threads a process has is
    110 entirely determined by what the programmer coded.
    111 
    112 There's no fixed number as to how many threads can a process run — it depends on
    113 system resources. On Linux for example, the maximum is calculated as
    114 
    115 ```text
    116 max threads = virtual memory size ÷ (stack size × 1024 × 1024)
    117 ```
    118 
    119 In practice, a Linux system can support tens of thousands of threads (e.g.
    120 ~63,704 on a typical kernel). On Windows, the limit is also very high and
    121 practically constrained by available memory rather than a hard cap.
    122 
    123 A program typically runs as one process by default. It can spawn additional
    124 processes explicitly in code (e.g. using fork() in Unix or spawning subprocesses
    125 in Rust/Python) — but this is always a deliberate programmer choice, not
    126 automatic.
    127 
    128 ## How to Check in Practice
    129 
    130 On Linux/macOS (terminal):
    131 
    132 - See threads for a specific process:
    133 
    134 ```bash
    135 ps -o nlwp <pid>      # shows number of threads
    136 cat /proc/<pid>/status | grep Threads
    137 ```
    138 
    139 - See all processes from a program:
    140 
    141 ```bash
    142 ps aux | grep <program_name>
    143 ```
    144 
    145 - Live view with threads: run top, then press H to toggle thread view ​