thread.md (5997B)
1 # Thread 2 3 ### Question: What exactly is a thread? 4 5 A thread is the smallest unit of execution that a CPU can run — essentially, an 6 independent sequence of instructions within a program. The CPU distinguishes 7 threads by their saved state and ID, not by “feeling” them electrically 😜. 8 9 #### Every thread has a small bundle of CPU-related data called its context: 10 11 - Program counter (which instruction to run next) 12 13 - CPU registers (temporary values) 14 15 - Stack pointer (where its stack lives) 16 17 - A thread ID the OS uses to refer to it 18 19 This context is stored in memory in a per-thread data structure managed by the 20 OS (often called a TCB, “thread control block”). 21 22 #### When the OS scheduler decides to run a different thread: 23 24 - It takes the context of that thread (from memory). 25 26 - It loads that context into the CPU’s registers and program counter. 27 28 - From the CPU’s perspective, it is now executing that thread. 29 30 So “which thread am I running?” = “which context did the OS most recently load 31 into my registers and program counter?” 32 33 ### Question: Why This Matters for Arc? 34 35 This is exactly why Arc exists in Rust — when multiple threads share the same 36 data, they all access the same memory. Without atomic operations on the 37 reference count, two threads incrementing it simultaneously could corrupt it, 38 leading to use-after-free bugs or memory leaks. Arc's atomic reference counting 39 prevents this without needing a lock. 40 41 ### Question: What is the relationship between a thread and a process and a program? 42 43 ``` 44 ,----------------, ,---------, 45 ,-----------------------, ," ,"| 46 ," ,"| ," ," | 47 +-----------------------+ | ," ," | 48 | .-----------------. | | +---------+ | 49 | | | | | | -==----'| | 50 | | I LOVE DOS! | | | | | | 51 | | Bad command or | | |/----|`---= | | 52 | | C:\>_ | | | ,/|==== ooo | ; 53 | | | | | // |(((( [33]| ," 54 | `-----------------' |," .;'| |(((( | ," 55 +-----------------------+ ;; | | |," 56 /_)______________(_/ //' | +---------+ 57 ___________________________/___ `, 58 / oooooooooooooooo .o. oooo /, \,"----------- 59 / ==ooooooooooooooo==.o. ooo= // ,`\--{)B ," 60 /_==__==========__==_ooo__ooo=_/' /___________," 61 `-----------------------------' 62 v 63 +--------------------------------------------------------------+ 64 | PROGRAM | 65 | (code on disk, not running) | 66 +--------------------------------------------------------------+ 67 | | 68 | OS loads program into memory, twice | 69 v v 70 +----------------------+ +----------------------+ 71 | PROCESS 1 | | PROCESS 2 | 72 | (instance of prog) | | (instance of prog) | 73 | Own address space | | Own address space | 74 +----------------------+ +----------------------+ 75 | | | | | | 76 | | +---------------+ | | +---------------+ 77 | +-------+ | | +-------+ | 78 v v v v v v 79 +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ 80 |THREAD 1| |THREAD 2| |THREAD 3| |THREAD A| |TRHEAD B| |THREAD C| 81 +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ 82 83 - PROGRAM: one set of instructions on disk 84 - PROCESS 1 and PROCESS 2: two running instances of the same program 85 - THREADs: multiple execution flows inside PROCESS 1 (PROCESS 2 could also have threads) 86 ``` 87 88 The three concepts form a clear hierarchy: a program becomes a process when run, 89 and a process contains one or more threads. 90 91 Program → Process → Thread 92 93 Think of it like this: 94 95 - A program is a static set of instructions stored on disk — it's just a file, 96 doing nothing 97 98 - A process is what a program becomes when the OS loads it into memory and 99 starts executing it — it's a living, running instance of the program 100 101 - A thread is the actual unit of execution inside a process — the sequence of 102 instructions the CPU is actively running 103 104 ### Question: Are threads allocated automatically to a process? How do I know how many threads a process runs? And how do I know how many process a program runs? 105 106 Every process automatically gets exactly one thread when it starts — the main 107 thread. This is the thread that begins executing at main(). Any additional 108 threads beyond that must be explicitly created by the program itself — the OS 109 does not add more threads on its own. So the number of threads a process has is 110 entirely determined by what the programmer coded. 111 112 There's no fixed number as to how many threads can a process run — it depends on 113 system resources. On Linux for example, the maximum is calculated as 114 115 ```text 116 max threads = virtual memory size ÷ (stack size × 1024 × 1024) 117 ``` 118 119 In practice, a Linux system can support tens of thousands of threads (e.g. 120 ~63,704 on a typical kernel). On Windows, the limit is also very high and 121 practically constrained by available memory rather than a hard cap. 122 123 A program typically runs as one process by default. It can spawn additional 124 processes explicitly in code (e.g. using fork() in Unix or spawning subprocesses 125 in Rust/Python) — but this is always a deliberate programmer choice, not 126 automatic. 127 128 ## How to Check in Practice 129 130 On Linux/macOS (terminal): 131 132 - See threads for a specific process: 133 134 ```bash 135 ps -o nlwp <pid> # shows number of threads 136 cat /proc/<pid>/status | grep Threads 137 ``` 138 139 - See all processes from a program: 140 141 ```bash 142 ps aux | grep <program_name> 143 ``` 144 145 - Live view with threads: run top, then press H to toggle thread view