notes

Log | Files | Refs | README

vec.md (4295B)


      1 # Vec
      2 
      3 # `Vec<u8>`
      4 
      5 Getting the size of a Vec<u8> in Rust is straightforward — since each element is
      6 exactly 1 byte, .len() gives you the byte count directly.
      7 
      8 ## `len()` vs `capacity()`
      9 
     10 A `Vec<u8>` internally holds three things: a pointer to heap memory, a length,
     11 and a capacity.
     12 
     13 - `len()` — the number of actual bytes of data present. This is the file size.
     14 
     15 - `capacity()` — the total memory reserved on the heap, which may be larger than
     16   len() to avoid frequent reallocations as the vector grows. This is an internal
     17   memory management detail.
     18 
     19   For example, a `Vec<u8>` with 500 bytes of real data might have a capacity of
     20   512 — those extra 12 bytes are pre-allocated empty slots Rust reserved
     21   speculatively. They contain no real data.
     22   ​[rust-lang](https://doc.rust-lang.org/std/vec/struct.Vec.html)
     23 
     24 ```
     25  Stack                        Heap
     26 ─────────────────            ──────────────────────────────────────────────────
     27 ┌───────────────┐            ┌─────────────────────────────┬──────────────────┐
     28 │   ptr         │──────────► │  REAL DATA (500 bytes)      │  EMPTY (12 bytes)│
     29 ├───────────────┤            │                             │                  │
     30 │   len = 500   │            │  [0x01][0xFF][0x3A]...      │  [ ][ ][ ]...    │
     31 ├───────────────┤            │                             │                  │
     32 │   cap = 512   │            │  ◄────── len = 500 ────────►│◄─── cap-len ────►│
     33 └───────────────┘            └─────────────────────────────┴──────────────────┘
     34                             ◄──────────────── cap = 512 ──────────────────────►
     35 ```
     36 
     37 - `ptr` points to the start of the heap block ​
     38 
     39 - `len = 500` marks how far into the block contains real, valid data — this is
     40   your file size
     41 
     42 - `cap = 512` is the total reserved memory; the trailing 12 bytes are allocated
     43   but uninitialised and ignored
     44   [stackoverflow](https://stackoverflow.com/questions/54889521/whats-the-difference-between-len-and-capacity)
     45     46 
     47 - When you call `.len()`, you get `500` — exactly the bytes fetched from the
     48   database, nothing more
     49 
     50 ### Question: Why `512`?
     51 
     52 512 was just an illustrative, round-number example, not a special Rust rule.
     53 
     54 In reality:
     55 
     56 - Rust does not guarantee “500 bytes → capacity 512”. The allocator chooses how
     57   much memory to give you, and `Vec` typically grows by some factor (often ~2×),
     58   but the exact value is an implementation detail.
     59 
     60 - When a `Vec` needs more space (because `len == capacity` and you push again),
     61   it reallocates to a larger capacity to reduce how often it has to reallocate
     62   in the future. That larger capacity might be 512, 640, 1000, etc., depending
     63   on the previous capacity and the growth strategy.
     64 
     65 - So “500 data, 512 capacity” was just to show the idea: **capacity ≥ len**, and
     66   the extra part is reserved space for future pushes, not real data.
     67 
     68 ### Question: Whats the difference between "padding" and "pre-allocated empty slot"?
     69 
     70 “Padding” is a term we use for extra bytes inserted inside or between fields of
     71 a struct to satisfy alignment requirements of the CPU (e.g., to align a u64 on
     72 an 8‑byte boundary). However, in a `Vec<T>`:
     73 
     74 - `len` is how many initialized elements you have.
     75 
     76 - `capacity` is how many elements the heap allocation can hold.
     77 
     78 - The bytes between `le`n and `capacity` are logically **uninitialized storage
     79   for future elements**, not alignment padding.
     80 
     81   So for a `Vec<u8>` with `len = 500` and `capacity = 512`, those 12 bytes are
     82   just unused, reserved space that Rust can fill later if you push more bytes;
     83   they’re not considered padding in the usual memory-layout sense.