docs: append sections S7-S8 (render loop, resize)

2026-05-30 17:53:15 -05:00
parent de38f526b9
commit 4e8c4be649
1 changed files with 374 additions and 0 deletions
--- a/docs/01-rainbow-triangle.md
+++ b/docs/01-rainbow-triangle.md
@@ -885,3 +885,377 @@ stereoscopic (VR) or multi-viewport single-pass rendering. Not used here.
 **`cache: None`** — no pipeline cache. A pipeline cache stores compiled shader
 binaries to speed up subsequent pipeline creation. Useful when creating many
 pipelines dynamically; for a single pipeline, caching has no practical benefit.
 ## S7: The Render Loop — Recording and Submitting Commands
 New concept: **command buffers are scripts, not function calls.** You cannot call
 GPU operations directly from CPU code. Instead, you record commands into a
 [command buffer](concepts/GLOSSARY.md#command-buffer) — a script that the GPU
 queue executes asynchronously. Think of it like building an assembly listing:
 each recording method appends an instruction. When the script is complete, you
 submit it atomically to the [queue](concepts/GLOSSARY.md#queue). The GPU executes
 all instructions in parallel, in whatever order it determines is optimal. There
 is no `.await` on a draw call. The CPU returns immediately after submission and
 continues the next frame while the GPU works in the background.
 > **Key insight #4 — Command buffers are scripts, not function calls:**
 > `create_command_encoder()` opens a recording session. `begin_render_pass()`
 > starts a scoped drawing block. `render_pass.draw()` appends a draw command.
 > `encoder.finish()` seals the script. `queue.submit()` dispatches it. The GPU
 > executes it later, in parallel. There is no `.await` on a draw call.
 ### The `render(&mut self)` Method Signature
 ```rust
 fn render(&mut self) {
    // ...
 }
 ```
 This is a **fully synchronous** method. It runs on the winit event loop thread
 (triggered by `RedrawRequested`), has no `async` keyword, no `.await`, and takes
 no tokio handle. All wgpu recording and submission operations are synchronous
 and fast — they only encode instructions and push them to the queue; they do not
 wait for GPU completion.
 ### Acquiring a Back Buffer from the Swapchain
 ```rust
 let status = self.surface.get_current_texture();
 ```
 `get_current_texture()` is how you acquire a back buffer from the
 [swapchain](concepts/GLOSSARY.md#swapchain). This is the framebuffer you render
 into for this frame. In a triple-buffered swapchain (`PresentMode::Mailbox`),
 there are up to two spare back buffers waiting for you. `get_current_texture()`
 hands you the next available one.
 In wgpu 29+, this method returns a `CurrentSurfaceTexture` **enum** — not a
 `Result`. The swapchain can be in seven distinct states, and every state is a
 valid, non-error condition:
 > **Key insight #5 — 7 swapchain states you must handle:** `Success(buf)` —
 > render normally. `Suboptimal(buf)` — render but reconfig is advisable.
 > `Timeout` — skip frame (GPU late). `Occluded` — skip frame (window behind
 > another). `Outdated` — `self.resize()` to reconfigure. `Lost` — skip frame
 > (display server restarted). `Validation` — skip frame (API misuse; check
 > logs).
 WHY `match` on 7 variants: `get_current_texture()` does not return a `Result`.
 All 7 states are valid and the match must be exhaustive. The Rust compiler
 enforces this — you cannot miss a variant.
 ### The Complete `render` Implementation
 ```rust
 fn render(&mut self) {
    let status = self.surface.get_current_texture();
    match status {
        wgpu::SurfaceStatus::Success(surface_texture)
        | wgpu::SurfaceStatus::Suboptimal(surface_texture) => {
            // Drive GPU work: shader compilation, memory allocation, fence signaling
            if let Err(e) = self.device.poll(wgpu::Maintain::Wait) {
                log::error!("Device poll failed: {e}");
                return;
            }
            let texture_view = surface_texture.texture.create_view(&Default::default());
            let mut encoder = self.device.create_command_encoder(
                &wgpu::CommandEncoderDescriptor {
                    label: Some("Main Command Encoder"),
                },
            );
            {
                let mut render_pass = encoder.begin_render_pass(&wgpu::RenderPassDescriptor {
                    label: Some("Main Render Pass"),
                    color_attachments: &[Some(wgpu::RenderPassColorAttachment {
                        view: &texture_view,
                        depth_slice: None,
                        resolve_target: None,
                        ops: wgpu::Operations {
                            load: wgpu::LoadOp::Clear(wgpu::Color {
                                r: 0.1,
                                g: 0.1,
                                b: 0.1,
                                a: 1.0,
                            }),
                            store: wgpu::StoreOp::Store,
                        },
                    })],
                    depth_stencil_attachment: None,
                    timestamp_writes: None,
                    occlusion_query_set: None,
                    multiview_mask: None,
                });
                render_pass.set_pipeline(&self.pipeline);
                render_pass.set_vertex_buffer(0, self.vertex_buffer.slice(..));
                render_pass.draw(0..3, 0..1);
            } // render_pass drops here — render pass ends automatically
            self.queue.submit(std::iter::once(encoder.finish()));
            surface_texture.present();
        }
        wgpu::SurfaceStatus::Timeout => {
            // GPU took too long to finish previous work. Skip this frame.
            log::warn!("Surface status: Timeout — skipping frame");
        }
        wgpu::SurfaceStatus::Occluded => {
            // Window is fully occluded by another window. Skip rendering.
            log::debug!("Surface status: Occluded — skipping frame");
        }
        wgpu::SurfaceStatus::Outdated => {
            // Swapchain resolution no longer matches window. Reconfigure.
            log::warn!("Surface status: Outdated — resizing");
            if let Some(window) = &self.window {
                self.resize(window.inner_size());
            }
        }
        wgpu::SurfaceStatus::Lost => {
            // Display server restarted or GPU lost. Fatal without re-init.
            log::error!("Surface status: Lost — cannot recover without re-creating State");
        }
        wgpu::SurfaceStatus::Validation { source, description } => {
            // wgpu validated your descriptor and found it invalid.
            log::error!("Surface validation: {source} — {description}");
        }
    }
 }
 ```
 ### Step by Step
 **`surface.get_current_texture()`** — Acquires the next available back buffer
 from the [swapchain](concepts/GLOSSARY.md#swapchain). The swapchain cycles through
 2–3 pre-allocated back buffers. This call returns immediately if a buffer is
 available; it does not block on the GPU.
 **`device.poll(wgpu::Maintain::Wait)`** — **Synchronous** call that drives
 in-flight GPU work to completion: shader compilation fences, memory allocation,
 and queue signaling. Without this, resources accumulate because the device does
 not reclaim finished work. Called once per frame. Returns
 `Result<(), MaintainError>` — if the device is lost, you recover by
 re-creating the device.
 WHY this is synchronous: `poll()` does not spawn a task or use `.await`. It
 runs a small internal loop checking Vulkan fence objects until all in-flight
 work is done, then returns. On a busy GPU this can take a few milliseconds per
 frame — that is normal.
 **`texture.create_view(&Default::default())`** — A [texture view](concepts/GLOSSARY.md#texture-view)
 is how wgpu references a texture's memory inside a render pass. The GPU does
 not accept raw texture handles in render pass attachments — it requires a view
 that describes the mip level range, aspect, and dimension format.
 `Default::default()` creates a full-view covering all mip levels and all aspects.
 **`device.create_command_encoder(&desc)`** — Opens a recording session. The
 [command encoder](concepts/GLOSSARY.md#command-buffer) is where you append
 instructions. Think of it as building a function body: you add statements, then
 `finish()` closes the function and returns the compiled buffer.
 **`encoder.begin_render_pass(&desc)`** — Starts a scoped drawing block. The
 [render pass](concepts/GLOSSARY.md#render-pass) descriptor defines the target
 attachments (color, depth, stencil). The returned `RenderPass` is a scoped
 guard — when it drops, the render pass ends automatically.
 ### Render Pass Color Attachment
 ```rust
 color_attachments: &[Some(wgpu::RenderPassColorAttachment {
    view: &texture_view,
    depth_slice: None,
    resolve_target: None,
    ops: wgpu::Operations {
        load: wgpu::LoadOp::Clear(wgpu::Color { r: 0.1, g: 0.1, b: 0.1, a: 1.0 }),
        store: wgpu::StoreOp::Store,
    },
 })],
 ```
 **`RenderPassColorAttachment` has exactly 4 fields:**
 - **`view: &texture_view`** — the framebuffer we draw into. Must match the
  color target format in the [render pipeline](concepts/GLOSSARY.md#render-pipeline).
 - **`depth_slice: None`** — only used for 3D texture slices. Not applicable
  to 2D rendering.
 - **`resolve_target: None`** — only used for MSAA resolve. When multisampling
  is active, the render pass writes to a multisampled buffer and resolves into
  this target. We have no MSAA, so `None`.
 - **`ops`** — [operations](concepts/GLOSSARY.md#render-pass) controlling load
  and store behavior. Two sub-fields:
  - **`load: LoadOp::Clear(color)`** — before drawing, fill the entire
    framebuffer with this color. **This IS your background color.** Dark gray.
    `LoadOp::Load` keeps existing pixels (used in UI compositing where you
    draw on top of previous content).
  - **`store: StoreOp::Store`** — after drawing, keep what was written. The
    GPU writes the result back to the texture so the swapchain can present it.
    `StoreOp::Discard` throws away the result — used for offscreen renders
    where only the depth/stencil result matters.
 **`depth_stencil_attachment: None`** — No depth or stencil buffer. When you
 have a depth texture, it goes here.
 **`timestamp_writes: None`** — GPU hardware timestamps for profiling. Not used
 in production rendering; requires a query set.
 **`occlusion_query_set: None`** — hardware occlusion queries (count fragments
 that pass the depth test). Useful for visibility-based culling.
 **`multiview_mask: None`** — multiview rendering mask for VR / multi-viewport.
 ### Binding State and Drawing
 **`render_pass.set_pipeline(&self.pipeline)`** — Tells the GPU which
 [render pipeline](concepts/GLOSSARY.md#render-pipeline) to use for subsequent
 draw calls. The pipeline encapsulates the shader programs, vertex format,
 primitive topology, and output configuration. Must be set before any draw call
 in a render pass. Switching pipelines mid-pass is expensive and should be
 minimized.
 WHY this is necessary: the GPU hardware does not store pipeline state between
 frames. Every render pass starts with no pipeline bound. You must set it every
 frame.
 **`render_pass.set_vertex_buffer(0, self.vertex_buffer.slice(..))`** — Binds the
 [vertex buffer](concepts/GLOSSARY.md#vertex-buffer) to slot 0.
 `buffer.slice(..)` creates a [buffer slice](concepts/GLOSSARY.md#buffer-slice)
 covering the full buffer (equivalent to `buffer.slice(0..)`). Slot 0 corresponds
 to the first layout in the pipeline's vertex buffer layouts array. If you had
 multiple vertex buffers (e.g., separate position and instance buffers), you'd
 bind them to slots 0, 1, etc.
 **`render_pass.draw(0..3, 0..1)`** — The draw command. Two `Range<u32>`
 arguments:
 - First range `0..3` — vertex range. Draw vertices 0, 1, 2 (three vertices
  forming one triangle).
 - Second range `0..1` — instance range. Draw instance 0 (one instance).
 WHY two ranges: the vertex range controls which vertices from the buffer are
 read. The instance range controls instanced rendering — the same geometry drawn
 multiple times with different instance-data attributes. For a single triangle,
 one draw call with `0..1` instances is correct.
 **Render pass scope drop** — When the `render_pass` variable goes out of scope
 (the closing `}` in the block), the drop implementation ends the render pass
 and performs validation. If you forgot to set the pipeline or bind a required
 buffer, wgpu reports the error at drop time, not at draw time.
 **`encoder.finish()`** — Seals the command encoder. Returns the finished
 [command buffer](concepts/GLOSSARY.md#command-buffer) ready for submission.
 After `finish()`, the encoder cannot be used again.
 **`queue.submit(iter)`** — Dispatches one or more command buffers to the GPU.
 Takes an iterator of command buffers. We submit exactly one: the frame's command
 buffer. This is a fire-and-forget call — it queues the work and returns
 immediately. The GPU executes it asynchronously, in parallel with your next
 frame's CPU work.
 **`surface_texture.present()`** — Queues the rendered back buffer for display.
 This tells the swapchain: "this buffer is done, show it on screen." **If you
 forget this, you render to a buffer nobody sees.** The swapchain cycles the
 buffer from "render target" to "front buffer" on the next vsync.
 ### Why the Match Arms Differ
 - **`Success` / `Suboptimal`** — both deliver a `SurfaceTexture` you can render
  into. The difference: `Suboptimal` means the current swapchain configuration
  is not ideal for the GPU (e.g., format mismatch). You render normally but
  should consider reconfiguring the surface during idle time.
 - **`Timeout`** — the GPU exceeded the wait threshold for a back buffer. Skip
  the frame. The GPU will catch up.
 - **`Occluded`** — another fully covers your window. Skip rendering entirely —
  the display server will not show your output. Saves GPU work.
 - **`Outdated`** — the swapchain was created for a resolution that no longer
  matches the window. Reconfigure the surface to match.
 - **`Lost`** — the GPU or display server has been reset. Without re-creating
  the device and surface, you cannot recover. In a real application, you'd
  trigger a full re-initialization.
 - **`Validation`** — wgpu rejected the surface configuration due to API misuse.
  Check logs for the description. This is a programming error, not a runtime
  condition.
 Note: `pre_present_notify()` does **not** exist in wgpu 29. Do not call it. The
 device polling via `device.poll()` is the only frame synchronization mechanism
 you need.
 ## S8: Handling Window Resize
 WHY `surface.configure()` on resize: The swapchain allocates back buffers at a
 fixed dimension. When the window size changes, the old back buffers no longer
 match the window's display surface. Presenting a mismatched-size buffer causes
 undefined behavior — the display server clips, stretches, or rejects it.
 `surface.configure()` allocates new back buffers matching the new dimensions and
 discards the old ones.
 WHY `width.max(1)`: On some display servers, minimizing a window briefly
 reports `0 × 0` size before restoring. A zero-dimension surface allocation
 panics. Clamping to 1 ensures the swapchain always has valid dimensions.
 WHY `std::mem::take(&mut self.config.view_formats)`: The `view_formats` field
 of `SurfaceConfiguration` is an owned `Vec<TextureFormat>`. When constructing
 the new configuration, you move the vector out of the old config rather than
 cloning it. `mem::take` replaces the field with `Vec::new()` (zero allocation)
 and returns the original vector. This avoids a heap allocation for what is
 typically a 1-element vec.
 ```rust
 fn resize(&mut self, size: wgpu::dpi::PhysicalSize<u32>) {
    if size.width > 0 && size.height > 0 {
        let config = wgpu::SurfaceConfiguration {
            usage: self.config.usage,
            format: self.config.format,
            width: size.width.max(1),
            height: size.height.max(1),
            present_mode: self.config.present_mode,
            desired_maximum_frame_latency: self.config.desired_maximum_frame_latency,
            alpha_mode: self.config.alpha_mode,
            view_formats: std::mem::take(&mut self.config.view_formats),
        };
        self.surface.configure(&self.device, &config);
        self.config = config;
    }
 }
 ```
 FIELD BY FIELD:
 **`usage` / `format` / `present_mode` / `alpha_mode`** — carried over from the
 old config unchanged. These properties are negotiated once at init time
 and do not change on resize.
 **`width` / `height`** — the new dimensions, clamped to at least 1.
 **`desired_maximum_frame_latency`** — swapchain back-pressure setting. Kept from
 the old config. This value controls how many frames the swapchain buffers
 between CPU submission and GPU presentation. A value of 2 (triple buffering)
 provides smooth frame pacing under variable CPU/GPU load. See S3 init step 5.
 **`view_formats`** — additional texture formats the surface can create views
 with. `std::mem::take()` moves the owned vector from the old config into the
 new config. After `take()`, the old config's `view_formats` is an empty `Vec`.
 This avoids a `clone()` of the vector. Since the old config is about to be
 overwritten by `self.config = config`, the emptied field is irrelevant.
 **`surface.configure(&self.device, &config)`** — takes a reference to the
 `Device` and the new `SurfaceConfiguration`. This is not async. It allocates the
 new swapchain buffers and replaces the old ones. Any in-flight renders using
 old buffers complete normally; the new buffers are available after this call
 returns.
 ### When `resize` Is Called
 In our `App::window_event` handler (S2), the `WindowEvent::Resized(size)` arm
 calls `state.resize(window, size)`. The resize fires once for every dimension
 change. On fast window resizing, you may receive dozens of resize events in
 succession. `surface.configure()` is fast enough to handle this — each call
 discards old buffers and allocates new ones. The GPU continues processing
 in-flight frames with the old buffer dimensions; there is no visual glitch
 because the swapchain handles the transition seamlessly.