docs: append sections S7-S8 (render loop, resize)
This commit is contained in:
@@ -885,3 +885,377 @@ stereoscopic (VR) or multi-viewport single-pass rendering. Not used here.
|
||||
**`cache: None`** — no pipeline cache. A pipeline cache stores compiled shader
|
||||
binaries to speed up subsequent pipeline creation. Useful when creating many
|
||||
pipelines dynamically; for a single pipeline, caching has no practical benefit.
|
||||
|
||||
## S7: The Render Loop — Recording and Submitting Commands
|
||||
|
||||
New concept: **command buffers are scripts, not function calls.** You cannot call
|
||||
GPU operations directly from CPU code. Instead, you record commands into a
|
||||
[command buffer](concepts/GLOSSARY.md#command-buffer) — a script that the GPU
|
||||
queue executes asynchronously. Think of it like building an assembly listing:
|
||||
each recording method appends an instruction. When the script is complete, you
|
||||
submit it atomically to the [queue](concepts/GLOSSARY.md#queue). The GPU executes
|
||||
all instructions in parallel, in whatever order it determines is optimal. There
|
||||
is no `.await` on a draw call. The CPU returns immediately after submission and
|
||||
continues the next frame while the GPU works in the background.
|
||||
|
||||
> **Key insight #4 — Command buffers are scripts, not function calls:**
|
||||
> `create_command_encoder()` opens a recording session. `begin_render_pass()`
|
||||
> starts a scoped drawing block. `render_pass.draw()` appends a draw command.
|
||||
> `encoder.finish()` seals the script. `queue.submit()` dispatches it. The GPU
|
||||
> executes it later, in parallel. There is no `.await` on a draw call.
|
||||
|
||||
### The `render(&mut self)` Method Signature
|
||||
|
||||
```rust
|
||||
fn render(&mut self) {
|
||||
// ...
|
||||
}
|
||||
```
|
||||
|
||||
This is a **fully synchronous** method. It runs on the winit event loop thread
|
||||
(triggered by `RedrawRequested`), has no `async` keyword, no `.await`, and takes
|
||||
no tokio handle. All wgpu recording and submission operations are synchronous
|
||||
and fast — they only encode instructions and push them to the queue; they do not
|
||||
wait for GPU completion.
|
||||
|
||||
### Acquiring a Back Buffer from the Swapchain
|
||||
|
||||
```rust
|
||||
let status = self.surface.get_current_texture();
|
||||
```
|
||||
|
||||
`get_current_texture()` is how you acquire a back buffer from the
|
||||
[swapchain](concepts/GLOSSARY.md#swapchain). This is the framebuffer you render
|
||||
into for this frame. In a triple-buffered swapchain (`PresentMode::Mailbox`),
|
||||
there are up to two spare back buffers waiting for you. `get_current_texture()`
|
||||
hands you the next available one.
|
||||
|
||||
In wgpu 29+, this method returns a `CurrentSurfaceTexture` **enum** — not a
|
||||
`Result`. The swapchain can be in seven distinct states, and every state is a
|
||||
valid, non-error condition:
|
||||
|
||||
> **Key insight #5 — 7 swapchain states you must handle:** `Success(buf)` —
|
||||
> render normally. `Suboptimal(buf)` — render but reconfig is advisable.
|
||||
> `Timeout` — skip frame (GPU late). `Occluded` — skip frame (window behind
|
||||
> another). `Outdated` — `self.resize()` to reconfigure. `Lost` — skip frame
|
||||
> (display server restarted). `Validation` — skip frame (API misuse; check
|
||||
> logs).
|
||||
|
||||
WHY `match` on 7 variants: `get_current_texture()` does not return a `Result`.
|
||||
All 7 states are valid and the match must be exhaustive. The Rust compiler
|
||||
enforces this — you cannot miss a variant.
|
||||
|
||||
### The Complete `render` Implementation
|
||||
|
||||
```rust
|
||||
fn render(&mut self) {
|
||||
let status = self.surface.get_current_texture();
|
||||
|
||||
match status {
|
||||
wgpu::SurfaceStatus::Success(surface_texture)
|
||||
| wgpu::SurfaceStatus::Suboptimal(surface_texture) => {
|
||||
// Drive GPU work: shader compilation, memory allocation, fence signaling
|
||||
if let Err(e) = self.device.poll(wgpu::Maintain::Wait) {
|
||||
log::error!("Device poll failed: {e}");
|
||||
return;
|
||||
}
|
||||
|
||||
let texture_view = surface_texture.texture.create_view(&Default::default());
|
||||
|
||||
let mut encoder = self.device.create_command_encoder(
|
||||
&wgpu::CommandEncoderDescriptor {
|
||||
label: Some("Main Command Encoder"),
|
||||
},
|
||||
);
|
||||
|
||||
{
|
||||
let mut render_pass = encoder.begin_render_pass(&wgpu::RenderPassDescriptor {
|
||||
label: Some("Main Render Pass"),
|
||||
color_attachments: &[Some(wgpu::RenderPassColorAttachment {
|
||||
view: &texture_view,
|
||||
depth_slice: None,
|
||||
resolve_target: None,
|
||||
ops: wgpu::Operations {
|
||||
load: wgpu::LoadOp::Clear(wgpu::Color {
|
||||
r: 0.1,
|
||||
g: 0.1,
|
||||
b: 0.1,
|
||||
a: 1.0,
|
||||
}),
|
||||
store: wgpu::StoreOp::Store,
|
||||
},
|
||||
})],
|
||||
depth_stencil_attachment: None,
|
||||
timestamp_writes: None,
|
||||
occlusion_query_set: None,
|
||||
multiview_mask: None,
|
||||
});
|
||||
|
||||
render_pass.set_pipeline(&self.pipeline);
|
||||
render_pass.set_vertex_buffer(0, self.vertex_buffer.slice(..));
|
||||
render_pass.draw(0..3, 0..1);
|
||||
} // render_pass drops here — render pass ends automatically
|
||||
|
||||
self.queue.submit(std::iter::once(encoder.finish()));
|
||||
surface_texture.present();
|
||||
}
|
||||
|
||||
wgpu::SurfaceStatus::Timeout => {
|
||||
// GPU took too long to finish previous work. Skip this frame.
|
||||
log::warn!("Surface status: Timeout — skipping frame");
|
||||
}
|
||||
|
||||
wgpu::SurfaceStatus::Occluded => {
|
||||
// Window is fully occluded by another window. Skip rendering.
|
||||
log::debug!("Surface status: Occluded — skipping frame");
|
||||
}
|
||||
|
||||
wgpu::SurfaceStatus::Outdated => {
|
||||
// Swapchain resolution no longer matches window. Reconfigure.
|
||||
log::warn!("Surface status: Outdated — resizing");
|
||||
if let Some(window) = &self.window {
|
||||
self.resize(window.inner_size());
|
||||
}
|
||||
}
|
||||
|
||||
wgpu::SurfaceStatus::Lost => {
|
||||
// Display server restarted or GPU lost. Fatal without re-init.
|
||||
log::error!("Surface status: Lost — cannot recover without re-creating State");
|
||||
}
|
||||
|
||||
wgpu::SurfaceStatus::Validation { source, description } => {
|
||||
// wgpu validated your descriptor and found it invalid.
|
||||
log::error!("Surface validation: {source} — {description}");
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Step by Step
|
||||
|
||||
**`surface.get_current_texture()`** — Acquires the next available back buffer
|
||||
from the [swapchain](concepts/GLOSSARY.md#swapchain). The swapchain cycles through
|
||||
2–3 pre-allocated back buffers. This call returns immediately if a buffer is
|
||||
available; it does not block on the GPU.
|
||||
|
||||
**`device.poll(wgpu::Maintain::Wait)`** — **Synchronous** call that drives
|
||||
in-flight GPU work to completion: shader compilation fences, memory allocation,
|
||||
and queue signaling. Without this, resources accumulate because the device does
|
||||
not reclaim finished work. Called once per frame. Returns
|
||||
`Result<(), MaintainError>` — if the device is lost, you recover by
|
||||
re-creating the device.
|
||||
|
||||
WHY this is synchronous: `poll()` does not spawn a task or use `.await`. It
|
||||
runs a small internal loop checking Vulkan fence objects until all in-flight
|
||||
work is done, then returns. On a busy GPU this can take a few milliseconds per
|
||||
frame — that is normal.
|
||||
|
||||
**`texture.create_view(&Default::default())`** — A [texture view](concepts/GLOSSARY.md#texture-view)
|
||||
is how wgpu references a texture's memory inside a render pass. The GPU does
|
||||
not accept raw texture handles in render pass attachments — it requires a view
|
||||
that describes the mip level range, aspect, and dimension format.
|
||||
`Default::default()` creates a full-view covering all mip levels and all aspects.
|
||||
|
||||
**`device.create_command_encoder(&desc)`** — Opens a recording session. The
|
||||
[command encoder](concepts/GLOSSARY.md#command-buffer) is where you append
|
||||
instructions. Think of it as building a function body: you add statements, then
|
||||
`finish()` closes the function and returns the compiled buffer.
|
||||
|
||||
**`encoder.begin_render_pass(&desc)`** — Starts a scoped drawing block. The
|
||||
[render pass](concepts/GLOSSARY.md#render-pass) descriptor defines the target
|
||||
attachments (color, depth, stencil). The returned `RenderPass` is a scoped
|
||||
guard — when it drops, the render pass ends automatically.
|
||||
|
||||
### Render Pass Color Attachment
|
||||
|
||||
```rust
|
||||
color_attachments: &[Some(wgpu::RenderPassColorAttachment {
|
||||
view: &texture_view,
|
||||
depth_slice: None,
|
||||
resolve_target: None,
|
||||
ops: wgpu::Operations {
|
||||
load: wgpu::LoadOp::Clear(wgpu::Color { r: 0.1, g: 0.1, b: 0.1, a: 1.0 }),
|
||||
store: wgpu::StoreOp::Store,
|
||||
},
|
||||
})],
|
||||
```
|
||||
|
||||
**`RenderPassColorAttachment` has exactly 4 fields:**
|
||||
|
||||
- **`view: &texture_view`** — the framebuffer we draw into. Must match the
|
||||
color target format in the [render pipeline](concepts/GLOSSARY.md#render-pipeline).
|
||||
- **`depth_slice: None`** — only used for 3D texture slices. Not applicable
|
||||
to 2D rendering.
|
||||
- **`resolve_target: None`** — only used for MSAA resolve. When multisampling
|
||||
is active, the render pass writes to a multisampled buffer and resolves into
|
||||
this target. We have no MSAA, so `None`.
|
||||
- **`ops`** — [operations](concepts/GLOSSARY.md#render-pass) controlling load
|
||||
and store behavior. Two sub-fields:
|
||||
- **`load: LoadOp::Clear(color)`** — before drawing, fill the entire
|
||||
framebuffer with this color. **This IS your background color.** Dark gray.
|
||||
`LoadOp::Load` keeps existing pixels (used in UI compositing where you
|
||||
draw on top of previous content).
|
||||
- **`store: StoreOp::Store`** — after drawing, keep what was written. The
|
||||
GPU writes the result back to the texture so the swapchain can present it.
|
||||
`StoreOp::Discard` throws away the result — used for offscreen renders
|
||||
where only the depth/stencil result matters.
|
||||
|
||||
**`depth_stencil_attachment: None`** — No depth or stencil buffer. When you
|
||||
have a depth texture, it goes here.
|
||||
|
||||
**`timestamp_writes: None`** — GPU hardware timestamps for profiling. Not used
|
||||
in production rendering; requires a query set.
|
||||
|
||||
**`occlusion_query_set: None`** — hardware occlusion queries (count fragments
|
||||
that pass the depth test). Useful for visibility-based culling.
|
||||
|
||||
**`multiview_mask: None`** — multiview rendering mask for VR / multi-viewport.
|
||||
|
||||
### Binding State and Drawing
|
||||
|
||||
**`render_pass.set_pipeline(&self.pipeline)`** — Tells the GPU which
|
||||
[render pipeline](concepts/GLOSSARY.md#render-pipeline) to use for subsequent
|
||||
draw calls. The pipeline encapsulates the shader programs, vertex format,
|
||||
primitive topology, and output configuration. Must be set before any draw call
|
||||
in a render pass. Switching pipelines mid-pass is expensive and should be
|
||||
minimized.
|
||||
|
||||
WHY this is necessary: the GPU hardware does not store pipeline state between
|
||||
frames. Every render pass starts with no pipeline bound. You must set it every
|
||||
frame.
|
||||
|
||||
**`render_pass.set_vertex_buffer(0, self.vertex_buffer.slice(..))`** — Binds the
|
||||
[vertex buffer](concepts/GLOSSARY.md#vertex-buffer) to slot 0.
|
||||
`buffer.slice(..)` creates a [buffer slice](concepts/GLOSSARY.md#buffer-slice)
|
||||
covering the full buffer (equivalent to `buffer.slice(0..)`). Slot 0 corresponds
|
||||
to the first layout in the pipeline's vertex buffer layouts array. If you had
|
||||
multiple vertex buffers (e.g., separate position and instance buffers), you'd
|
||||
bind them to slots 0, 1, etc.
|
||||
|
||||
**`render_pass.draw(0..3, 0..1)`** — The draw command. Two `Range<u32>`
|
||||
arguments:
|
||||
- First range `0..3` — vertex range. Draw vertices 0, 1, 2 (three vertices
|
||||
forming one triangle).
|
||||
- Second range `0..1` — instance range. Draw instance 0 (one instance).
|
||||
|
||||
WHY two ranges: the vertex range controls which vertices from the buffer are
|
||||
read. The instance range controls instanced rendering — the same geometry drawn
|
||||
multiple times with different instance-data attributes. For a single triangle,
|
||||
one draw call with `0..1` instances is correct.
|
||||
|
||||
**Render pass scope drop** — When the `render_pass` variable goes out of scope
|
||||
(the closing `}` in the block), the drop implementation ends the render pass
|
||||
and performs validation. If you forgot to set the pipeline or bind a required
|
||||
buffer, wgpu reports the error at drop time, not at draw time.
|
||||
|
||||
**`encoder.finish()`** — Seals the command encoder. Returns the finished
|
||||
[command buffer](concepts/GLOSSARY.md#command-buffer) ready for submission.
|
||||
After `finish()`, the encoder cannot be used again.
|
||||
|
||||
**`queue.submit(iter)`** — Dispatches one or more command buffers to the GPU.
|
||||
Takes an iterator of command buffers. We submit exactly one: the frame's command
|
||||
buffer. This is a fire-and-forget call — it queues the work and returns
|
||||
immediately. The GPU executes it asynchronously, in parallel with your next
|
||||
frame's CPU work.
|
||||
|
||||
**`surface_texture.present()`** — Queues the rendered back buffer for display.
|
||||
This tells the swapchain: "this buffer is done, show it on screen." **If you
|
||||
forget this, you render to a buffer nobody sees.** The swapchain cycles the
|
||||
buffer from "render target" to "front buffer" on the next vsync.
|
||||
|
||||
### Why the Match Arms Differ
|
||||
|
||||
- **`Success` / `Suboptimal`** — both deliver a `SurfaceTexture` you can render
|
||||
into. The difference: `Suboptimal` means the current swapchain configuration
|
||||
is not ideal for the GPU (e.g., format mismatch). You render normally but
|
||||
should consider reconfiguring the surface during idle time.
|
||||
- **`Timeout`** — the GPU exceeded the wait threshold for a back buffer. Skip
|
||||
the frame. The GPU will catch up.
|
||||
- **`Occluded`** — another fully covers your window. Skip rendering entirely —
|
||||
the display server will not show your output. Saves GPU work.
|
||||
- **`Outdated`** — the swapchain was created for a resolution that no longer
|
||||
matches the window. Reconfigure the surface to match.
|
||||
- **`Lost`** — the GPU or display server has been reset. Without re-creating
|
||||
the device and surface, you cannot recover. In a real application, you'd
|
||||
trigger a full re-initialization.
|
||||
- **`Validation`** — wgpu rejected the surface configuration due to API misuse.
|
||||
Check logs for the description. This is a programming error, not a runtime
|
||||
condition.
|
||||
|
||||
Note: `pre_present_notify()` does **not** exist in wgpu 29. Do not call it. The
|
||||
device polling via `device.poll()` is the only frame synchronization mechanism
|
||||
you need.
|
||||
|
||||
## S8: Handling Window Resize
|
||||
|
||||
WHY `surface.configure()` on resize: The swapchain allocates back buffers at a
|
||||
fixed dimension. When the window size changes, the old back buffers no longer
|
||||
match the window's display surface. Presenting a mismatched-size buffer causes
|
||||
undefined behavior — the display server clips, stretches, or rejects it.
|
||||
`surface.configure()` allocates new back buffers matching the new dimensions and
|
||||
discards the old ones.
|
||||
|
||||
WHY `width.max(1)`: On some display servers, minimizing a window briefly
|
||||
reports `0 × 0` size before restoring. A zero-dimension surface allocation
|
||||
panics. Clamping to 1 ensures the swapchain always has valid dimensions.
|
||||
|
||||
WHY `std::mem::take(&mut self.config.view_formats)`: The `view_formats` field
|
||||
of `SurfaceConfiguration` is an owned `Vec<TextureFormat>`. When constructing
|
||||
the new configuration, you move the vector out of the old config rather than
|
||||
cloning it. `mem::take` replaces the field with `Vec::new()` (zero allocation)
|
||||
and returns the original vector. This avoids a heap allocation for what is
|
||||
typically a 1-element vec.
|
||||
|
||||
```rust
|
||||
fn resize(&mut self, size: wgpu::dpi::PhysicalSize<u32>) {
|
||||
if size.width > 0 && size.height > 0 {
|
||||
let config = wgpu::SurfaceConfiguration {
|
||||
usage: self.config.usage,
|
||||
format: self.config.format,
|
||||
width: size.width.max(1),
|
||||
height: size.height.max(1),
|
||||
present_mode: self.config.present_mode,
|
||||
desired_maximum_frame_latency: self.config.desired_maximum_frame_latency,
|
||||
alpha_mode: self.config.alpha_mode,
|
||||
view_formats: std::mem::take(&mut self.config.view_formats),
|
||||
};
|
||||
self.surface.configure(&self.device, &config);
|
||||
self.config = config;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
FIELD BY FIELD:
|
||||
|
||||
**`usage` / `format` / `present_mode` / `alpha_mode`** — carried over from the
|
||||
old config unchanged. These properties are negotiated once at init time
|
||||
and do not change on resize.
|
||||
|
||||
**`width` / `height`** — the new dimensions, clamped to at least 1.
|
||||
|
||||
**`desired_maximum_frame_latency`** — swapchain back-pressure setting. Kept from
|
||||
the old config. This value controls how many frames the swapchain buffers
|
||||
between CPU submission and GPU presentation. A value of 2 (triple buffering)
|
||||
provides smooth frame pacing under variable CPU/GPU load. See S3 init step 5.
|
||||
|
||||
**`view_formats`** — additional texture formats the surface can create views
|
||||
with. `std::mem::take()` moves the owned vector from the old config into the
|
||||
new config. After `take()`, the old config's `view_formats` is an empty `Vec`.
|
||||
This avoids a `clone()` of the vector. Since the old config is about to be
|
||||
overwritten by `self.config = config`, the emptied field is irrelevant.
|
||||
|
||||
**`surface.configure(&self.device, &config)`** — takes a reference to the
|
||||
`Device` and the new `SurfaceConfiguration`. This is not async. It allocates the
|
||||
new swapchain buffers and replaces the old ones. Any in-flight renders using
|
||||
old buffers complete normally; the new buffers are available after this call
|
||||
returns.
|
||||
|
||||
### When `resize` Is Called
|
||||
|
||||
In our `App::window_event` handler (S2), the `WindowEvent::Resized(size)` arm
|
||||
calls `state.resize(window, size)`. The resize fires once for every dimension
|
||||
change. On fast window resizing, you may receive dozens of resize events in
|
||||
succession. `surface.configure()` is fast enough to handle this — each call
|
||||
discards old buffers and allocates new ones. The GPU continues processing
|
||||
in-flight frames with the old buffer dimensions; there is no visual glitch
|
||||
because the swapchain handles the transition seamlessly.
|
||||
|
||||
Reference in New Issue
Block a user