docs: append sections S7-S8 (render loop, resize)
This commit is contained in:
@@ -885,3 +885,377 @@ stereoscopic (VR) or multi-viewport single-pass rendering. Not used here.
|
|||||||
**`cache: None`** — no pipeline cache. A pipeline cache stores compiled shader
|
**`cache: None`** — no pipeline cache. A pipeline cache stores compiled shader
|
||||||
binaries to speed up subsequent pipeline creation. Useful when creating many
|
binaries to speed up subsequent pipeline creation. Useful when creating many
|
||||||
pipelines dynamically; for a single pipeline, caching has no practical benefit.
|
pipelines dynamically; for a single pipeline, caching has no practical benefit.
|
||||||
|
|
||||||
|
## S7: The Render Loop — Recording and Submitting Commands
|
||||||
|
|
||||||
|
New concept: **command buffers are scripts, not function calls.** You cannot call
|
||||||
|
GPU operations directly from CPU code. Instead, you record commands into a
|
||||||
|
[command buffer](concepts/GLOSSARY.md#command-buffer) — a script that the GPU
|
||||||
|
queue executes asynchronously. Think of it like building an assembly listing:
|
||||||
|
each recording method appends an instruction. When the script is complete, you
|
||||||
|
submit it atomically to the [queue](concepts/GLOSSARY.md#queue). The GPU executes
|
||||||
|
all instructions in parallel, in whatever order it determines is optimal. There
|
||||||
|
is no `.await` on a draw call. The CPU returns immediately after submission and
|
||||||
|
continues the next frame while the GPU works in the background.
|
||||||
|
|
||||||
|
> **Key insight #4 — Command buffers are scripts, not function calls:**
|
||||||
|
> `create_command_encoder()` opens a recording session. `begin_render_pass()`
|
||||||
|
> starts a scoped drawing block. `render_pass.draw()` appends a draw command.
|
||||||
|
> `encoder.finish()` seals the script. `queue.submit()` dispatches it. The GPU
|
||||||
|
> executes it later, in parallel. There is no `.await` on a draw call.
|
||||||
|
|
||||||
|
### The `render(&mut self)` Method Signature
|
||||||
|
|
||||||
|
```rust
|
||||||
|
fn render(&mut self) {
|
||||||
|
// ...
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This is a **fully synchronous** method. It runs on the winit event loop thread
|
||||||
|
(triggered by `RedrawRequested`), has no `async` keyword, no `.await`, and takes
|
||||||
|
no tokio handle. All wgpu recording and submission operations are synchronous
|
||||||
|
and fast — they only encode instructions and push them to the queue; they do not
|
||||||
|
wait for GPU completion.
|
||||||
|
|
||||||
|
### Acquiring a Back Buffer from the Swapchain
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let status = self.surface.get_current_texture();
|
||||||
|
```
|
||||||
|
|
||||||
|
`get_current_texture()` is how you acquire a back buffer from the
|
||||||
|
[swapchain](concepts/GLOSSARY.md#swapchain). This is the framebuffer you render
|
||||||
|
into for this frame. In a triple-buffered swapchain (`PresentMode::Mailbox`),
|
||||||
|
there are up to two spare back buffers waiting for you. `get_current_texture()`
|
||||||
|
hands you the next available one.
|
||||||
|
|
||||||
|
In wgpu 29+, this method returns a `CurrentSurfaceTexture` **enum** — not a
|
||||||
|
`Result`. The swapchain can be in seven distinct states, and every state is a
|
||||||
|
valid, non-error condition:
|
||||||
|
|
||||||
|
> **Key insight #5 — 7 swapchain states you must handle:** `Success(buf)` —
|
||||||
|
> render normally. `Suboptimal(buf)` — render but reconfig is advisable.
|
||||||
|
> `Timeout` — skip frame (GPU late). `Occluded` — skip frame (window behind
|
||||||
|
> another). `Outdated` — `self.resize()` to reconfigure. `Lost` — skip frame
|
||||||
|
> (display server restarted). `Validation` — skip frame (API misuse; check
|
||||||
|
> logs).
|
||||||
|
|
||||||
|
WHY `match` on 7 variants: `get_current_texture()` does not return a `Result`.
|
||||||
|
All 7 states are valid and the match must be exhaustive. The Rust compiler
|
||||||
|
enforces this — you cannot miss a variant.
|
||||||
|
|
||||||
|
### The Complete `render` Implementation
|
||||||
|
|
||||||
|
```rust
|
||||||
|
fn render(&mut self) {
|
||||||
|
let status = self.surface.get_current_texture();
|
||||||
|
|
||||||
|
match status {
|
||||||
|
wgpu::SurfaceStatus::Success(surface_texture)
|
||||||
|
| wgpu::SurfaceStatus::Suboptimal(surface_texture) => {
|
||||||
|
// Drive GPU work: shader compilation, memory allocation, fence signaling
|
||||||
|
if let Err(e) = self.device.poll(wgpu::Maintain::Wait) {
|
||||||
|
log::error!("Device poll failed: {e}");
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
let texture_view = surface_texture.texture.create_view(&Default::default());
|
||||||
|
|
||||||
|
let mut encoder = self.device.create_command_encoder(
|
||||||
|
&wgpu::CommandEncoderDescriptor {
|
||||||
|
label: Some("Main Command Encoder"),
|
||||||
|
},
|
||||||
|
);
|
||||||
|
|
||||||
|
{
|
||||||
|
let mut render_pass = encoder.begin_render_pass(&wgpu::RenderPassDescriptor {
|
||||||
|
label: Some("Main Render Pass"),
|
||||||
|
color_attachments: &[Some(wgpu::RenderPassColorAttachment {
|
||||||
|
view: &texture_view,
|
||||||
|
depth_slice: None,
|
||||||
|
resolve_target: None,
|
||||||
|
ops: wgpu::Operations {
|
||||||
|
load: wgpu::LoadOp::Clear(wgpu::Color {
|
||||||
|
r: 0.1,
|
||||||
|
g: 0.1,
|
||||||
|
b: 0.1,
|
||||||
|
a: 1.0,
|
||||||
|
}),
|
||||||
|
store: wgpu::StoreOp::Store,
|
||||||
|
},
|
||||||
|
})],
|
||||||
|
depth_stencil_attachment: None,
|
||||||
|
timestamp_writes: None,
|
||||||
|
occlusion_query_set: None,
|
||||||
|
multiview_mask: None,
|
||||||
|
});
|
||||||
|
|
||||||
|
render_pass.set_pipeline(&self.pipeline);
|
||||||
|
render_pass.set_vertex_buffer(0, self.vertex_buffer.slice(..));
|
||||||
|
render_pass.draw(0..3, 0..1);
|
||||||
|
} // render_pass drops here — render pass ends automatically
|
||||||
|
|
||||||
|
self.queue.submit(std::iter::once(encoder.finish()));
|
||||||
|
surface_texture.present();
|
||||||
|
}
|
||||||
|
|
||||||
|
wgpu::SurfaceStatus::Timeout => {
|
||||||
|
// GPU took too long to finish previous work. Skip this frame.
|
||||||
|
log::warn!("Surface status: Timeout — skipping frame");
|
||||||
|
}
|
||||||
|
|
||||||
|
wgpu::SurfaceStatus::Occluded => {
|
||||||
|
// Window is fully occluded by another window. Skip rendering.
|
||||||
|
log::debug!("Surface status: Occluded — skipping frame");
|
||||||
|
}
|
||||||
|
|
||||||
|
wgpu::SurfaceStatus::Outdated => {
|
||||||
|
// Swapchain resolution no longer matches window. Reconfigure.
|
||||||
|
log::warn!("Surface status: Outdated — resizing");
|
||||||
|
if let Some(window) = &self.window {
|
||||||
|
self.resize(window.inner_size());
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
wgpu::SurfaceStatus::Lost => {
|
||||||
|
// Display server restarted or GPU lost. Fatal without re-init.
|
||||||
|
log::error!("Surface status: Lost — cannot recover without re-creating State");
|
||||||
|
}
|
||||||
|
|
||||||
|
wgpu::SurfaceStatus::Validation { source, description } => {
|
||||||
|
// wgpu validated your descriptor and found it invalid.
|
||||||
|
log::error!("Surface validation: {source} — {description}");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step by Step
|
||||||
|
|
||||||
|
**`surface.get_current_texture()`** — Acquires the next available back buffer
|
||||||
|
from the [swapchain](concepts/GLOSSARY.md#swapchain). The swapchain cycles through
|
||||||
|
2–3 pre-allocated back buffers. This call returns immediately if a buffer is
|
||||||
|
available; it does not block on the GPU.
|
||||||
|
|
||||||
|
**`device.poll(wgpu::Maintain::Wait)`** — **Synchronous** call that drives
|
||||||
|
in-flight GPU work to completion: shader compilation fences, memory allocation,
|
||||||
|
and queue signaling. Without this, resources accumulate because the device does
|
||||||
|
not reclaim finished work. Called once per frame. Returns
|
||||||
|
`Result<(), MaintainError>` — if the device is lost, you recover by
|
||||||
|
re-creating the device.
|
||||||
|
|
||||||
|
WHY this is synchronous: `poll()` does not spawn a task or use `.await`. It
|
||||||
|
runs a small internal loop checking Vulkan fence objects until all in-flight
|
||||||
|
work is done, then returns. On a busy GPU this can take a few milliseconds per
|
||||||
|
frame — that is normal.
|
||||||
|
|
||||||
|
**`texture.create_view(&Default::default())`** — A [texture view](concepts/GLOSSARY.md#texture-view)
|
||||||
|
is how wgpu references a texture's memory inside a render pass. The GPU does
|
||||||
|
not accept raw texture handles in render pass attachments — it requires a view
|
||||||
|
that describes the mip level range, aspect, and dimension format.
|
||||||
|
`Default::default()` creates a full-view covering all mip levels and all aspects.
|
||||||
|
|
||||||
|
**`device.create_command_encoder(&desc)`** — Opens a recording session. The
|
||||||
|
[command encoder](concepts/GLOSSARY.md#command-buffer) is where you append
|
||||||
|
instructions. Think of it as building a function body: you add statements, then
|
||||||
|
`finish()` closes the function and returns the compiled buffer.
|
||||||
|
|
||||||
|
**`encoder.begin_render_pass(&desc)`** — Starts a scoped drawing block. The
|
||||||
|
[render pass](concepts/GLOSSARY.md#render-pass) descriptor defines the target
|
||||||
|
attachments (color, depth, stencil). The returned `RenderPass` is a scoped
|
||||||
|
guard — when it drops, the render pass ends automatically.
|
||||||
|
|
||||||
|
### Render Pass Color Attachment
|
||||||
|
|
||||||
|
```rust
|
||||||
|
color_attachments: &[Some(wgpu::RenderPassColorAttachment {
|
||||||
|
view: &texture_view,
|
||||||
|
depth_slice: None,
|
||||||
|
resolve_target: None,
|
||||||
|
ops: wgpu::Operations {
|
||||||
|
load: wgpu::LoadOp::Clear(wgpu::Color { r: 0.1, g: 0.1, b: 0.1, a: 1.0 }),
|
||||||
|
store: wgpu::StoreOp::Store,
|
||||||
|
},
|
||||||
|
})],
|
||||||
|
```
|
||||||
|
|
||||||
|
**`RenderPassColorAttachment` has exactly 4 fields:**
|
||||||
|
|
||||||
|
- **`view: &texture_view`** — the framebuffer we draw into. Must match the
|
||||||
|
color target format in the [render pipeline](concepts/GLOSSARY.md#render-pipeline).
|
||||||
|
- **`depth_slice: None`** — only used for 3D texture slices. Not applicable
|
||||||
|
to 2D rendering.
|
||||||
|
- **`resolve_target: None`** — only used for MSAA resolve. When multisampling
|
||||||
|
is active, the render pass writes to a multisampled buffer and resolves into
|
||||||
|
this target. We have no MSAA, so `None`.
|
||||||
|
- **`ops`** — [operations](concepts/GLOSSARY.md#render-pass) controlling load
|
||||||
|
and store behavior. Two sub-fields:
|
||||||
|
- **`load: LoadOp::Clear(color)`** — before drawing, fill the entire
|
||||||
|
framebuffer with this color. **This IS your background color.** Dark gray.
|
||||||
|
`LoadOp::Load` keeps existing pixels (used in UI compositing where you
|
||||||
|
draw on top of previous content).
|
||||||
|
- **`store: StoreOp::Store`** — after drawing, keep what was written. The
|
||||||
|
GPU writes the result back to the texture so the swapchain can present it.
|
||||||
|
`StoreOp::Discard` throws away the result — used for offscreen renders
|
||||||
|
where only the depth/stencil result matters.
|
||||||
|
|
||||||
|
**`depth_stencil_attachment: None`** — No depth or stencil buffer. When you
|
||||||
|
have a depth texture, it goes here.
|
||||||
|
|
||||||
|
**`timestamp_writes: None`** — GPU hardware timestamps for profiling. Not used
|
||||||
|
in production rendering; requires a query set.
|
||||||
|
|
||||||
|
**`occlusion_query_set: None`** — hardware occlusion queries (count fragments
|
||||||
|
that pass the depth test). Useful for visibility-based culling.
|
||||||
|
|
||||||
|
**`multiview_mask: None`** — multiview rendering mask for VR / multi-viewport.
|
||||||
|
|
||||||
|
### Binding State and Drawing
|
||||||
|
|
||||||
|
**`render_pass.set_pipeline(&self.pipeline)`** — Tells the GPU which
|
||||||
|
[render pipeline](concepts/GLOSSARY.md#render-pipeline) to use for subsequent
|
||||||
|
draw calls. The pipeline encapsulates the shader programs, vertex format,
|
||||||
|
primitive topology, and output configuration. Must be set before any draw call
|
||||||
|
in a render pass. Switching pipelines mid-pass is expensive and should be
|
||||||
|
minimized.
|
||||||
|
|
||||||
|
WHY this is necessary: the GPU hardware does not store pipeline state between
|
||||||
|
frames. Every render pass starts with no pipeline bound. You must set it every
|
||||||
|
frame.
|
||||||
|
|
||||||
|
**`render_pass.set_vertex_buffer(0, self.vertex_buffer.slice(..))`** — Binds the
|
||||||
|
[vertex buffer](concepts/GLOSSARY.md#vertex-buffer) to slot 0.
|
||||||
|
`buffer.slice(..)` creates a [buffer slice](concepts/GLOSSARY.md#buffer-slice)
|
||||||
|
covering the full buffer (equivalent to `buffer.slice(0..)`). Slot 0 corresponds
|
||||||
|
to the first layout in the pipeline's vertex buffer layouts array. If you had
|
||||||
|
multiple vertex buffers (e.g., separate position and instance buffers), you'd
|
||||||
|
bind them to slots 0, 1, etc.
|
||||||
|
|
||||||
|
**`render_pass.draw(0..3, 0..1)`** — The draw command. Two `Range<u32>`
|
||||||
|
arguments:
|
||||||
|
- First range `0..3` — vertex range. Draw vertices 0, 1, 2 (three vertices
|
||||||
|
forming one triangle).
|
||||||
|
- Second range `0..1` — instance range. Draw instance 0 (one instance).
|
||||||
|
|
||||||
|
WHY two ranges: the vertex range controls which vertices from the buffer are
|
||||||
|
read. The instance range controls instanced rendering — the same geometry drawn
|
||||||
|
multiple times with different instance-data attributes. For a single triangle,
|
||||||
|
one draw call with `0..1` instances is correct.
|
||||||
|
|
||||||
|
**Render pass scope drop** — When the `render_pass` variable goes out of scope
|
||||||
|
(the closing `}` in the block), the drop implementation ends the render pass
|
||||||
|
and performs validation. If you forgot to set the pipeline or bind a required
|
||||||
|
buffer, wgpu reports the error at drop time, not at draw time.
|
||||||
|
|
||||||
|
**`encoder.finish()`** — Seals the command encoder. Returns the finished
|
||||||
|
[command buffer](concepts/GLOSSARY.md#command-buffer) ready for submission.
|
||||||
|
After `finish()`, the encoder cannot be used again.
|
||||||
|
|
||||||
|
**`queue.submit(iter)`** — Dispatches one or more command buffers to the GPU.
|
||||||
|
Takes an iterator of command buffers. We submit exactly one: the frame's command
|
||||||
|
buffer. This is a fire-and-forget call — it queues the work and returns
|
||||||
|
immediately. The GPU executes it asynchronously, in parallel with your next
|
||||||
|
frame's CPU work.
|
||||||
|
|
||||||
|
**`surface_texture.present()`** — Queues the rendered back buffer for display.
|
||||||
|
This tells the swapchain: "this buffer is done, show it on screen." **If you
|
||||||
|
forget this, you render to a buffer nobody sees.** The swapchain cycles the
|
||||||
|
buffer from "render target" to "front buffer" on the next vsync.
|
||||||
|
|
||||||
|
### Why the Match Arms Differ
|
||||||
|
|
||||||
|
- **`Success` / `Suboptimal`** — both deliver a `SurfaceTexture` you can render
|
||||||
|
into. The difference: `Suboptimal` means the current swapchain configuration
|
||||||
|
is not ideal for the GPU (e.g., format mismatch). You render normally but
|
||||||
|
should consider reconfiguring the surface during idle time.
|
||||||
|
- **`Timeout`** — the GPU exceeded the wait threshold for a back buffer. Skip
|
||||||
|
the frame. The GPU will catch up.
|
||||||
|
- **`Occluded`** — another fully covers your window. Skip rendering entirely —
|
||||||
|
the display server will not show your output. Saves GPU work.
|
||||||
|
- **`Outdated`** — the swapchain was created for a resolution that no longer
|
||||||
|
matches the window. Reconfigure the surface to match.
|
||||||
|
- **`Lost`** — the GPU or display server has been reset. Without re-creating
|
||||||
|
the device and surface, you cannot recover. In a real application, you'd
|
||||||
|
trigger a full re-initialization.
|
||||||
|
- **`Validation`** — wgpu rejected the surface configuration due to API misuse.
|
||||||
|
Check logs for the description. This is a programming error, not a runtime
|
||||||
|
condition.
|
||||||
|
|
||||||
|
Note: `pre_present_notify()` does **not** exist in wgpu 29. Do not call it. The
|
||||||
|
device polling via `device.poll()` is the only frame synchronization mechanism
|
||||||
|
you need.
|
||||||
|
|
||||||
|
## S8: Handling Window Resize
|
||||||
|
|
||||||
|
WHY `surface.configure()` on resize: The swapchain allocates back buffers at a
|
||||||
|
fixed dimension. When the window size changes, the old back buffers no longer
|
||||||
|
match the window's display surface. Presenting a mismatched-size buffer causes
|
||||||
|
undefined behavior — the display server clips, stretches, or rejects it.
|
||||||
|
`surface.configure()` allocates new back buffers matching the new dimensions and
|
||||||
|
discards the old ones.
|
||||||
|
|
||||||
|
WHY `width.max(1)`: On some display servers, minimizing a window briefly
|
||||||
|
reports `0 × 0` size before restoring. A zero-dimension surface allocation
|
||||||
|
panics. Clamping to 1 ensures the swapchain always has valid dimensions.
|
||||||
|
|
||||||
|
WHY `std::mem::take(&mut self.config.view_formats)`: The `view_formats` field
|
||||||
|
of `SurfaceConfiguration` is an owned `Vec<TextureFormat>`. When constructing
|
||||||
|
the new configuration, you move the vector out of the old config rather than
|
||||||
|
cloning it. `mem::take` replaces the field with `Vec::new()` (zero allocation)
|
||||||
|
and returns the original vector. This avoids a heap allocation for what is
|
||||||
|
typically a 1-element vec.
|
||||||
|
|
||||||
|
```rust
|
||||||
|
fn resize(&mut self, size: wgpu::dpi::PhysicalSize<u32>) {
|
||||||
|
if size.width > 0 && size.height > 0 {
|
||||||
|
let config = wgpu::SurfaceConfiguration {
|
||||||
|
usage: self.config.usage,
|
||||||
|
format: self.config.format,
|
||||||
|
width: size.width.max(1),
|
||||||
|
height: size.height.max(1),
|
||||||
|
present_mode: self.config.present_mode,
|
||||||
|
desired_maximum_frame_latency: self.config.desired_maximum_frame_latency,
|
||||||
|
alpha_mode: self.config.alpha_mode,
|
||||||
|
view_formats: std::mem::take(&mut self.config.view_formats),
|
||||||
|
};
|
||||||
|
self.surface.configure(&self.device, &config);
|
||||||
|
self.config = config;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
FIELD BY FIELD:
|
||||||
|
|
||||||
|
**`usage` / `format` / `present_mode` / `alpha_mode`** — carried over from the
|
||||||
|
old config unchanged. These properties are negotiated once at init time
|
||||||
|
and do not change on resize.
|
||||||
|
|
||||||
|
**`width` / `height`** — the new dimensions, clamped to at least 1.
|
||||||
|
|
||||||
|
**`desired_maximum_frame_latency`** — swapchain back-pressure setting. Kept from
|
||||||
|
the old config. This value controls how many frames the swapchain buffers
|
||||||
|
between CPU submission and GPU presentation. A value of 2 (triple buffering)
|
||||||
|
provides smooth frame pacing under variable CPU/GPU load. See S3 init step 5.
|
||||||
|
|
||||||
|
**`view_formats`** — additional texture formats the surface can create views
|
||||||
|
with. `std::mem::take()` moves the owned vector from the old config into the
|
||||||
|
new config. After `take()`, the old config's `view_formats` is an empty `Vec`.
|
||||||
|
This avoids a `clone()` of the vector. Since the old config is about to be
|
||||||
|
overwritten by `self.config = config`, the emptied field is irrelevant.
|
||||||
|
|
||||||
|
**`surface.configure(&self.device, &config)`** — takes a reference to the
|
||||||
|
`Device` and the new `SurfaceConfiguration`. This is not async. It allocates the
|
||||||
|
new swapchain buffers and replaces the old ones. Any in-flight renders using
|
||||||
|
old buffers complete normally; the new buffers are available after this call
|
||||||
|
returns.
|
||||||
|
|
||||||
|
### When `resize` Is Called
|
||||||
|
|
||||||
|
In our `App::window_event` handler (S2), the `WindowEvent::Resized(size)` arm
|
||||||
|
calls `state.resize(window, size)`. The resize fires once for every dimension
|
||||||
|
change. On fast window resizing, you may receive dozens of resize events in
|
||||||
|
succession. `surface.configure()` is fast enough to handle this — each call
|
||||||
|
discards old buffers and allocates new ones. The GPU continues processing
|
||||||
|
in-flight frames with the old buffer dimensions; there is no visual glitch
|
||||||
|
because the swapchain handles the transition seamlessly.
|
||||||
|
|||||||
Reference in New Issue
Block a user