docs: update tutorial to wgpu 29 / winit 0.30 APIs and pollster async pattern

This commit is contained in:
2026-05-31 22:29:13 -05:00
parent b557ea8a1e
commit 5a7cb22bf2

View File

@@ -27,22 +27,18 @@ code and the display server (X11 or Wayland on Linux). Think of it like `epoll`
or `kqueue` but for windows, input, and display lifecycle events instead of file
descriptors.
The entire program runs on the tokio async runtime — wgpu's [adapter](concepts/GLOSSARY.md#adapter)
queries and [device](concepts/GLOSSARY.md#device) creation are async, and the
runtime is the natural home for the main event loop.
GPU initialization (adapter and device queries) is async, while the winit event loop runs synchronously. We bridge these two execution models once at startup using `pollster::block_on`.
### Architecture Overview
- **`main()` is `#[tokio::main] async fn`** — the entry point runs on the tokio
runtime, giving us access to tokio's task scheduler and I/O facilities.
- **`tokio::spawn_blocking`** — winit's `event_loop.run_app()` is synchronous
and owns the display server connection. Blocking the tokio runtime thread with
an indefinite sync call would starve other tasks. We offload the blocking event
loop to a dedicated thread, then await the join handle.
- **`Handle::block_on()` in `resumed()`** — wgpu initialization (adapter and
- **`main()` is synchronous** — the entry point initializes logging, creates
the event loop, and calls `run_app()`. The entire program runs on a single
thread. No async runtime is needed for the main loop.
- **`pollster::block_on()` in `resumed()`** — wgpu initialization (adapter and
device queries) is async, but winit's `resumed()` handler is synchronous. We
bridge the two execution models exactly once at startup. This initial GPU
setup takes ~50ms of wall time.
bridge the two execution models exactly once at startup using `pollster`,
a minimal single-threaded async executor. This initial GPU setup takes
~50ms of wall time.
- **`Arc<Window>`** — shared reference count to the window, needed because both
winit event handlers and wgpu [surface](concepts/GLOSSARY.md#surface) state
must hold a reference to the same window object across the event loop
@@ -60,17 +56,16 @@ Add these to your `Cargo.toml`:
```toml
wgpu = "29"
winit = "0.30"
tokio = { version = "1", features = ["rt", "macros"] }
pollster = "0.4"
bytemuck = { version = "1", features = ["derive"] }
log = "0.4"
simple_logger = "5"
```
- `wgpu` — the GPU abstraction layer. Manages device lifecycles, shaders, buffers,
pipelines, and command encoding.
- `winit` — cross-platform window creation and event dispatch. Owns the display
server connection.
- `tokio` — async runtime for the main loop and all GPU queries.
- `pollster` — minimal single-threaded async executor. Bridges wgpu's async GPU queries with synchronous winit callbacks. Polls futures to completion (~50ms) during initial GPU setup, then returns.
- `bytemuck` — zero-copy casting between Rust structs and byte slices. Required
for uploading vertex data to GPU buffers without manual serialization.
- `log` / `simple_logger` — structured logging. wgpu and winit emit diagnostic
@@ -79,6 +74,7 @@ simple_logger = "5"
### Complete Code
```rust
use pollster::block_on;
use std::sync::Arc;
use winit::application::ApplicationHandler;
use winit::dpi::LogicalSize;
@@ -86,26 +82,19 @@ use winit::event::WindowEvent;
use winit::event_loop::{ActiveEventLoop, ControlFlow, EventLoop};
use winit::window::{Window, WindowId};
#[tokio::main]
async fn main() {
fn main() {
simple_logger::init_with_level(log::Level::Debug).unwrap();
let event_loop = EventLoop::new().unwrap();
let handle = tokio::Handle::current();
tokio::spawn_blocking(move || {
event_loop.run_app(&mut App {
handle,
window: None,
state: None,
})
event_loop.run_app(&mut App {
window: None,
state: None,
})
.await
.unwrap();
}
struct App {
handle: tokio::Handle,
window: Option<Arc<Window>>,
state: Option<State>,
}
@@ -125,11 +114,10 @@ impl ApplicationHandler<()> for App {
self.window = Some(window.clone());
self.state = Some(
self.handle
.block_on(async {
State::new(window.clone()).await.expect("Failed to create wgpu State")
})
.expect("Failed to create wgpu State"),
block_on(async {
State::new(window.clone()).await.expect("Failed to create wgpu State")
})
.expect("Failed to create wgpu State"),
);
}
@@ -159,13 +147,9 @@ impl ApplicationHandler<()> for App {
}
```
> **WHY: `spawn_blocking` for winit**
> **WHY: `pollster::block_on` for async GPU init**
>
> The display server event loop must run to completion and cannot be interrupted. If we ran `run_app()` on the tokio runtime thread, no other async tasks could execute. By spawning it on a blocking thread, the tokio runtime remains free for GPU queries, driver I/O, and future background tasks.
> **WHY: `Handle::block_on` for async GPU init**
>
> wgpu's `request_adapter` and `request_device` query the driver over async D-Bus/Wayland/Vulkan entrypoints. These futures must be polled by a runtime executor. `block_on` attaches temporarily to the runtime thread via its handle, polls the future to completion (~50ms), then returns the result.
> wgpu's `request_adapter` and `request_device` query the driver over async D-Bus/Wayland/Vulkan entrypoints. These futures must be polled by a runtime executor. We use `pollster`, a minimal single-threaded async executor, to bridge wgpu's async GPU initialization with winit's synchronous `resumed()` callback. `pollster::block_on` polls the future to completion (~50ms) on the current thread, then returns the result — no background runtime, no spawn overhead, no cross-thread communication.
> **WHY: `ControlFlow::Poll` for the render loop**
>
@@ -307,7 +291,7 @@ const VERTICES: &[Vertex] = &[
impl State {
async fn new(window: Arc<Window>) -> Result<Self, String> {
// Step 1: Instance — connection to the graphics driver
let instance = wgpu::Instance::default();
let instance = wgpu::Instance::new(wgpu::InstanceDescriptor::new_without_display_handle());
// Step 2: Surface — binds our window to the GPU's swapchain
let surface = instance
@@ -375,12 +359,12 @@ impl State {
attributes: &[
wgpu::VertexAttribute {
offset: 0,
format: wgpu::VertexFormat::F32x3,
format: wgpu::VertexFormat::Float32x3,
shader_location: 0,
},
wgpu::VertexAttribute {
offset: std::mem::size_of::<[f32; 3]>() as u64,
format: wgpu::VertexFormat::F32x3,
format: wgpu::VertexFormat::Float32x3,
shader_location: 1,
},
],
@@ -392,7 +376,7 @@ impl State {
vertex: wgpu::VertexState {
module: &shader_module,
entry_point: Some("vs_main"),
buffers: &[vertex_buffer_layout],
buffers: &[Some(&vertex_buffer_layout)],
compilation_options: Default::default(),
},
primitive: wgpu::PrimitiveState {
@@ -415,7 +399,7 @@ impl State {
entry_point: Some("fs_main"),
targets: &[Some(wgpu::ColorTargetState {
format: config.format,
blend: None,
blend: Some(wgpu::BlendState::REPLACE),
write_mask: wgpu::ColorWrites::ALL,
})],
compilation_options: Default::default(),
@@ -751,7 +735,7 @@ the master `State::new()` block (S3, Step 8):
the vertex shader processes. The other option is `Instance`, which advances
per draw instance in instanced rendering. For a single triangle, `Vertex` is
correct: each of the three vertices has its own position and color.
- **First attribute — `shader_location: 0`**: reads 3 floats (`F32x3`) at byte
- **First attribute — `shader_location: 0`**: reads 3 floats (`Float32x3`) at byte
offset 0 of each vertex. These 3 floats map to the
[shader location](concepts/GLOSSARY.md#shader-location) `@location(0)` in the
vertex shader — the `position` parameter. The GPU delivers `[x, y, z]` to
@@ -789,7 +773,7 @@ must provide a `RenderPipelineLayout` created with `device.create_render_pipelin
- **`entry_point: Some("vs_main")`** — selects which function in the module is
the vertex shader entry point. Must match the `@vertex fn vs_main(...)`
declaration exactly.
- **`buffers: &[vertex_buffer_layout]`** — array of vertex buffer layouts.
- **`buffers: &[Some(&vertex_buffer_layout)]`** — array of optional vertex buffer layouts. Each layout is wrapped in `Some` to indicate it is present.
Multiple layouts are used rarely (multi-mesh, GPU instancing with separate
instance buffers). For a single vertex buffer, one layout suffices.
- **`compilation_options: Default::default()`** — shader compilation backend
@@ -846,10 +830,11 @@ draws at the same pixel. For a single triangle this is not a concern.
`SurfaceConfiguration`. The pipeline writes in this format; the surface
reads in this format. A mismatch at render time produces an error. If
you change the surface format, you must recreate the pipeline.
- **`blend: None`** — disables blending. Without blending, every fragment
color replaces the existing framebuffer pixel (`REPLACE` mode). With
blending, new and existing colors are combined according to a blend
equation (useful for transparency).
- **`blend: Some(wgpu::BlendState::REPLACE)`** — explicitly replaces every
fragment color with the new output. `None` would default to this behavior,
but we make it explicit for clarity. With a custom blend state, new and
existing colors can be combined according to a blend equation
(useful for transparency).
- **`write_mask: ColorWrites::ALL`** — write all four RGBA channels.
You can mask out individual channels (e.g., write only R and G) if you
need to preserve certain framebuffer channels across draw calls.
@@ -903,8 +888,8 @@ fn render(&mut self) {
```
This is a **fully synchronous** method. It runs on the winit event loop thread
(triggered by `RedrawRequested`), has no `async` keyword, no `.await`, and takes
no tokio handle. All wgpu recording and submission operations are synchronous
(triggered by `RedrawRequested`), has no `async` keyword, no `.await`, and requires
no async runtime. All wgpu recording and submission operations are synchronous
and fast — they only encode instructions and push them to the queue; they do not
wait for GPU completion.
@@ -997,12 +982,7 @@ fn render(&mut self) {
depth_slice: None,
resolve_target: None,
ops: wgpu::Operations {
load: wgpu::LoadOp::Clear(wgpu::Color {
r: 0.1,
g: 0.1,
b: 0.1,
a: 1.0,
}),
load: wgpu::LoadOp::Clear(wgpu::Color::BLACK),
store: wgpu::StoreOp::Store,
},
})],
@@ -1076,7 +1056,7 @@ color_attachments: &[Some(wgpu::RenderPassColorAttachment {
depth_slice: None,
resolve_target: None,
ops: wgpu::Operations {
load: wgpu::LoadOp::Clear(wgpu::Color { r: 0.1, g: 0.1, b: 0.1, a: 1.0 }),
load: wgpu::LoadOp::Clear(wgpu::Color::BLACK),
store: wgpu::StoreOp::Store,
},
})],
@@ -1094,7 +1074,7 @@ color_attachments: &[Some(wgpu::RenderPassColorAttachment {
- **`ops`** — [operations](concepts/GLOSSARY.md#operations) controlling load
and store behavior. Two sub-fields:
- **`load: LoadOp::Clear(color)`** — before drawing, fill the entire
framebuffer with this color. **This IS your background color.** Dark gray.
framebuffer with this color. **This IS your background color.** Black.
`LoadOp::Load` keeps existing pixels (used in UI compositing where you
draw on top of previous content).
- **`store: StoreOp::Store`** — after drawing, keep what was written. The
@@ -1209,7 +1189,7 @@ and returns the original vector. This avoids a heap allocation for what is
typically a 1-element vec.
```rust
fn resize(&mut self, size: wgpu::dpi::PhysicalSize<u32>) {
fn resize(&mut self, size: winit::dpi::PhysicalSize<u32>) {
if size.width > 0 && size.height > 0 {
let config = wgpu::SurfaceConfiguration {
usage: self.config.usage,
@@ -1300,7 +1280,7 @@ cargo run
module compilation log, pipeline creation messages, and the `simple_logger`
debug lines from surface status and device polling.
**Expected visual:** A dark gray background (from `LoadOp::Clear`) with a
**Expected visual:** A black background (from `LoadOp::Clear(wgpu::Color::BLACK)`) with a
rainbow triangle spanning most of the window. Red at the bottom-left corner,
blue at the bottom-right corner, green at the top vertex. Colors blend smoothly
across the triangle surface via hardware interpolation.