diff --git a/docs/01-rainbow-triangle.md b/docs/01-rainbow-triangle.md new file mode 100644 index 0000000..8a78303 --- /dev/null +++ b/docs/01-rainbow-triangle.md @@ -0,0 +1,440 @@ +# Building a Rainbow Triangle + +## S1: What We're Building + +We're creating a window containing a single triangle with smoothly blended colors: + +Red at the bottom-left corner, blue at the bottom-right corner, and green at the +top vertex. The gradient between each pair of vertices is not computed by you — +it is interpolated automatically by the GPU rasterizer in hardware. You provide +three vertices, each carrying a position and a color. The rasterizer determines +every pixel covered by the triangle and computes the color for that pixel by +blending the three vertex colors proportionally to their distance. The result +is a smooth rainbow gradient across a single primitive. We do not need a texture, +a colormap, or a fragment shader with any branching — just three colored +vertices and the default linear interpolation the [rasterizer](concepts/GLOSSARY.md#rasterizer) +applies to every [varying](concepts/GLOSSARY.md#varying). + +If you haven't read the [concept overview](concepts/graphics-pipeline.md), do so +now. [Coordinate systems](concepts/coordinate-systems.md) explains how the GPU +positions geometry. [Shader basics](concepts/shader-basics.md) covers the GPU +programs that drive rendering. + +## S2: The winit Application and Event Loop + +New concept: **event-driven windowing.** winit is the bridge between your Rust +code and the display server (X11 or Wayland on Linux). Think of it like `epoll` +or `kqueue` but for windows, input, and display lifecycle events instead of file +descriptors. + +The entire program runs on the tokio async runtime — wgpu's [adapter](concepts/GLOSSARY.md#adapter) +queries and [device](concepts/GLOSSARY.md#device) creation are async, and the +runtime is the natural home for the main event loop. + +### Architecture Overview + +- **`main()` is `#[tokio::main] async fn`** — the entry point runs on the tokio + runtime, giving us access to tokio's task scheduler and I/O facilities. +- **`tokio::spawn_blocking`** — winit's `event_loop.run_app()` is synchronous + and owns the display server connection. Blocking the tokio runtime thread with + an indefinite sync call would starve other tasks. We offload the blocking event + loop to a dedicated thread, then await the join handle. +- **`Handle::block_on()` in `resumed()`** — wgpu initialization (adapter and + device queries) is async, but winit's `resumed()` handler is synchronous. We + bridge the two execution models exactly once at startup. This initial GPU + setup takes ~50ms of wall time. +- **`Arc`** — shared reference count to the window, needed because both + winit event handlers and wgpu [surface](concepts/GLOSSARY.md#surface) state + must hold a reference to the same window object across the event loop + boundary. +- **`ControlFlow::Poll`** — continuous redraw mode. winit fires + `RedrawRequested` as fast as the display server allows the window to be + presented, giving us a tight render loop without a separate timer or explicit + vsync setup. The display [present mode](concepts/GLOSSARY.md#present-mode) + controls the actual vsync behavior. + +### Dependencies + +Add these to your `Cargo.toml`: + +```toml +wgpu = "29" +winit = "0.30" +tokio = { version = "1", features = ["rt", "macros"] } +bytemuck = { version = "1", features = ["derive"] } +log = "0.4" +simple_logger = "5" +``` + +- `wgpu` — the GPU abstraction layer. Manages device lifecycles, shaders, buffers, + pipelines, and command encoding. +- `winit` — cross-platform window creation and event dispatch. Owns the display + server connection. +- `tokio` — async runtime for the main loop and all GPU queries. +- `bytemuck` — zero-copy casting between Rust structs and byte slices. Required + for uploading vertex data to GPU buffers without manual serialization. +- `log` / `simple_logger` — structured logging. wgpu and winit emit diagnostic + messages via `log` when misconfigurations or driver issues are detected. + +### Complete Code + +```rust +use std::sync::Arc; +use winit::application::ApplicationHandler; +use winit::dpi::LogicalSize; +use winit::event::WindowEvent; +use winit::event_loop::{ActiveEventLoop, ControlFlow, EventLoop}; +use winit::window::{Window, WindowId}; + +#[tokio::main] +async fn main() { + simple_logger::init_with_level(log::Level::Debug).unwrap(); + + let event_loop = EventLoop::new().unwrap(); + let handle = tokio::Handle::current(); + + tokio::spawn_blocking(move || { + event_loop.run_app(&mut App { + handle, + window: None, + state: None, + }) + }) + .await + .unwrap(); +} + +struct App { + handle: tokio::Handle, + window: Option>, + state: Option, +} + +impl ApplicationHandler<()> for App { + fn resumed(&mut self, event_loop_ctl: &ActiveEventLoop) { + let window = Arc::new( + event_loop_ctl + .create_window( + Window::default_attributes() + .with_inner_size(LogicalSize::new(800.0, 600.0)) + .with_title("Rainbow Triangle"), + ) + .unwrap(), + ); + event_loop_ctl.set_control_flow(ControlFlow::Poll); + self.window = Some(window.clone()); + + self.state = Some( + self.handle + .block_on(async { + State::new(window.clone()).await.expect("Failed to create wgpu State") + }) + .expect("Failed to create wgpu State"), + ); + } + + fn window_event( + &mut self, + event_loop_ctl: &ActiveEventLoop, + _window_id: WindowId, + event: WindowEvent, + ) { + let Some(state) = self.state.as_mut() else { return }; + let Some(window) = self.window.as_ref() else { return }; + + match event { + WindowEvent::Resized(size) => state.resize(window, size), + WindowEvent::CloseRequested { .. } => event_loop_ctl.exit(), + WindowEvent::RedrawRequested => { + state.render(); + window.request_redraw(); + } + _ => {} + } + } + + fn exiting(&mut self, event_loop_ctl: &ActiveEventLoop) { + event_loop_ctl.exit(); + } +} +``` + +**Why `spawn_blocking`:** The display server event loop must run to completion +and cannot be interrupted. If we ran `run_app()` on the tokio runtime thread, +no other async tasks could execute. By spawning it on a blocking thread, the +tokio runtime remains free for GPU queries, driver I/O, and future background +tasks. + +**Why `Handle::block_on`:** wgpu's `request_adapter` and `request_device` query +the driver over async D-Bus/Wayland/Vulkan entrypoints. These futures must be +polled by a runtime executor. `block_on` attaches temporarily to the runtime +thread via its handle, polls the future to completion (~50ms), then returns the +result. + +**Why `ControlFlow::Poll`:** winit supports `ControlFlow::Poll` (continuous +redraw) and `ControlFlow::Wait` (idle until next event). A graphics application +needs a steady render loop. `Poll` tells winit to keep firing `RedrawRequested` +events. We re-queue ourselves inside the handler via `window.request_redraw()`, +matching the wgpu swapchain presentation rhythm. + +**Why `request_redraw()`:** After presenting a frame to the display, we ask +winit to schedule the next `RedrawRequested` frame. This creates an explicit +render loop: render → present → request redraw → render → repeat. The rate is +governed by the [swapchain](concepts/GLOSSARY.md#swapchain) [present mode](concepts/GLOSSARY.md#present-mode). + +**Why `exiting()`:** This is the final lifecycle signal before the process +terminates. On some display servers, `CloseRequested` fires on the window but +the event loop must still drain. `exiting()` ensures we have one last clean +opportunity to flush the queue and release GPU resources before the process +exits. + +## S3: Connecting to the GPU — The Init Chain + +New concept: **5-layer GPU connection.** Each layer adds a capability: + +1. **[Instance](concepts/GLOSSARY.md#instance)** — opens a connection to the + graphics driver. On Vulkan this loads the Vulkan loader and registers + instance-level extensions. On WebGL this picks the browser GPU context. +2. **[Surface](concepts/GLOSSARY.md#surface)** — binds the instance to a + specific window's swapchain. The surface is the wgpu representation of the + window's display buffer. +3. **[Adapter](concepts/GLOSSARY.md#adapter)** — selects the physical GPU + hardware. An adapter wraps the actual driver + silicon pair (e.g., Mesa RADV + on AMD, NVIDIA driver on NVIDIA silicon). +4. **[Device](concepts/GLOSSARY.md#device) + [Queue](concepts/GLOSSARY.md#queue)** — the + device owns all GPU resources (buffers, textures, shaders, pipelines). The + queue is the submission channel: you encode work into command buffers and + submit them to the queue. +5. **[SurfaceConfiguration](concepts/GLOSSARY.md#surface-configuration)** — + allocates the swapchain [framebuffers](concepts/GLOSSARY.md#framebuffer) for + this window at a specific resolution and pixel format. + +### The State Struct + +```rust +struct State { + surface: wgpu::Surface<'static>, + device: wgpu::Device, + queue: wgpu::Queue, + config: wgpu::SurfaceConfiguration, + pipeline: wgpu::RenderPipeline, + vertex_buffer: wgpu::Buffer, +} +``` + +- **`surface`** — connects to the window's display buffer. The `'static` lifetime + is safe because `App` owns the window and lives for the entire lifetime of the + process. The surface mediates all [swapchain](concepts/GLOSSARY.md#swapchain) + operations. +- **`device`** — owns all GPU resources. Every buffer, texture, shader module, + and pipeline created in this guide is a child of the device. When the device + is dropped, all its children are freed. +- **`queue`** — the command submission channel. You encode a frame's worth of + work into a [command buffer](concepts/GLOSSARY.md#command-buffer), then submit + that buffer to the queue. The queue pushes work to the GPU hardware. +- **`config`** — holds the surface's current width, height, pixel format, and + [present mode](concepts/GLOSSARY.md#present-mode). When the window is resized, + we reconfigure the surface with updated dimensions. +- **`pipeline`** — the compiled [render pipeline](concepts/GLOSSARY.md#render-pipeline). + A render pipeline is an immutable configuration combining a shader, a vertex + buffer layout, a primitive topology, and a [color target](concepts/GLOSSARY.md#color-target) + setup. Switching pipelines mid-frame is expensive; most applications use a few + pipelines and change them between draw calls. +- **`vertex_buffer`** — GPU memory holding our vertex data. The GPU reads + position and color data directly from this buffer during the vertex shader + stage. + +### Complete `State::new()` Implementation + +```rust +use wgpu::Surface; + +// --- Vertex type and data --- + +#[repr(C)] +#[derive(Clone, Copy, bytemuck::Pod, bytemuck::Zeroable)] +struct Vertex { + position: [f32; 3], + color: [f32; 3], +} + +const VERTICES: &[Vertex] = &[ + Vertex { position: [-0.5, -0.5, 0.0], color: [1.0, 0.0, 0.0] }, // red + Vertex { position: [ 0.5, -0.5, 0.0], color: [0.0, 0.0, 1.0] }, // blue + Vertex { position: [ 0.0, 0.5, 0.0], color: [0.0, 1.0, 0.0] }, // green +]; + +impl State { + async fn new(window: Arc) -> Result { + // Step 1: Instance — connection to the graphics driver + let instance = wgpu::Instance::default(); + + // Step 2: Surface — binds our window to the GPU's swapchain + let surface = instance + .create_surface(window) + .map_err(|e| format!("Failed to create surface: {:?}", e))?; + + // Step 3: Adapter — selects the physical GPU + let adapter = instance + .request_adapter(&wgpu::RequestAdapterOptions { + power_preference: wgpu::PowerPreference::HighPerformance, + force_fallback_adapter: false, + compatible_surface: None, + }) + .await + .ok_or("No GPU adapter found. Ensure Vulkan drivers are installed.")?; + + // Step 4: Device + Queue — resource owner + command submission + let (device, queue) = adapter + .request_device(&wgpu::DeviceDescriptor::default(), None) + .await + .map_err(|e| format!("Failed to request device: {:?}", e))?; + + // Step 5: SurfaceConfiguration — allocates swapchain framebuffers + let size = window.inner_size(); + let surface_caps = surface.get_capabilities(&adapter); + let format = surface_caps.formats.iter() + .find(|f| f.is_srgb()) + .copied() + .unwrap_or(surface_caps.formats[0]); + + let config = wgpu::SurfaceConfiguration { + usage: wgpu::TextureUsages::RENDER_ATTACHMENT | wgpu::TextureUsages::TEXTURE_BINDING, + format, + width: size.width.max(1), + height: size.height.max(1), + present_mode: wgpu::PresentMode::Mailbox, + desired_maximum_frame_latency: 2, + alpha_mode: surface_caps.alpha_modes[0], + view_formats: vec![format.add_srgb_suffix()], + }; + surface.configure(&device, &config); + + // Step 6: Compile the shader module + let shader_module = device.create_shader_module( + wgpu::ShaderModuleDescriptor { + label: Some("Rainbow Triangle Shader"), + source: wgpu::ShaderSource::Wgsl(include_str!("shader.wgsl").into()), + } + ); + + // Step 7: Upload vertex data to GPU memory + use wgpu::util::DeviceExt; + let vertex_buffer = device.create_buffer_init( + &wgpu::util::BufferInitDescriptor { + label: Some("Vertex Buffer"), + contents: bytemuck::cast_slice(VERTICES), + usage: wgpu::BufferUsages::VERTEX, + } + ); + + // Step 8: Create the render pipeline + let vertex_buffer_layout = wgpu::VertexBufferLayout { + array_stride: std::mem::size_of::() as u64, + step_mode: wgpu::VertexStepMode::Vertex, + attributes: &[ + wgpu::VertexAttribute { + offset: 0, + format: wgpu::VertexFormat::F32x3, + shader_location: 0, + }, + wgpu::VertexAttribute { + offset: std::mem::size_of::<[f32; 3]>() as u64, + format: wgpu::VertexFormat::F32x3, + shader_location: 1, + }, + ], + }; + + let pipeline = device.create_render_pipeline(&wgpu::RenderPipelineDescriptor { + label: Some("Triangle Pipeline"), + layout: None, + vertex: wgpu::VertexState { + module: &shader_module, + entry_point: Some("vs_main"), + buffers: &[vertex_buffer_layout], + compilation_options: Default::default(), + }, + primitive: wgpu::PrimitiveState { + topology: wgpu::PrimitiveTopology::TriangleList, + strip_index_format: None, + front_face: wgpu::FrontFace::Ccw, + cull_mode: Some(wgpu::Face::Back), + unclipped_depth: false, + polygon_mode: wgpu::PolygonMode::Fill, + conservative: false, + }, + depth_stencil: None, + multisample: wgpu::MultisampleState { + count: 1, + mask: !0, + alpha_to_coverage_enabled: false, + }, + fragment: Some(wgpu::FragmentState { + module: &shader_module, + entry_point: Some("fs_main"), + targets: &[Some(wgpu::ColorTargetState { + format: config.format, + blend: None, + write_mask: wgpu::ColorWrites::ALL, + })], + compilation_options: Default::default(), + }), + multiview_mask: None, + cache: None, + }); + + Ok(Self { + surface, + device, + queue, + config, + pipeline, + vertex_buffer, + }) + } +} +``` + +### Init Steps Explained + +**Step 1 — Instance:** `Instance::default()` opens a connection to the graphics +driver on the current platform. On Linux with Vulkan, this loads `libvulkan.so` +and creates a Vulkan `VkInstance`. On Windows, it loads `vulkan-1.dll`. The +instance is the foundational wgpu object — every other wgpu operation requires +it. + +**Step 2 — Surface:** `instance.create_surface(window)` binds the wgpu instance +to the winit `Window`. This tells the GPU: "the pixels of *this* window will be +the output of my rendering." In Vulkan terms, this is the first half of creating +a `SwapchainKHR`. The surface must match the window platform type exactly (X11, +Wayland, Windows, macOS, etc.). + +**Step 3 — Adapter:** `request_adapter()` queries available GPUs and returns the +best match for the given options. With +`PowerPreference::HighPerformance`, wgpu prefers a discrete GPU over an +integrated one on hybrid systems (e.g., NVIDIA + Intel Optimus). The +`compatible_surface: None` path works because our `Instance` was created without +a display handle; on Linux with Vulkan, the adapter selection remains correct +because the surface itself was created through a compatible instance. + +**Step 4 — Device + Queue:** `request_device()` allocates the logical GPU +resource manager and its submission queue. The device tracks all GPU memory and +validates API calls. The queue is the submission endpoint — every rendered frame +becomes a [command buffer](concepts/GLOSSARY.md#command-buffer) that is submitted +to this queue. On Vulkan, the device corresponds to `VkDevice` and the queue +to a `VkQueue`. + +**Step 5 — SurfaceConfiguration:** This allocates the +[swapchain](concepts/GLOSSARY.md#swapchain) [framebuffers](concepts/GLOSSARY.md#framebuffer). +We negotiate the pixel format with the driver (preferring an +[sRGB](concepts/GLOSSARY.md#srgb) format for correct color display), pick the +window dimensions (clamped to at least 1x1 to allow minimize-and-restore on some +platforms), and select the [present mode](concepts/GLOSSARY.md#present-mode). +`PresentMode::Mailbox` is a triple-buffered present mode that provides +consistent 60fps without tearing on most platforms. +`desired_maximum_frame_latency: 2` tells the swapchain to keep two frames of +back pressure, smoothing out frame time spikes. + +Steps 6 through 8 — shader module compilation, vertex buffer upload, and render +pipeline assembly — will be explored in detail in the next sections.