Building a Rainbow Triangle

S1: What We're Building

We're creating a window containing a single triangle with smoothly blended colors:

Red at the bottom-left corner, blue at the bottom-right corner, and green at the top vertex. The gradient between each pair of vertices is not computed by you — it is interpolated automatically by the GPU rasterizer in hardware. You provide three vertices, each carrying a position and a color. The rasterizer determines every pixel covered by the triangle and computes the color for that pixel by blending the three vertex colors proportionally to their distance. The result is a smooth rainbow gradient across a single primitive. We do not need a texture, a colormap, or a fragment shader with any branching — just three colored vertices and the default linear interpolation the rasterizer applies to every varying.

If you haven't read the concept overview, do so now. Coordinate systems explains how the GPU positions geometry. Shader basics covers the GPU programs that drive rendering.

S2: The winit Application and Event Loop

New concept: event-driven windowing. winit is the bridge between your Rust code and the display server (X11 or Wayland on Linux). Think of it like epoll or kqueue but for windows, input, and display lifecycle events instead of file descriptors.

The entire program runs on the tokio async runtime — wgpu's adapter queries and device creation are async, and the runtime is the natural home for the main event loop.

Architecture Overview

main() is #[tokio::main] async fn — the entry point runs on the tokio runtime, giving us access to tokio's task scheduler and I/O facilities.
tokio::spawn_blocking — winit's event_loop.run_app() is synchronous and owns the display server connection. Blocking the tokio runtime thread with an indefinite sync call would starve other tasks. We offload the blocking event loop to a dedicated thread, then await the join handle.
Handle::block_on() in resumed() — wgpu initialization (adapter and device queries) is async, but winit's resumed() handler is synchronous. We bridge the two execution models exactly once at startup. This initial GPU setup takes ~50ms of wall time.
Arc<Window> — shared reference count to the window, needed because both winit event handlers and wgpu surface state must hold a reference to the same window object across the event loop boundary.
ControlFlow::Poll — continuous redraw mode. winit fires RedrawRequested as fast as the display server allows the window to be presented, giving us a tight render loop without a separate timer or explicit vsync setup. The display present mode controls the actual vsync behavior.

Dependencies

Add these to your Cargo.toml:

wgpu = "29"
winit = "0.30"
tokio = { version = "1", features = ["rt", "macros"] }
bytemuck = { version = "1", features = ["derive"] }
log = "0.4"
simple_logger = "5"

wgpu — the GPU abstraction layer. Manages device lifecycles, shaders, buffers, pipelines, and command encoding.
winit — cross-platform window creation and event dispatch. Owns the display server connection.
tokio — async runtime for the main loop and all GPU queries.
bytemuck — zero-copy casting between Rust structs and byte slices. Required for uploading vertex data to GPU buffers without manual serialization.
log / simple_logger — structured logging. wgpu and winit emit diagnostic messages via log when misconfigurations or driver issues are detected.

Complete Code

use std::sync::Arc;
use winit::application::ApplicationHandler;
use winit::dpi::LogicalSize;
use winit::event::WindowEvent;
use winit::event_loop::{ActiveEventLoop, ControlFlow, EventLoop};
use winit::window::{Window, WindowId};

#[tokio::main]
async fn main() {
    simple_logger::init_with_level(log::Level::Debug).unwrap();

    let event_loop = EventLoop::new().unwrap();
    let handle = tokio::Handle::current();

    tokio::spawn_blocking(move || {
        event_loop.run_app(&mut App {
            handle,
            window: None,
            state: None,
        })
    })
    .await
    .unwrap();
}

struct App {
    handle: tokio::Handle,
    window: Option<Arc<Window>>,
    state: Option<State>,
}

impl ApplicationHandler<()> for App {
    fn resumed(&mut self, event_loop_ctl: &ActiveEventLoop) {
        let window = Arc::new(
            event_loop_ctl
                .create_window(
                    Window::default_attributes()
                        .with_inner_size(LogicalSize::new(800.0, 600.0))
                        .with_title("Rainbow Triangle"),
                )
                .unwrap(),
        );
        event_loop_ctl.set_control_flow(ControlFlow::Poll);
        self.window = Some(window.clone());

        self.state = Some(
            self.handle
                .block_on(async {
                    State::new(window.clone()).await.expect("Failed to create wgpu State")
                })
                .expect("Failed to create wgpu State"),
        );
    }

    fn window_event(
        &mut self,
        event_loop_ctl: &ActiveEventLoop,
        _window_id: WindowId,
        event: WindowEvent,
    ) {
        let Some(state) = self.state.as_mut() else { return };
        let Some(window) = self.window.as_ref() else { return };

        match event {
            WindowEvent::Resized(size) => state.resize(window, size),
            WindowEvent::CloseRequested { .. } => event_loop_ctl.exit(),
            WindowEvent::RedrawRequested => {
                state.render();
                window.request_redraw();
            }
            _ => {}
        }
    }

    fn exiting(&mut self, event_loop_ctl: &ActiveEventLoop) {
        event_loop_ctl.exit();
    }
}

Why spawn_blocking: The display server event loop must run to completion and cannot be interrupted. If we ran run_app() on the tokio runtime thread, no other async tasks could execute. By spawning it on a blocking thread, the tokio runtime remains free for GPU queries, driver I/O, and future background tasks.

Why Handle::block_on: wgpu's request_adapter and request_device query the driver over async D-Bus/Wayland/Vulkan entrypoints. These futures must be polled by a runtime executor. block_on attaches temporarily to the runtime thread via its handle, polls the future to completion (~50ms), then returns the result.

Why ControlFlow::Poll: winit supports ControlFlow::Poll (continuous redraw) and ControlFlow::Wait (idle until next event). A graphics application needs a steady render loop. Poll tells winit to keep firing RedrawRequested events. We re-queue ourselves inside the handler via window.request_redraw(), matching the wgpu swapchain presentation rhythm.

Why request_redraw(): After presenting a frame to the display, we ask winit to schedule the next RedrawRequested frame. This creates an explicit render loop: render → present → request redraw → render → repeat. The rate is governed by the swapchain present mode.

Why exiting(): This is the final lifecycle signal before the process terminates. On some display servers, CloseRequested fires on the window but the event loop must still drain. exiting() ensures we have one last clean opportunity to flush the queue and release GPU resources before the process exits.

S3: Connecting to the GPU — The Init Chain

New concept: 5-layer GPU connection. Each layer adds a capability:

Instance — opens a connection to the graphics driver. On Vulkan this loads the Vulkan loader and registers instance-level extensions. On WebGL this picks the browser GPU context.
Surface — binds the instance to a specific window's swapchain. The surface is the wgpu representation of the window's display buffer.
Adapter — selects the physical GPU hardware. An adapter wraps the actual driver + silicon pair (e.g., Mesa RADV on AMD, NVIDIA driver on NVIDIA silicon).
Device + Queue — the device owns all GPU resources (buffers, textures, shaders, pipelines). The queue is the submission channel: you encode work into command buffers and submit them to the queue.
SurfaceConfiguration — allocates the swapchain framebuffers for this window at a specific resolution and pixel format.

The State Struct

struct State {
    surface: wgpu::Surface<'static>,
    device: wgpu::Device,
    queue: wgpu::Queue,
    config: wgpu::SurfaceConfiguration,
    pipeline: wgpu::RenderPipeline,
    vertex_buffer: wgpu::Buffer,
}

surface — connects to the window's display buffer. The 'static lifetime is safe because App owns the window and lives for the entire lifetime of the process. The surface mediates all swapchain operations.
device — owns all GPU resources. Every buffer, texture, shader module, and pipeline created in this guide is a child of the device. When the device is dropped, all its children are freed.
queue — the command submission channel. You encode a frame's worth of work into a command buffer, then submit that buffer to the queue. The queue pushes work to the GPU hardware.
config — holds the surface's current width, height, pixel format, and present mode. When the window is resized, we reconfigure the surface with updated dimensions.
pipeline — the compiled render pipeline. A render pipeline is an immutable configuration combining a shader, a vertex buffer layout, a primitive topology, and a color target setup. Switching pipelines mid-frame is expensive; most applications use a few pipelines and change them between draw calls.
vertex_buffer — GPU memory holding our vertex data. The GPU reads position and color data directly from this buffer during the vertex shader stage.

Complete `State::new()` Implementation

use wgpu::Surface;

// --- Vertex type and data ---

#[repr(C)]
#[derive(Clone, Copy, bytemuck::Pod, bytemuck::Zeroable)]
struct Vertex {
    position: [f32; 3],
    color: [f32; 3],
}

const VERTICES: &[Vertex] = &[
    Vertex { position: [-0.5, -0.5, 0.0], color: [1.0, 0.0, 0.0] }, // red
    Vertex { position: [ 0.5, -0.5, 0.0], color: [0.0, 0.0, 1.0] }, // blue
    Vertex { position: [ 0.0,  0.5, 0.0], color: [0.0, 1.0, 0.0] }, // green
];

impl State {
    async fn new(window: Arc<Window>) -> Result<Self, String> {
        // Step 1: Instance — connection to the graphics driver
        let instance = wgpu::Instance::default();

        // Step 2: Surface — binds our window to the GPU's swapchain
        let surface = instance
            .create_surface(window)
            .map_err(|e| format!("Failed to create surface: {:?}", e))?;

        // Step 3: Adapter — selects the physical GPU
        let adapter = instance
            .request_adapter(&wgpu::RequestAdapterOptions {
                power_preference: wgpu::PowerPreference::HighPerformance,
                force_fallback_adapter: false,
                compatible_surface: None,
            })
            .await
            .ok_or("No GPU adapter found. Ensure Vulkan drivers are installed.")?;

        // Step 4: Device + Queue — resource owner + command submission
        let (device, queue) = adapter
            .request_device(&wgpu::DeviceDescriptor::default(), None)
            .await
            .map_err(|e| format!("Failed to request device: {:?}", e))?;

        // Step 5: SurfaceConfiguration — allocates swapchain framebuffers
        let size = window.inner_size();
        let surface_caps = surface.get_capabilities(&adapter);
        let format = surface_caps.formats.iter()
            .find(|f| f.is_srgb())
            .copied()
            .unwrap_or(surface_caps.formats[0]);

        let config = wgpu::SurfaceConfiguration {
            usage: wgpu::TextureUsages::RENDER_ATTACHMENT | wgpu::TextureUsages::TEXTURE_BINDING,
            format,
            width: size.width.max(1),
            height: size.height.max(1),
            present_mode: wgpu::PresentMode::Mailbox,
            desired_maximum_frame_latency: 2,
            alpha_mode: surface_caps.alpha_modes[0],
            view_formats: vec![format.add_srgb_suffix()],
        };
        surface.configure(&device, &config);

        // Step 6: Compile the shader module
        let shader_module = device.create_shader_module(
            wgpu::ShaderModuleDescriptor {
                label: Some("Rainbow Triangle Shader"),
                source: wgpu::ShaderSource::Wgsl(include_str!("shader.wgsl").into()),
            }
        );

        // Step 7: Upload vertex data to GPU memory
        use wgpu::util::DeviceExt;
        let vertex_buffer = device.create_buffer_init(
            &wgpu::util::BufferInitDescriptor {
                label: Some("Vertex Buffer"),
                contents: bytemuck::cast_slice(VERTICES),
                usage: wgpu::BufferUsages::VERTEX,
            }
        );

        // Step 8: Create the render pipeline
        let vertex_buffer_layout = wgpu::VertexBufferLayout {
            array_stride: std::mem::size_of::<Vertex>() as u64,
            step_mode: wgpu::VertexStepMode::Vertex,
            attributes: &[
                wgpu::VertexAttribute {
                    offset: 0,
                    format: wgpu::VertexFormat::F32x3,
                    shader_location: 0,
                },
                wgpu::VertexAttribute {
                    offset: std::mem::size_of::<[f32; 3]>() as u64,
                    format: wgpu::VertexFormat::F32x3,
                    shader_location: 1,
                },
            ],
        };

        let pipeline = device.create_render_pipeline(&wgpu::RenderPipelineDescriptor {
            label: Some("Triangle Pipeline"),
            layout: None,
            vertex: wgpu::VertexState {
                module: &shader_module,
                entry_point: Some("vs_main"),
                buffers: &[vertex_buffer_layout],
                compilation_options: Default::default(),
            },
            primitive: wgpu::PrimitiveState {
                topology: wgpu::PrimitiveTopology::TriangleList,
                strip_index_format: None,
                front_face: wgpu::FrontFace::Ccw,
                cull_mode: Some(wgpu::Face::Back),
                unclipped_depth: false,
                polygon_mode: wgpu::PolygonMode::Fill,
                conservative: false,
            },
            depth_stencil: None,
            multisample: wgpu::MultisampleState {
                count: 1,
                mask: !0,
                alpha_to_coverage_enabled: false,
            },
            fragment: Some(wgpu::FragmentState {
                module: &shader_module,
                entry_point: Some("fs_main"),
                targets: &[Some(wgpu::ColorTargetState {
                    format: config.format,
                    blend: None,
                    write_mask: wgpu::ColorWrites::ALL,
                })],
                compilation_options: Default::default(),
            }),
            multiview_mask: None,
            cache: None,
        });

        Ok(Self {
            surface,
            device,
            queue,
            config,
            pipeline,
            vertex_buffer,
        })
    }
}

Init Steps Explained

Step 1 — Instance: Instance::default() opens a connection to the graphics driver on the current platform. On Linux with Vulkan, this loads libvulkan.so and creates a Vulkan VkInstance. On Windows, it loads vulkan-1.dll. The instance is the foundational wgpu object — every other wgpu operation requires it.

Step 2 — Surface: instance.create_surface(window) binds the wgpu instance to the winit Window. This tells the GPU: "the pixels of this window will be the output of my rendering." In Vulkan terms, this is the first half of creating a SwapchainKHR. The surface must match the window platform type exactly (X11, Wayland, Windows, macOS, etc.).

Step 3 — Adapter: request_adapter() queries available GPUs and returns the best match for the given options. With PowerPreference::HighPerformance, wgpu prefers a discrete GPU over an integrated one on hybrid systems (e.g., NVIDIA + Intel Optimus). The compatible_surface: None path works because our Instance was created without a display handle; on Linux with Vulkan, the adapter selection remains correct because the surface itself was created through a compatible instance.

Step 4 — Device + Queue: request_device() allocates the logical GPU resource manager and its submission queue. The device tracks all GPU memory and validates API calls. The queue is the submission endpoint — every rendered frame becomes a command buffer that is submitted to this queue. On Vulkan, the device corresponds to VkDevice and the queue to a VkQueue.

Step 5 — SurfaceConfiguration: This allocates the swapchain framebuffers. We negotiate the pixel format with the driver (preferring an sRGB format for correct color display), pick the window dimensions (clamped to at least 1x1 to allow minimize-and-restore on some platforms), and select the present mode. PresentMode::Mailbox is a triple-buffered present mode that provides consistent 60fps without tearing on most platforms. desired_maximum_frame_latency: 2 tells the swapchain to keep two frames of back pressure, smoothing out frame time spikes.

Steps 6 through 8 — shader module compilation, vertex buffer upload, and render pipeline assembly — will be explored in detail in the next sections.

18 KiB Raw Blame History