diff --git a/docs/01-rainbow-triangle.md b/docs/01-rainbow-triangle.md
new file mode 100644
index 0000000..8a78303
--- /dev/null
+++ b/docs/01-rainbow-triangle.md
@@ -0,0 +1,440 @@
+# Building a Rainbow Triangle
+
+## S1: What We're Building
+
+We're creating a window containing a single triangle with smoothly blended colors:
+
+Red at the bottom-left corner, blue at the bottom-right corner, and green at the
+top vertex. The gradient between each pair of vertices is not computed by you —
+it is interpolated automatically by the GPU rasterizer in hardware. You provide
+three vertices, each carrying a position and a color. The rasterizer determines
+every pixel covered by the triangle and computes the color for that pixel by
+blending the three vertex colors proportionally to their distance. The result
+is a smooth rainbow gradient across a single primitive. We do not need a texture,
+a colormap, or a fragment shader with any branching — just three colored
+vertices and the default linear interpolation the [rasterizer](concepts/GLOSSARY.md#rasterizer)
+applies to every [varying](concepts/GLOSSARY.md#varying).
+
+If you haven't read the [concept overview](concepts/graphics-pipeline.md), do so
+now. [Coordinate systems](concepts/coordinate-systems.md) explains how the GPU
+positions geometry. [Shader basics](concepts/shader-basics.md) covers the GPU
+programs that drive rendering.
+
+## S2: The winit Application and Event Loop
+
+New concept: **event-driven windowing.** winit is the bridge between your Rust
+code and the display server (X11 or Wayland on Linux). Think of it like `epoll`
+or `kqueue` but for windows, input, and display lifecycle events instead of file
+descriptors.
+
+The entire program runs on the tokio async runtime — wgpu's [adapter](concepts/GLOSSARY.md#adapter)
+queries and [device](concepts/GLOSSARY.md#device) creation are async, and the
+runtime is the natural home for the main event loop.
+
+### Architecture Overview
+
+- **`main()` is `#[tokio::main] async fn`** — the entry point runs on the tokio
+  runtime, giving us access to tokio's task scheduler and I/O facilities.
+- **`tokio::spawn_blocking`** — winit's `event_loop.run_app()` is synchronous
+  and owns the display server connection. Blocking the tokio runtime thread with
+  an indefinite sync call would starve other tasks. We offload the blocking event
+  loop to a dedicated thread, then await the join handle.
+- **`Handle::block_on()` in `resumed()`** — wgpu initialization (adapter and
+  device queries) is async, but winit's `resumed()` handler is synchronous. We
+  bridge the two execution models exactly once at startup. This initial GPU
+  setup takes ~50ms of wall time.
+- **`Arc<Window>`** — shared reference count to the window, needed because both
+  winit event handlers and wgpu [surface](concepts/GLOSSARY.md#surface) state
+  must hold a reference to the same window object across the event loop
+  boundary.
+- **`ControlFlow::Poll`** — continuous redraw mode. winit fires
+  `RedrawRequested` as fast as the display server allows the window to be
+  presented, giving us a tight render loop without a separate timer or explicit
+  vsync setup. The display [present mode](concepts/GLOSSARY.md#present-mode)
+  controls the actual vsync behavior.
+
+### Dependencies
+
+Add these to your `Cargo.toml`:
+
+```toml
+wgpu = "29"
+winit = "0.30"
+tokio = { version = "1", features = ["rt", "macros"] }
+bytemuck = { version = "1", features = ["derive"] }
+log = "0.4"
+simple_logger = "5"
+```
+
+- `wgpu` — the GPU abstraction layer. Manages device lifecycles, shaders, buffers,
+  pipelines, and command encoding.
+- `winit` — cross-platform window creation and event dispatch. Owns the display
+  server connection.
+- `tokio` — async runtime for the main loop and all GPU queries.
+- `bytemuck` — zero-copy casting between Rust structs and byte slices. Required
+  for uploading vertex data to GPU buffers without manual serialization.
+- `log` / `simple_logger` — structured logging. wgpu and winit emit diagnostic
+  messages via `log` when misconfigurations or driver issues are detected.
+
+### Complete Code
+
+```rust
+use std::sync::Arc;
+use winit::application::ApplicationHandler;
+use winit::dpi::LogicalSize;
+use winit::event::WindowEvent;
+use winit::event_loop::{ActiveEventLoop, ControlFlow, EventLoop};
+use winit::window::{Window, WindowId};
+
+#[tokio::main]
+async fn main() {
+    simple_logger::init_with_level(log::Level::Debug).unwrap();
+
+    let event_loop = EventLoop::new().unwrap();
+    let handle = tokio::Handle::current();
+
+    tokio::spawn_blocking(move || {
+        event_loop.run_app(&mut App {
+            handle,
+            window: None,
+            state: None,
+        })
+    })
+    .await
+    .unwrap();
+}
+
+struct App {
+    handle: tokio::Handle,
+    window: Option<Arc<Window>>,
+    state: Option<State>,
+}
+
+impl ApplicationHandler<()> for App {
+    fn resumed(&mut self, event_loop_ctl: &ActiveEventLoop) {
+        let window = Arc::new(
+            event_loop_ctl
+                .create_window(
+                    Window::default_attributes()
+                        .with_inner_size(LogicalSize::new(800.0, 600.0))
+                        .with_title("Rainbow Triangle"),
+                )
+                .unwrap(),
+        );
+        event_loop_ctl.set_control_flow(ControlFlow::Poll);
+        self.window = Some(window.clone());
+
+        self.state = Some(
+            self.handle
+                .block_on(async {
+                    State::new(window.clone()).await.expect("Failed to create wgpu State")
+                })
+                .expect("Failed to create wgpu State"),
+        );
+    }
+
+    fn window_event(
+        &mut self,
+        event_loop_ctl: &ActiveEventLoop,
+        _window_id: WindowId,
+        event: WindowEvent,
+    ) {
+        let Some(state) = self.state.as_mut() else { return };
+        let Some(window) = self.window.as_ref() else { return };
+
+        match event {
+            WindowEvent::Resized(size) => state.resize(window, size),
+            WindowEvent::CloseRequested { .. } => event_loop_ctl.exit(),
+            WindowEvent::RedrawRequested => {
+                state.render();
+                window.request_redraw();
+            }
+            _ => {}
+        }
+    }
+
+    fn exiting(&mut self, event_loop_ctl: &ActiveEventLoop) {
+        event_loop_ctl.exit();
+    }
+}
+```
+
+**Why `spawn_blocking`:** The display server event loop must run to completion
+and cannot be interrupted. If we ran `run_app()` on the tokio runtime thread,
+no other async tasks could execute. By spawning it on a blocking thread, the
+tokio runtime remains free for GPU queries, driver I/O, and future background
+tasks.
+
+**Why `Handle::block_on`:** wgpu's `request_adapter` and `request_device` query
+the driver over async D-Bus/Wayland/Vulkan entrypoints. These futures must be
+polled by a runtime executor. `block_on` attaches temporarily to the runtime
+thread via its handle, polls the future to completion (~50ms), then returns the
+result.
+
+**Why `ControlFlow::Poll`:** winit supports `ControlFlow::Poll` (continuous
+redraw) and `ControlFlow::Wait` (idle until next event). A graphics application
+needs a steady render loop. `Poll` tells winit to keep firing `RedrawRequested`
+events. We re-queue ourselves inside the handler via `window.request_redraw()`,
+matching the wgpu swapchain presentation rhythm.
+
+**Why `request_redraw()`:** After presenting a frame to the display, we ask
+winit to schedule the next `RedrawRequested` frame. This creates an explicit
+render loop: render → present → request redraw → render → repeat. The rate is
+governed by the [swapchain](concepts/GLOSSARY.md#swapchain) [present mode](concepts/GLOSSARY.md#present-mode).
+
+**Why `exiting()`:** This is the final lifecycle signal before the process
+terminates. On some display servers, `CloseRequested` fires on the window but
+the event loop must still drain. `exiting()` ensures we have one last clean
+opportunity to flush the queue and release GPU resources before the process
+exits.
+
+## S3: Connecting to the GPU — The Init Chain
+
+New concept: **5-layer GPU connection.** Each layer adds a capability:
+
+1. **[Instance](concepts/GLOSSARY.md#instance)** — opens a connection to the
+   graphics driver. On Vulkan this loads the Vulkan loader and registers
+   instance-level extensions. On WebGL this picks the browser GPU context.
+2. **[Surface](concepts/GLOSSARY.md#surface)** — binds the instance to a
+   specific window's swapchain. The surface is the wgpu representation of the
+   window's display buffer.
+3. **[Adapter](concepts/GLOSSARY.md#adapter)** — selects the physical GPU
+   hardware. An adapter wraps the actual driver + silicon pair (e.g., Mesa RADV
+   on AMD, NVIDIA driver on NVIDIA silicon).
+4. **[Device](concepts/GLOSSARY.md#device) + [Queue](concepts/GLOSSARY.md#queue)** — the
+   device owns all GPU resources (buffers, textures, shaders, pipelines). The
+   queue is the submission channel: you encode work into command buffers and
+   submit them to the queue.
+5. **[SurfaceConfiguration](concepts/GLOSSARY.md#surface-configuration)** —
+   allocates the swapchain [framebuffers](concepts/GLOSSARY.md#framebuffer) for
+   this window at a specific resolution and pixel format.
+
+### The State Struct
+
+```rust
+struct State {
+    surface: wgpu::Surface<'static>,
+    device: wgpu::Device,
+    queue: wgpu::Queue,
+    config: wgpu::SurfaceConfiguration,
+    pipeline: wgpu::RenderPipeline,
+    vertex_buffer: wgpu::Buffer,
+}
+```
+
+- **`surface`** — connects to the window's display buffer. The `'static` lifetime
+  is safe because `App` owns the window and lives for the entire lifetime of the
+  process. The surface mediates all [swapchain](concepts/GLOSSARY.md#swapchain)
+  operations.
+- **`device`** — owns all GPU resources. Every buffer, texture, shader module,
+  and pipeline created in this guide is a child of the device. When the device
+  is dropped, all its children are freed.
+- **`queue`** — the command submission channel. You encode a frame's worth of
+  work into a [command buffer](concepts/GLOSSARY.md#command-buffer), then submit
+  that buffer to the queue. The queue pushes work to the GPU hardware.
+- **`config`** — holds the surface's current width, height, pixel format, and
+  [present mode](concepts/GLOSSARY.md#present-mode). When the window is resized,
+  we reconfigure the surface with updated dimensions.
+- **`pipeline`** — the compiled [render pipeline](concepts/GLOSSARY.md#render-pipeline).
+  A render pipeline is an immutable configuration combining a shader, a vertex
+  buffer layout, a primitive topology, and a [color target](concepts/GLOSSARY.md#color-target)
+  setup. Switching pipelines mid-frame is expensive; most applications use a few
+  pipelines and change them between draw calls.
+- **`vertex_buffer`** — GPU memory holding our vertex data. The GPU reads
+  position and color data directly from this buffer during the vertex shader
+  stage.
+
+### Complete `State::new()` Implementation
+
+```rust
+use wgpu::Surface;
+
+// --- Vertex type and data ---
+
+#[repr(C)]
+#[derive(Clone, Copy, bytemuck::Pod, bytemuck::Zeroable)]
+struct Vertex {
+    position: [f32; 3],
+    color: [f32; 3],
+}
+
+const VERTICES: &[Vertex] = &[
+    Vertex { position: [-0.5, -0.5, 0.0], color: [1.0, 0.0, 0.0] }, // red
+    Vertex { position: [ 0.5, -0.5, 0.0], color: [0.0, 0.0, 1.0] }, // blue
+    Vertex { position: [ 0.0,  0.5, 0.0], color: [0.0, 1.0, 0.0] }, // green
+];
+
+impl State {
+    async fn new(window: Arc<Window>) -> Result<Self, String> {
+        // Step 1: Instance — connection to the graphics driver
+        let instance = wgpu::Instance::default();
+
+        // Step 2: Surface — binds our window to the GPU's swapchain
+        let surface = instance
+            .create_surface(window)
+            .map_err(|e| format!("Failed to create surface: {:?}", e))?;
+
+        // Step 3: Adapter — selects the physical GPU
+        let adapter = instance
+            .request_adapter(&wgpu::RequestAdapterOptions {
+                power_preference: wgpu::PowerPreference::HighPerformance,
+                force_fallback_adapter: false,
+                compatible_surface: None,
+            })
+            .await
+            .ok_or("No GPU adapter found. Ensure Vulkan drivers are installed.")?;
+
+        // Step 4: Device + Queue — resource owner + command submission
+        let (device, queue) = adapter
+            .request_device(&wgpu::DeviceDescriptor::default(), None)
+            .await
+            .map_err(|e| format!("Failed to request device: {:?}", e))?;
+
+        // Step 5: SurfaceConfiguration — allocates swapchain framebuffers
+        let size = window.inner_size();
+        let surface_caps = surface.get_capabilities(&adapter);
+        let format = surface_caps.formats.iter()
+            .find(|f| f.is_srgb())
+            .copied()
+            .unwrap_or(surface_caps.formats[0]);
+
+        let config = wgpu::SurfaceConfiguration {
+            usage: wgpu::TextureUsages::RENDER_ATTACHMENT | wgpu::TextureUsages::TEXTURE_BINDING,
+            format,
+            width: size.width.max(1),
+            height: size.height.max(1),
+            present_mode: wgpu::PresentMode::Mailbox,
+            desired_maximum_frame_latency: 2,
+            alpha_mode: surface_caps.alpha_modes[0],
+            view_formats: vec![format.add_srgb_suffix()],
+        };
+        surface.configure(&device, &config);
+
+        // Step 6: Compile the shader module
+        let shader_module = device.create_shader_module(
+            wgpu::ShaderModuleDescriptor {
+                label: Some("Rainbow Triangle Shader"),
+                source: wgpu::ShaderSource::Wgsl(include_str!("shader.wgsl").into()),
+            }
+        );
+
+        // Step 7: Upload vertex data to GPU memory
+        use wgpu::util::DeviceExt;
+        let vertex_buffer = device.create_buffer_init(
+            &wgpu::util::BufferInitDescriptor {
+                label: Some("Vertex Buffer"),
+                contents: bytemuck::cast_slice(VERTICES),
+                usage: wgpu::BufferUsages::VERTEX,
+            }
+        );
+
+        // Step 8: Create the render pipeline
+        let vertex_buffer_layout = wgpu::VertexBufferLayout {
+            array_stride: std::mem::size_of::<Vertex>() as u64,
+            step_mode: wgpu::VertexStepMode::Vertex,
+            attributes: &[
+                wgpu::VertexAttribute {
+                    offset: 0,
+                    format: wgpu::VertexFormat::F32x3,
+                    shader_location: 0,
+                },
+                wgpu::VertexAttribute {
+                    offset: std::mem::size_of::<[f32; 3]>() as u64,
+                    format: wgpu::VertexFormat::F32x3,
+                    shader_location: 1,
+                },
+            ],
+        };
+
+        let pipeline = device.create_render_pipeline(&wgpu::RenderPipelineDescriptor {
+            label: Some("Triangle Pipeline"),
+            layout: None,
+            vertex: wgpu::VertexState {
+                module: &shader_module,
+                entry_point: Some("vs_main"),
+                buffers: &[vertex_buffer_layout],
+                compilation_options: Default::default(),
+            },
+            primitive: wgpu::PrimitiveState {
+                topology: wgpu::PrimitiveTopology::TriangleList,
+                strip_index_format: None,
+                front_face: wgpu::FrontFace::Ccw,
+                cull_mode: Some(wgpu::Face::Back),
+                unclipped_depth: false,
+                polygon_mode: wgpu::PolygonMode::Fill,
+                conservative: false,
+            },
+            depth_stencil: None,
+            multisample: wgpu::MultisampleState {
+                count: 1,
+                mask: !0,
+                alpha_to_coverage_enabled: false,
+            },
+            fragment: Some(wgpu::FragmentState {
+                module: &shader_module,
+                entry_point: Some("fs_main"),
+                targets: &[Some(wgpu::ColorTargetState {
+                    format: config.format,
+                    blend: None,
+                    write_mask: wgpu::ColorWrites::ALL,
+                })],
+                compilation_options: Default::default(),
+            }),
+            multiview_mask: None,
+            cache: None,
+        });
+
+        Ok(Self {
+            surface,
+            device,
+            queue,
+            config,
+            pipeline,
+            vertex_buffer,
+        })
+    }
+}
+```
+
+### Init Steps Explained
+
+**Step 1 — Instance:** `Instance::default()` opens a connection to the graphics
+driver on the current platform. On Linux with Vulkan, this loads `libvulkan.so`
+and creates a Vulkan `VkInstance`. On Windows, it loads `vulkan-1.dll`. The
+instance is the foundational wgpu object — every other wgpu operation requires
+it.
+
+**Step 2 — Surface:** `instance.create_surface(window)` binds the wgpu instance
+to the winit `Window`. This tells the GPU: "the pixels of *this* window will be
+the output of my rendering." In Vulkan terms, this is the first half of creating
+a `SwapchainKHR`. The surface must match the window platform type exactly (X11,
+Wayland, Windows, macOS, etc.).
+
+**Step 3 — Adapter:** `request_adapter()` queries available GPUs and returns the
+best match for the given options. With
+`PowerPreference::HighPerformance`, wgpu prefers a discrete GPU over an
+integrated one on hybrid systems (e.g., NVIDIA + Intel Optimus). The
+`compatible_surface: None` path works because our `Instance` was created without
+a display handle; on Linux with Vulkan, the adapter selection remains correct
+because the surface itself was created through a compatible instance.
+
+**Step 4 — Device + Queue:** `request_device()` allocates the logical GPU
+resource manager and its submission queue. The device tracks all GPU memory and
+validates API calls. The queue is the submission endpoint — every rendered frame
+becomes a [command buffer](concepts/GLOSSARY.md#command-buffer) that is submitted
+to this queue. On Vulkan, the device corresponds to `VkDevice` and the queue
+to a `VkQueue`.
+
+**Step 5 — SurfaceConfiguration:** This allocates the
+[swapchain](concepts/GLOSSARY.md#swapchain) [framebuffers](concepts/GLOSSARY.md#framebuffer).
+We negotiate the pixel format with the driver (preferring an
+[sRGB](concepts/GLOSSARY.md#srgb) format for correct color display), pick the
+window dimensions (clamped to at least 1x1 to allow minimize-and-restore on some
+platforms), and select the [present mode](concepts/GLOSSARY.md#present-mode).
+`PresentMode::Mailbox` is a triple-buffered present mode that provides
+consistent 60fps without tearing on most platforms.
+`desired_maximum_frame_latency: 2` tells the swapchain to keep two frames of
+back pressure, smoothing out frame time spikes.
+
+Steps 6 through 8 — shader module compilation, vertex buffer upload, and render
+pipeline assembly — will be explored in detail in the next sections.