# Coordinate Systems

## The Problem

Your window is a grid of pixels: 800×600 in our configuration. The 3D scene you want to render spans from -∞ to +∞ in every direction. The GPU cannot reason in window pixels because every window has a different size. It cannot reason in world space because that is application-defined. The GPU needs a standard intermediate coordinate space.

That space is [[ndc]](GLOSSARY.md#ndc), Normalized Device Coordinates.

## NDC Definition

NDC is a fixed, standardized cube:

| Axis | Min | Max | Meaning |
|------|-----|-----|---------|
| X | -1.0 | +1.0 | Left to right |
| Y | -1.0 | +1.0 | Bottom to top |
| Z | 0.0 | 1.0 | Near to far |

Any geometry in this cube that is in front of the near plane (Z ≥ 0) and behind the far plane (Z ≤ 1) is visible. Anything outside is clipped away by the GPU hardware before rasterization.

### Visual Map

```
  (-1,+1) ────────── (+1,+1)
      │                  │
      │       (0,0)      │  ← origin = center of screen
      │                  │
  (-1,-1) ────────── (+1,-1)
```

Notice the origin is at the center of the screen, not the top-left. This is deliberate: 3D scenes are easier to reason about when (0,0) is the center. A camera sits at the origin and looks down the negative Z axis.

## Our Triangle In NDC

Because this is the simplest possible renderer, our triangle vertices are specified directly in NDC. No projection matrix. No camera transform. No model matrix. Just three points in GPU-native space:

| Corner | X | Y | Z |
|--------|-----|-----|-----|
| Bottom-left | -0.5 | -0.5 | 0.0 |
| Bottom-right | +0.5 | -0.5 | 0.0 |
| Top-center | 0.0 | +0.5 | 0.0 |

The triangle occupies the lower half of the screen. The base runs from left to right along Y=-0.5. The peak sits on the center axis at Y=0.5. All three vertices are at Z=0, sitting exactly on the near plane.

Plot this in the NDC box above and you will see why the triangle fills half the screen. It spans 50% of the X axis (from -0.5 to +0.5) and 50% of the Y axis (from -0.5 to +0.5 in the lower half).

In a real application, vertices live in arbitrary world units and you apply a series of matrix transformations to bring them into clip space, from which the GPU produces NDC. Here we skip all of that and place the vertices directly in NDC. The vertex shader still outputs `vec4<f32>` and the pipeline is structurally identical.

## Homogeneous Coordinates

The GPU vertex shader outputs a `vec4<f32>`, not a `vec3<f32>`. The fourth component `w` is the [[homogeneous coordinates]](GLOSSARY.md#homogeneous-coordinates) value that enables the clip space → NDC conversion.

When the vertex shader outputs `vec4<f32>(x, y, z, w)`, the GPU performs a step called **perspective division**: it divides every component by `w`. The result is `(x/w, y/w, z/w)` — this is what lands in NDC.

For our triangle, we set `w = 1.0`:

```
vec4<f32>(position, 1.0)  =  vec4<f32>(pos.x, pos.y, pos.z, 1.0)
```

Division by 1.0 is the identity — the position passes through unchanged. But why four components?

A `w` value of 1.0 means "this is a point in space." A `w` value of 0.0 would mean "this is a direction vector." This encoding lets the GPU handle both positions and directions with the same data type. More importantly, when you use a perspective projection matrix, the matrix encodes a varying `w` value per vertex (equal to the vertex's Z distance from the camera). After perspective division, the resulting NDC coordinates automatically produce the foreshortening effect that makes distant objects appear smaller. That is how perspective works on the GPU.

Our triangle uses `w = 1.0` because we have no camera and no perspective — just an orthogonal placement. The value exists because the pipeline requires clip-space `vec4` output, not because we need perspective.

## Clip Space

Before NDC, there is [[clip space]](GLOSSARY.md#clip-space). This is the coordinate space the vertex shader outputs into. Clip space is a pyramid (for perspective projection) or a box (for orthographic projection) that the GPU clips against. Geometry outside the clip-space boundaries is discarded by hardware before perspective division. Our triangle is entirely inside the clip space pyramid, so nothing is clipped.

## Viewport Transform (Automatic)

After perspective division produces NDC coordinates, the GPU maps them to the actual window dimensions. This is the viewport transform:

```
screen_x = (ndc_x + 1.0) / 2.0 * window_width
screen_y = (ndc_y + 1.0) / 2.0 * window_height
```

This step is automatic. You never write it in code. It is configured by the [[viewport transform]](GLOSSARY.md#viewport-transform) fields in your `SurfaceConfiguration`, specifically the `width` and `height` values. When the surface configuration says 800×600, the GPU maps NDC `[-1, +1]` onto `[0, 800]` and `[0, 600]`.

You do write code to update the viewport transform — but only when the window size changes. At that point, you create a new `SurfaceConfiguration` with the new dimensions and configure the surface. The GPU then uses the updated mapping on subsequent frames.

## Summary: The Coordinate Journey

For our triangle, every vertex follows this path:

1. **Vertex data:** Stored as `vec3<f32>` in the vertex buffer. Values are already in NDC.
2. **Vertex shader:** Wraps in `vec4(f32)` by appending `w = 1.0`. This is clip space (which, for identity `w`, equals NDC).
3. **Perspective division:** GPU divides by `w = 1.0` → identity. Vertex is now in [[ndc]](GLOSSARY.md#ndc).
4. **Viewport transform (automatic):** GPU scales NDC to window pixel coordinates. The triangle appears on screen.

In a real 3D application, this journey includes model, view, and projection matrices before clip space. For the rainbow triangle, the journey is three steps through identity transforms. The hardware pipeline stages are the same regardless.