13 KiB
Database Utilities – Grade Snapshot System
Overview
This module implements a ClickHouse-backed grade snapshot and diffing system. It ingests grade data from an external API, persists immutable snapshots, tracks stable entities (users, classes, assignments), and computes changes over time (new / updated / removed grades).
The design emphasizes:
- Idempotent ingestion
- Historical accuracy
- Efficient change detection
- Append-only semantics (ClickHouse-friendly)
All functionality lives in the database_utils namespace.
Core Concepts
1. Stable Entities vs Snapshots
| Concept | Description |
|---|---|
| User | A logical account |
| Class | A course belonging to a user (stable across time) |
| Assignment | A specific graded item within a class (stable across time) |
| Snapshot (response) | A point-in-time capture of all grades returned by the API |
| Grade history | Per-assignment grades linked to a snapshot |
| Diffs | Computed changes between two snapshots |
Stable entities are created once and reused. Snapshots are immutable and time-ordered.
UUID Utilities
parse_uuid(string) → UUID
Parses a standard UUID string (xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx) into ClickHouse’s UUID { high, low } format.
- Validates format
- Throws on malformed input
- Used everywhere UUIDs enter the DB
uuid_to_string(UUID) → string
Converts ClickHouse UUIDs back into standard string format.
Date Handling
parse_date_to_clickhouse(string) → uint16_t
Converts an API date string (YYYY-MM-DD) into ClickHouse Date format (days since Unix epoch).
- Empty or invalid dates → epoch (
0) - Logs parsing failures instead of throwing
Database Handle
using CHClient = std::shared_ptr<clickhouse::Client>;
All functions accept a shared ClickHouse client to allow:
- Connection reuse
- Thread-safe sharing
- Easy dependency injection
User Operations
get_all_users()
Returns all users with:
user_idusernamepassword
Primarily for administration/debugging.
register_user(username, password)
Inserts a new user row.
⚠️ Note: Passwords are currently stored in plaintext. Hashing should be added before production use.
authenticate_user(username, password)
Returns true if a matching user exists.
- Uses
count()for minimal payload - Simple boolean authentication check
get_user_uuid(username)
Returns the user’s UUID if found.
Snapshot Insertion Flow
insert_grade_snapshot(user_id, api_response) → response_id
This is the main ingestion pipeline.
Step-by-Step Flow
-
Insert
grade_responses- One row per API fetch
- Contains metadata (
success,total_classes, timestamp)
-
Fetch generated
response_id- Most recent response for that user
-
Process each class
get_or_create_class()- Ensures a stable
class_id - Links class to the response (
response_classes)
-
Process assignments
get_or_create_assignment()- Ensures stable
assignment_id
-
Insert grade history
-
Batched inserts into
assignment_grade_history -
Each row ties:
- response
- assignment
- score
- attempts
-
Snapshots are never updated, only appended.
Stable Entity Management
get_or_create_class(user_id, class_data) → class_id
-
Searches by
(user_id, class_name) -
If found:
- Updates metadata (teacher, period, category)
-
If not found:
- Inserts a new class record
-
Returns the stable
class_id
get_or_create_assignment(user_id, class_id, assignment_data) → assignment_id
- Searches by
(user_id, class_id, assignment_name) - Updates due date / major flag if it exists
- Otherwise inserts a new assignment
- Returns stable
assignment_id
Snapshot Loading
load_latest_snapshot(user_id)
Loads the most recent snapshot for a user.
Returns std::nullopt if none exists.
load_snapshot_by_id(response_id)
Loads a fully hydrated snapshot, including:
- User
- Classes
- Assignments
- Grades
Result Structure
GradeSnapshot
├── response_id
├── user_id
├── classes[class_name] -> ClassRecord
├── assignments[class::assignment] -> AssignmentRecord
└── grades[assignment_id] -> GradeRecord
Used for:
- Diffing
- UI display
- Historical comparisons
Change Detection
has_changes(user_id, new_api_response) → bool
Fast pre-check before inserting a new snapshot.
Detects:
- New assignments
- Removed assignments
- Score changes
- Attempt changes
If no prior snapshot exists → changes detected.
Snapshot Diffing
diff_snapshots(old, new) → vector<AssignmentDiff>
Computes semantic differences between two snapshots.
Change Types
| Type | Meaning |
|---|---|
NEW |
Assignment did not exist before |
UPDATED |
Score or attempts changed |
REMOVED |
Assignment disappeared |
Each diff includes:
- Assignment ID
- Class name
- Assignment name
- Old grade (optional)
- New grade
Grade Update Logging
insert_grade_updates(user_id, old_response_id, new_response_id, diffs)
Persists diffs into grade_updates.
Features
- Nullable old values for new assignments
- Placeholder values for removed assignments
- Compact enum encoding for change type
- Batched insert for efficiency
This table provides a clear audit trail of grade changes over time.
Assignment Key Strategy
"class_name::assignment_name"
Used as a human-stable lookup key when comparing snapshots.
- Avoids relying on database IDs during diffing
- Keeps logic resilient to ID reuse or refactors
Design Guarantees
- ✅ Snapshots are immutable
- ✅ Stable IDs persist across time
- ✅ Changes are explicitly logged
- ✅ Efficient ClickHouse-friendly inserts
- ⚠️ SQL string concatenation is used (should be parameterized later)
- ⚠️ Password hashing not implemented
database_utils – Function Documentation
UUID Utilities
clickhouse::UUID parse_uuid(const std::string& str)
-
Purpose: Converts a UUID string (
xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx) into a ClickHouseUUIDobject. -
Input:
str– UUID string. -
Output:
clickhouse::UUID(high/low 64-bit parts). -
Behavior:
- Throws
std::runtime_errorif string is not 36 characters or malformed. - Removes hyphens and parses hex into two 64-bit integers.
- Throws
-
Usage: Any place UUID strings need to be stored in ClickHouse.
std::string uuid_to_string(const clickhouse::UUID& u)
- Purpose: Converts a ClickHouse
UUIDback into a human-readable UUID string. - Input:
u– ClickHouse UUID. - Output: Standard UUID string.
- Behavior: Formats
highandlow64-bit integers as a zero-padded 36-character string with hyphens.
Date Utilities
uint16_t parse_date_to_clickhouse(const std::string& date_str)
-
Purpose: Converts a date string from the API into ClickHouse
Dateformat. -
Input:
date_str– string inYYYY-MM-DDformat. -
Output:
uint16_trepresenting days since 1970-01-01. -
Behavior:
- Empty or invalid strings return
0(epoch). - Logs warnings for parse failures.
- Empty or invalid strings return
-
Usage: Assignments’
due_dateconversion.
User Operations
std::vector<UserRecord> get_all_users(const CHClient& client)
- Purpose: Retrieves all users from the database.
- Input:
client– ClickHouse client. - Output: Vector of
UserRecord(includesuser_id,logininfo). - Behavior: Logs batch size and total retrieved.
bool register_user(const CHClient& client, const std::string& username, const std::string& password)
- Purpose: Inserts a new user into the database.
- Input:
username,password. - Output:
trueon success. - Behavior: Currently stores passwords in plaintext; logs success.
bool authenticate_user(const CHClient& client, const std::string& username, const std::string& password)
- Purpose: Checks if a user exists with given credentials.
- Input:
username,password. - Output:
trueif valid,falseotherwise. - Behavior: Uses
SELECT count()for boolean check. Logs results.
std::optional<clickhouse::UUID> get_user_uuid(const CHClient& client, const std::string& username)
- Purpose: Retrieves a user’s UUID based on their username.
- Input:
username. - Output:
optional<UUID>; empty if user not found. - Behavior: Uses
LIMIT 1for efficiency.
Class & Assignment Management
std::string get_or_create_class(const CHClient& client, const std::string& user_id, const api_utils::ClassGrades& class_data)
-
Purpose: Ensures a stable
class_idexists for a user. -
Input:
user_id,class_data(name, teacher, period, category). -
Output:
class_idas string. -
Behavior:
- Searches for existing class.
- Updates metadata if found.
- Inserts new record if not found.
- Links class to the user in
user_classes.
std::string get_or_create_assignment(const CHClient& client, const std::string& user_id, const std::string& class_id, const api_utils::AssignmentGrade& assignment_data)
-
Purpose: Ensures a stable
assignment_idexists within a class. -
Input:
user_id,class_id,assignment_data(name, dueDate, isMajorGrade). -
Output:
assignment_idas string. -
Behavior:
- Updates existing assignment if found.
- Inserts a new assignment if not found.
- Uses
parse_date_to_clickhouse()for due date.
Snapshot Insertion
std::string insert_grade_snapshot(const CHClient& client, const std::string& user_id, const api_utils::GradesResponse& api_response)
-
Purpose: Inserts a complete snapshot of grades from the API.
-
Input:
user_id,api_response(success flag, total classes, grades per class/assignment). -
Output:
response_idof the inserted snapshot; empty string on failure. -
Behavior:
-
Inserts metadata into
grade_responses. -
Retrieves
response_id. -
Processes each class:
- Calls
get_or_create_class() - Links to response in
response_classes
- Calls
-
Processes each assignment:
- Calls
get_or_create_assignment() - Inserts grades into
assignment_grade_history.
- Calls
-
-
Notes: Immutable snapshots; append-only.
Snapshot Loading
std::optional<GradeSnapshot> load_latest_snapshot(const CHClient& client, const std::string& user_id)
- Purpose: Loads the most recent snapshot for a user.
- Output: Fully populated
GradeSnapshotornulloptif none exists. - Behavior: Uses
fetched_at DESC LIMIT 1.
std::optional<GradeSnapshot> load_snapshot_by_id(const CHClient& client, const std::string& response_id)
-
Purpose: Loads a snapshot by
response_id. -
Output:
GradeSnapshotincluding:- Classes
- Assignments
- Grades
-
Behavior: Joins
user_classes,user_assignments,assignment_grade_history,response_classes.
Diffing & Change Detection
bool has_changes(const CHClient& client, const std::string& user_id, const api_utils::GradesResponse& new_api_response)
-
Purpose: Detects if new API response differs from the latest snapshot.
-
Output:
trueif changes exist,falseotherwise. -
Checks for:
- New assignments
- Removed assignments
- Score/attempt changes
std::vector<AssignmentDiff> diff_snapshots(const GradeSnapshot& old_snapshot, const GradeSnapshot& new_snapshot)
- Purpose: Returns detailed differences between two snapshots.
- Output: Vector of
AssignmentDiff. - Change Types:
NEW,UPDATED,REMOVED. - Behavior: Compares old and new grades per assignment key.
void insert_grade_updates(const CHClient& client, const std::string& user_id, const std::string& old_response_id, const std::string& new_response_id, const std::vector<AssignmentDiff>& diffs)
-
Purpose: Inserts diffs into
grade_updatestable. -
Behavior:
- Maps
AssignmentDiffto ClickHouse columns. - Handles nullable old values for new assignments.
- Uses placeholder values for removed assignments.
- Logs number of inserted updates.
- Maps