Week 7 – Experiments Manager and Dashboard

Author

Luiz Garcia

Published

January 19, 2026

Doi
Abstract

Built an experiments management system and a SvelteKit dashboard for real-time monitoring of optimization runs.

Summary

With overnight runs producing hundreds of experiment results, the need for proper experiment management became acute. This week introduced a complete tooling stack: experiments manager, REST API, and a real-time SvelteKit dashboard.

Experiments Manager

Built a system with Flow/Checkpoint abstractions:

  • Flow: A named sequence of optimization phases (e.g., “bitwise v3”)
  • Checkpoint: Serialized state at any point, enabling resume after crashes
  • CLI tool: Command-line interface for creating, listing, and managing flows

This replaced ad-hoc JSON files with a structured approach to experiment tracking.

SvelteKit Dashboard

Built a web dashboard (SvelteKit, Svelte 4) for monitoring experiments:

  • Flows CRUD: Create, view, edit, delete optimization flows
  • Real-time updates: WebSocket connection for live progress
  • Phase comparison tables: Side-by-side metrics across optimization phases
  • Checkpoint management: Resume interrupted experiments from the UI

The dashboard connects to a FastAPI backend that wraps the experiments manager.

GPU Sparse Evaluation

Extended the Metal GPU accelerator to handle sparse memory groups (neurons with >12 bits per neuron):

  • Training uses DashMap on CPU (lock-free concurrent hash map)
  • Evaluation exports to sorted arrays for GPU binary search
  • O(log n) lookups with coalesced memory access on Metal

This was necessary because high-bit neurons have address spaces too large for dense memory (2^20 = 1M entries per neuron).

Parallel Hybrid Genome Evaluation

Implemented evaluate_genomes_parallel_hybrid() for maximum GA/TS throughput:

  • Memory pool with 8 reusable instances to avoid OOM
  • Multiple genomes train concurrently using the pool
  • Dense groups (bits <= 12) evaluated on CPU, sparse groups (bits > 12) on GPU
  • Pipelining: CPU trains batch N+1 while GPU evaluates batch N

This achieved 4–8x speedup over sequential genome evaluation.

Phase Comparison Tables

Added structured reporting: after each optimization phase, the system full-evaluates the top genomes on validation data and prints a comparison table showing CE, accuracy, and improvement across all phases.

Next

The dashboard is functional but basic. Next week: full CRUD, charts, and an iOS companion app.

Reuse

CC-BY-NC-SA-4.0