Skip to content

Data flows

How data moves through the system.

Data lifecycle

How training data flows from chess platforms to the player's practice sessions.

Data lifecycle flow

flowchart LR
    subgraph Sources
        LI[Lichess API]
        CC[Chess.com API]
    end

    subgraph "Phase 1 — Collection"
        IMP[Importer<br/>fetch games]
        SF[Stockfish 18<br/>N-1 threads, 1GB hash]
        TB[Lichess Tablebase<br/>≤7 pieces]
        OE[Lichess Opening Explorer<br/>theory detection]
    end

    subgraph Storage
        AD[analysis_data.json<br/>all moves, max granularity]
        TD[training_data.json<br/>filtered mistakes]
        LS[localStorage<br/>SRS state per position]
    end

    subgraph "Phase 2 — Derivation"
        DER[annotate_and_derive<br/>filter + explain]
    end

    subgraph "PWA — Training mode"
        SEL[Session selector<br/>SM-2 priority]
        QUIZ[Quiz interface<br/>board + feedback]
    end

    subgraph "PWA — Analysis mode"
        GSEL[Game selector]
        REV[Game review<br/>eval bar + score chart<br/>+ classifications]
    end

    LI --> IMP
    CC --> IMP
    IMP --> SF
    IMP --> TB
    IMP --> OE
    SF --> AD
    TB --> AD
    OE --> AD
    AD --> DER
    DER --> TD
    TD --> SEL
    LS --> SEL
    SEL --> QUIZ
    QUIZ --> LS
    AD --> GSEL
    GSEL --> REV

Two-layer data model

File Content Used by
analysis_data.json All moves, all evals, per game Phase 2 derivation + Analysis mode (game review UI)
training_data.json Filtered mistakes (unchanged schema) App + Demo

Phase 2 can be re-run cheaply without re-running Stockfish (chess-self-coach train --derive).

analysis_data.json structure (per game, per move)

{
  version, player,
  games: {
    "<game_url>": {
      headers, player_color, analyzed_at, analysis_duration_s, settings,
      moves: [
        { ply, fen_before, fen_after, move_san, move_uci, side,
          eval_source, in_opening, eval_before: {score_cp, is_mate, depth, seldepth, nodes, nps, time_ms, pv_san, ...},
          eval_after: {...}, eval_after_best: {score_cp, is_mate, mate_in},
          tablebase_before, tablebase_after,
          opening_explorer: {opening: {eco, name}, moves: [{san, white, draws, black}]},
          cp_loss, board: {piece_count, is_check, is_capture, ...},
          clock: {player, opponent, time_spent} }
      ]
    }
  }
}

training_data.json structure (unchanged)

{
  version, generated, player: {lichess, chesscom},
  positions: [
    { id, fen, player_color, player_move, best_move,
      context, score_before, score_after, cp_loss, category,
      explanation, acceptable_moves, pv,
      game: { id, source, opponent, date, result },
      clock: { player, opponent },
      srs: { interval, ease, next_review, history } }
  ],
  analyzed_game_ids: [...]
}

localStorage SRS state

train_srs: {
  "<position_id>": {
    interval, ease, repetitions, next_review,
    history: [{ date, correct, dismissed? }]
  }
}

SRS (Spaced Repetition) algorithm

The SM-2 variant used for scheduling position reviews.

SRS algorithm flow

stateDiagram-v2
    [*] --> New: Position created
    New --> Learning: First review (interval=1d)
    Learning --> Learning: Wrong (interval=1d, ease↓)
    Learning --> Learning: Correct (interval×ease)
    Learning --> Mastered: interval ≥ 7d
    Mastered --> Learning: Overdue + wrong
    New --> Dismissed: "Give up"
    Learning --> Dismissed: "Give up"
    Dismissed --> [*]: interval=99999d<br/>Never shown again
Outcome Effect
Correct (1st rep) interval = 1 day
Correct (2nd rep) interval = 3 days
Correct (3rd+ rep) interval = interval × ease
Wrong interval = 1 day, repetitions = 0
Ease adjustment ease += 0.1 − (5−q)(0.08 + (5−q)×0.02), min 1.3