# solstone Journal Guide This document describes the layout of a **journal** directory where all captures, extracts, and insights are stored. Each dated `YYYYMMDD` folder is referred to as a **day**, and within each day captured content is organized into **segments** (timestamped duration folders). Each segment folder uses the format `HHMMSS_LEN/` where `HHMMSS` is the start time and `LEN` is the duration in seconds. This folder name serves as the **segment key**, uniquely identifying the segment within a given day. ## The Three-Layer Architecture solstone transforms raw recordings into actionable understanding through a three-layer pipeline: ``` ┌─────────────────────────────────────┐ │ LAYER 3: AGENT OUTPUTS │ Narrative summaries │ (Markdown files) │ "What it means" │ - agents/*.md (daily outputs) │ │ - *.md (segment outputs) │ └─────────────────────────────────────┘ ↑ synthesized from ┌─────────────────────────────────────┐ │ LAYER 2: EXTRACTS │ Structured data │ (JSON/JSONL files) │ "What happened" │ - audio.jsonl, *_audio.jsonl │ │ - screen.jsonl, *_screen.jsonl │ │ - events/*.jsonl (per-facet) │ └─────────────────────────────────────┘ ↑ derived from ┌─────────────────────────────────────┐ │ LAYER 1: CAPTURES │ Raw recordings │ (Binary media files) │ "What was recorded" │ - *.flac, *.ogg, *.opus, *.wav (audio) │ │ - *.webm (video) │ └─────────────────────────────────────┘ ``` ### Vocabulary Quick Reference **Pipeline Layers** | Term | Definition | Examples | |------|------------|----------| | **Capture** | Raw audio/video recording | `*.flac`, `*.ogg`, `*.opus`, `*.wav`, `*.webm` | | **Extract** | Structured data from captures | `*.jsonl` | | **Agent Output** | AI-generated narrative summary | `agents/*.md`, `HHMMSS_LEN/*.md` | **Organization** | Term | Definition | Examples | |------|------------|----------| | **Day** | 24-hour activity directory | `20250119/` | | **Segment** | 5-minute time window | `143022_300/` (14:30:22, 5 min) | | **Span** | Sequential segment group | Import creating 3 segments | | **Facet** | Project/context scope | `#work`, `#personal` | **Extracted Data** | Term | Definition | Examples | |------|------------|----------| | **Entity** | Tracked person/project/concept | People, companies, tools | | **Occurrence** | Time-based event | Meetings, messages, files | ## Top-Level Directory Structure | Directory/File | Purpose | |----------------|---------| | `YYYYMMDD/` | Daily capture folders containing segments, extracts, and agent outputs | | `entities/` | Journal-level entity identity records (`/entity.json`) | | `facets/` | Facet-specific data: entity relationships, todos, events, news, action logs | | `agents/` | Agent run logs in per-agent subdirectories (`/.jsonl`), day indexes (`.jsonl`), and latest-run symlinks (`.log`) | | `apps/` | App-specific storage (distinct from codebase `apps/`) | | `streams/` | Per-stream state files (`.json`) tracking segment chains and sequence numbers | | `imports/` | Imported audio files and processing artifacts | | `tokens/` | Token usage logs from AI model calls, organized by day | | `indexer/` | Search index (`journal.sqlite` FTS5 database) | | `health/` | Service health logs (`.log` files) | | `config/` | Configuration files and journal-level action logs | | `task_log.txt` | Optional log of utility runs in `[epoch]\tmessage` format | | `summary.md` | Journal-wide statistics summary (generated by `sol journal-stats`) | | `stats.json` | Detailed journal statistics in JSON format (generated by `sol journal-stats`) | ### Config directory - `config/journal.json` – owner configuration for the journal (optional, see [Owner configuration](#owner-configuration)). - `config/convey.json` – Convey UI preferences (facet/app ordering, selected facet). - `config/actions/` – journal-level action logs (see [Action Logs](#action-logs)). ## Owner configuration The optional `config/journal.json` file allows customization of journal processing and presentation based on owner preferences. This file should be created at the journal root and contains personal settings that affect how the system processes and interprets journal data. ### Identity configuration The `identity` block contains information about the journal owner that helps tools correctly identify the owner in transcripts, meetings, and other captured content: ```json { "identity": { "name": "Jeremie Miller", "preferred": "Jer", "pronouns": { "subject": "he", "object": "him", "possessive": "his", "reflexive": "himself" }, "aliases": ["Jer", "jeremie"], "email_addresses": ["jer@example.com"], "timezone": "America/Los_Angeles" } } ``` Fields: - `name` (string) – Full legal or formal name of the journal owner - `preferred` (string) – Preferred name or nickname to be used when addressing the owner - `pronouns` (object) – Structured pronoun set for template usage with fields: - `subject` – Subject pronoun (e.g., "he", "she", "they") - `object` – Object pronoun (e.g., "him", "her", "them") - `possessive` – Possessive adjective (e.g., "his", "her", "their") - `reflexive` – Reflexive pronoun (e.g., "himself", "herself", "themselves") - `aliases` (array of strings) – Alternative names, nicknames, or usernames that may appear in transcripts - `email_addresses` (array of strings) – Email addresses associated with the owner for participant detection - `timezone` (string) – IANA timezone identifier (e.g., "America/New_York", "Europe/London") for timestamp interpretation This configuration helps meeting extraction identify the owner as a participant, enables personalized agent interactions, and ensures timestamps are interpreted correctly across the journal. ### Convey configuration The `convey` block contains settings for the web application: ```json { "convey": { "password_hash": "" } } ``` Fields: - `password_hash` (string) – Hashed password for accessing the convey web application. Set via `sol password set`. **UI Preferences:** The separate `config/convey.json` file stores UI/UX personalization (facet/app ordering, selected facet). All fields optional: ```json { "facets": {"order": ["work", "personal"], "selected": "work"}, "apps": {"order": ["home", "calendar", "todos"], "starred": ["home", "todos"]} } ``` - `facets.order` – Custom facet ordering. `facets.selected` – Currently selected facet (auto-synced with browser). - `apps.order` – Custom app ordering in menu bar. - `apps.starred` – Apps to show in the quick-access starred section. ### Retention configuration The `retention` block controls automatic cleanup of layer 1 raw media (audio recordings, video captures, screen diffs) while preserving all layer 2 extracts and layer 3 agent outputs. Three modes control when raw media is deleted: - `"keep"` – retain raw media indefinitely (default) - `"days"` – delete raw media after `raw_media_days` days, once the segment has finished processing - `"processed"` – delete raw media as soon as the segment has finished processing ```json { "retention": { "raw_media": "days", "raw_media_days": 30, "per_stream": { "plaud": { "raw_media": "days", "raw_media_days": 7 }, "archon": { "raw_media": "processed" } } } } ``` Fields: - `raw_media` (string) – Retention mode: `"keep"`, `"days"`, or `"processed"`. Default: `"keep"`. - `raw_media_days` (integer or null) – Number of days to retain raw media when mode is `"days"`. Required when `raw_media` is `"days"`, ignored otherwise. - `per_stream` (object) – Per-stream overrides keyed by stream name. Each entry supports `raw_media` and `raw_media_days`. Omitted fields inherit from the global retention settings. "Raw media" means layer 1 capture files only: audio files (`.flac`, `.opus`, `.ogg`, `.m4a`, `.wav`), video files (`.webm`, `.mov`, `.mp4`), and screen diffs (`monitor_*_diff.png`). All layer 2 and layer 3 content is always preserved regardless of retention policy: transcripts (`audio.jsonl`, `screen.jsonl`), agent outputs (`agents/*.md`), speaker labels (`agents/speaker_labels.json`), facet events (`events/*.jsonl`), entity data, segment metadata (`stream.json`), and search index entries. Raw media is never deleted from segments that haven't finished processing. A segment is considered complete only when all four checks pass: - No `_active.jsonl` files in `agents/` (no running agents) - `audio.jsonl` (or `*_audio.jsonl`) exists if audio raw media was captured - `screen.jsonl` (or `*_screen.jsonl`) exists if video raw media was captured - `agents/speaker_labels.json` exists if voice embeddings (`.npz`) are present Purged segments remain fully navigable in convey. Transcripts, entities, speaker labels, and summaries are all intact. The only difference is that audio/video playback is unavailable. ### Environment variables The `env` block provides fallback values for environment variables. These are loaded at CLI startup and used when the corresponding variable is not set in the shell or `.env` file: ```json { "env": { "GOOGLE_API_KEY": "your-google-api-key", "ANTHROPIC_API_KEY": "your-anthropic-api-key", "OPENAI_API_KEY": "your-openai-api-key", "REVAI_ACCESS_TOKEN": "your-revai-token", "PLAUD_ACCESS_TOKEN": "your-plaud-token" } } ``` **Precedence order** (highest to lowest): 1. Shell environment variables 2. `.env` file in project root 3. Journal config `env` section This allows storing API keys in the journal config as an alternative to `.env`, which can be useful when the journal is synced across machines or when you want to keep all configuration in one place. #### Template usage examples The structured pronoun format enables proper pronoun usage in generated text and agent responses: ```python # In templates or generated text: f"{identity.pronouns.subject} joined the meeting" # "he joined the meeting" f"I spoke with {identity.pronouns.object}" # "I spoke with him" f"That is {identity.pronouns.possessive} desk" # "That is his desk" f"{identity.pronouns.subject} did it {identity.pronouns.reflexive}" # "he did it himself" ``` For complete documentation of the prompt template system including all variable categories, composition patterns, and how to add new variables, see [PROMPT_TEMPLATES.md](PROMPT_TEMPLATES.md). ### Transcribe configuration The `transcribe` block configures audio transcription settings for `sol transcribe`: ```json { "transcribe": { "backend": "whisper", "enrich": true, "preserve_all": false, "whisper": { "device": "auto", "model": "medium.en", "compute_type": "default" }, "revai": { "model": "fusion" } } } ``` **Top-level fields:** - `backend` (string) – STT backend to use: `"whisper"` (local processing) or `"revai"` (cloud with speaker diarization). Default: `"whisper"`. - `enrich` (boolean) – Enable LLM enrichment for topic extraction and transcript correction. Default: `true`. - `preserve_all` (boolean) – Keep audio files even when no speech is detected. When `false`, silent recordings are deleted to save disk space. Default: `false`. **Whisper backend settings** (`transcribe.whisper`): - `device` (string) – Device for inference: `"auto"` (detect GPU, fall back to CPU), `"cpu"`, or `"cuda"`. Default: `"auto"`. - `model` (string) – Whisper model to use (e.g., `"tiny.en"`, `"base.en"`, `"small.en"`, `"medium.en"`, `"large-v3-turbo"`, `"distil-large-v3"`). Default: `"medium.en"`. - `compute_type` (string) – Compute precision: `"default"` (auto-select optimal for platform), `"float32"` (most compatible), `"float16"` (faster on CUDA GPUs), `"int8"` (fastest on CPU). Default: `"default"`. **Rev.ai backend settings** (`transcribe.revai`): - `model` (string) – Rev.ai transcriber model: `"fusion"` (best quality), `"machine"` (fast automated), or `"low_cost"`. Default: `"fusion"`. **Platform auto-detection** (Whisper): When `compute_type` is `"default"`, optimal settings are automatically selected: - **CUDA GPU**: Uses `float16` for GPU-optimized inference - **CPU (including Apple Silicon)**: Uses `int8` for ~2x faster inference and significantly faster model loading Voice embeddings (resemblyzer) also auto-detect the best device: MPS on Apple Silicon (~16x faster), CUDA when available, or CPU fallback. CLI flags can override settings: `--backend` selects the backend, `--cpu` forces CPU mode with int8 (Whisper only), `--model MODEL` overrides the Whisper model. ### Describe configuration The `describe` block configures screen analysis settings for `sol describe`: ```json { "describe": { "max_extractions": 20, "categories": { "code": { "importance": "high", "extraction": "Extract when viewing different repositories or files" }, "gaming": { "importance": "ignore" } } } } ``` **Fields:** - `max_extractions` (integer) – Maximum number of frames to run detailed content extraction on per video. The first qualified frame is always extracted regardless of this limit. When more frames are eligible, selection uses AI-based prioritization (falling back to random selection). Default: `20`. - `categories` (object) – Per-category overrides for importance and extraction guidance. #### Category overrides Each category (e.g., `code`, `meeting`, `browsing`) can have: | Field | Values | Description | |-------|--------|-------------| | `importance` | `high`, `normal`, `low`, `ignore` | Advisory priority hint for AI frame selection. `high` prioritizes these frames, `low` deprioritizes unless unique, `ignore` suggests skipping unless categorization seems wrong. Default: `normal`. | | `extraction` | string | Custom guidance for when to extract content from this category. Overrides the default from the category's `.json` file. | Importance levels are advisory hints passed to the AI selection process, not hard filters. The AI may still select frames from `ignore` categories if it determines the content is valuable or the categorization may be incorrect. ### Providers configuration The `providers` block enables fine-grained control over which LLM provider and model is used for different contexts. This supports a tier-based system where you can specify capability levels (pro/flash/lite) rather than specific model names. ```json { "providers": { "default": { "provider": "google", "tier": 2 }, "contexts": { "observe.*": {"provider": "google", "tier": 3}, "talent.system.*": {"tier": 1}, "talent.system.meetings": {"provider": "anthropic", "disabled": true}, "talent.entities.observer": {"tier": 2, "extract": false} }, "models": { "google": { "1": "gemini-3-pro-preview", "2": "gemini-3-flash-preview", "3": "gemini-2.5-flash-lite" } } } } ``` #### Tier system Tiers provide a provider-agnostic way to specify model capability levels: | Tier | Name | Description | |------|-------|-------------| | 1 | pro | Highest capability, best for complex reasoning | | 2 | flash | Balanced performance and cost (default) | | 3 | lite | Fastest and cheapest, for simple tasks | System defaults map tiers to models for each provider. See `think/models.py` for current tier-to-model mappings (`PROVIDER_DEFAULTS` constant). If a requested tier is unavailable for a provider, the system falls back to more capable tiers (e.g., tier 3 → tier 2 → tier 1). #### Context matching Contexts are matched in order of specificity: 1. **Exact match** – `"talent.system.meetings"` matches only that exact context 2. **Glob pattern** – `"observe.*"` matches any context starting with `observe.` 3. **Default** – Falls back to the `default` configuration #### Context naming convention Talent configs (agents and generators) use the pattern `talent.{source}.{name}`: - System configs: `talent.system.{name}` (e.g., `talent.system.meetings`, `talent.system.default`) - App configs: `talent.{app}.{name}` (e.g., `talent.entities.observer`, `talent.support.support`) Other contexts follow the pattern `{module}.{feature}[.{operation}]`: - Observe pipeline: `observe.describe.frame`, `observe.enrich`, `observe.transcribe.gemini` #### Configuration options **default** – Global defaults applied when no context matches: - `provider` (string) – Provider name: `"google"`, `"openai"`, or `"anthropic"`. Default: `"google"`. - `tier` (integer) – Tier number (1-3). Default: `2` (flash). - `model` (string) – Explicit model name (overrides tier if specified). **contexts** – Context-specific overrides. Each key is a context pattern, value is: - `provider` (string) – Override provider (optional, inherits from default). - `tier` (integer) – Tier number (optional). - `model` (string) – Explicit model name (optional, overrides tier). - `disabled` (boolean) – Disable this talent config (optional, talent contexts only). - `extract` (boolean) – Enable/disable event extraction for generators with occurrence/anticipation hooks (optional). **models** – Per-provider tier overrides. Maps provider name to tier-model mappings: ```json { "google": {"1": "gemini-3-pro-preview", "2": "gemini-3-flash-preview"}, "openai": {"2": "gpt-5-mini-custom"} } ``` Note: Tier keys in JSON must be strings (`"1"`, `"2"`, `"3"`) since JSON doesn't support integer keys. ## Facet folders The `facets/` directory provides a way to organize journal content by scope or focus area. Each facet represents a cohesive grouping of related activities, projects, or areas of interest. ### Facet structure Each facet is organized as `facets//` where `` is a descriptive short unique name. When referencing facets in the system, use hashtags (e.g., `#personal` for the "Personal Life" facet, `#ml_research` for "Machine Learning Research"). Each facet folder contains: - `facet.json` – metadata file with facet title and description. - `activities/` – configured activities and completed activity records (see [Activity Records](#activity-records)). - `entities/` – entity relationships and detected entities (see [Facet Entities](#facet-entities)). - `todos/` – daily todo lists (see [Facet-Scoped Todos](#facet-scoped-todos)). - `events/` – extracted events per day (see [Event extracts](#event-extracts)). - `news/` – daily news and updates relevant to the facet (optional). - `logs/` – action audit logs for tool calls (optional, see [Action Logs](#action-logs)). ### Facet metadata The `facet.json` file contains basic information about the facet: ```json { "title": "Machine Learning Research", "description": "AI/ML research projects, experiments, and related activities", "color": "#4f46e5", "emoji": "🧠" } ``` Optional fields: - `color` – hex color code for the facet card background in the web UI - `emoji` – emoji icon displayed in the top-left of the facet card - `muted` – boolean flag to mute/hide the facet from views (default: false) ### Facet Entities Entities in solstone use a two-tier architecture with **journal-level entities** (canonical identity) and **facet relationships** (per-facet context). There are also **detected entities** (daily discoveries) that can be promoted to attached status. #### Entity Storage Structure ``` entities/ └── {entity_id}/ └── entity.json # Journal-level entity (canonical identity) facets/{facet}/ └── entities/ ├── YYYYMMDD.jsonl # Daily detected entities └── {entity_id}/ ├── entity.json # Facet relationship ├── observations.jsonl # Durable facts (optional) └── voiceprints.npz # Voice recognition data (optional) ``` **Journal-level entities** (`entities//entity.json`) store the canonical identity: name, type, aliases (aka), and principal flag. These are shared across all facets. **Facet relationships** (`facets//entities//entity.json`) store per-facet context: description, timestamps, and custom fields specific to that facet. **Entity memory** (observations, voiceprints) is stored alongside facet relationships. #### Journal-Level Entities Journal entities represent the canonical identity record: ```json { "id": "alice_johnson", "name": "Alice Johnson", "type": "Person", "aka": ["Ali", "AJ"], "is_principal": false, "created_at": 1704067200000 } ``` **Standard fields:** - `id` (string) – Stable slug identifier derived from name via `entity_slug()` in `think/entities/` (lowercase, underscores, e.g., "Alice Johnson" → "alice_johnson"). Used for folder paths, URLs, and tool references. - `name` (string) – Display name for the entity. - `type` (string) – Entity type (e.g., "Person", "Company", "Project", "Tool"). Types are flexible and owner-defined; must be alphanumeric with spaces, minimum 3 characters. - `aka` (array of strings) – Alternative names, nicknames, or acronyms. Used in audio transcription and fuzzy matching. - `is_principal` (boolean) – When `true`, identifies this entity as the journal owner. Auto-flagged when name/aka matches identity config. - `blocked` (boolean) – When `true`, entity is hidden from all facets and excluded from agent context. - `created_at` (integer) – Unix timestamp in milliseconds when entity was created. #### Facet Relationships Facet relationships link journal entities to specific facets with context: ```json { "entity_id": "alice_johnson", "description": "Lead engineer on the API project", "attached_at": 1704067200000, "updated_at": 1704153600000, "last_seen": "20260115" } ``` **Relationship fields:** - `entity_id` (string) – Links to the journal entity. - `description` (string) – Facet-specific description. - `attached_at` (integer) – Unix timestamp when attached to this facet. - `updated_at` (integer) – Unix timestamp of last modification. - `last_seen` (string) – Day (YYYYMMDD) when last mentioned in journal content. - `detached` (boolean) – When `true`, soft-deleted from this facet but data preserved. - Custom fields (any) – Additional facet-specific metadata (e.g., `tier`, `status`, `priority`). #### Detected Entities Daily detection files (`facets//entities/YYYYMMDD.jsonl`) contain entities automatically discovered by agents from journal content: ```jsonl {"type": "Person", "name": "Charlie Brown", "description": "Mentioned in standup meeting"} {"type": "Tool", "name": "React", "description": "Used in UI development work"} ``` #### Entity Lifecycle 1. **Detection**: Daily agents scan journal content and record entities in `facets//entities/YYYYMMDD.jsonl` 2. **Aggregation**: Review agent tracks detection frequency across recent days 3. **Promotion**: Entities with 3+ detections are auto-promoted to attached, or owners manually promote via UI 4. **Persistence**: Creates journal entity + facet relationship; remains active until detached 5. **Detachment**: Sets `detached: true` on facet relationship, preserving all data 6. **Re-attachment**: Clears detached flag, restoring the entity with preserved history 7. **Blocking**: Sets `blocked: true` on journal entity and detaches from all facets #### Cross-Facet Behavior The same entity can be attached to multiple facets with independent descriptions and timestamps. When loading entities across all facets, the alphabetically-first facet wins for duplicates during aggregation. ### Facet News The `news/` directory provides a chronological record of news, updates, and external developments relevant to the facet. This allows tracking of industry news, research updates, regulatory changes, or any external information that impacts the facet's focus area. #### News organization News files are organized by date as `news/YYYYMMDD.md` where each file contains the day's relevant news items. Only create files for days that have news to record—sparse population is expected. #### News file format Each `YYYYMMDD.md` file is a markdown document with a consistent structure: ```markdown # 2025-01-18 News - Machine Learning Research ## OpenAI Announces New Model Architecture **Source:** techcrunch.com | **Time:** 09:15 Summary of the announcement and its relevance to current research projects... ## Paper: "Efficient Attention Mechanisms in Transformers" **Source:** arxiv.org | **Time:** 14:30 Key findings from the paper and potential applications... ## Google Research Updates Dataset License Terms **Source:** blog.google | **Time:** 16:45 Changes to dataset licensing that may affect ongoing experiments... ``` #### News entry structure Each news entry should include: - **Title** – concise headline as a level 2 heading - **Source** – origin of the news (website, journal, etc.) - **Time** – optional time of publication or discovery (HH:MM format) - **Summary** – brief description focusing on relevance to the facet - **Impact** – optional notes on how this affects facet work #### News metadata Optionally, a `news.json` file can be maintained at the root of the news directory to track metadata: ```json { "last_updated": "2025-01-18", "sources": ["arxiv.org", "techcrunch.com", "nature.com"], "auto_fetch": false, "keywords": ["transformer", "attention", "llm", "research"] } ``` This allows for future automation of news gathering while maintaining manual curation quality. ### Activity Records The `activities/` directory within each facet stores both the configured activity types (`activities.jsonl`) and completed activity records organized by day (`{day}.jsonl`). Activity records represent completed spans of activity — periods where a specific activity type was continuously tracked across one or more recording segments. **File path pattern:** ``` facets/personal/activities/activities.jsonl # Configured activity types facets/personal/activities/20260209.jsonl # Completed records for the day facets/work/activities/20260209.jsonl facets/work/activities/20260209/coding_095809_303/session_review.md # Generated output ``` Each day file contains one JSON object per line, where each record represents a completed activity span: ```jsonl {"id": "coding_095809_303", "activity": "coding", "segments": ["095809_303", "100313_303", "100816_303", "101320_302"], "level_avg": 0.88, "description": "Developed extraction prompts using Claude Code and VS Code", "active_entities": ["Claude Code", "VS Code", "sunstone"], "created_at": 1770435619415} {"id": "meeting_090953_303", "activity": "meeting", "segments": ["090953_303", "091457_303", "092001_304", "092506_304", "093010_304"], "level_avg": 1.0, "description": "Sprint planning meeting with the engineering team", "active_entities": ["Alice", "Bob"], "created_at": 1770435619420} ``` #### Record ID scheme Activity record IDs follow the format `{activity_type}_{segment_key}` where `segment_key` is the segment in which the activity started. This is unique within a facet+day because only one activity of a given type can start in a given segment for one facet. #### Record fields - `id` (string) – Unique identifier: `{activity}_{start_segment_key}` (e.g., `coding_095809_303`) - `activity` (string) – Activity type ID from the facet's configured activities - `segments` (array of strings) – Ordered list of segment keys where this activity was active - `level_avg` (float) – Average engagement level across all segments (high=1.0, medium=0.5, low=0.25) - `description` (string) – AI-synthesized description of the full activity span - `active_entities` (array of strings) – Merged and deduplicated entity names from all segments - `created_at` (integer) – Unix timestamp in milliseconds when the record was created #### Lifecycle Activity records are created by the `activities` segment agent when it detects that an activity has ended: 1. The `activity_state` agent tracks per-segment, per-facet activity states with continuity via `since` fields. Each entry includes an `id` field (`{activity}_{since}`) that uniquely identifies the activity span, and `activity.live` events are emitted for active entries. 2. The `activities` agent runs after `activity_state` and compares previous vs. current segment states 3. When an activity ends (explicitly, implicitly, or via timeout), the agent walks the segment chain to collect all data 4. A record is written to the facet's day file with preliminary description 5. An LLM synthesizes all per-segment descriptions into a unified narrative 6. The record description is updated with the synthesized version **Segment flush:** If no new segments arrive for an extended period (1 hour), the supervisor triggers `sol dream --flush` on the last segment. Agents that declare `hook.flush: true` (like `activities`) run with `flush=True` in their context, treating all remaining active activities as ended. This ensures activities are recorded promptly even when the owner stops working, and prevents cross-day data loss. Records are written idempotently — duplicate IDs are skipped on re-runs. #### Generated output Activity-scheduled agents (`schedule: "activity"`) produce output that is stored alongside the activity records, organized by day and record ID: ``` facets/{facet}/activities/{day}/{activity_id}/{agent}.{ext} ``` For example, a `session_review` agent processing a coding activity would write to: ``` facets/work/activities/20260209/coding_095809_303/session_review.md ``` These output directories are only created when activity-scheduled agents run. The path is computed by `get_activity_output_path()` in `think/activities.py` and passed as `output_path` in the agent request. Output files are indexed for search via the `facets/*/activities/*/*/*.md` formatter pattern. ## Facet-Scoped Todos Todos are organized by facet in `facets/{facet}/todos/{day}.jsonl` where each file stores todo items as JSON Lines. Todos belong to a specific facet (e.g., "personal", "work", "research") and are completely separated by scope. **File path pattern:** ``` facets/personal/todos/20250110.jsonl facets/work/todos/20250110.jsonl facets/research/todos/20250112.jsonl ``` Each file contains one JSON object per line, with the line number (1-indexed) serving as the stable todo ID. ```jsonl {"text": "Draft standup update"} {"text": "Review PR #1234 for indexing tweaks", "time": "14:30"} {"text": "Morning planning session notes", "completed": true} {"text": "Cancel meeting with vendor", "cancelled": true} ``` ### Format Specification **JSONL structure:** Each line is a JSON object with the following fields: - `text` (required) – Task description - `time` (optional) – Scheduled time in `HH:MM` format (e.g., `"14:30"`) - `completed` (optional) – Set to `true` when task is done - `cancelled` (optional) – Set to `true` for soft-deleted tasks - `created_at` (optional) – Unix timestamp in milliseconds when todo was created - `updated_at` (optional) – Unix timestamp in milliseconds of last modification **Facet context:** - Facet is determined by the file location, not inline tags - Each facet has its own independent todo list for each day - Work todos (`facets/work/todos/`) are completely separate from personal todos (`facets/personal/todos/`) **Rules:** - Line number is the stable todo ID (1-indexed); todos are never removed, only cancelled - Append new todos at the end of the file to maintain stable line numbering - Mark completed items with `"completed": true` - Cancel items with `"cancelled": true` (soft delete preserves line numbers) **Tool Access:** All todo operations require both `day` and `facet` parameters: - `todo_list(day, facet)` – view numbered checklist for a specific facet - `todo_add(day, facet, text)` – append new todo - `todo_done(day, facet, line_number)` – mark complete - `todo_cancel(day, facet, line_number)` – cancel entry (soft delete) - `todo_upcoming(limit, facet=None)` – view upcoming todos (optionally filtered by facet) This facet-scoped structure provides true separation of concerns while enabling automated tools to manage tasks deterministically. ## Action Logs Action logs record an audit trail of owner-initiated actions and agent tool calls. There are two types: - **Journal-level logs** (`config/actions/`) – actions not tied to a specific facet (settings changes, observer management) - **Facet-scoped logs** (`facets/{facet}/logs/`) – actions within a specific facet (todos, entities) ### Journal Action Logs The `config/actions/` directory records journal-level actions. Logs are organized by day as `config/actions/YYYYMMDD.jsonl`. ```json { "timestamp": "2025-12-16T07:33:05.135587+00:00", "source": "app", "actor": "settings", "action": "identity_update", "params": { "changed_fields": {"name": {"old": "John", "new": "John Doe"}} } } ``` ### Facet Action Logs The `logs/` directory within each facet records facet-scoped actions. Logs are organized by day as `facets/{facet}/logs/YYYYMMDD.jsonl`. ```json { "timestamp": "2025-12-16T07:33:05.135587+00:00", "source": "tool", "actor": "todos:todo", "action": "todo_add", "params": { "text": "Review project proposal" }, "facet": "work", "agent_id": "1765870373972" } ``` ### Log Entry Fields Both log types share the same structure: - `timestamp` – ISO 8601 timestamp of the action - `source` – Origin type: "app" for web UI, "tool" for agent tools - `actor` – App or tool name that performed the action - `action` – Action name (e.g., "todo_add", "identity_update") - `params` – Action-specific parameters - `facet` – Facet name (only present in facet-scoped logs) - `agent_id` – Agent ID (only present for agent tool actions) These logs enable auditing, debugging, and potential rollback of automated actions. ## Token Usage The `tokens/` directory tracks token usage from all AI model calls across the system. Usage data is organized by day as `tokens/YYYYMMDD.jsonl` where each file contains JSON Lines entries for that day's API calls. ### Token log format Each line in a token log file is a JSON object with the following structure: ```json { "timestamp": 1736812345000, "model": "gemini-2.5-flash", "context": "agent.default.20250113_143022", "segment": "143022_300", "usage": { "input_tokens": 1500, "output_tokens": 500, "total_tokens": 2000, "cached_tokens": 800, "reasoning_tokens": 200 } } ``` Required fields: - `timestamp` – Unix timestamp in milliseconds (13 digits) - `model` – Model identifier (e.g., "gemini-2.5-flash", "gpt-5", "claude-sonnet-4-5") - `context` – Calling context (e.g., "agent.name.agent_id" or "module.function:line") - `usage` – Token counts dictionary with normalized field names Optional fields: - `segment` – Recording segment key (e.g., "143022_300") when token usage is attributable to a specific observation window Usage fields (all optional depending on model capabilities): - `input_tokens` – Tokens in the prompt/input - `output_tokens` – Tokens in the response/output - `total_tokens` – Total tokens consumed - `cached_tokens` – Tokens served from cache (reduces cost) - `reasoning_tokens` – Tokens used for extended thinking/reasoning - `requests` – Number of API requests made (for batch operations) The logging system normalizes provider-specific formats (OpenAI, Gemini, Anthropic) into this unified schema for consistent cost tracking and analysis across all models. ## Agent Event Logs The `agents/` directory stores event logs for all AI agent sessions managed by Cortex. Each agent session produces a JSONL file containing the complete event history. **Directory layout:** - `/` – per-agent subdirectory (e.g., `default/`, `entities--observer/`) - `/_active.jsonl` – currently running agent (renamed when complete) - `/.jsonl` – completed agent session - `.log` – symlink to the latest completed run for each agent name - `.jsonl` – day index with one summary line per agent that completed on that day The `agent_id` is a Unix timestamp in milliseconds that uniquely identifies the session. **Event format (JSONL):** Each line is a JSON object with an `event` field indicating the event type: ```jsonl {"event": "start", "ts": 1755450767962, "name": "helper", "prompt": "Help me with...", "facet": "work"} {"event": "text", "ts": 1755450768000, "content": "I'll help you with that."} {"event": "tool_call", "ts": 1755450769000, "tool": "search", "params": {"query": "example"}} {"event": "tool_result", "ts": 1755450770000, "tool": "search", "result": "..."} {"event": "finish", "ts": 1755450771000, "result": "Here's what I found..."} ``` **Common event types:** - `start` – agent session started, includes name, prompt, and facet - `text` – streaming text output from the agent - `tool_call` – agent invoked a tool - `tool_result` – result returned from tool execution - `error` – error occurred during execution - `finish` – agent session completed, includes final result See [CORTEX.md](CORTEX.md) for agent architecture and spawning details. ## App Storage The `apps/` directory provides storage space for Convey apps to persist configuration, data, and artifacts specific to this journal. Each app has its own directory at `apps//` where it can maintain app-specific state independent of the application codebase. Apps typically use `config.json` for journal-specific settings and create subdirectories for data storage (e.g., `cache/`, `data/`, `logs/`). This is distinct from the app metadata file (`apps//app.json` in the codebase) which defines icon, label, and facet support across all journals. See [APPS.md](APPS.md) for storage utilities (`get_app_storage_path`, `load_app_config`, `save_app_config`). ## Search Index The `indexer/` directory contains the full-text search index built from journal content. **Files:** - `indexer/journal.sqlite` – FTS5 SQLite database containing indexed chunks from agent outputs, events, entities, todos, and action logs The indexer converts content to markdown chunks via the formatters framework, then indexes with metadata fields (day, facet, agent) for filtering. Raw audio/screen transcripts are formattable but not indexed — agent outputs provide more useful search results. Use `get_journal_index()` from `think/indexer/journal.py` to access the database programmatically. Which content gets indexed is controlled by the `FORMATTERS` registry in `think/formatters.py`. Each entry maps a glob pattern to a formatter function and an `indexed` flag. The registry patterns must be specific enough to use as `Path.glob()` arguments from the journal root — adding a new content location requires a new entry. Run `sol indexer` to rebuild the index from current journal content. ## Service Health The `health/` directory contains log files for long-running services. **Files:** - `health/.log` – log output for each service (e.g., `observe.log`, `cortex.log`, `convey.log`) - `health/retention.log` – JSONL log of retention purge operations with timestamps, files deleted, bytes freed, and per-segment details These logs are useful for debugging service issues. See [DOCTOR.md](DOCTOR.md) for diagnostics and troubleshooting guidance. ## Imported Audio The `imports/` directory stores audio files imported via the import app, along with their processing artifacts. Each import is organized by detected timestamp: ``` imports/ └── YYYYMMDD_HHMMSS/ # Import directory (detected or owner-specified timestamp) ├── import.json # Import metadata and processing status ├── {original_filename} # Original uploaded audio file ├── imported.json # Processed transcript in standard format └── segments.json # List of segment keys created for this import ``` ### Import metadata The `import.json` file tracks the import process: ```json { "original_filename": "meeting_recording.m4a", "upload_timestamp": 1755034698276, "upload_datetime": "2025-08-12T15:38:18.276000", "detection_result": { "day": "20250630", "time": "143256", "confidence": "high", "source": "Date/Time Original" }, "detected_timestamp": "20250630_143256", "user_timestamp": "20250630_143256", "file_size": 13950943, "mime_type": "audio/x-m4a", "facet": "work", "processing_completed": "2025-08-12T15:41:42.970189" } ``` Once processed, imports are linked into the appropriate day's segment via `imported_audio.jsonl` files that reference the original import location. ## Day folder contents Within each day, captured content is organized into **segments** (timestamped duration folders). The folder name is the **segment key**, which uniquely identifies the segment within the day and follows this format: - `HHMMSS_LEN/` – Start time and duration in seconds (e.g., `143022_300/` for a 5-minute segment starting at 14:30:22) Each segment progresses through the three-layer pipeline: captures are recorded, extracts are generated, and agent outputs are synthesized. #### Stream identity Every segment belongs to a **stream** — a named series of segments from a single source. Streams provide navigable chains linking each segment to its predecessor. - `stream.json` – Per-segment stream marker containing: - `stream` – stream name (e.g., `"archon"`, `"import.apple"`) - `prev_day` – day of the previous segment in this stream (null for first) - `prev_segment` – segment key of the predecessor (null for first) - `seq` – sequence number within the stream Stream names follow the convention: `{hostname}` for local observers, `{observer_name}` for observers, `import.{type}` for imports (e.g., `import.apple`, `import.text`). Global stream state is tracked in the top-level `streams/` directory as `{name}.json` files. Pre-stream segments (created before stream identity was added) have no `stream.json` and are handled gracefully as `None` throughout the pipeline. ### Layer 1: Captures Captures are the original binary media files recorded by observation tools. #### Audio captures Audio files are initially written to the day root with the segment key prefix (Linux) or directly to segment folders (macOS): - **Linux**: `HHMMSS_LEN_*.flac` – audio files in day root (e.g., `143022_300_audio.flac`) - **macOS**: `HHMMSS_LEN/audio.m4a` – audio files written directly to segment folder After transcription, audio files are moved into their segment folder: - `HHMMSS_LEN/*.flac`, `*.m4a`, `*.ogg`, `*.opus`, or `*.wav` – audio files moved here after processing, preserving descriptive suffix (e.g., `audio.flac`, `audio.m4a`, `imported_audio.opus`) Note: The descriptive portion after the segment key (e.g., `_audio`, `_recording`) is preserved when files are moved into segment directories. Processing tools match files by extension only, ignoring the descriptive suffix. #### Screen captures Screen recordings use per-monitor files with position and connector/displayID in the filename: - **Linux**: `HHMMSS_LEN___screen.webm` – screencast video files in day root (e.g., `143022_300_center_DP-3_screen.webm`) - **macOS**: `HHMMSS_LEN/__screen.mov` – video files written directly to segment folder (e.g., `center_1_screen.mov`) After analysis, files are in their segment folder: - `HHMMSS_LEN/__screen.webm` or `*.mov` – video files (e.g., `center_DP-3_screen.webm`, `center_1_screen.mov`) For multi-monitor setups, each monitor produces a separate file. Position labels include: `center`, `left`, `right`, `top`, `bottom`, and combinations like `left-top`. ### Layer 2: Extracts Extracts are structured data files (JSON/JSONL) derived from captures through AI analysis. #### Audio transcript extracts The transcript file (`audio.jsonl`) contains a metadata line followed by one JSON object per transcript segment. Example transcript file: ```jsonl {"raw": "audio.flac"} {"start": "00:00:01", "source": "mic", "text": "So we need to finalize the authentication module today."} {"start": "00:00:15", "source": "sys", "text": "I agree. Let's make sure we have proper unit tests."} ``` **Metadata line (first line):** - `raw` – path to processed audio file (required) - `backend` – STT backend used (e.g., "whisper", "revai") - `model` – model used for transcription (e.g., "medium.en", "revai-fusion") - `device` – device used for inference (e.g., "cuda", "cpu", "cloud") - `compute_type` – compute precision used (e.g., "float16", "int8", "api") - `observer` – observer name if transcribed from an observer source (optional) - `imported` – object with import metadata for external files (optional): - `id` – unique import identifier - `facet` – facet name for entity extraction - `setting` – contextual setting description **Transcript statements (subsequent lines):** - `start` – timestamp in HH:MM:SS format (required) - `text` – transcribed text (required) - `source` – audio source: "mic" or "sys" (optional) - `speaker` – speaker identifier, numeric or string (optional, not currently populated) - `corrected` – LLM-corrected version of text (optional, added during enrichment) - `description` – tone or delivery description, e.g., "enthusiastic", "questioning" (optional, added during enrichment) #### Screen frame extracts Screen analysis files use per-monitor naming: `__screen.jsonl` (e.g., `center_DP-3_screen.jsonl`, `left_HDMI-1_screen.jsonl`). For single-monitor setups, the file is simply `screen.jsonl`. Each file contains one JSON object per qualified frame. Frames qualify when they show significant visual change (≥5% RMS difference) compared to the previous qualified frame. Example frame record: ```json { "frame_id": 123, "timestamp": 45.67, "requests": [ {"type": "describe", "model": "gemini-2.5-flash-lite", "duration": 0.5}, {"type": "category", "category": "reading", "model": "gemini-3-flash", "duration": 1.2} ], "analysis": { "visual_description": "Documentation page showing API reference.", "primary": "reading", "secondary": "none", "overlap": true }, "content": { "reading": "# API Reference\n\n## Authentication\n\nUse Bearer tokens..." } } ``` **Common fields:** - `frame_id` – sequential frame number in the video - `timestamp` – time in seconds from video start - `requests` – list of vision API requests made for this frame (type: "describe" for initial, "category" for follow-ups) - `analysis` – categorization result with `primary`, `secondary`, `overlap`, and `visual_description` - `content` – object containing category-specific extracted content (see below) - `error` – present when processing failed after retries **Category-specific content (inside `content` object):** - `messaging` – markdown content when frame contains chat/email apps - `browsing` – markdown content when frame contains web browsing - `reading` – markdown content when frame contains documents/articles - `productivity` – markdown content when frame contains spreadsheets/slides/calendars - `meeting` – JSON object when frame contains video conferencing, includes participant detection and bounding boxes The vision analysis uses multi-stage conditional processing: 1. Initial categorization determines content type (e.g., `code`, `meeting`, `browsing`, `reading`). See `observe/categories/` for the full list of categories. 2. Category-specific follow-up prompts are discovered from `observe/categories/*.md` files 3. Follow-ups are triggered for categories that have extraction content in their `.md` file (currently: messaging, browsing, reading, productivity output markdown; meeting outputs JSON) #### Event extracts Generator output processing extracts time-based events from the day's transcripts—meetings, messages, follow-ups, file activity and more. Events are stored per-facet in JSONL files at `facets/{facet}/events/{day}.jsonl`. There are two types of events: - **Occurrences** – events that happened on the capture day (`occurred: true`) - **Anticipations** – future scheduled events extracted from calendar views (`occurred: false`) ```jsonl {"type": "meeting", "start": "09:00:00", "end": "09:30:00", "title": "Team stand-up", "summary": "Status update with the engineering team", "work": true, "participants": ["Jeremie Miller", "Alice", "Bob"], "facet": "work", "agent": "meetings", "occurred": true, "source": "20250101/agents/meetings.md", "details": "Sprint planning discussion"} {"type": "deadline", "date": "2025-01-15", "start": null, "end": null, "title": "Project milestone", "summary": "Q1 deliverable due", "work": true, "participants": [], "facet": "work", "agent": "schedule", "occurred": false, "source": "20250101/agents/schedule.md", "details": "Final review before release"} ``` **Common fields:** - **type** – event kind: `meeting`, `message`, `file`, `followup`, `documentation`, `research`, `media`, `deadline`, `appointment`, etc. - **start** and **end** – HH:MM:SS timestamps (or `null` for anticipations without specific times) - **date** – ISO date YYYY-MM-DD (anticipations only, indicates scheduled date) - **title** and **summary** – short text for display and search - **facet** – facet name the event belongs to (required) - **agent** – source generator type (e.g., "meetings", "schedule", "flow") - **occurred** – `true` for occurrences, `false` for anticipations - **source** – path to the output file that generated this event - **work** – boolean, work vs. personal classification - **participants** – optional list of people or entities involved - **details** – free-form string with additional context This structure allows the indexer to collect and search events across all facets and days. ### Layer 3: Agent Outputs Agent outputs are AI-generated markdown files that provide human-readable narratives synthesized from captures and extracts. #### Segment outputs After captures are processed, segment-level outputs are generated within each segment folder as `HHMMSS_LEN/*.md` files. Available segment output types are defined by templates in `talent/` with `"schedule": "segment"` in their metadata JSON. #### Daily outputs Post-processing generates day-level outputs in the `agents/` directory that synthesize all segments. **Generator discovery:** Available generator types are discovered at runtime from: - `talent/*.md` – system generator templates (files with `schedule` field but no `tools` field) - `apps/{app}/talent/*.md` – app-specific generator templates Each template is a `.md` file with JSON frontmatter containing metadata (title, description, schedule, output format). The `schedule` field is required and must be `"segment"` or `"daily"` - generators with missing or invalid schedule are skipped. Use `get_talent_configs(has_tools=False)` from `think/talent.py` to retrieve all available generators, or `get_talent_configs(has_tools=False, schedule="daily")` to get generators filtered by schedule. **Output naming:** - System outputs: `agents/{agent}.md` (e.g., `agents/flow.md`, `agents/meetings.md`) - App outputs: `agents/_{app}_{agent}.md` (e.g., `agents/_entities_observer.md`) - JSON output: `agents/{agent}.json` when metadata specifies `"output": "json"` Each generator type has a corresponding template file (`{name}.md`) that defines how the AI synthesizes extracts into narrative form.