A rust implementation of skywatch-phash
README.md

Metrics Module#

Purpose#

This module provides a centralized, thread-safe, and high-performance mechanism for tracking application-wide metrics. It uses lock-free atomic counters to monitor the flow of jobs, cache performance, and moderation actions without introducing lock contention between worker threads.

Key Components#

mod.rs#

  • Metrics: A struct that holds a collection of AtomicU64 counters. It is wrapped in an Arc to allow it to be safely and cheaply cloned and shared across all asynchronous tasks and workers.
  • MetricsInner: The internal struct containing the atomic counters themselves.
  • MetricsSnapshot: A struct that holds a consistent snapshot of all counter values at a specific moment in time.

Tracked Metrics#

The module tracks several key areas:

  • Job Processing: jobs_received, jobs_processed, jobs_failed, jobs_retried.
  • Blob Processing: blobs_processed, blobs_downloaded.
  • Phash Matching: matches_found.
  • Cache Performance: cache_hits, cache_misses.
  • Moderation Actions: posts_labeled, posts_reported, accounts_labeled, accounts_reported.
  • Deduplication: posts_already_labeled, accounts_already_reported, etc., to track skipped actions.

Key Methods#

  • new(): Creates a new Metrics instance with all counters initialized to zero.
  • inc_*(): A suite of methods (e.g., inc_jobs_processed()) to atomically increment a specific counter.
  • log_stats(): Prints a formatted, human-readable summary of all current metric values to the log.
  • cache_hit_rate(): A computed metric that returns the cache hit percentage.

Design & Performance#

This module uses std::sync::atomic::AtomicU64 with Ordering::Relaxed for all counter operations. This choice is intentional:

  • Performance: Atomic operations are extremely fast and do not require kernel-level locking, making them ideal for high-throughput, multi-threaded applications.
  • Relaxed Ordering: Since the counters are independent of each other, we don't need strict memory ordering guarantees. Relaxed provides the best performance by only ensuring atomicity for each individual operation.

Usage Pattern#

  1. A single Metrics instance is created in main.rs at application startup.
  2. This instance is cloned and passed to the Jetstream consumer, the job receiver, and each worker in the worker pool.
  3. As events occur (e.g., a job is processed, a cache is hit), the relevant component calls the appropriate inc_* method.
  4. A dedicated task in main.rs calls log_stats() periodically (e.g., every 60 seconds) to provide a running summary of the application's status.
  5. A final log_stats() call is made during graceful shutdown to report the final totals.