commits
createIdentityResolver() requires a handleResolver option to know HOW
to resolve handles. Passing only didCache/handleCache (the cache layer)
is not enough. Uses https://bsky.social as the handle resolution endpoint.
Fixes startup crash: "TypeError: identityResolver or handleResolver option is required"
* feat(oauth): add Valkey-backed DID and handle caching to OAuth client
Passes didCache (1h TTL) and handleCache (30min TTL) to NodeOAuthClient,
eliminating uncached plc.directory lookups on every restore() and authorize().
The cache fails open: Valkey errors and corrupted JSON are caught and
treated as cache misses, so identity resolution falls through to the
network resolver. This is intentionally different from ValkeyStateStore's
fail-closed behavior.
Addresses plc.directory rate-limiting vulnerability from 2026-03-23 incident.
Ref: feedback from @thisismissem.social
* refactor(oauth): validate handle before authorize, remove authorize loop
Uses oauthClient.oauthResolver.resolveFromIdentity() to resolve identity
and check server capabilities before calling authorize(). Replaces the
previous pattern of looping authorize() with try/catch for each candidate.
Key improvements:
- Distinguishes upstream errors (503) from 'handle not found' (400)
- Detects granular scope support via repo:* in scopes_supported
- Keeps scope fallback if authorize() rejects despite metadata support
- Checks for handle.invalid from bidirectional verification
- Network/rate-limit errors short-circuit instead of trying next candidate
Ref: feedback from @thisismissem.social
* refactor: replace direct plc.directory calls with cached IdentityResolver
All DID/handle resolution now goes through the shared cached IdentityResolver
backed by Valkey. Eliminates direct HTTP calls to plc.directory and bsky.social.
Changes:
- src/lib/identity.ts: shared resolver with init/get singleton + standalone
factory for cron scripts
- src/lib/pds-provider.ts: resolvePdsHost/resolvePdsEndpoint now delegate to
the cached resolver instead of raw fetch()
- src/services/bluesky-bot.ts: bot PDS resolution uses cached resolver
- src/scripts/refresh-pds-hosts.ts: creates its own resolver instance
- src/server.ts: initializes shared resolver on startup
The shared resolver uses the same Valkey cache keys as the OAuth client's
internal resolver (identity-did: and identity-handle:), so cache hits
benefit both code paths.
Ref: feedback from @thisismissem.social
* feat(registry): add namespace ownership fields and getDisplayName function
* feat(activity): use namespace-aware display names in activity items
* fix(lint): replace non-null assertions with guard checks
* fix(activity): resolve typecheck error for blueskyEntry narrowing
Replace fragile string matching in isPermanentSessionError() with
instanceof checks against TokenInvalidError, TokenRevokedError, and
TokenRefreshError exported by @atproto/oauth-client.
Addresses review feedback from @ThisIsMissEm on #168.
* feat(db): add latitude/longitude columns to profiles table
* feat: add geocodeCity helper using GeoNames API
* feat: geocode profile locations during Jetstream indexing
* feat(admin): use stored coordinates for user location map
Replace simpleHash-based offsets from country centroids with actual
latitude/longitude stored on profiles. Falls back to country centroid
when coordinates are not yet geocoded.
* feat: add backfill script for profile coordinates
* feat(events): add event insights data types and cache schema
* feat(events): add event attendee resolver with Smoke Signal + speaker merge
* feat(events): add PDS, DID method, and account age insight collectors
* feat(events): add ATProto ecosystem roles collector
Counts feed generator authors, labeler operators, starter pack creators,
and list creators among event attendees using the Bluesky profile
associated field.
* feat(events): add connection graph collector with batched getRelationships
Builds a force-directed graph dataset of attendee follow relationships
using app.bsky.graph.getRelationships with batches of 30, concurrency
limit of 10 workers, and 100ms rate-limit delay. Edges are deduplicated
with mutual follow detection. Includes tests for runWithConcurrency,
buildDeduplicatedEdges, and collectConnectionGraph.
* feat(events): add post activity timeline and client diversity collectors
* feat(events): add insights background job and API endpoint
Background job orchestrates all insight collectors hourly and caches
results in Valkey. GET /api/events/:slug/insights serves cached data.
* style: fix non-null assertion lint errors in insights code
* fix(server): enable trustProxy and restrict rate limiter to loopback only
External requests through Caddy appeared as Docker gateway IPs (172.20.0.1),
bypassing the rate limiter entirely. A bot sending 267 req/s to the activity
endpoint caused plc.directory to rate-limit our VPS, breaking all DID
resolution including oauthClient.restore() — logging everyone out.
- Enable Fastify trustProxy so req.ip reflects X-Forwarded-For
- Only exempt loopback IPs (127.0.0.1, ::1) from rate limiting
- Remove Docker subnet exemption (172.x, 10.x)
* fix(auth): distinguish transient errors from session expiry in restore()
oauthClient.restore() throws on ANY error including plc.directory rate
limiting (429 Too Many Requests). Previously ALL errors cleared the
session cookie, logging the user out permanently until they re-sign-in.
Now transient errors (rate limits, network timeouts, DID resolution
failures) return 503 and keep the cookie intact. Only permanent errors
(revoked tokens, missing sessions) clear the cookie.
Also adds error logging to both catch blocks — previously errors were
silently swallowed, making debugging impossible.
* fix(auth): narrow restore() try-catch scope and complete causeMsg checks
- Separate the oauthClient.restore() try-catch from the profile fetch
logic in /api/auth/session. Previously both were in one catch block,
so a Bluesky API failure could be misclassified as a transient
restore error. Now restore errors return 503/401 appropriately, and
profile fetch failures return a minimal authenticated response.
- Add missing causeMsg checks for econnreset, enotfound, fetch failed
in isTransientRestoreError().
* fix(oauth): return UpstreamError instead of HandleNotFound for rate-limit failures
When oauthClient.authorize() fails due to plc.directory rate limiting or
network errors, the login endpoint returned HandleNotFound — misleading
the user into thinking their handle was wrong. Now detects upstream errors
and returns 503 UpstreamError with an accurate message.
Also improves error logging: serializes error messages instead of
logging the raw error object.
* fix(auth): address review findings — trustProxy, error default, dedup
Incorporates feedback from security, sysadmin, and devil's advocate reviews:
1. trustProxy: true → trustProxy: 1 (trust only last proxy hop)
Prevents X-Forwarded-For spoofing to bypass rate limiter.
2. Rename isTransientRestoreError → isPermanentSessionError and flip
the default. Unknown errors now default to transient (keep cookie)
instead of permanent (clear cookie). Only known-fatal errors
(invalid_grant, token_revoked, HTTP 401) clear the session.
This prevents future library updates from causing mass logouts.
3. Deduplicate error classification in login route — reuses
isPermanentSessionError instead of inline string matching.
4. Profile fetch fallback returns 503 instead of DID-as-handle,
so the frontend keeps the cached session with proper display data.
* fix(auth): report permanent session errors to GlitchTip
Send permanent session errors (revoked tokens, invalid grants) to
GlitchTip via Sentry.captureException() so they appear in the error
dashboard. Transient errors (rate limiting, network) are only logged
to avoid noise.
* build: initialize OpenSpec for spec-driven development
Add OpenSpec v1.2.0 core profile with Claude Code integration.
Creates the openspec/ directory structure for specs and change
tracking, plus 4 slash commands (/opsx:propose, apply, explore,
archive) and matching skills for spec-driven development workflow.
* style: format OpenSpec markdown files with Prettier
Fix CI formatting check failure on all 8 new .md files.
* style: reformat with project-pinned Prettier 3.5.3
Previous commit used globally-installed Prettier 3.8.1 which
produces different output. Now formatted with the project's
pinned version to match CI.
PDS data uses null for absent optional fields, but Zod's .optional()
only accepts undefined. Add .nullable() to all optional string and
array fields across position, education, certification, project,
volunteering, publication, course, and honor schemas.
This was causing 400 ValidationErrors when toggling skill-position
links on positions with null endDate (and potentially other fields).
- Log the actual error (at warn level) when OAuth authorize fails per
candidate, not just at debug level which is filtered in production
- Log the granular scope rejection error too (was silently swallowed)
- Serialize lastError as message + stack instead of raw object (Error
objects serialize as {} with JSON.stringify)
The teaser endpoint fetches `perAppLimit` items from Bluesky's
getAuthorFeed, then filters out reposts post-fetch. When the user has
reposts interspersed, this reduces the result count below the limit
(e.g. requesting 2 items, 1 is a repost → only 1 item shown).
Fix by requesting limit*3 items from Bluesky and slicing back to the
requested limit after filtering.
Positions with no location set store `location: null` in the frontend.
When the skill-link toggle sends the full position back to the API,
Zod's `.optional()` rejects null (it only accepts undefined), causing
a 400 ValidationError. Add `.nullable()` to locationSchema so both
null and undefined are accepted.
LinkedIn exports sometimes contain invalid credential URLs (e.g. partial
URLs or non-URL text). Previously, one bad URL in any certification,
project, or publication would cause the entire import to fail with a 400
"Validation failed" error.
The optionalUrl() Zod helper now validates URLs at parse time and
silently drops invalid ones to undefined, allowing the rest of the
import to proceed.
Closes #170
Call com.atproto.repo.describeRepo once per user to get the list of
collections that actually exist in their repo. Filter app collections
against this set before calling listRecords, avoiding wasted API calls
for apps the user has never used. Falls back to scanning all collections
if describeRepo is unavailable.
When the heatmap hits maxPages without reaching the requested time
window, log a warning and send it to GlitchTip via Sentry.captureMessage.
This gives visibility into users whose activity exceeds our pagination
limits, so we can adjust before they notice incomplete heatmaps.
The Bluesky AppView's getAuthorFeed mixes in reposts from others,
wasting pagination pages on items that get filtered out. For patak.cat
(~28 posts/day), only ~40% of raw feed items are own posts, so 30 pages
only covered 43 days instead of 6 months.
Switching to PDS listRecords for the heatmap fetches only the user's
own records, making every page count. Also bumped maxPages to 50
(5,000 records) to comfortably cover 6 months for prolific users.
* feat(oauth): seed profile about from Bluesky bio on first sign-up
Import the user's Bluesky bio (description) as the initial "about" text
when creating their Sifa profile. Only set on insert (new users), never
updated on subsequent logins, so user edits are preserved.
* feat(scripts): add one-off backfill for empty about from Bluesky bio
For existing users with no about text, fetches their Bluesky bio and
populates the field. Run once with:
DATABASE_URL=... tsx src/scripts/backfill-about-from-bsky.ts
The heatmap endpoint only fetched 5 pages (500 records) from the Bluesky
AppView, which covered only ~3 weeks for prolific posters like patak.cat
(~6 posts/day). Increasing to 30 pages (3,000 records) covers the full
6-month heatmap window even for very active users.
* feat(auth): add isNewUser field to session endpoint response
Returns isNewUser: true when the authenticated user has zero positions,
education records, and skills in the database, enabling the frontend to
route new users to the onboarding/import flow.
* feat(db): add email_subscriptions table for welcome page email collection
* feat(api): add POST /api/email-subscription endpoint
* feat(oauth): redirect new users to /welcome instead of /import
- Filter topApps to only active apps with recentCount > 0
(previously included inactive apps with 0 activity)
- Distribute per-app fetch limit: 5/N items per app instead of
hardcoded 2 (single-app users now get 5 items, not 2)
- Don't cache empty results when active apps exist — empty
responses from transient fetch failures were getting cached
for 15 minutes, hiding real activity
* feat(activity): add Popfeed to app registry
- Register Popfeed with social.popfeed and app.popsky prefixes
- Scan social.popfeed.feed.review, feed.post, and feed.list collections
- Exclude social.popfeed.feed.like, challenge.participation, and
actor.profile from activity feed
* fix(test): update excluded collections count for Popfeed entries
* fix(activity): filter empty categories from activity feed tabs
Only include categories in availableCategories when the user has apps
with recentCount > 0. Previously, the scanner created rows for all
registered apps (including those with zero records), causing empty
category tabs like "Photos" to appear with no content.
* fix(activity): pass resolved Bluesky embeds with thumb URLs
The Bluesky AppView API returns resolved embeds (with CDN thumb URLs)
on item.post.embed, but we were only passing item.post.record which
has raw blob refs. Merge the resolved embed into the record so the
frontend card can display image thumbnails and link preview images.
* feat(api): add heatmap endpoint for day-level activity aggregation
GET /api/activity/:handleOrDid/heatmap?days=180
Returns day-level counts per app with quantile thresholds.
Cached in Valkey for 4 hours.
* fix(api): address heatmap code review findings
- Purge heatmap cache keys on GDPR suppression
- Cap app fetch to 10 to prevent DoS amplification
- Break PDS pagination on old records instead of continue
- Fix days=0 silently becoming 180
- Simplify to Promise.all since fetches never reject
* fix(api): resolve lint errors in heatmap route
- Remove unused type imports from test file
- Replace non-null assertions with nullish coalescing in quantile function
Profiles imported from LinkedIn often have location_country as free
text (e.g. "Deutschland", "Schweiz", "United States") without a
country_code. Add a name-to-code reverse lookup covering common
English names plus local-language variants found in production data.
Also parse comma-separated patterns like "Oslo, Oslo, Norway".
Add GET /api/admin/stats/user-locations that aggregates profiles by
country and city, returning lat/lng coordinates with user counts.
Uses a static country centroids lookup for coordinate mapping.
* feat(registry): map community.lexicon.calendar to Smoke Signal
* feat(activity): add RSVP event enrichment with cross-PDS fetch
* feat(activity): wire RSVP enrichment into feed and teaser endpoints
* fix(tests): remove non-null assertions in activity enrichment tests
* fix: replace non-null assertions with guard checks in parseAtUri
KipClip (kipclip.com) uses the community bookmark lexicon
(community.lexicon.bookmarks.bookmark) as its main record type,
with com.kipclip.annotation for enrichment metadata (title,
description, favicon). Verified against real PDS records.
Extend the create-or-update pattern from #143 to all remaining PUT
endpoints: position, education, skill, generic records, and external
accounts (including primary toggle).
If the local DB has a record but the PDS doesn't (e.g. after external
PDS reset or race with Jetstream delete), the blind update would fail.
Now all writes check pdsRecordExists() first and fall back to create.
Register Keytrace (dev.keytrace.claim) in the ATproto app registry so
claims appear in the cross-app activity feed. Extend the external
accounts endpoint to fetch Keytrace claims from the user's PDS, merge
them with Sifa-declared accounts using URL normalization and
platform/username matching, and surface cryptographic verification
status. Claims are cached in Valkey with a 2-hour TTL, invalidated
on profile sync.
* feat(activity): include availableCategories in feed response
Add an availableCategories string array to the activity feed response,
derived from visible active app stats. This lets the frontend hide
category filter tabs when a user has content in only one category.
Closes singi-labs/sifa-web#327
* fix(activity): only show Load More when sources have more items
Discard upstream cursors when a source returned fewer items than the
requested page size, so hasMore is false when all sources are exhausted.
Previously the button appeared even with 4 or 8 total items because
PDS listRecords and Bluesky getAuthorFeed return cursors on every page.
PUT /api/profile/self always used applyWrites#update, which fails with
a 500 from the PDS when the record doesn't exist yet. This affects any
user who edits their profile without first importing via LinkedIn.
Check whether the record exists on the PDS before writing, and use
create or update accordingly — same pattern already used by the import
route.
Fixes SIFAID-3 (34 occurrences today).
Fetch Standard app publications from the user's PDS and merge them with
Sifa-native publications on the profile detail page. Standard publications
are read-only, deduplicated by URL or title (Sifa entries take precedence),
and tagged with source: 'standard' vs 'sifa'. Results are cached in Valkey
for 15 minutes.
Also refactors pds-provider.ts to extract getDidDocUrl helper and adds
resolvePdsEndpoint for resolving full PDS endpoint URLs from DIDs.
Closes #131
The PDS scanner creates rows in user_app_stats for each detected app
but doesn't always populate recentCount. Use the presence of a row
(meaning the scanner found the collection in the PDS) rather than
filtering by recentCount.
The isActive flag from the PDS scanner uses a different threshold than
what the feed endpoint uses for fetching items, causing availableCategories
to return empty even when items exist. Use recentCount > 0 instead.
Users followed on both Sifa and Bluesky appeared twice in the
/api/following response. Deduplicate by subjectDid, preferring the
sifa source row so the Unfollow button is available.
Fixes singi-labs/sifa-web#317
Smoke Signal RSVPs use the community.lexicon.calendar.rsvp collection,
not the events.smokesignal namespace. Update the app registry to scan
the correct collection so RSVP activity appears in user feeds.
- Add community.lexicon.calendar to Smoke Signal collectionPrefixes
- Replace events.smokesignal.calendar.event with community.lexicon.calendar.rsvp
and events.smokesignal.profile in scanCollections
- Remove events.smokesignal.calendar.rsvp from EXCLUDED_COLLECTIONS
(was an incorrect NSID that never existed)
Closes #127
Add an availableCategories string array to the activity feed response,
derived from visible active app stats. This lets the frontend hide
category filter tabs when a user has content in only one category.
Closes singi-labs/sifa-web#327
Add three new AT Protocol apps to the activity registry:
- Standard (site.standard.document, site.standard.publication)
- Aetheros (computer.aetheros.page)
- Roomy (space.roomy.space.personal)
Also add net.alternativeproto.vote to excluded collections
since votes are low-signal activity.
* feat(follow): add GET /api/following endpoint (#317)
Returns paginated list of everyone the authenticated user follows,
with LEFT JOIN on profiles for claimed status. Supports source
filter and cursor-based pagination.
* fix(follow): validate source/cursor params, handle enrichment errors
- Validate source against whitelist (sifa, bluesky, tangled)
- Validate cursor is a valid ISO date
- Wrap fetchProfilesFromBluesky in try/catch for graceful degradation
- Rename response field to avatarUrl for consistency with suggestions
Expands the seed file from ~100 to 230 canonical skills covering all
unresolved skills in production. Changes:
- Add ~65 new canonical skills (WordPress, jQuery, Django, CAD, Adobe
tools, Design Thinking, IoT, languages, etc.)
- Add ~50 new aliases to existing canonicals (variant names, foreign
language translations, common abbreviations)
- Change seed upsert from onConflictDoNothing to onConflictDoUpdate
so re-running the seed updates aliases on existing entries
- Add resolveUnresolvedSkills() step that re-matches all pending
unresolved_skills against the updated canonical list after seeding
Only 4 entries remain intentionally unresolved (test junk data and
overly vague terms like "create" and "computer literacy").
Closes singi-labs/sifa-web#316
Verified against real PDS records. Three namespaces were wrong:
- Frontpage: com.frontpage.* -> fyi.unravel.frontpage.*
- Picosky: blue.picosky.* -> social.psky.* (also: category Chat, not Pages)
- PasteSphere: com.pastesphere.* -> link.pastesphere.*
Also:
- Added verified scanCollections for all apps (Tangled repos/issues/PRs,
Smoke Signal events, Flashes posts, Frontpage posts, etc.)
- Added URL patterns for Tangled, Smoke Signal, Frontpage, Picosky,
Linkat, PasteSphere
- Added cross-app excluded collections (Tangled follows/stars,
Smoke Signal RSVPs, Frontpage votes)
feat: cross-app activity API
- GET /api/activity/:handleOrDid/teaser — returns 3-5 recent items
across top apps for the profile activity card, with Valkey caching
(15 min TTL)
- GET /api/activity/:handleOrDid — full paginated activity feed with
category filtering, composite cursor pagination, and per-collection
Valkey caching (5 min TTL)
- POST /api/privacy/suppress — GDPR erasure endpoint that suppresses
a DID and clears all cached activity data
All PDS/AppView calls use 3s timeouts and Promise.allSettled for
graceful partial failure handling. Bluesky content fetched via public
AppView API; other apps via direct PDS listRecords.
Add a blocking PDS scan at the end of the /api/profile/sync endpoint
so that cross-app activity badges are available immediately after a
user claims or syncs their profile. The scan is wrapped in a try/catch
so failures do not affect the overall profile sync.
- Add activeApps array to claimed profile response (cross-app activity from
user_app_stats, filtered by visibility and suppression status)
- Add activeApps: [] to unclaimed profile response (GDPR: no cross-app data)
- Fire-and-forget triggerRefreshIfStale on claimed profile views when valkey
and pdsHost are available
- New PUT /api/profile/activity-visibility endpoint for toggling per-app
visibility (authenticated, Zod-validated)
- Register activity routes in server.ts
- getAppStatsForDid / getVisibleAppStats: query userAppStats by DID
- upsertScanResults: upsert scan results with onConflictDoUpdate, prune stale apps
- triggerRefreshIfStale: fire-and-forget background scan with Valkey NX lock (120s TTL)
- isDidSuppressed / suppressDid: suppression list with stats + cache cleanup
- 8 unit tests covering all functions
fix(security): address CodeQL alerts
- Replace hostname.includes() with hostname.endsWith() + exact match
to prevent spoofed YouTube hostnames (e.g. evil-youtube.com)
- Add contents: read to security job in ci.yml (needed for checkout)
- Add test for hostname spoofing rejection
Resolves CodeQL alerts #5, #11.
Charts no longer start before the site existed. For all-time (days=0)
queries, the date range always begins at the launch date.
* fix(admin): fill missing dates in chart data with zero-count entries
Admin dashboard charts (Daily Signups, Cumulative Users, DAU, LinkedIn
Imports) were skipping dates with no data, making the X-axis misleading.
Added fillDateGaps helper that generates all dates in the query range
and fills missing ones with zero values.
* fix(admin): replace non-null assertion with safe check
The /api/suggestions/count endpoint was counting ALL cross-network
follows (5000+), not just those who are on Sifa. Add session check
so the banner shows the correct number.
The profile resolver was inserting profile rows for ALL Bluesky
follows (5000+), polluting the profiles table and inflating user
counts on the admin page.
Now only persists profiles for DIDs that have sessions (actual Sifa
users). For "Not on Sifa" suggestion cards, resolves Bluesky
profile data on-the-fly without persisting.
The suggestions endpoint was fetching 20 connections ordered by
createdAt, then splitting into onSifa/notOnSifa. With thousands of
follows, the few "On Sifa" users were buried and never appeared.
Now queries them separately: all "On Sifa" suggestions (uncapped),
then fills with paginated "Not on Sifa" up to the limit.
createIdentityResolver() requires a handleResolver option to know HOW
to resolve handles. Passing only didCache/handleCache (the cache layer)
is not enough. Uses https://bsky.social as the handle resolution endpoint.
Fixes startup crash: "TypeError: identityResolver or handleResolver option is required"
* feat(oauth): add Valkey-backed DID and handle caching to OAuth client
Passes didCache (1h TTL) and handleCache (30min TTL) to NodeOAuthClient,
eliminating uncached plc.directory lookups on every restore() and authorize().
The cache fails open: Valkey errors and corrupted JSON are caught and
treated as cache misses, so identity resolution falls through to the
network resolver. This is intentionally different from ValkeyStateStore's
fail-closed behavior.
Addresses plc.directory rate-limiting vulnerability from 2026-03-23 incident.
Ref: feedback from @thisismissem.social
* refactor(oauth): validate handle before authorize, remove authorize loop
Uses oauthClient.oauthResolver.resolveFromIdentity() to resolve identity
and check server capabilities before calling authorize(). Replaces the
previous pattern of looping authorize() with try/catch for each candidate.
Key improvements:
- Distinguishes upstream errors (503) from 'handle not found' (400)
- Detects granular scope support via repo:* in scopes_supported
- Keeps scope fallback if authorize() rejects despite metadata support
- Checks for handle.invalid from bidirectional verification
- Network/rate-limit errors short-circuit instead of trying next candidate
Ref: feedback from @thisismissem.social
* refactor: replace direct plc.directory calls with cached IdentityResolver
All DID/handle resolution now goes through the shared cached IdentityResolver
backed by Valkey. Eliminates direct HTTP calls to plc.directory and bsky.social.
Changes:
- src/lib/identity.ts: shared resolver with init/get singleton + standalone
factory for cron scripts
- src/lib/pds-provider.ts: resolvePdsHost/resolvePdsEndpoint now delegate to
the cached resolver instead of raw fetch()
- src/services/bluesky-bot.ts: bot PDS resolution uses cached resolver
- src/scripts/refresh-pds-hosts.ts: creates its own resolver instance
- src/server.ts: initializes shared resolver on startup
The shared resolver uses the same Valkey cache keys as the OAuth client's
internal resolver (identity-did: and identity-handle:), so cache hits
benefit both code paths.
Ref: feedback from @thisismissem.social
* feat(db): add latitude/longitude columns to profiles table
* feat: add geocodeCity helper using GeoNames API
* feat: geocode profile locations during Jetstream indexing
* feat(admin): use stored coordinates for user location map
Replace simpleHash-based offsets from country centroids with actual
latitude/longitude stored on profiles. Falls back to country centroid
when coordinates are not yet geocoded.
* feat: add backfill script for profile coordinates
* feat(events): add event insights data types and cache schema
* feat(events): add event attendee resolver with Smoke Signal + speaker merge
* feat(events): add PDS, DID method, and account age insight collectors
* feat(events): add ATProto ecosystem roles collector
Counts feed generator authors, labeler operators, starter pack creators,
and list creators among event attendees using the Bluesky profile
associated field.
* feat(events): add connection graph collector with batched getRelationships
Builds a force-directed graph dataset of attendee follow relationships
using app.bsky.graph.getRelationships with batches of 30, concurrency
limit of 10 workers, and 100ms rate-limit delay. Edges are deduplicated
with mutual follow detection. Includes tests for runWithConcurrency,
buildDeduplicatedEdges, and collectConnectionGraph.
* feat(events): add post activity timeline and client diversity collectors
* feat(events): add insights background job and API endpoint
Background job orchestrates all insight collectors hourly and caches
results in Valkey. GET /api/events/:slug/insights serves cached data.
* style: fix non-null assertion lint errors in insights code
* fix(server): enable trustProxy and restrict rate limiter to loopback only
External requests through Caddy appeared as Docker gateway IPs (172.20.0.1),
bypassing the rate limiter entirely. A bot sending 267 req/s to the activity
endpoint caused plc.directory to rate-limit our VPS, breaking all DID
resolution including oauthClient.restore() — logging everyone out.
- Enable Fastify trustProxy so req.ip reflects X-Forwarded-For
- Only exempt loopback IPs (127.0.0.1, ::1) from rate limiting
- Remove Docker subnet exemption (172.x, 10.x)
* fix(auth): distinguish transient errors from session expiry in restore()
oauthClient.restore() throws on ANY error including plc.directory rate
limiting (429 Too Many Requests). Previously ALL errors cleared the
session cookie, logging the user out permanently until they re-sign-in.
Now transient errors (rate limits, network timeouts, DID resolution
failures) return 503 and keep the cookie intact. Only permanent errors
(revoked tokens, missing sessions) clear the cookie.
Also adds error logging to both catch blocks — previously errors were
silently swallowed, making debugging impossible.
* fix(auth): narrow restore() try-catch scope and complete causeMsg checks
- Separate the oauthClient.restore() try-catch from the profile fetch
logic in /api/auth/session. Previously both were in one catch block,
so a Bluesky API failure could be misclassified as a transient
restore error. Now restore errors return 503/401 appropriately, and
profile fetch failures return a minimal authenticated response.
- Add missing causeMsg checks for econnreset, enotfound, fetch failed
in isTransientRestoreError().
* fix(oauth): return UpstreamError instead of HandleNotFound for rate-limit failures
When oauthClient.authorize() fails due to plc.directory rate limiting or
network errors, the login endpoint returned HandleNotFound — misleading
the user into thinking their handle was wrong. Now detects upstream errors
and returns 503 UpstreamError with an accurate message.
Also improves error logging: serializes error messages instead of
logging the raw error object.
* fix(auth): address review findings — trustProxy, error default, dedup
Incorporates feedback from security, sysadmin, and devil's advocate reviews:
1. trustProxy: true → trustProxy: 1 (trust only last proxy hop)
Prevents X-Forwarded-For spoofing to bypass rate limiter.
2. Rename isTransientRestoreError → isPermanentSessionError and flip
the default. Unknown errors now default to transient (keep cookie)
instead of permanent (clear cookie). Only known-fatal errors
(invalid_grant, token_revoked, HTTP 401) clear the session.
This prevents future library updates from causing mass logouts.
3. Deduplicate error classification in login route — reuses
isPermanentSessionError instead of inline string matching.
4. Profile fetch fallback returns 503 instead of DID-as-handle,
so the frontend keeps the cached session with proper display data.
* fix(auth): report permanent session errors to GlitchTip
Send permanent session errors (revoked tokens, invalid grants) to
GlitchTip via Sentry.captureException() so they appear in the error
dashboard. Transient errors (rate limiting, network) are only logged
to avoid noise.
* build: initialize OpenSpec for spec-driven development
Add OpenSpec v1.2.0 core profile with Claude Code integration.
Creates the openspec/ directory structure for specs and change
tracking, plus 4 slash commands (/opsx:propose, apply, explore,
archive) and matching skills for spec-driven development workflow.
* style: format OpenSpec markdown files with Prettier
Fix CI formatting check failure on all 8 new .md files.
* style: reformat with project-pinned Prettier 3.5.3
Previous commit used globally-installed Prettier 3.8.1 which
produces different output. Now formatted with the project's
pinned version to match CI.
PDS data uses null for absent optional fields, but Zod's .optional()
only accepts undefined. Add .nullable() to all optional string and
array fields across position, education, certification, project,
volunteering, publication, course, and honor schemas.
This was causing 400 ValidationErrors when toggling skill-position
links on positions with null endDate (and potentially other fields).
- Log the actual error (at warn level) when OAuth authorize fails per
candidate, not just at debug level which is filtered in production
- Log the granular scope rejection error too (was silently swallowed)
- Serialize lastError as message + stack instead of raw object (Error
objects serialize as {} with JSON.stringify)
The teaser endpoint fetches `perAppLimit` items from Bluesky's
getAuthorFeed, then filters out reposts post-fetch. When the user has
reposts interspersed, this reduces the result count below the limit
(e.g. requesting 2 items, 1 is a repost → only 1 item shown).
Fix by requesting limit*3 items from Bluesky and slicing back to the
requested limit after filtering.
Positions with no location set store `location: null` in the frontend.
When the skill-link toggle sends the full position back to the API,
Zod's `.optional()` rejects null (it only accepts undefined), causing
a 400 ValidationError. Add `.nullable()` to locationSchema so both
null and undefined are accepted.
LinkedIn exports sometimes contain invalid credential URLs (e.g. partial
URLs or non-URL text). Previously, one bad URL in any certification,
project, or publication would cause the entire import to fail with a 400
"Validation failed" error.
The optionalUrl() Zod helper now validates URLs at parse time and
silently drops invalid ones to undefined, allowing the rest of the
import to proceed.
Closes #170
Call com.atproto.repo.describeRepo once per user to get the list of
collections that actually exist in their repo. Filter app collections
against this set before calling listRecords, avoiding wasted API calls
for apps the user has never used. Falls back to scanning all collections
if describeRepo is unavailable.
The Bluesky AppView's getAuthorFeed mixes in reposts from others,
wasting pagination pages on items that get filtered out. For patak.cat
(~28 posts/day), only ~40% of raw feed items are own posts, so 30 pages
only covered 43 days instead of 6 months.
Switching to PDS listRecords for the heatmap fetches only the user's
own records, making every page count. Also bumped maxPages to 50
(5,000 records) to comfortably cover 6 months for prolific users.
* feat(oauth): seed profile about from Bluesky bio on first sign-up
Import the user's Bluesky bio (description) as the initial "about" text
when creating their Sifa profile. Only set on insert (new users), never
updated on subsequent logins, so user edits are preserved.
* feat(scripts): add one-off backfill for empty about from Bluesky bio
For existing users with no about text, fetches their Bluesky bio and
populates the field. Run once with:
DATABASE_URL=... tsx src/scripts/backfill-about-from-bsky.ts
* feat(auth): add isNewUser field to session endpoint response
Returns isNewUser: true when the authenticated user has zero positions,
education records, and skills in the database, enabling the frontend to
route new users to the onboarding/import flow.
* feat(db): add email_subscriptions table for welcome page email collection
* feat(api): add POST /api/email-subscription endpoint
* feat(oauth): redirect new users to /welcome instead of /import
- Filter topApps to only active apps with recentCount > 0
(previously included inactive apps with 0 activity)
- Distribute per-app fetch limit: 5/N items per app instead of
hardcoded 2 (single-app users now get 5 items, not 2)
- Don't cache empty results when active apps exist — empty
responses from transient fetch failures were getting cached
for 15 minutes, hiding real activity
* feat(activity): add Popfeed to app registry
- Register Popfeed with social.popfeed and app.popsky prefixes
- Scan social.popfeed.feed.review, feed.post, and feed.list collections
- Exclude social.popfeed.feed.like, challenge.participation, and
actor.profile from activity feed
* fix(test): update excluded collections count for Popfeed entries
* fix(activity): filter empty categories from activity feed tabs
Only include categories in availableCategories when the user has apps
with recentCount > 0. Previously, the scanner created rows for all
registered apps (including those with zero records), causing empty
category tabs like "Photos" to appear with no content.
* fix(activity): pass resolved Bluesky embeds with thumb URLs
The Bluesky AppView API returns resolved embeds (with CDN thumb URLs)
on item.post.embed, but we were only passing item.post.record which
has raw blob refs. Merge the resolved embed into the record so the
frontend card can display image thumbnails and link preview images.
* feat(api): add heatmap endpoint for day-level activity aggregation
GET /api/activity/:handleOrDid/heatmap?days=180
Returns day-level counts per app with quantile thresholds.
Cached in Valkey for 4 hours.
* fix(api): address heatmap code review findings
- Purge heatmap cache keys on GDPR suppression
- Cap app fetch to 10 to prevent DoS amplification
- Break PDS pagination on old records instead of continue
- Fix days=0 silently becoming 180
- Simplify to Promise.all since fetches never reject
* fix(api): resolve lint errors in heatmap route
- Remove unused type imports from test file
- Replace non-null assertions with nullish coalescing in quantile function
Profiles imported from LinkedIn often have location_country as free
text (e.g. "Deutschland", "Schweiz", "United States") without a
country_code. Add a name-to-code reverse lookup covering common
English names plus local-language variants found in production data.
Also parse comma-separated patterns like "Oslo, Oslo, Norway".
* feat(registry): map community.lexicon.calendar to Smoke Signal
* feat(activity): add RSVP event enrichment with cross-PDS fetch
* feat(activity): wire RSVP enrichment into feed and teaser endpoints
* fix(tests): remove non-null assertions in activity enrichment tests
* fix: replace non-null assertions with guard checks in parseAtUri
Extend the create-or-update pattern from #143 to all remaining PUT
endpoints: position, education, skill, generic records, and external
accounts (including primary toggle).
If the local DB has a record but the PDS doesn't (e.g. after external
PDS reset or race with Jetstream delete), the blind update would fail.
Now all writes check pdsRecordExists() first and fall back to create.
Register Keytrace (dev.keytrace.claim) in the ATproto app registry so
claims appear in the cross-app activity feed. Extend the external
accounts endpoint to fetch Keytrace claims from the user's PDS, merge
them with Sifa-declared accounts using URL normalization and
platform/username matching, and surface cryptographic verification
status. Claims are cached in Valkey with a 2-hour TTL, invalidated
on profile sync.
* feat(activity): include availableCategories in feed response
Add an availableCategories string array to the activity feed response,
derived from visible active app stats. This lets the frontend hide
category filter tabs when a user has content in only one category.
Closes singi-labs/sifa-web#327
* fix(activity): only show Load More when sources have more items
Discard upstream cursors when a source returned fewer items than the
requested page size, so hasMore is false when all sources are exhausted.
Previously the button appeared even with 4 or 8 total items because
PDS listRecords and Bluesky getAuthorFeed return cursors on every page.
PUT /api/profile/self always used applyWrites#update, which fails with
a 500 from the PDS when the record doesn't exist yet. This affects any
user who edits their profile without first importing via LinkedIn.
Check whether the record exists on the PDS before writing, and use
create or update accordingly — same pattern already used by the import
route.
Fixes SIFAID-3 (34 occurrences today).
Fetch Standard app publications from the user's PDS and merge them with
Sifa-native publications on the profile detail page. Standard publications
are read-only, deduplicated by URL or title (Sifa entries take precedence),
and tagged with source: 'standard' vs 'sifa'. Results are cached in Valkey
for 15 minutes.
Also refactors pds-provider.ts to extract getDidDocUrl helper and adds
resolvePdsEndpoint for resolving full PDS endpoint URLs from DIDs.
Closes #131
Smoke Signal RSVPs use the community.lexicon.calendar.rsvp collection,
not the events.smokesignal namespace. Update the app registry to scan
the correct collection so RSVP activity appears in user feeds.
- Add community.lexicon.calendar to Smoke Signal collectionPrefixes
- Replace events.smokesignal.calendar.event with community.lexicon.calendar.rsvp
and events.smokesignal.profile in scanCollections
- Remove events.smokesignal.calendar.rsvp from EXCLUDED_COLLECTIONS
(was an incorrect NSID that never existed)
Closes #127
* feat(follow): add GET /api/following endpoint (#317)
Returns paginated list of everyone the authenticated user follows,
with LEFT JOIN on profiles for claimed status. Supports source
filter and cursor-based pagination.
* fix(follow): validate source/cursor params, handle enrichment errors
- Validate source against whitelist (sifa, bluesky, tangled)
- Validate cursor is a valid ISO date
- Wrap fetchProfilesFromBluesky in try/catch for graceful degradation
- Rename response field to avatarUrl for consistency with suggestions
Expands the seed file from ~100 to 230 canonical skills covering all
unresolved skills in production. Changes:
- Add ~65 new canonical skills (WordPress, jQuery, Django, CAD, Adobe
tools, Design Thinking, IoT, languages, etc.)
- Add ~50 new aliases to existing canonicals (variant names, foreign
language translations, common abbreviations)
- Change seed upsert from onConflictDoNothing to onConflictDoUpdate
so re-running the seed updates aliases on existing entries
- Add resolveUnresolvedSkills() step that re-matches all pending
unresolved_skills against the updated canonical list after seeding
Only 4 entries remain intentionally unresolved (test junk data and
overly vague terms like "create" and "computer literacy").
Closes singi-labs/sifa-web#316
Verified against real PDS records. Three namespaces were wrong:
- Frontpage: com.frontpage.* -> fyi.unravel.frontpage.*
- Picosky: blue.picosky.* -> social.psky.* (also: category Chat, not Pages)
- PasteSphere: com.pastesphere.* -> link.pastesphere.*
Also:
- Added verified scanCollections for all apps (Tangled repos/issues/PRs,
Smoke Signal events, Flashes posts, Frontpage posts, etc.)
- Added URL patterns for Tangled, Smoke Signal, Frontpage, Picosky,
Linkat, PasteSphere
- Added cross-app excluded collections (Tangled follows/stars,
Smoke Signal RSVPs, Frontpage votes)
- GET /api/activity/:handleOrDid/teaser — returns 3-5 recent items
across top apps for the profile activity card, with Valkey caching
(15 min TTL)
- GET /api/activity/:handleOrDid — full paginated activity feed with
category filtering, composite cursor pagination, and per-collection
Valkey caching (5 min TTL)
- POST /api/privacy/suppress — GDPR erasure endpoint that suppresses
a DID and clears all cached activity data
All PDS/AppView calls use 3s timeouts and Promise.allSettled for
graceful partial failure handling. Bluesky content fetched via public
AppView API; other apps via direct PDS listRecords.
- Add activeApps array to claimed profile response (cross-app activity from
user_app_stats, filtered by visibility and suppression status)
- Add activeApps: [] to unclaimed profile response (GDPR: no cross-app data)
- Fire-and-forget triggerRefreshIfStale on claimed profile views when valkey
and pdsHost are available
- New PUT /api/profile/activity-visibility endpoint for toggling per-app
visibility (authenticated, Zod-validated)
- Register activity routes in server.ts
- getAppStatsForDid / getVisibleAppStats: query userAppStats by DID
- upsertScanResults: upsert scan results with onConflictDoUpdate, prune stale apps
- triggerRefreshIfStale: fire-and-forget background scan with Valkey NX lock (120s TTL)
- isDidSuppressed / suppressDid: suppression list with stats + cache cleanup
- 8 unit tests covering all functions
* fix(admin): fill missing dates in chart data with zero-count entries
Admin dashboard charts (Daily Signups, Cumulative Users, DAU, LinkedIn
Imports) were skipping dates with no data, making the X-axis misleading.
Added fillDateGaps helper that generates all dates in the query range
and fills missing ones with zero values.
* fix(admin): replace non-null assertion with safe check
The profile resolver was inserting profile rows for ALL Bluesky
follows (5000+), polluting the profiles table and inflating user
counts on the admin page.
Now only persists profiles for DIDs that have sessions (actual Sifa
users). For "Not on Sifa" suggestion cards, resolves Bluesky
profile data on-the-fly without persisting.
The suggestions endpoint was fetching 20 connections ordered by
createdAt, then splitting into onSifa/notOnSifa. With thousands of
follows, the few "On Sifa" users were buried and never appeared.
Now queries them separately: all "On Sifa" suggestions (uncapped),
then fills with paginated "Not on Sifa" up to the limit.