+124
-8
README.md
+124
-8
README.md
···
11
11
## scripts
12
12
13
13
- [`check-files-for-bad-links`](#check-files-for-bad-links)
14
+
- [`dm-me-when-a-flight-passes-over`](#dm-me-when-a-flight-passes-over)
14
15
- [`find-longest-bsky-thread`](#find-longest-bsky-thread)
15
16
- [`kill-processes`](#kill-processes)
16
17
- [`predict-github-stars`](#predict-github-stars)
···
39
40
40
41
---
41
42
43
+
### `dm-me-when-a-flight-passes-over`
44
+
45
+
Monitor flights passing overhead and send BlueSky DMs.
46
+
47
+
Usage:
48
+
# Single user mode (backward compatible)
49
+
./dm-me-when-a-flight-passes-over
50
+
51
+
# Multi-subscriber mode with JSON file
52
+
./dm-me-when-a-flight-passes-over --subscribers subscribers.json
53
+
54
+
# Multi-subscriber mode with stdin
55
+
echo '[{"handle": "user1.bsky.social", "latitude": 41.8781, "longitude": -87.6298, "radius_miles": 5}]' | ./dm-me-when-a-flight-passes-over --subscribers -
56
+
57
+
This script monitors flights within a configurable radius and sends DMs on BlueSky
58
+
when flights pass overhead. Supports multiple subscribers with different locations.
59
+
60
+
## Future Architecture Ideas
61
+
62
+
### Web App Deployment Options
63
+
64
+
1. **FastAPI + Fly.io/Railway/Render**
65
+
- REST API with endpoints:
66
+
- POST /subscribe - Register user with BlueSky handle
67
+
- DELETE /unsubscribe - Remove subscription
68
+
- POST /update-location - Update user's location
69
+
- GET /status - Check subscription status
70
+
- Background worker using Celery/RQ/APScheduler
71
+
- PostgreSQL/SQLite for subscriber persistence
72
+
- Redis for caching flight data & deduplication
73
+
74
+
2. **Vercel/Netlify Edge Functions**
75
+
- Serverless approach with scheduled cron jobs
76
+
- Use Vercel KV or Upstash Redis for state
77
+
- Challenge: Long-running monitoring needs workarounds
78
+
- Solution: Trigger checks via cron every minute
79
+
80
+
3. **Self-Hosted with ngrok/Cloudflare Tunnel**
81
+
- Quick prototype option
82
+
- Run this script as daemon
83
+
- Expose simple Flask/FastAPI wrapper
84
+
- Security concerns: rate limiting, auth required
85
+
86
+
### Mobile/Browser Integration
87
+
88
+
1. **Progressive Web App (PWA)**
89
+
- Service worker for background location updates
90
+
- Geolocation API for current position
91
+
- Push notifications instead of/alongside DMs
92
+
- IndexedDB for offline capability
93
+
94
+
2. **iOS Shortcuts Integration**
95
+
- Create shortcut that gets location
96
+
- Calls webhook with location + BlueSky handle
97
+
- Could run automatically based on focus modes
98
+
99
+
3. **Browser Extension**
100
+
- Background script polls location
101
+
- Lighter weight than full app
102
+
- Cross-platform solution
103
+
104
+
### Architecture Components
105
+
106
+
1. **Location Services Layer**
107
+
- Browser Geolocation API
108
+
- IP-based geolocation fallback
109
+
- Manual location picker UI
110
+
- Privacy: Only send location when checking flights
111
+
112
+
2. **Notification Options**
113
+
- BlueSky DMs (current)
114
+
- Web Push Notifications
115
+
- Webhooks to other services
116
+
- Email/SMS via Twilio/SendGrid
117
+
118
+
3. **Subscription Management**
119
+
- OAuth with BlueSky for auth
120
+
- User preferences: radius, notification types
121
+
- Quiet hours/Do Not Disturb
122
+
- Rate limiting per user
123
+
124
+
4. **Data Optimization**
125
+
- Cache FlightRadar API responses
126
+
- Batch location updates
127
+
- Aggregate nearby users for efficiency
128
+
- WebSocket for real-time updates
129
+
130
+
### Implementation Approach
131
+
132
+
Phase 1: Web API Wrapper
133
+
- FastAPI with /subscribe endpoint
134
+
- SQLite for subscribers
135
+
- Run monitoring in background thread
136
+
- Deploy to Fly.io free tier
137
+
138
+
Phase 2: Web UI
139
+
- Simple React/Vue form
140
+
- Geolocation permission request
141
+
- Show nearby flights on map
142
+
- Subscription management
143
+
144
+
Phase 3: Mobile Experience
145
+
- PWA with service workers
146
+
- Background location updates
147
+
- Local notifications
148
+
- Offline support
149
+
150
+
### Security Considerations
151
+
- Rate limit FlightRadar API calls
152
+
- Authenticate BlueSky handles
153
+
- Validate location bounds
154
+
- Prevent subscription spam
155
+
- GDPR compliance for location data
156
+
157
+
---
158
+
42
159
### `find-longest-bsky-thread`
43
160
44
161
Find the longest reply thread from a Bluesky post.
···
76
193
Predict when a GitHub repository will reach a target number of stars.
77
194
78
195
Usage:
79
-
80
-
```bash
81
-
./predict-github-stars anthropics/claude-dev 10000
82
-
```
196
+
./predict-github-stars owner/repo 10000
83
197
84
198
Details:
85
-
- uses github api to fetch star history
86
-
- uses polynomial regression to predict future star growth
87
-
- shows confidence intervals based on historical variance
88
-
- requires `GITHUB_TOKEN` in environment for higher rate limits (optional)
199
+
- Uses GitHub REST API to fetch star history (with timestamps).
200
+
- Fits polynomial regression (degree 1–3) to full history.
201
+
- Falls back to recent‑trend linear extrapolation if the polynomial
202
+
cannot reach the target within ten years.
203
+
- Shows recent growth rate and a caution for long‑range estimates.
204
+
- Requires `GITHUB_TOKEN` in the environment for higher rate limits (optional).
89
205
90
206
---
91
207
+487
dm-me-when-a-flight-passes-over
+487
dm-me-when-a-flight-passes-over
···
1
+
#!/usr/bin/env -S uv run --script --quiet
2
+
# /// script
3
+
# requires-python = ">=3.12"
4
+
# dependencies = ["atproto", "pydantic-settings", "geopy", "httpx"]
5
+
# ///
6
+
"""
7
+
Monitor flights passing overhead and send BlueSky DMs.
8
+
9
+
Usage:
10
+
# Single user mode (backward compatible)
11
+
./dm-me-when-a-flight-passes-over
12
+
13
+
# Multi-subscriber mode with JSON file
14
+
./dm-me-when-a-flight-passes-over --subscribers subscribers.json
15
+
16
+
# Multi-subscriber mode with stdin
17
+
echo '[{"handle": "user1.bsky.social", "latitude": 41.8781, "longitude": -87.6298, "radius_miles": 5}]' | ./dm-me-when-a-flight-passes-over --subscribers -
18
+
19
+
This script monitors flights within a configurable radius and sends DMs on BlueSky
20
+
when flights pass overhead. Supports multiple subscribers with different locations.
21
+
22
+
## Future Architecture Ideas
23
+
24
+
### Web App Deployment Options
25
+
26
+
1. **FastAPI + Fly.io/Railway/Render**
27
+
- REST API with endpoints:
28
+
- POST /subscribe - Register user with BlueSky handle
29
+
- DELETE /unsubscribe - Remove subscription
30
+
- POST /update-location - Update user's location
31
+
- GET /status - Check subscription status
32
+
- Background worker using Celery/RQ/APScheduler
33
+
- PostgreSQL/SQLite for subscriber persistence
34
+
- Redis for caching flight data & deduplication
35
+
36
+
2. **Vercel/Netlify Edge Functions**
37
+
- Serverless approach with scheduled cron jobs
38
+
- Use Vercel KV or Upstash Redis for state
39
+
- Challenge: Long-running monitoring needs workarounds
40
+
- Solution: Trigger checks via cron every minute
41
+
42
+
3. **Self-Hosted with ngrok/Cloudflare Tunnel**
43
+
- Quick prototype option
44
+
- Run this script as daemon
45
+
- Expose simple Flask/FastAPI wrapper
46
+
- Security concerns: rate limiting, auth required
47
+
48
+
### Mobile/Browser Integration
49
+
50
+
1. **Progressive Web App (PWA)**
51
+
- Service worker for background location updates
52
+
- Geolocation API for current position
53
+
- Push notifications instead of/alongside DMs
54
+
- IndexedDB for offline capability
55
+
56
+
2. **iOS Shortcuts Integration**
57
+
- Create shortcut that gets location
58
+
- Calls webhook with location + BlueSky handle
59
+
- Could run automatically based on focus modes
60
+
61
+
3. **Browser Extension**
62
+
- Background script polls location
63
+
- Lighter weight than full app
64
+
- Cross-platform solution
65
+
66
+
### Architecture Components
67
+
68
+
1. **Location Services Layer**
69
+
- Browser Geolocation API
70
+
- IP-based geolocation fallback
71
+
- Manual location picker UI
72
+
- Privacy: Only send location when checking flights
73
+
74
+
2. **Notification Options**
75
+
- BlueSky DMs (current)
76
+
- Web Push Notifications
77
+
- Webhooks to other services
78
+
- Email/SMS via Twilio/SendGrid
79
+
80
+
3. **Subscription Management**
81
+
- OAuth with BlueSky for auth
82
+
- User preferences: radius, notification types
83
+
- Quiet hours/Do Not Disturb
84
+
- Rate limiting per user
85
+
86
+
4. **Data Optimization**
87
+
- Cache FlightRadar API responses
88
+
- Batch location updates
89
+
- Aggregate nearby users for efficiency
90
+
- WebSocket for real-time updates
91
+
92
+
### Implementation Approach
93
+
94
+
Phase 1: Web API Wrapper
95
+
- FastAPI with /subscribe endpoint
96
+
- SQLite for subscribers
97
+
- Run monitoring in background thread
98
+
- Deploy to Fly.io free tier
99
+
100
+
Phase 2: Web UI
101
+
- Simple React/Vue form
102
+
- Geolocation permission request
103
+
- Show nearby flights on map
104
+
- Subscription management
105
+
106
+
Phase 3: Mobile Experience
107
+
- PWA with service workers
108
+
- Background location updates
109
+
- Local notifications
110
+
- Offline support
111
+
112
+
### Security Considerations
113
+
- Rate limit FlightRadar API calls
114
+
- Authenticate BlueSky handles
115
+
- Validate location bounds
116
+
- Prevent subscription spam
117
+
- GDPR compliance for location data
118
+
"""
119
+
120
+
import argparse
121
+
import time
122
+
import math
123
+
import json
124
+
import sys
125
+
from datetime import datetime
126
+
from concurrent.futures import ThreadPoolExecutor, as_completed
127
+
128
+
import httpx
129
+
from atproto import Client
130
+
from geopy import distance
131
+
from pydantic import BaseModel, Field
132
+
from pydantic_settings import BaseSettings, SettingsConfigDict
133
+
134
+
135
+
class Settings(BaseSettings):
136
+
"""App settings loaded from environment variables"""
137
+
138
+
model_config = SettingsConfigDict(env_file=".env", extra="ignore")
139
+
140
+
bsky_handle: str = Field(...)
141
+
bsky_password: str = Field(...)
142
+
flightradar_api_token: str = Field(...)
143
+
144
+
145
+
class Subscriber(BaseModel):
146
+
"""Subscriber with location information"""
147
+
148
+
handle: str
149
+
latitude: float
150
+
longitude: float
151
+
radius_miles: float = 5.0
152
+
153
+
154
+
class Flight(BaseModel):
155
+
"""Flight data model"""
156
+
157
+
hex: str
158
+
latitude: float
159
+
longitude: float
160
+
altitude: float | None = None
161
+
ground_speed: float | None = None
162
+
heading: float | None = None
163
+
aircraft_type: str | None = None
164
+
registration: str | None = None
165
+
origin: str | None = None
166
+
destination: str | None = None
167
+
callsign: str | None = None
168
+
distance_miles: float
169
+
170
+
171
+
def get_flights_in_area(
172
+
settings: Settings, latitude: float, longitude: float, radius_miles: float
173
+
) -> list[Flight]:
174
+
"""Get flights within the specified radius using FlightRadar24 API."""
175
+
lat_offset = radius_miles / 69 # 1 degree latitude ≈ 69 miles
176
+
lon_offset = radius_miles / (69 * abs(math.cos(math.radians(latitude))))
177
+
178
+
bounds = {
179
+
"north": latitude + lat_offset,
180
+
"south": latitude - lat_offset,
181
+
"west": longitude - lon_offset,
182
+
"east": longitude + lon_offset,
183
+
}
184
+
185
+
headers = {
186
+
"Authorization": f"Bearer {settings.flightradar_api_token}",
187
+
"Accept": "application/json",
188
+
"Accept-Version": "v1",
189
+
}
190
+
191
+
url = "https://fr24api.flightradar24.com/api/live/flight-positions/full"
192
+
params = {
193
+
"bounds": f"{bounds['north']},{bounds['south']},{bounds['west']},{bounds['east']}"
194
+
}
195
+
196
+
try:
197
+
with httpx.Client() as client:
198
+
response = client.get(url, headers=headers, params=params, timeout=10)
199
+
response.raise_for_status()
200
+
data = response.json()
201
+
202
+
flights_in_radius = []
203
+
center = (latitude, longitude)
204
+
205
+
if isinstance(data, dict) and "data" in data:
206
+
for flight_data in data["data"]:
207
+
lat = flight_data.get("lat")
208
+
lon = flight_data.get("lon")
209
+
210
+
if lat and lon:
211
+
flight_pos = (lat, lon)
212
+
dist = distance.distance(center, flight_pos).miles
213
+
if dist <= radius_miles:
214
+
flight = Flight(
215
+
hex=flight_data.get("fr24_id", ""),
216
+
latitude=lat,
217
+
longitude=lon,
218
+
altitude=flight_data.get("alt"),
219
+
ground_speed=flight_data.get("gspeed"),
220
+
heading=flight_data.get("track"),
221
+
aircraft_type=flight_data.get("type"),
222
+
registration=flight_data.get("reg"),
223
+
origin=flight_data.get("orig_iata"),
224
+
destination=flight_data.get("dest_iata"),
225
+
callsign=flight_data.get("flight"),
226
+
distance_miles=round(dist, 2),
227
+
)
228
+
flights_in_radius.append(flight)
229
+
230
+
return flights_in_radius
231
+
except httpx.HTTPStatusError as e:
232
+
print(f"HTTP error fetching flights: {e}")
233
+
print(f"Response status: {e.response.status_code}")
234
+
print(f"Response content: {e.response.text[:500]}")
235
+
return []
236
+
except Exception as e:
237
+
print(f"Error fetching flights: {e}")
238
+
return []
239
+
240
+
241
+
def format_flight_info(flight: Flight) -> str:
242
+
"""Format flight information for a DM."""
243
+
parts = ["✈️ Flight passing overhead!\n"]
244
+
245
+
parts.append(f"Flight: {flight.callsign or 'Unknown'}")
246
+
parts.append(f"Distance: {flight.distance_miles} miles")
247
+
248
+
if flight.altitude:
249
+
parts.append(f"Altitude: {flight.altitude:,.0f} ft")
250
+
if flight.ground_speed:
251
+
parts.append(f"Speed: {flight.ground_speed:.0f} kts")
252
+
if flight.heading:
253
+
parts.append(f"Heading: {flight.heading:.0f}°")
254
+
if flight.aircraft_type:
255
+
parts.append(f"Aircraft: {flight.aircraft_type}")
256
+
257
+
if flight.origin or flight.destination:
258
+
route = f"{flight.origin or '???'} → {flight.destination or '???'}"
259
+
parts.append(f"Route: {route}")
260
+
261
+
parts.append(f"\nTime: {datetime.now().strftime('%H:%M:%S')}")
262
+
263
+
return "\n".join(parts)
264
+
265
+
266
+
def send_dm(client: Client, message: str, target_handle: str) -> bool:
267
+
"""Send a direct message to the specified handle on BlueSky."""
268
+
try:
269
+
resolved = client.com.atproto.identity.resolve_handle(
270
+
params={"handle": target_handle}
271
+
)
272
+
target_did = resolved.did
273
+
274
+
chat_client = client.with_bsky_chat_proxy()
275
+
276
+
convo_response = chat_client.chat.bsky.convo.get_convo_for_members(
277
+
{"members": [target_did]}
278
+
)
279
+
280
+
if not convo_response or not convo_response.convo:
281
+
print(f"Could not create/get conversation with {target_handle}")
282
+
return False
283
+
284
+
recipient = None
285
+
for member in convo_response.convo.members:
286
+
if member.did != client.me.did:
287
+
recipient = member
288
+
break
289
+
290
+
if not recipient or recipient.handle != target_handle:
291
+
print(
292
+
f"ERROR: About to message wrong person! Expected {target_handle}, but found {recipient.handle if recipient else 'no recipient'}"
293
+
)
294
+
return False
295
+
296
+
chat_client.chat.bsky.convo.send_message(
297
+
data={
298
+
"convoId": convo_response.convo.id,
299
+
"message": {"text": message, "facets": []},
300
+
}
301
+
)
302
+
303
+
print(f"DM sent to {target_handle}")
304
+
return True
305
+
306
+
except Exception as e:
307
+
print(f"Error sending DM to {target_handle}: {e}")
308
+
return False
309
+
310
+
311
+
def process_subscriber(
312
+
client: Client,
313
+
settings: Settings,
314
+
subscriber: Subscriber,
315
+
notified_flights: dict[str, set[str]],
316
+
) -> None:
317
+
"""Process flights for a single subscriber."""
318
+
try:
319
+
flights = get_flights_in_area(
320
+
settings, subscriber.latitude, subscriber.longitude, subscriber.radius_miles
321
+
)
322
+
323
+
if subscriber.handle not in notified_flights:
324
+
notified_flights[subscriber.handle] = set()
325
+
326
+
subscriber_notified = notified_flights[subscriber.handle]
327
+
328
+
for flight in flights:
329
+
flight_id = flight.hex
330
+
331
+
if flight_id not in subscriber_notified:
332
+
message = format_flight_info(flight)
333
+
print(f"\n[{subscriber.handle}] {message}\n")
334
+
335
+
if send_dm(client, message, subscriber.handle):
336
+
print(f"DM sent to {subscriber.handle} for flight {flight_id}")
337
+
subscriber_notified.add(flight_id)
338
+
else:
339
+
print(
340
+
f"Failed to send DM to {subscriber.handle} for flight {flight_id}"
341
+
)
342
+
343
+
current_flight_ids = {f.hex for f in flights}
344
+
notified_flights[subscriber.handle] &= current_flight_ids
345
+
346
+
if not flights:
347
+
print(
348
+
f"[{subscriber.handle}] No flights in range at {datetime.now().strftime('%H:%M:%S')}"
349
+
)
350
+
351
+
except Exception as e:
352
+
print(f"Error processing subscriber {subscriber.handle}: {e}")
353
+
354
+
355
+
def load_subscribers(subscribers_input: str | None) -> list[Subscriber]:
356
+
"""Load subscribers from JSON file or stdin."""
357
+
if subscribers_input:
358
+
with open(subscribers_input, "r") as f:
359
+
data = json.load(f)
360
+
else:
361
+
print("Reading subscriber data from stdin (provide JSON array)...")
362
+
data = json.load(sys.stdin)
363
+
364
+
return [Subscriber(**item) for item in data]
365
+
366
+
367
+
def main():
368
+
"""Main monitoring loop."""
369
+
parser = argparse.ArgumentParser(
370
+
description="Monitor flights overhead and send BlueSky DMs"
371
+
)
372
+
373
+
parser.add_argument(
374
+
"--subscribers",
375
+
type=str,
376
+
help="JSON file with subscriber list, or '-' for stdin",
377
+
)
378
+
parser.add_argument(
379
+
"--latitude", type=float, default=41.8781, help="Latitude (default: Chicago)"
380
+
)
381
+
parser.add_argument(
382
+
"--longitude", type=float, default=-87.6298, help="Longitude (default: Chicago)"
383
+
)
384
+
parser.add_argument(
385
+
"--radius", type=float, default=5.0, help="Radius in miles (default: 5)"
386
+
)
387
+
parser.add_argument(
388
+
"--handle",
389
+
type=str,
390
+
default="alternatebuild.dev",
391
+
help="BlueSky handle to DM (default: alternatebuild.dev)",
392
+
)
393
+
parser.add_argument(
394
+
"--interval",
395
+
type=int,
396
+
default=60,
397
+
help="Check interval in seconds (default: 60)",
398
+
)
399
+
parser.add_argument(
400
+
"--once", action="store_true", help="Run once and exit (for testing)"
401
+
)
402
+
parser.add_argument(
403
+
"--max-workers",
404
+
type=int,
405
+
default=5,
406
+
help="Max concurrent workers for processing subscribers (default: 5)",
407
+
)
408
+
args = parser.parse_args()
409
+
410
+
try:
411
+
settings = Settings()
412
+
except Exception as e:
413
+
print(f"Error loading settings: {e}")
414
+
print(
415
+
"Ensure .env file exists with BSKY_HANDLE, BSKY_PASSWORD, and FLIGHTRADAR_API_TOKEN"
416
+
)
417
+
return
418
+
419
+
client = Client()
420
+
try:
421
+
client.login(settings.bsky_handle, settings.bsky_password)
422
+
print(f"Logged in to BlueSky as {settings.bsky_handle}")
423
+
except Exception as e:
424
+
print(f"Error logging into BlueSky: {e}")
425
+
return
426
+
427
+
if args.subscribers:
428
+
if args.subscribers == "-":
429
+
subscribers_input = None
430
+
else:
431
+
subscribers_input = args.subscribers
432
+
433
+
try:
434
+
subscribers = load_subscribers(subscribers_input)
435
+
print(f"Loaded {len(subscribers)} subscriber(s)")
436
+
except Exception as e:
437
+
print(f"Error loading subscribers: {e}")
438
+
return
439
+
else:
440
+
subscribers = [
441
+
Subscriber(
442
+
handle=args.handle,
443
+
latitude=args.latitude,
444
+
longitude=args.longitude,
445
+
radius_miles=args.radius,
446
+
)
447
+
]
448
+
print(
449
+
f"Monitoring flights within {args.radius} miles of ({args.latitude}, {args.longitude}) for {args.handle}"
450
+
)
451
+
452
+
print(f"Checking every {args.interval} seconds...")
453
+
454
+
notified_flights: dict[str, set[str]] = {}
455
+
456
+
while True:
457
+
try:
458
+
with ThreadPoolExecutor(max_workers=args.max_workers) as executor:
459
+
futures = []
460
+
for subscriber in subscribers:
461
+
future = executor.submit(
462
+
process_subscriber,
463
+
client,
464
+
settings,
465
+
subscriber,
466
+
notified_flights,
467
+
)
468
+
futures.append(future)
469
+
470
+
for future in as_completed(futures):
471
+
future.result()
472
+
473
+
if args.once:
474
+
break
475
+
476
+
time.sleep(args.interval)
477
+
478
+
except KeyboardInterrupt:
479
+
print("\nStopping flight monitor...")
480
+
break
481
+
except Exception as e:
482
+
print(f"Error in monitoring loop: {e}")
483
+
time.sleep(args.interval)
484
+
485
+
486
+
if __name__ == "__main__":
487
+
main()
+183
-218
predict-github-stars
+183
-218
predict-github-stars
···
7
7
Predict when a GitHub repository will reach a target number of stars.
8
8
9
9
Usage:
10
-
11
-
```bash
12
-
./predict-github-stars anthropics/claude-dev 10000
13
-
```
10
+
./predict-github-stars owner/repo 10000
14
11
15
12
Details:
16
-
- uses github api to fetch star history
17
-
- uses polynomial regression to predict future star growth
18
-
- shows confidence intervals based on historical variance
19
-
- requires `GITHUB_TOKEN` in environment for higher rate limits (optional)
13
+
- Uses GitHub REST API to fetch star history (with timestamps).
14
+
- Fits polynomial regression (degree 1–3) to full history.
15
+
- Falls back to recent‑trend linear extrapolation if the polynomial
16
+
cannot reach the target within ten years.
17
+
- Shows recent growth rate and a caution for long‑range estimates.
18
+
- Requires `GITHUB_TOKEN` in the environment for higher rate limits (optional).
20
19
"""
20
+
21
+
from __future__ import annotations
21
22
22
23
import argparse
23
24
import os
24
25
import sys
25
26
from datetime import datetime, timezone
26
-
from typing import Optional
27
+
28
+
import httpx
27
29
import numpy as np
28
-
from sklearn.preprocessing import PolynomialFeatures
30
+
import pandas as pd
31
+
from dateutil import parser as date_parser
32
+
from pydantic import Field
33
+
from pydantic_settings import BaseSettings, SettingsConfigDict
34
+
from rich.console import Console
35
+
from rich.panel import Panel
36
+
from rich.table import Table
29
37
from sklearn.linear_model import LinearRegression
30
38
from sklearn.metrics import r2_score
31
-
import httpx
32
-
from rich.console import Console
33
-
from rich.table import Table
34
-
from rich.panel import Panel
35
-
from dateutil import parser as date_parser
36
-
import pandas as pd
37
-
from pydantic_settings import BaseSettings, SettingsConfigDict
38
-
from pydantic import Field
39
+
from sklearn.preprocessing import PolynomialFeatures
39
40
40
41
console = Console()
41
42
42
43
43
44
class Settings(BaseSettings):
44
-
"""App settings loaded from environment variables"""
45
+
"""Load settings (e.g. GitHub token) from environment."""
45
46
46
47
model_config = SettingsConfigDict(
47
48
env_file=os.environ.get("ENV_FILE", ".env"), extra="ignore"
48
49
)
50
+
github_token: str = Field(default="", description="GitHub API token")
49
51
50
-
github_token: str = Field(default="")
51
52
53
+
# ──────────────────────────────── GitHub helpers ────────────────────────────────
52
54
53
-
GREY = "\033[90m"
54
-
GREEN = "\033[92m"
55
-
YELLOW = "\033[93m"
56
-
RED = "\033[91m"
57
-
_END = "\033[0m"
58
55
56
+
def _headers(token: str | None = None) -> dict[str, str]:
57
+
h = {"Accept": "application/vnd.github.v3+json"}
58
+
if token:
59
+
h["Authorization"] = f"token {token}"
60
+
return h
59
61
60
-
def get_repo_data(owner: str, repo: str, token: Optional[str] = None) -> dict:
61
-
"""fetch basic repository data from github api"""
62
-
headers = {"Accept": "application/vnd.github.v3+json"}
63
-
if token:
64
-
headers["Authorization"] = f"token {token}"
65
62
63
+
def get_repo_data(owner: str, repo: str, token: str | None = None) -> dict:
66
64
url = f"https://api.github.com/repos/{owner}/{repo}"
67
-
68
-
with httpx.Client() as client:
69
-
response = client.get(url, headers=headers)
70
-
response.raise_for_status()
71
-
return response.json()
65
+
with httpx.Client() as c:
66
+
r = c.get(url, headers=_headers(token))
67
+
r.raise_for_status()
68
+
return r.json()
72
69
73
70
74
71
def get_star_history(
75
-
owner: str, repo: str, token: Optional[str] = None, current_stars: int = 0
72
+
owner: str, repo: str, token: str | None, total_stars: int
76
73
) -> list[tuple[datetime, int]]:
77
-
"""fetch star history using github api stargazers endpoint"""
78
-
headers = {
79
-
"Accept": "application/vnd.github.v3.star+json" # includes starred_at timestamps
80
-
}
81
-
if token:
82
-
headers["Authorization"] = f"token {token}"
74
+
"""Return (timestamp, cumulative_star_count) pairs, sampled if repo is huge."""
75
+
hdrs = _headers(token)
76
+
hdrs["Accept"] = "application/vnd.github.v3.star+json" # need starred_at
83
77
84
-
star_history = []
78
+
history: list[tuple[datetime, int]] = []
85
79
86
-
# for repos with many stars, sample across the range
87
-
# instead of just getting the first ones
88
-
if current_stars > 10000:
89
-
# sample ~200 points across the star range for performance
80
+
if total_stars > 10_000:
81
+
# sample ~200 evenly‑spaced star indices
90
82
sample_points = 200
91
-
step = max(1, current_stars // sample_points)
83
+
step = max(1, total_stars // sample_points)
84
+
pages_needed: dict[int, list[int]] = {}
85
+
for s in range(1, total_stars, step):
86
+
pg = (s - 1) // 100 + 1
87
+
idx = (s - 1) % 100
88
+
pages_needed.setdefault(pg, []).append(idx)
92
89
93
-
# batch requests with a single client
94
-
with httpx.Client() as client:
95
-
# get samples at regular intervals
96
-
for target_star in range(1, current_stars, step):
97
-
page = (target_star // 100) + 1
98
-
position = (target_star % 100) - 1
90
+
# always include final star
91
+
last_pg = (total_stars - 1) // 100 + 1
92
+
last_idx = (total_stars - 1) % 100
93
+
pages_needed.setdefault(last_pg, []).append(last_idx)
99
94
100
-
url = f"https://api.github.com/repos/{owner}/{repo}/stargazers?page={page}&per_page=100"
101
-
response = client.get(url, headers=headers)
102
-
response.raise_for_status()
95
+
with httpx.Client() as c:
96
+
for pg, idxs in pages_needed.items():
97
+
url = f"https://api.github.com/repos/{owner}/{repo}/stargazers?page={pg}&per_page=100"
98
+
r = c.get(url, headers=hdrs)
99
+
r.raise_for_status()
100
+
data = r.json()
101
+
for i in sorted(set(idxs)):
102
+
if i < len(data) and "starred_at" in data[i]:
103
+
ts = date_parser.parse(data[i]["starred_at"])
104
+
history.append((ts, (pg - 1) * 100 + i + 1))
103
105
104
-
data = response.json()
105
-
if data and position < len(data) and "starred_at" in data[position]:
106
-
starred_at = date_parser.parse(data[position]["starred_at"])
107
-
star_history.append((starred_at, target_star))
106
+
console.print(f"[dim]sampled {len(history)} points across star history[/dim]")
108
107
109
-
console.print(
110
-
f"{GREY}sampled {len(star_history)} points across star history{_END}"
111
-
)
112
108
else:
113
-
# for smaller repos, get all stars
109
+
# fetch all pages
114
110
page = 1
115
-
per_page = 100
116
-
117
-
with httpx.Client() as client:
111
+
with httpx.Client() as c:
118
112
while True:
119
-
url = f"https://api.github.com/repos/{owner}/{repo}/stargazers?page={page}&per_page={per_page}"
120
-
response = client.get(url, headers=headers)
121
-
response.raise_for_status()
122
-
123
-
data = response.json()
113
+
url = f"https://api.github.com/repos/{owner}/{repo}/stargazers?page={page}&per_page=100"
114
+
r = c.get(url, headers=hdrs)
115
+
r.raise_for_status()
116
+
data = r.json()
124
117
if not data:
125
118
break
126
-
127
119
for i, star in enumerate(data):
128
120
if "starred_at" in star:
129
-
starred_at = date_parser.parse(star["starred_at"])
130
-
cumulative_stars = (page - 1) * per_page + i + 1
131
-
star_history.append((starred_at, cumulative_stars))
132
-
121
+
ts = date_parser.parse(star["starred_at"])
122
+
history.append((ts, (page - 1) * 100 + i + 1))
133
123
page += 1
134
124
135
-
return star_history
136
-
137
-
138
-
def predict_star_growth(
139
-
star_history: list[tuple[datetime, int]], target_stars: int, current_stars: int
140
-
) -> Optional[datetime]:
141
-
"""use polynomial regression to predict when repo will reach target stars"""
142
-
if len(star_history) < 10:
143
-
return None
125
+
# ensure order and anchor today’s count
126
+
history.sort(key=lambda t: t[0])
127
+
if history and history[-1][1] < total_stars:
128
+
history.append((datetime.now(timezone.utc), total_stars))
129
+
return history
144
130
145
-
# convert to days since first star
146
-
first_date = star_history[0][0]
147
-
X = np.array(
148
-
[(date - first_date).total_seconds() / 86400 for date, _ in star_history]
149
-
).reshape(-1, 1)
150
-
y = np.array([stars for _, stars in star_history])
151
131
152
-
# try different polynomial degrees and pick best fit
153
-
best_r2 = -float("inf")
154
-
best_model = None
155
-
best_poly = None
156
-
best_degree = 1
132
+
# ──────────────────────────────── modelling ─────────────────────────────────────
157
133
158
-
for degree in range(1, 4): # try linear, quadratic, cubic
159
-
poly = PolynomialFeatures(degree=degree)
160
-
X_poly = poly.fit_transform(X)
161
134
162
-
model = LinearRegression()
163
-
model.fit(X_poly, y)
135
+
def best_poly_fit(
136
+
X: np.ndarray, y: np.ndarray
137
+
) -> tuple[LinearRegression, PolynomialFeatures, int, float]:
138
+
best_r2 = -1.0
139
+
best_model: LinearRegression | None = None
140
+
best_poly: PolynomialFeatures | None = None
141
+
best_deg = 1
142
+
for deg in (1, 2, 3):
143
+
poly = PolynomialFeatures(degree=deg)
144
+
Xpoly = poly.fit_transform(X)
145
+
model = LinearRegression().fit(Xpoly, y)
146
+
r2 = r2_score(y, model.predict(Xpoly))
147
+
if r2 > best_r2:
148
+
best_r2, best_model, best_poly, best_deg = r2, model, poly, deg
149
+
return best_model, best_poly, best_deg, best_r2 # type: ignore
164
150
165
-
y_pred = model.predict(X_poly)
166
-
r2 = r2_score(y, y_pred)
167
151
168
-
if r2 > best_r2:
169
-
best_r2 = r2
170
-
best_model = model
171
-
best_poly = poly
172
-
best_degree = degree
173
-
174
-
console.print(
175
-
f"{GREY}best fit: degree {best_degree} polynomial (r² = {best_r2:.3f}){_END}"
152
+
def predict_date(history: list[tuple[datetime, int]], target: int) -> datetime | None:
153
+
if len(history) < 10:
154
+
return None
155
+
origin = history[0][0]
156
+
X = np.array([(t - origin).total_seconds() / 86400 for t, _ in history]).reshape(
157
+
-1, 1
176
158
)
159
+
y = np.array([s for _, s in history])
177
160
178
-
# predict future
179
-
# search for when we'll hit target stars
180
-
days_to_check = np.arange(0, 3650, 1) # check up to 10 years
161
+
model, poly, deg, r2 = best_poly_fit(X, y)
162
+
console.print(f"[dim]best fit: degree {deg} polynomial (r² = {r2:.3f})[/dim]")
181
163
182
-
for days_ahead in days_to_check:
183
-
current_days = X[-1][0]
184
-
future_days = current_days + days_ahead
185
-
X_future = best_poly.transform([[future_days]])
186
-
predicted_stars = best_model.predict(X_future)[0]
164
+
current_day = X[-1, 0]
165
+
for d in range(0, 3650): # up to 10 years
166
+
future = current_day + d
167
+
if model.predict(poly.transform([[future]]))[0] >= target:
168
+
return origin + pd.Timedelta(days=future)
169
+
return None
187
170
188
-
if predicted_stars >= target_stars:
189
-
predicted_date = first_date + pd.Timedelta(days=future_days)
190
-
return predicted_date
191
171
192
-
return None # won't reach target in 10 years
172
+
# ──────────────────────────────── utils ─────────────────────────────────────────
193
173
194
174
195
-
def format_timeframe(date: datetime) -> str:
196
-
"""format a future date as a human-readable timeframe"""
175
+
def timeframe_str(dt: datetime) -> str:
197
176
now = datetime.now(timezone.utc)
198
-
delta = date - now
199
-
200
-
if delta.days < 0:
177
+
if dt <= now:
201
178
return "already reached"
202
-
elif delta.days == 0:
179
+
days = (dt - now).days
180
+
if days == 0:
203
181
return "today"
204
-
elif delta.days == 1:
182
+
if days == 1:
205
183
return "tomorrow"
206
-
elif delta.days < 7:
207
-
return f"in {delta.days} days"
208
-
elif delta.days < 30:
209
-
weeks = delta.days // 7
210
-
return f"in {weeks} week{'s' if weeks > 1 else ''}"
211
-
elif delta.days < 365:
212
-
months = delta.days // 30
213
-
return f"in {months} month{'s' if months > 1 else ''}"
214
-
else:
215
-
years = delta.days // 365
216
-
return f"in {years} year{'s' if years > 1 else ''}"
184
+
if days < 7:
185
+
return f"in {days} days"
186
+
if days < 30:
187
+
return f"in {days // 7} week(s)"
188
+
if days < 365:
189
+
return f"in {days // 30} month(s)"
190
+
return f"in {days // 365} year(s)"
217
191
218
192
219
-
def main():
220
-
parser = argparse.ArgumentParser(
221
-
description="predict when a github repository will reach a target number of stars"
222
-
)
223
-
parser.add_argument("repo", help="repository in format owner/repo")
224
-
parser.add_argument("stars", type=int, help="target number of stars")
193
+
# ──────────────────────────────── main ──────────────────────────────────────────
225
194
226
-
args = parser.parse_args()
227
195
228
-
try:
229
-
settings = Settings() # type: ignore
230
-
except Exception as e:
231
-
console.print(f"{RED}error loading settings: {e}{_END}")
232
-
sys.exit(1)
196
+
def main() -> None:
197
+
p = argparse.ArgumentParser(
198
+
description="Predict when a GitHub repo will reach a target number of stars"
199
+
)
200
+
p.add_argument("repo", help="owner/repo")
201
+
p.add_argument("stars", type=int, help="target star count")
202
+
args = p.parse_args()
233
203
234
-
token = settings.github_token
204
+
if "/" not in args.repo:
205
+
console.print("[red]error: repo must be owner/repo[/red]")
206
+
sys.exit(1)
207
+
owner, repo = args.repo.split("/", 1)
235
208
236
209
try:
237
-
owner, repo = args.repo.split("/")
238
-
except ValueError:
239
-
console.print(f"{RED}error: repository must be in format owner/repo{_END}")
210
+
settings = Settings() # load token
211
+
except Exception as e: # pragma: no cover
212
+
console.print(f"[red]error loading settings: {e}[/red]")
240
213
sys.exit(1)
214
+
token = settings.github_token.strip() or None
241
215
242
-
# fetch current repo data
243
216
try:
244
217
repo_data = get_repo_data(owner, repo, token)
245
218
current_stars = repo_data["stargazers_count"]
···
247
220
248
221
console.print(
249
222
Panel.fit(
250
-
f"[bold cyan]{args.repo}[/bold cyan]\n"
251
-
f"[dim]current stars: {current_stars:,}\n"
252
-
f"created: {created_at.strftime('%Y-%m-%d')}[/dim]",
223
+
f"[bold cyan]{owner}/{repo}[/bold cyan]\n"
224
+
f"[dim]current stars: {current_stars:,}\ncreated: {created_at:%Y-%m-%d}[/dim]",
253
225
border_style="blue",
254
226
)
255
227
)
256
228
257
229
if current_stars >= args.stars:
258
-
console.print(f"\n{GREEN}✓ already has {current_stars:,} stars!{_END}")
230
+
console.print("\n[green]✓ already at or above target![/green]")
259
231
sys.exit(0)
260
232
261
-
console.print("\nfetching star history...")
262
-
star_history = get_star_history(owner, repo, token, current_stars)
263
-
264
-
if not star_history:
265
-
console.print(f"{RED}error: no star history available{_END}")
233
+
console.print("\nfetching star history…")
234
+
history = get_star_history(owner, repo, token, current_stars)
235
+
if not history:
236
+
console.print("[red]error: no star history[/red]")
266
237
sys.exit(1)
267
-
268
-
# sample the history if too large
269
-
if len(star_history) > 1000:
270
-
# take every nth star to get ~1000 data points
271
-
n = len(star_history) // 1000
272
-
star_history = star_history[::n]
273
-
274
-
console.print(f"{GREY}analyzing {len(star_history)} data points...{_END}")
275
-
276
-
predicted_date = predict_star_growth(star_history, args.stars, current_stars)
277
-
278
-
if predicted_date:
279
-
timeframe = format_timeframe(predicted_date)
280
-
281
-
# create results table
282
-
table = Table(show_header=True, header_style="bold magenta")
283
-
table.add_column("metric", style="cyan")
284
-
table.add_column("value", style="white")
238
+
if len(history) > 1000: # down‑sample for speed
239
+
step = len(history) // 1000
240
+
history = history[::step] + [history[-1]]
285
241
286
-
table.add_row("target stars", f"{args.stars:,}")
287
-
table.add_row("current stars", f"{current_stars:,}")
288
-
table.add_row("stars needed", f"{args.stars - current_stars:,}")
289
-
table.add_row("predicted date", predicted_date.strftime("%Y-%m-%d"))
290
-
table.add_row("timeframe", timeframe)
242
+
console.print(f"[dim]analysing {len(history)} data points…[/dim]")
243
+
poly_date = predict_date(history, args.stars)
291
244
292
-
# calculate current growth rate
293
-
if len(star_history) > 1:
294
-
recent_days = 30
295
-
recent_date = datetime.now(timezone.utc) - pd.Timedelta(
296
-
days=recent_days
297
-
)
298
-
recent_stars = [s for d, s in star_history if d >= recent_date]
299
-
if len(recent_stars) > 1:
300
-
daily_rate = (recent_stars[-1] - recent_stars[0]) / recent_days
301
-
table.add_row("recent growth", f"{daily_rate:.1f} stars/day")
245
+
def recent_rate(window: int = 30) -> float:
246
+
cutoff = datetime.now(timezone.utc) - pd.Timedelta(days=window)
247
+
pts = [s for t, s in history if t >= cutoff]
248
+
return (pts[-1] - pts[0]) / window if len(pts) >= 2 else 0.0
302
249
303
-
console.print("\n")
304
-
console.print(table)
250
+
rate = recent_rate() or recent_rate(90)
305
251
306
-
if "year" in timeframe and "1 year" not in timeframe:
307
-
console.print(
308
-
f"\n{YELLOW}⚠ prediction is far in the future and may be unreliable{_END}"
309
-
)
252
+
if poly_date:
253
+
out_date, tf = poly_date, timeframe_str(poly_date)
254
+
elif rate > 0:
255
+
days_needed = (args.stars - current_stars) / rate
256
+
out_date = datetime.now(timezone.utc) + pd.Timedelta(days=days_needed)
257
+
tf = timeframe_str(out_date)
258
+
console.print(
259
+
"[dim]poly model pessimistic; using recent growth trend[/dim]"
260
+
)
310
261
else:
311
262
console.print(
312
-
f"\n{RED}✗ unlikely to reach {args.stars:,} stars in the next 10 years{_END}"
263
+
f"[red]✗ unlikely to reach {args.stars:,} stars in the next 10 years[/red]"
313
264
)
265
+
sys.exit(0)
266
+
267
+
table = Table(show_header=True, header_style="bold magenta")
268
+
table.add_column("metric")
269
+
table.add_column("value", style="white")
270
+
table.add_row("target stars", f"{args.stars:,}")
271
+
table.add_row("current stars", f"{current_stars:,}")
272
+
table.add_row("stars needed", f"{args.stars - current_stars:,}")
273
+
table.add_row("predicted date", out_date.strftime("%Y-%m-%d"))
274
+
table.add_row("timeframe", tf)
275
+
if rate:
276
+
table.add_row("recent growth", f"{rate:.1f} stars/day")
277
+
278
+
console.print()
279
+
console.print(table)
280
+
if "year" in tf and "1 year" not in tf:
281
+
console.print("\n[dim]⚠ prediction far in future; uncertainty high[/dim]")
314
282
315
283
except httpx.HTTPStatusError as e:
316
284
if e.response.status_code == 404:
317
-
console.print(f"{RED}error: repository {args.repo} not found{_END}")
285
+
msg = "repository not found"
318
286
elif e.response.status_code == 403:
319
-
console.print(
320
-
f"{RED}error: rate limit exceeded. set GITHUB_TOKEN environment variable{_END}"
321
-
)
287
+
msg = "rate limit exceeded (set GITHUB_TOKEN)"
322
288
else:
323
-
console.print(
324
-
f"{RED}error: github api error {e.response.status_code}{_END}"
325
-
)
289
+
msg = f"GitHub API error {e.response.status_code}"
290
+
console.print(f"[red]error: {msg}[/red]")
326
291
sys.exit(1)
327
-
except Exception as e:
328
-
console.print(f"{RED}error: {e}{_END}")
292
+
except Exception as e: # pragma: no cover
293
+
console.print(f"[red]error: {e}[/red]")
329
294
sys.exit(1)
330
295
331
296