Cedarville Cybersecurity Textbook PDF Creator#
Automated tool to download and convert the Cedarville "Invitation to Cybersecurity" textbook to PDF format.
Features#
- Downloads all 340 pages (SVG text layers + high-res WebP images)
- Composites layers with proper font rendering
- Creates high-quality PDF (1045x1350 pixels per page)
- Optional: Add searchable text with OCR
Quick Start#
./build.sh
That's it! The script will:
- Create a Python virtual environment
- Install dependencies
- Download all page layers (~10-15 min)
- Create the PDF (~8-10 min)
- Optionally add OCR for selectable text (~30-60 min)
Manual Steps#
If you prefer to run steps individually:
1. Setup Environment#
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python -m playwright install chromium
2. Download Layers#
python download_layers.py
Downloads 340 pages:
- SVG layers (text, vector graphics) →
svg_layers/ - High-res WebP images (1045x1350) →
webp_highres/
3. Create PDF#
python create_pdf.py
Composites SVG + WebP and creates Invitation_to_Cybersecurity.pdf
4. Add OCR (Optional)#
brew install ocrmypdf
ocrmypdf Invitation_to_Cybersecurity.pdf Invitation_to_Cybersecurity_OCR.pdf
Creates a version with selectable/searchable text.
Requirements#
- Python 3.9+
- macOS (tested on Apple Silicon)
- Homebrew (for OCR step)
Output#
- Invitation_to_Cybersecurity.pdf - 340 pages, ~70-80 MB, high quality
- Invitation_to_Cybersecurity_OCR.pdf - Same as above + searchable text (optional)
File Structure#
cedrus/
├── build.sh # Main build script
├── requirements.txt # Python dependencies
├── download_layers.py # Download SVG + WebP
├── create_pdf.py # Composite and create PDF
├── svg_layers/ # Downloaded SVG files
├── webp_highres/ # Downloaded WebP files
├── merged_pages/ # Temporary composited PNGs
└── Invitation_to_Cybersecurity.pdf
Troubleshooting#
"Command not found: python3"
- Install Python 3:
brew install python3
"ocrmypdf not found"
- OCR step is optional. Install with:
brew install ocrmypdf
Fonts look wrong
- The script uses Playwright (Chromium) which properly renders embedded fonts
- If issues persist, check that Playwright browser installed:
python -m playwright install chromium
Notes#
- Total time: ~20-30 minutes (without OCR)
- With OCR: ~50-90 minutes total
- Disk space needed: ~500 MB temporary files
- The script downloads from the official Cedarville publication server
- Be patient - high-quality rendering takes time!