CI/CD Pipeline
The CI/CD pipeline automates quality validation, testing and application deployment.
CI/CD: Continuous Integration / Continuous Deployment (Glossary).
Practical workflows: see end of page for concrete examples.
Infrastructure
The CI/CD pipeline relies on the infrastructure described in Technical Architecture:
Standalone VM dataia (virsh/KVM) hosting the entire stack
2 isolated Docker containers: preprod (port 8500) and prod (port 8501)
GitHub self-hosted runner orchestrating automatic deployments
This architecture enables deployment without manual VPN connection. The runner executes git reset --hard SHA + docker-compose restart directly on the VM.
Pipeline Architecture
Parallel Pipeline with Automatic Rollback
Push to main
│
├──────────────────────────────────────┐
│ │
▼ ▼
┌─────────────────────────────────┐ ┌──────────────────────────────────┐
│ 1. CI Pipeline │ │ 2. CD Preprod │
│ (GitHub-hosted runners) │ │ (Self-hosted runner) │
│ │ │ │
│ - PEP8, Black, Docstrings │ │ ⚡ DEPLOY FIRST (~40s) │
│ - Unit tests │ │ - Save rollback point │
│ - Coverage ≥90% │ │ - git reset --hard SHA │
│ - Infrastructure tests │ │ - docker-compose restart │
│ │ │ - Health check │
│ Duration: ~2-3 minutes │ │ - Launch CI watcher script │
└────────────┬────────────────────┘ │ │
│ │ 🔍 WATCH CI IN BACKGROUND │
│ │ - Poll CI status (5 min max) │
│ │ │
├───────────────────────┤ │
│ │ │
▼ ▼ │
✅ CI SUCCESS ❌ CI FAILURE │
│ │ │
│ ▼ │
│ 🔄 AUTOMATIC ROLLBACK │
│ - git reset --hard last-validated-sha │
│ - docker-compose restart │
│ - Discord notification with CI logs │
│ │
▼ │
Mark SHA validated │
Success notification │
└──────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ 3. CD Production - Manual with confirmation │
│ (Self-hosted runner on dataia VM) │
│ - Automatic backup before deploy │
│ - git reset --hard validated SHA │
│ - docker-compose restart │
│ - Health checks (3 attempts) │
│ - Discord notifications │
│ - URL: https://backtothefuturekitchen.lafrance.io/ │
└─────────────────────────────────────────────────────────┘
Key advantages:
⚡ Ultra-fast deployment: 40s instead of 3-5 min (no CI wait)
🔒 Guaranteed security: Automatic rollback if CI fails
🎯 Traceability: Each deployment corresponds exactly to tested SHA
🔄 Runner freed: Self-hosted runner available immediately
GitHub Actions Workflows
1. CI Pipeline
File: .github/workflows/ci.yml
Triggers:
on:
push:
branches: [main]
pull_request:
branches: [main]
Executed jobs:
Job 1: Quality Checks
PEP8 compliance (flake8) - max 88 characters per line
Code formatting (black)
Docstring validation (pydocstyle) - Google style
Type checking (mypy) - optional, warning mode
Job 2: Preprod Tests
Python 3.13.7 with uv
Minimum coverage: 90%
Command:
pytest tests/ -v --cov=src --cov-report=html --cov-fail-under=90Artifacts: HTML coverage report (30 days retention)
Job 3: Infrastructure Tests
S3, DuckDB, SQL tests (50_test/)
Mode: continue-on-error (requires credentials)
Job 4: Summary
Final summary of all jobs
2. CD Preprod
File: .github/workflows/cd-preprod.yml
Innovative architecture: Deploy First, Test in Parallel
Principle: Deploy immediately without waiting for CI, then automatic rollback if tests fail.
Advantages:
⚡ Ultra-fast deployment (~40 seconds instead of 3-5 minutes)
🔄 Self-hosted runner freed immediately
🛡️ Security guaranteed by automatic rollback
🎯 Each deployment corresponds exactly to tested SHA
Phase 1: Immediate Deployment (~40s)
💾 Save rollback point (
/var/app-state/last-validated-sha.txt)📢 Discord notification - Deployment started
📥 Fetch commits -
git fetch origin main🔄 Deploy exact SHA -
git reset --hard ${{ github.sha }}🐳 Restart container -
docker-compose -f docker-compose-preprod.yml restart🔍 Quick health check - 3 attempts on https://mangetamain.lafrance.io/
👀 Launch watcher - Background script monitoring CI
✅ Discord notification - Deployment complete, CI in progress
Phase 2: Background CI Monitoring
Watcher script (/tmp/watch-ci-SHA.sh):
⏳ Wait 30s - Wait for CI startup
🔍 Poll CI status - Check every 10s for 5 minutes max
If CI succeeds ✅: Mark SHA validated, success notification
If CI fails ❌: Automatic rollback to last validated SHA, notification with details
Watcher logs: /tmp/ci-watcher-SHA.log
Why ``git reset –hard SHA`` instead of ``git pull``?
# ❌ WRONG: git pull (takes latest commit from main)
git pull origin main
# ✅ CORRECT: reset to exact SHA that triggered this workflow
git fetch origin main
git reset --hard acfdb42... # Precise SHA
Guarantee: Deployed code = code tested by CI ✅
3. CD Production
File: .github/workflows/cd-prod.yml
Trigger: Manual only (workflow_dispatch)
Mandatory confirmation: Type « DEPLOY » to validate
Workflow:
📋 User confirmation - « DEPLOY » input required
📊 Environment status - Displays PREPROD vs PROD SHA
💾 Automatic backup - Save before deployment
📢 Discord notification - PROD deployment started
🔄 Deploy
deploy_preprod_to_prod.sh🐳 Restart container -
docker-compose -f docker-compose-prod.yml restart🔍 Health checks - 5 attempts with retry
✅ Discord notification - Success or failure with rollback instructions
Manual rollback if needed:
ssh dataia
cd ~/mangetamain/20_prod
git reset --hard PREVIOUS_SHA
docker-compose -f ~/mangetamain/30_docker/docker-compose-prod.yml restart
4. Health Check Monitoring
File: .github/workflows/health-check.yml
Frequency: Every hour (cron: 0 * * * *)
Verifications:
Performed checks:
HTTP status 200
Valid HTML content (presence of « Mangetamain »)
Timeout: 10 seconds
Discord notifications: Alerts if service down
5. Documentation Build
File: .github/workflows/documentation.yml
Triggers:
on:
push:
branches: [main]
paths:
- '90_doc/source/**'
- '90_doc/requirements.txt'
- '.github/workflows/documentation.yml'
workflow_dispatch:
Principle: Workflow completely isolated from preprod/prod CI/CD. A doc build failure never impacts application deployments.
Workflow:
📦 Setup Python 3.13.7
📥 Install Sphinx dependencies (sphinx, sphinx-rtd-theme, myst-parser)
🔨 Build documentation (
sphinx-build -b html source build/html)📄 Add
.nojekyllfile (disables Jekyll processing)🚀 Deploy to GitHub Pages (
gh-pagesbranch)📬 Discord notification (failures only)
Documentation URL: https://julienlafrance.github.io/backtothefuturekitchen/
Discord notifications: Failures only (like CI Pipeline)
Documentation architecture:
90_doc/
├── source/ # .rst files (tracked in Git)
│ ├── conf.py # Sphinx configuration
│ └── *.rst # Documentation pages
├── build/ # Generated HTML (ignored by Git)
└── Makefile # Sphinx commands
Update workflow:
cd 90_doc/source
# Modify .rst files
vim installation.rst
# Local build to test (optional)
cd ..
make html
firefox build/html/index.html
# Commit and push
git add source/
git commit -m "Doc: update installation"
git push
# → GitHub Actions builds automatically
# → Doc published in 2-3 minutes on GitHub Pages
Key points:
💾 HTML removed from main repo (38 MB saved)
⚡ Automatic build on push to main
🌐 Deployment on separate
gh-pagesbranch🔒 Isolated workflow: doc failure ≠ preprod/prod impact
📝 Only
.rstfiles are tracked in Git
GitHub Pages vs tracked HTML advantages:
Correct GitHub statistics (Python instead of HTML)
Lighter repo (-38 MB)
Professional and stable URL
No Git history pollution with generated HTML
Practical Commands
Local Verification Before Push
# Local verification script
./70_scripts/run_ci_checks.sh preprod # Test 10_preprod
./70_scripts/run_ci_checks.sh prod # Test 20_prod
Manual Workflow Trigger
# Via GitHub CLI
# CD Preprod (discouraged, normally automatic)
gh workflow run cd-preprod.yml
# CD Production
gh workflow run cd-prod.yml
# Health Check
gh workflow run health-check.yml
Check CI/CD Status
# List recent runs
gh run list --limit 10
# View logs of specific run
gh run view RUN_ID --log
# Watch run in real-time
gh run watch RUN_ID
Check PREPROD Watcher Logs
ssh dataia
ls -lh /tmp/ci-watcher-*.log
tail -f /tmp/ci-watcher-LATEST.log
Self-Hosted Runner
Configuration
Location: dataia VM (VPN network)
Advantage: Deployment without manual VPN connection
Labels: self-hosted, Linux, X64
Services: GitHub Actions Runner service
Check Runner Status
ssh dataia
sudo systemctl status actions.runner.*
# Runner logs
journalctl -u actions.runner.* -f
Discord Notifications
Configured Webhooks
CI Pipeline: Failures only
CD Preprod: All deployments + rollbacks
CD Prod: All deployments + rollbacks
Health Check: DOWN alerts only
Documentation: Failures only
Preprod Message Format
🚀 PREPROD - Deployment started
SHA: acfdb42
Author: @user
Message: Fix bug analysis
⏳ CI verification in progress...
Prod Message Format
🎯 PRODUCTION - Deployment successful ✅
SHA: acfdb42
PREPROD ✅ → PROD ✅
URL: https://backtothefuturekitchen.lafrance.io/
Troubleshooting
Error: flake8 not found
Solution: Install dev dependencies
cd ~/mangetamain/10_preprod
uv pip install -e ".[dev]"
Error: Coverage < 90%
Solution: Add tests or exclude non-testable code
# In code to exclude
def main(): # pragma: no cover
st.title("Application")
Error: Missing docstring
Solution: Add Google-style docstrings
def my_function():
"""Brief description of the function.
Detailed description if needed.
Args:
param: Description
Returns:
Description
"""
pass
CI fails but local works
Possible reasons:
Different Python versions (CI: 3.13.7, Local: other)
Uncommitted files
Missing dependencies in pyproject.toml
Solution:
# Check untracked files
git status
# Update dependencies
git add pyproject.toml
git commit -m "fix: update dependencies"
CD Preprod Blocked
Cause: CI failed, automatic rollback performed
Solution: Fix errors reported by CI, then push fix
Manual Production Rollback
If PROD deployment failed:
ssh dataia
cd ~/mangetamain/20_prod
# Find last validated SHA
git log --oneline -5
# Rollback
git reset --hard PREVIOUS_SHA
docker-compose -f ~/mangetamain/30_docker/docker-compose-prod.yml restart
# Verify
curl https://backtothefuturekitchen.lafrance.io/
Required Configuration
GitHub Secrets
DISCORD_WEBHOOK_URL: Discord webhook for notifications
Environment Variables
GITHUB_TOKEN: Automatic GitHub Actions token (provided)
Runner Labels
self-hosted: Runner on dataia VM
Metrics
Pipeline Performance
Phase |
Duration |
Runner |
|---|---|---|
CI Quality |
~2 minutes |
GitHub-hosted |
CI Tests |
~2 minutes |
GitHub-hosted |
CD Preprod |
~40 seconds |
Self-hosted |
CD Prod |
~1 minute |
Self-hosted |
Health Check |
~7 seconds |
Self-hosted |
Reliability
PREPROD Uptime: ~99.5%
PROD Uptime: ~99.9%
Automatic rollbacks: 100% success
False positive health checks: <1%
Concrete Workflow Examples
Feature Development
Scenario: Add new seasonal analysis
# 1. Create branch
git checkout -b feature/analyse-mensuelle
# 2. Develop
# Modify src/visualization/analyse_mensuelle.py
# Add tests in tests/unit/test_analyse_mensuelle.py
# 3. Check locally
uv run flake8 src/ tests/
uv run pytest tests/unit/ --cov=src --cov-fail-under=90
# 4. Commit and push
git add .
git commit -m "Add monthly analysis with tests"
git push origin feature/analyse-mensuelle
# 5. Create PR
gh pr create --title "Monthly analysis" --body "New analysis by month"
# → CI runs automatically on branch
# → If tests pass → Merge to main possible
# → After merge → CD PREPROD runs automatically
Timeline:
Push branch → CI (2min) → PR review → Merge → CD PREPROD (40s) → App live
↓
Tests OK/KO
↓
Blocks merge if KO
Production Hotfix
Scenario: Critical bug in production requires immediate fix
# 1. Identify problematic commit
gh run list --limit 10
# Find last successful PROD deploy
# 2. Create hotfix branch
git checkout -b hotfix/fix-rating-bug
# 3. Quick fix + test
# Modify src/visualization/analyse_ratings.py
# Add regression test
# 4. Push and quick merge
git add . && git commit -m "Fix critical ratings bug"
git push origin hotfix/fix-rating-bug
gh pr create --title "[HOTFIX] Fix ratings" --body "Fix 5-star ratings bug"
# 5. After merge → Wait for CD PREPROD (auto)
# 6. Verify PREPROD OK then manual PROD deploy
gh workflow run cd-prod.yml
# Type "DEPLOY" in confirmation
Total duration: ~5-10 minutes (CI + CD PREPROD + check + CD PROD)
Rollback After Error
Scenario: PROD deployment breaks app, need immediate rollback
Option 1 - Rollback via Git:
# On dataia VM
ssh dataia
cd ~/mangetamain/20_prod
# Identify stable commit
git log --oneline -10
# Ex: abc1234 Stable version before bug
# Rollback
git reset --hard abc1234
# Restart
cd ../30_docker
docker-compose -f docker-compose-prod.yml restart
Duration: ~1 minute
Option 2 - Rollback via Re-deploy:
# Locally, return to stable commit
git revert HEAD # Or git reset --hard <stable-sha>
git push origin main
# CI/CD PREPROD runs
# Verify PREPROD OK
# Deploy PROD
gh workflow run cd-prod.yml # Type DEPLOY
Duration: ~5 minutes (safer, goes through CI/CD)
Deployment Monitoring
Monitor in real-time:
# Option 1: gh CLI
gh run watch
# Option 2: SSH + Docker logs
ssh dataia "docker-compose -f 30_docker/docker-compose-preprod.yml logs -f --tail=50"
# Option 3: Discord webhook
# Automatic notifications in #deployments channel
Check health:
# PREPROD
curl -s https://mangetamain.lafrance.io/_stcore/health | jq
# PROD
curl -s https://backtothefuturekitchen.lafrance.io/_stcore/health | jq
Expected response:
{
"status": "ok",
"uptime": 12345.67
}
Best Practices
Commits
Message format:
<type>: <short description>
<optional detailed description>
Types: feat, fix, docs, test, refactor, perf, ci
Examples:
# Feature
git commit -m "feat: add season filter in trends analysis"
# Bugfix
git commit -m "fix: correct ratings average calculation"
# Tests
git commit -m "test: add weekend analysis tests (coverage +5%)"
# Documentation
git commit -m "docs: enrich visualization API with examples"
Pull Requests
PR Template:
## Description
Brief description of the change
## Changes
- [ ] Add feature X
- [ ] Tests coverage ≥ 90%
- [ ] Documentation updated
## Tests
```bash
pytest tests/unit/test_new_feature.py -v
```
## Screenshots (if UI)
 
Review checklist:
Code follows PEP8 (flake8 passes)
Tests added (coverage ≥ 90%)
Documentation updated
No committed credentials
Branch up to date with main
CI/CD
Avoid CI failures:
# Before each push, run locally
uv run flake8 src/ tests/
uv run black --check src/ tests/
uv run pytest tests/unit/ --cov=src --cov-fail-under=90
# Pre-push hook script (.git/hooks/pre-push)
#!/bin/bash
echo "Running pre-push checks..."
uv run flake8 src/ tests/ || exit 1
uv run pytest tests/unit/ --cov=src --cov-fail-under=90 || exit 1
echo "✓ All checks passed"
Optimize CI:
Use uv cache for dependencies
Parallelize independent tests
Skip CI if [skip ci] in commit message (docs only)
Deployment
Checklist before PROD deploy:
✅ PREPROD works correctly
✅ Manual tests performed on PREPROD
✅ No errors in PREPROD logs
✅ Acceptable performance (load time < 10s)
✅ Automatic backup performed (verified)
Optimal timing:
Avoid: Friday evening, just before weekend
Prefer: Tuesday-Thursday morning (time to monitor)
Communication:
Announce maintenance if downtime > 1 minute
Automatic Discord notifications
See Also
Tests and Coverage - Unit tests and coverage
Quality Standards - Academic compliance
Technical Architecture - Complete technical architecture
Quick Start Guide - Essential Git/CI/CD commands
FAQ - Frequently Asked Questions - CI/CD FAQ and troubleshooting
README_CI_CD.md (root) - Complete detailed documentation (982 lines)