Building Gatehouse: A REST API for OpenBSD's PF Firewall
Every operation on my home firewall used to start with ssh gw and end with pfctl -f /etc/pf.conf.
It works — but it doesn’t scale when you want to toggle YouTube blocking from your phone, give family members limited control, or automate device-level parental controls.
So I built Gatehouse: a REST API that wraps PF operations in authenticated, rate-limited HTTP endpoints.
The Setup
I run a multi-VLAN home network on OpenBSD 7.8 with six network segments, content filtering, device-level blocking, and a growing collection of PF anchors.
The configuration lives in /etc/pf.d/ across dozens of files, managed through a Makefile with 80+ targets.
It works well from a terminal — but I wanted something I could hit from a browser or a script.
Architecture Overview
Gatehouse sits between clients and the PF firewall.
It doesn’t replace pfctl — it wraps it.
Every operation ultimately calls a system command (pfctl, tcpdump, make, shell scripts) through a centralised executor, validates inputs, and returns structured JSON.
┌─────────────┐ HTTPS/TLS ┌────────────────────┐ HTTP ┌───────────────┐
│ Browser │ ──────────────► │ relayd │ ──────► │ Gatehouse │
│ Script │ :8443 (TLS) │ (TLS termination) │ :8080 │ (FastAPI) │
│ App │ │ │ │ │
└─────────────┘ └────────────────────┘ └───────┬───────┘
│
┌────────────┼────────────┐
│ │ │
┌────▼───┐ ┌────▼───┐ ┌────▼────┐
│ pfctl │ │tcpdump │ │ scripts │
│ (PF) │ │(pflog) │ │ (make) │
└────────┘ └────────┘ └─────────┘TLS Termination with relayd
The API listens on 127.0.0.1:8080 — localhost only.
OpenBSD’s built-in relayd handles TLS termination and reverse proxying:
http protocol "gatehouse" {
match request header append "X-Forwarded-For" value "$REMOTE_ADDR"
match request header append "X-Forwarded-Port" value "$REMOTE_PORT"
match request header set "X-Forwarded-Proto" value "https"
match response header remove "Server"
tls keypair "gw.home.arpa"
tcp { nodelay, socket buffer 65536 }
websockets
}
relay "gatehouse-tls" {
listen on 192.168.1.2 port 8443 tls
forward to <gatehouse> port 8080
}TLS without baking certificate handling into the application, WebSocket proxying for live monitoring, and automatic X-Forwarded-For headers for audit logging.
Service Isolation
Gatehouse runs as the _gatehouse system user via OpenBSD’s rc.d framework:
daemon="/usr/local/share/gatehouse/venv/bin/python3"
daemon_flags="-m uvicorn app:app --host 127.0.0.1 --port 8080 --log-level info"
daemon_execdir="/usr/local/share/gatehouse"
daemon_user="_gatehouse"
daemon_logger="daemon.info"The _gatehouse user has doas permissions for exactly the commands it needs — pfctl, tcpdump, rcctl, crontab, and make in /etc/pf.d.
Nothing else.
The Technology Stack
Python 3 + FastAPI — async-capable, automatic OpenAPI docs, Pydantic validation
Uvicorn — ASGI server
python-jose — JWT token creation and validation
passlib — bcrypt password hashing (original auth)
ctypes — BSD
auth_userokay()integration (current auth)relayd — TLS reverse proxy (native OpenBSD)
FastAPI was chosen because it generates Swagger documentation at /docs automatically.
Every endpoint, request body, and response model is documented without writing a single line of API docs.
Authentication: From JSON Files to BSD Auth
The authentication system went through an evolution during development.
Version 1: JSON User Database
The initial implementation stored users in a users.json file with bcrypt-hashed passwords:
{
"admin": {
"password_hash": "$2b$12$..."
}
}A manage_users.py CLI handled user CRUD.
Simple, portable, but it meant maintaining a separate user database alongside the system’s own accounts.
Version 2: Native BSD Authentication
The current version uses OpenBSD’s auth_userokay(3) — the same function that validates passwords for SSH, console login, and every other authentication on the system.
One user database, one password policy, one source of truth.
import ctypes
_libc = ctypes.CDLL("libc.so")
_libc.auth_userokay.restype = ctypes.c_int
_libc.auth_userokay.argtypes = [
ctypes.c_char_p, # name
ctypes.c_char_p, # style
ctypes.c_char_p, # type
ctypes.c_char_p, # password (zeroed by callee)
]
def bsd_auth_verify(username: str, password: str) -> bool:
pw_buf = ctypes.create_string_buffer(password.encode("utf-8"))
result = _libc.auth_userokay(
username.encode("utf-8"),
None, # style: use default from login.conf
None, # type: default
pw_buf, # mutable buffer - auth_userokay zeroes it
)
return result != 0The ctypes FFI call is straightforward — auth_userokay takes a username, optional style/type, and a mutable password buffer.
It returns non-zero on success and zeroes the password buffer for security.
No subprocess, no shelling out, just a direct C library call.
Role-Based Access Control
Roles map to OpenBSD system groups:
| Group | Role | Access Level |
|---|---|---|
| Admin | Full control — reload PF, modify rules, manage users |
| Operator | Toggle features, manage tables and devices |
| Viewer | Read-only — inspection, analytics, status |
Adding a user to the appropriate group grants API access at that level. Standard Unix group management, no application-level user administration needed.
JWT Tokens
After authentication, clients receive a JWT token valid for 60 minutes:
{
"sub": "username",
"role": "admin",
"exp": 1708732800,
"iat": 1708729200
}Every subsequent request includes Authorization: Bearer <token>.
The middleware validates the token, extracts the role, and enforces minimum role requirements per endpoint.
Rate Limiting
Two layers of rate limiting protect the login endpoint:
Per-IP: 5 attempts per 60 seconds — stops brute-force attacks from a single source
Per-account: 5 failures per 300 seconds — prevents credential stuffing against a known username
Both use in-memory sliding windows with automatic cleanup. No external dependencies, no Redis, no database — just timestamp lists in a dictionary.
The Command Executor
Every PF operation ultimately runs a system command. The executor centralises this with consistent error handling, timeout management, and logging:
class CommandResult:
success: bool # True if returncode == 0
stdout: str # Full stdout
stderr: str # Full stderr
returncode: int # Exit code (-1 on exception)
class PFExecutor:
@staticmethod
def run(cmd: list[str], timeout: int = 30) -> CommandResult:
# subprocess.run with capture, timeout, logging
@staticmethod
def pfctl(*args: str) -> CommandResult:
# Shorthand: pfctl("-sr") → run(["/sbin/pfctl", "-sr"])
@staticmethod
def run_script(script: str, *args: str) -> CommandResult:
# Run from /etc/pf.d/scripts/
@staticmethod
def make(target: str, **kwargs: str) -> CommandResult:
# Run make target in /etc/pf.d with variablesEvery command is logged before execution.
Failures log stderr.
Exceptions are caught and returned as CommandResult with returncode=-1.
Nothing raises — callers always get a result they can inspect.
This pattern makes the API predictable: every mutation endpoint returns {success, message, output}, where output contains the raw command output for debugging.
API Endpoints: 66 Routes Across 9 Modules
The API is organised into nine controller modules, each handling a domain of PF management.
Health Check
GET / → {"status": "ok", "service": "gatehouse", "version": "1.0.0"}Core Operations (6 endpoints)
The "big red buttons" — PF control operations that affect the entire firewall:
POST /core/check # Syntax-check the production config
POST /core/reload # Check + reload PF rules
POST /core/restart # Full disable/enable cycle
POST /core/load-safe # Load emergency fallback rules
POST /core/flush-reload # Nuclear option: flush everything, reload
POST /core/backup # Backup /etc/pf.d to timestamped directoryload-safe is the panic button.
It loads a minimal ruleset that allows SSH from internal networks and basic outbound — enough to maintain access while you fix whatever went wrong.
Inspection (9 endpoints)
Read-only queries against the running PF state:
GET /inspection/rules # pfctl -sr (filter rules)
GET /inspection/nat # pfctl -sn (NAT rules)
GET /inspection/states # pfctl -ss (connection states)
GET /inspection/stats # pfctl -si (statistics)
GET /inspection/tables # pfctl -sT (table list)
GET /inspection/tables/{name} # pfctl -t <name> -T show
GET /inspection/anchors # Status of all 16 anchors
GET /inspection/anchors/{name} # pfctl -a <name> -sr
GET /inspection/device-block-status # All 4 device blocking typesThe anchors endpoint queries all 16 configured anchors and reports which have rules loaded. A single-call dashboard of what features are active.
Anchor/Feature Management (5 endpoints per feature)
This is where Gatehouse really shines — toggling PF features without editing config files or remembering pfctl syntax:
GET /anchors/status # All features
GET /anchors/{feature}/status # Single feature
POST /anchors/{feature}/enable # Load anchor rules
POST /anchors/{feature}/disable # Clear anchor rules
POST /anchors/{feature}/update # Refresh IP listsSixteen features are mapped to their anchor names and config files:
| Feature | Anchor | What It Does |
|---|---|---|
|
| Block YouTube on guest/IoT VLANs |
|
| Block Fortnite/Epic Games |
|
| Manual IP blacklisting |
|
| Security feed blocklists |
|
| Transparent HTTP/S proxy |
|
| Isolate IoT VLAN |
|
| mDNS/Bonjour service discovery |
|
| Per-device YouTube blocking |
|
| Full internet blocking per device |
|
| Service-specific blocking |
The update endpoint runs IP resolution scripts — for example, resolving YouTube’s current IP ranges via DNS and updating the PF table.
Table Management (12 endpoints)
Direct manipulation of PF tables — the data structures that hold IP lists:
Blackhole blocking:
GET /tables/blackhole # List blackholed IPs
POST /tables/blackhole # Add IP (with RFC1918 safety check)
DELETE /tables/blackhole # Remove IP
POST /tables/blackhole/flush # Clear auto-detected IPs
POST /tables/blackhole/auto-scan # Scan pflog for nuisance IPsFTSO RPC whitelist (dual-table system):
GET /tables/ftso-rpc # List permanent + temporary IPs
POST /tables/ftso-rpc/permanent # Add permanent (file-backed)
DELETE /tables/ftso-rpc/permanent # Remove permanent
POST /tables/ftso-rpc/temporary # Add temporary (memory only)
DELETE /tables/ftso-rpc/temporary # Remove temporary
POST /tables/ftso-rpc/flush-temp # Clear all temporaryThe dual-table pattern — permanent (file-backed) plus temporary (memory-only) — appears throughout the API.
Permanent entries survive PF reloads because they’re written to .ips files that PF reads on startup.
Temporary entries exist only in kernel memory and vanish on reload, making them perfect for testing.
Device Blocking (10 endpoints per type)
Four independent device blocking modes, each with identical endpoint structure:
POST /devices/{type}/enable # Load blocking rules
POST /devices/{type}/disable # Unload rules
GET /devices/{type}/list # Show permanent + temporary IPs
POST /devices/{type}/add # Add device IP (permanent)
POST /devices/{type}/add-temp # Add device IP (temporary)
DELETE /devices/{type}/remove # Remove from permanent
DELETE /devices/{type}/remove-temp # Remove from temporary
POST /devices/{type}/flush-temp # Clear temporary list
POST /devices/{type}/kill-states # Kill active connectionsTypes: full-block, services-block, youtube-24x7, youtube-sched
Every add operation validates the IP address: it must be RFC1918 (private), properly formatted, and not a gateway IP.
You can’t accidentally block the router.
The kill-states endpoint is crucial — adding an IP to a block table doesn’t affect existing connections.
You need to explicitly kill the connection states for the block to take immediate effect.
Connection States (4 endpoints)
POST /states/kill # Kill all states from an IP
POST /states/kill/youtube # Kill YouTube connection states
POST /states/kill/fortnite # Kill Fortnite states
POST /states/kill/netflix # Kill Netflix statesAnalytics (6 endpoints)
Parse /var/log/pflog for traffic analysis:
GET /analytics/summary # Total blocked, protocol breakdown
GET /analytics/top-ips # Top 20 blocked source IPs
GET /analytics/top-ports # Top 20 blocked destination ports
GET /analytics/protocols # Protocol distribution with percentages
GET /analytics/hourly # 24-hour activity breakdown
GET /analytics/recent # Last 50 blocked connections (raw)These endpoints run tcpdump -n -e -ttt -r /var/log/pflog and parse the output.
Not the fastest approach for large logs, but it works without any additional infrastructure and the pflog binary format ensures accurate packet data.
Cron Management (4 endpoints)
GET /cron/status # Status of all 5 scheduled jobs
GET /cron/{job}/status # Single job status
POST /cron/{job}/enable # Add cron entries
POST /cron/{job}/disable # Remove cron entriesFive managed cron jobs:
| Job | Schedule | Purpose |
|---|---|---|
| 6pm on / 9am off | Time-based device blocking |
| Sundays 2:30 AM | Refresh YouTube IP ranges |
| Sundays 3:00 AM | Refresh Fortnite IPs |
| Daily 3:00 AM | Update security blocklists |
| Every 15 min | Auto-scan pflog for nuisance IPs |
Crontab management uses marker comments (# gatehouse: device-block) to identify managed entries.
The API reads the crontab, adds/removes marked lines, and writes it back.
No external cron libraries — just crontab -l and crontab -.
DHCP Management (6 endpoints)
GET /dhcp/status # Service running/enabled status
POST /dhcp/start # rcctl start dhcpd
POST /dhcp/stop # rcctl stop dhcpd
POST /dhcp/restart # rcctl restart dhcpd
GET /dhcp/leases # Parse dhcpd.leases
GET /dhcp/subnets # Subnet utilisation from dhcpd.confInput Validation and Safety
Every user-provided IP address goes through validation before it touches PF:
def validate_device_ip(ip: str) -> tuple[bool, str]:
if not validate_ip(ip):
return False, "Invalid IP address format"
if not is_rfc1918(ip):
return False, "Only RFC1918 (private) addresses allowed"
if is_gateway_ip(ip):
return False, "Cannot block gateway IP addresses"
return True, ""Gateway IPs are hardcoded as protected — you physically cannot block your router through the API:
GATEWAY_IPS = {
"192.168.1.2", # Primary LAN gateway
"192.168.3.2", # Secondary LAN
"192.168.110.2", # Guest WiFi
"192.168.120.2", # IoT WiFi
"192.168.130.2", # Admin WiFi
}The blackhole endpoint adds an extra layer: by default, it refuses to blackhole RFC1918 addresses.
You can override with force: true, but you have to be explicit about it.
Parsing PF Output
PF’s output is designed for humans, not machines. The parsers bridge that gap.
PF statistics (pfctl -si) produces lines like:
State Table Total Rate
current entries 47
searches 12847293 234.2/s
inserts 89432 1.6/sThe parser extracts key-value pairs with regex and returns structured JSON.
pflog analysis is more involved — tcpdump output contains timestamps, interfaces, protocols, IPs, and ports in a format that varies by protocol:
2024-01-15 14:23:45.123456 rule 0/(match) block in on igc1: 203.0.113.5.44231 > 192.168.1.2.22: SThe analytics parsers use regex to extract source IPs, ports, protocols, and hours, then aggregate with Python’s Counter.
The Modular Architecture
The codebase follows a controller-service pattern:
api/src/
├── auth/ # Authentication + JWT + rate limiting
├── core/ # PF check/reload/restart/backup
├── inspection/ # Read-only PF queries
├── anchors/ # Feature toggle management
├── tables/ # PF table operations
├── devices/ # Device-level blocking
├── states/ # Connection state management
├── analytics/ # pflog analysis
├── cron/ # Scheduled task management
├── dhcp/ # DHCP service management
└── common/
├── executor.py # Command execution
├── parsers.py # Output parsing
└── validators.py # Input validationControllers handle HTTP concerns — route definitions, request parsing, response formatting, authentication guards. They’re thin wrappers that delegate to services.
Services contain business logic — input validation, command construction, output parsing, file I/O. They use the executor for all system commands and return structured results.
Common modules are shared utilities. The executor, parsers, and validators have no HTTP awareness — they’re pure functions that could be used from a CLI tool or test harness.
What I Learned
OpenBSD is a great API platform
The base system includes everything you need for a secure API deployment: relayd for TLS termination, rc.d for service management, doas for privilege escalation, crontab for scheduling, and auth_userokay for authentication.
No third-party packages needed for infrastructure.
PF anchors are the perfect API primitive
PF anchors — named rule containers that can be loaded and unloaded at runtime — map perfectly to REST endpoints.
POST /anchors/youtube/enable loads rules into the youtube-block anchor.
POST /anchors/youtube/disable clears it.
The main ruleset never changes, so there’s no risk of a syntax error taking down the firewall.
Dual tables solve the testing problem
The permanent + temporary table pattern appeared organically. When testing device blocking, I wanted to add an IP temporarily, verify it worked, then either make it permanent or remove it. File-backed tables persist across reloads; memory-only tables don’t. Both are first-class concepts in the API.
State killing is as important as rule loading
The first version of device blocking added IPs to tables but didn’t kill existing connection states.
Blocked devices stayed connected for minutes until their TCP sessions naturally expired.
Adding kill-states endpoints made blocks take effect immediately.
Parsing shell output is fragile but practical
Ideally, pfctl would have a --json flag.
It doesn’t.
Parsing text output with regex is fragile — a format change in a future OpenBSD release could break things.
But it works today, the output formats have been stable for years, and the alternative (writing a C program that uses PF’s ioctl interface) is dramatically more complex for a home network API.
What’s Next
The Python implementation works, but it has deployment friction: a virtual environment, pip dependencies, Python runtime, and a multi-process ASGI server for what is fundamentally a single-binary problem.
The next step is a Go rewrite — same API surface, same endpoints, same JSON shapes — but compiled to a single static binary. No venv, no pip, no Python runtime. Just copy the binary and run it.
But that’s a story for another post.
Gatehouse is open-source and runs on OpenBSD 7.8. The PF configuration it manages handles six network segments with content filtering, device-level parental controls, security blocklists, and dynamic feature toggles — all controllable through 66 REST endpoints.
Comments
Loading comments...