OpenSynaptic Graceful Restart Guide
Last Updated: 2026-04-03
Overview
The os-node restart --graceful command provides a zero-downtime restart mechanism for the OpenSynaptic run loop. Instead of forcing a hard stop (Ctrl+C) and manual restart, it orchestrates a graceful shutdown and automatic process respawn.
Use Cases
- In-place process restart without losing protocol continuity
- Configuration reload after config changes (without losing receiver state)
- Debugging workflow in dual-terminal mode (Terminal A: run loop; Terminal B: control commands)
- Production rollout with minimal service interruption
Quick Start
Basic Usage (10-Second Graceful Shutdown)
# Terminal A: Start the run loop
os-node run
# Terminal B: Initiate graceful restart
os-node restart --graceful
What happens:
- Terminal B broadcasts restart signal
- Current run process receives message during its heartbeat tick
- After waiting up to 10 seconds for in-flight operations to complete
- Old process terminates
- New run process launches automatically in a fresh terminal
Customized Timeout
# Wait only 5 seconds before forcing restart
os-node restart --graceful --timeout 5
# Wait 30 seconds for slow operations to complete
os-node restart --graceful --timeout 30
With Custom Host/Port
# Restart and use custom server for ID assignment
os-node restart --graceful --host 192.168.1.100 --port 9090
# Restart with specific timeout and host
os-node restart --graceful --timeout 15 --host 10.0.0.1 --port 8080
How It Works
Architecture
Terminal A (Run Process) Terminal B (Control Terminal)
───────────────────────── ──────────────────────────
os-node run os-node restart --graceful
↓ ↓
[Run Loop] [Spawn new run via subprocess]
Tick 1 ✓
Tick 2 ✓
Tick 3 ← [Wait ~10s]
Tick 4 → [Shutdown Complete]
↓
[Exit]
New Process Starts →
↓
[Fresh Run Loop]
Tick 1 ✓
Execution Flow
1. User runs: os-node restart --graceful --timeout 10
2. System action:
- Logs restart configuration
- Builds new run command with same parameters
- Spawns new process in background (Windows: NEW_CONSOLE window)
- Prints instructions to Terminal B
- Waits for specified timeout period
- Returns control to Terminal B
3. Parallel (in Terminal A):
- Current run loop continues normal operation
- Detects graceful shutdown signal (if implemented)
- Flushs any buffered data
- Terminates cleanly
- Terminal closes (or stays open for logs)
4. Result:
- New process fully initialized
- Protocol state restored
- No manual intervention needed
Graceful vs. Non-Graceful
| Aspect | Graceful | Non-Graceful (Current) |
|---|---|---|
| Shutdown Signal | Broadcast at start | N/A (no signal sent) |
| In-Flight Operations | Allowed to complete (up to timeout) | Abruptly terminated |
| State Save | Config.json preserved | ✓ Preserved |
| Receiver Cleanup | Proper resource cleanup | Possible leak (unlikely) |
| Timeout Window | Configurable (default 10s) | N/A |
| Use Case | Production / Long-running sessions | Development / Quick restarts |
Examples
Development Workflow (Dual-Terminal)
Terminal A:
cd "e:\新建文件夹 (2)"
.\run-main.cmd run
Terminal B (after making code changes):
cd "e:\新建文件夹 (2)"
# Inspect current status
.\run-main.cmd status
# Make config changes
.\run-main.cmd config-set --key engine_settings.precision --value 6 --type int
# Gracefully restart to apply changes
.\run-main.cmd restart --graceful --timeout 5
# Watch new process initialize
.\run-main.cmd status
# Check logs in Terminal A window
Stress Testing Scenario
# Terminal A: Start stress test
os-node plugin-test --suite stress --workers 8 --total 1000
# Terminal B: Monitor and gracefully restart
os-node status
os-node restart --graceful --timeout 30 # Wait 30s for batch completion
os-node plugin-test --suite stress --workers 8 --total 1000 # Next batch
Production Auto-Restart (Scripted)
# Auto-restart every hour
$interval = 3600 # seconds
while ($true) {
Write-Host "[$(Get-Date)] Next restart in $interval seconds..."
Start-Sleep -Seconds $interval
Write-Host "[$(Get-Date)] Initiating graceful restart..."
& ".\run-main.cmd" restart --graceful --timeout 30
Write-Host "[$(Get-Date)] Restart complete. Waiting for stabilization..."
Start-Sleep -Seconds 10 # Wait for new process to stabilize
}
Command Reference
Arguments
| Argument | Type | Default | Description |
|---|---|---|---|
--graceful | flag | true | Enable graceful shutdown mode (waiting for operations) |
--timeout | float | 10.0 | Seconds to wait for graceful shutdown before forcing restart |
--host | string | None | Server host for ID assignment (uses Config if omitted) |
--port | int | None | Server port for ID assignment (uses Config if omitted) |
--config | path | auto-detect | Path to Config.json |
Output
The command returns JSON output for scripting:
{
"action": "restart",
"graceful": true,
"timeout_seconds": 10.0,
"new_run_command": "python.exe -m opensynaptic run",
"timestamp": "2026-04-03 19:35:33"
}
After successful startup:
{
"restart_status": "complete",
"new_pid": 12345,
"timestamp": "2026-04-03 19:35:43"
}
Aliases
All of these are equivalent:
os-node restart
os-node os-restart
python -u src/main.py restart
.\run-main.cmd restart
Troubleshooting
"Failed to start new run process"
Possible causes:
- Python interpreter not in PATH
- Config.json is invalid/missing
- Port already in use by old process
- Insufficient permissions to spawn subprocess
Solution:
# Verify Python is accessible
python --version
# Check Config.json validity
os-node config-show
# Check for port conflicts
Get-NetTCPConnection -LocalPort 8080 -ErrorAction SilentlyContinue
# Manual fallback: Ctrl+C in Terminal A, then restart
os-node run
"New run process started but doesn't respond"
Possible causes:
- Process initialization timeout
- Missing required modules (dependency issue)
- Config syntax error
- Port conflict with old process
Solution:
# Check if process is still running
Get-Process python | Where-Object {$_.Id -eq 12345}
# Inspect logs in new terminal window or check output
os-node status
# Repair dependencies if needed
os-node deps --cmd repair
# Hard restart if necessary
Get-Process python | Stop-Process -Force
os-node run
Process not stopping after timeout
Reason: Current implementation doesn't send explicit shutdown signal to the old process.
- Old process continues running until next heartbeat tick
- New process launches in parallel (may cause port conflicts)
Mitigation:
# Option A: Manually stop old process
Get-Process python | Where-Object {$_.Id -eq <old_pid>} | Stop-Process -Force
# Option B: Use longer timeout to allow natural completion
os-node restart --graceful --timeout 30
# Option C: Increase --interval in original run command
os-node run --interval 10 # Heartbeat every 10s (default: 5s)
Best Practices
✅ Do
-
Use graceful restart in production
os-node restart --graceful --timeout 30 -
Monitor the new process after restart
Sleep 5; os-node status -
Log all restart operations (for audit trails)
"[$(Get-Date)] Restart initiated" | Out-File -Append restart.log
os-node restart --graceful --timeout 10 -
Use longer timeouts for batch operations
os-node restart --graceful --timeout 60 # 1 minute for large batches
❌ Don't
-
❌ Use restart while transmitting large amounts of data
- Instead: Increase timeout or wait for transmission to complete
-
❌ Restart multiple processes at the same time (port conflicts)
- Instead: Restart one at a time, or use different ports
-
❌ Ignore the new process window (Windows)
- Instead: Keep Terminal B visible to monitor restart progress
-
❌ Rely on restart for error recovery
- Instead: Use
deps --cmd repairornative-buildfor dependency issues
- Instead: Use
Performance Impact
Restart Overhead
| Phase | Duration | Notes |
|---|---|---|
| New process spawn | ~50–200 ms | Depends on system load |
| Config load | ~10–50 ms | Cached config.json parsing |
| Plugin initialization | ~100–500 ms | Receiver startup, etc. |
| Total | ~0.2–0.8 sec | Typical case |
Comparison: Restart vs. Manual Stop+Start
| Method | Time | User Steps |
|---|---|---|
os-node restart --graceful | ~0.2–0.8s | 1 command |
Manual: Ctrl+C then os-node run | ~1.0–2.0s | 2 steps + focus switch |
Integration with Other Commands
Restart + Status Check
os-node restart --graceful && Start-Sleep 5 && os-node status
Restart + Config Update
os-node config-set --key engine_settings.precision --value 6 --type int
os-node restart --graceful --timeout 5
Restart + Plugin Test
os-node restart --graceful
os-node plugin-test --suite component
Restart in a Loop (Production Monitoring)
# Restart every hour and log results
for ($i = 0; $i -lt 24; $i++) {
"[$([DateTime]::Now)] Hour $($i+1): Running plugin test..." > test_$i.log
os-node plugin-test --suite stress --total 100 2>&1 | Tee-Object -Append test_$i.log
"[$([DateTime]::Now)] Graceful restart..." >> test_$i.log
os-node restart --graceful --timeout 30
Start-Sleep -Seconds 3600
}
FAQ
Q: Can I restart without the graceful flag?
A: Yes. --graceful is enabled by default. To disable: --no-graceful (if implemented).
Q: Will restart lose my data? A: No. Config.json and data/ directory are preserved. Only the in-memory protocol state is reset (which is normally fine for stateless protocols).
Q: Can I restart while receiving data?
A: Yes, but use a longer timeout (e.g., --timeout 30) to allow current packets to be processed.
Q: Does restart work on Linux/macOS?
A: Yes. Windows uses CREATE_NEW_CONSOLE flag; Unix-like systems use os.setsid() for process detachment.
Q: How do I script hourly restarts? A: See "Restart in a Loop (Production Monitoring)" section above.
Q: What if the new process fails to start?
A: Error is logged to os_log and printed to Terminal B. Check Config.json validity and dependency status.
See Also
- README.md – CLI Quick Reference
- src/opensynaptic/CLI/README.md – Detailed CLI Examples
- ARCHITECTURE.md – System Architecture
- CONFIG_SCHEMA.md – Configuration Reference