Alert Troubleshooting
This guide covers common alert scenarios and their resolutions.
Critical Alert Playbooks
Device Offline (CONN_001)
Immediate Actions:
- Check Console - Verify device shows as offline
- Ping device (if on same network)
ping {device-ip} - Check physical - Power, network cable, indicator lights
If device unreachable:
- Check network switch/router
- Verify VLAN configuration
- Check PoE power budget
If device reachable but still showing offline:
- Access camera web UI
- Check Anava ACAP status (Apps → Anava Agent)
- Restart ACAP if stopped
- Check MQTT configuration hasn't changed
Certificate Validation Failed (SEC_001)
This is a security-critical alert. Investigate immediately.
-
Check certificate expiry
Console → Devices → [Device] → Certificates -
Verify CA chain
- Device should trust Anava root CA
- Check for CA mismatch
-
Investigate network path
- Could indicate MITM attempt
- Check for proxy interception
- Verify DNS resolution
-
If legitimate expiry:
- Initiate certificate rotation
- Console → Devices → Actions → Rotate Certificate
Unauthorized Broker Attempt (SEC_002)
Security incident - treat as high priority.
-
Check ConfigGuardian logs
Console → Devices → [Device] → Logs → Filter: ConfigGuardian -
Identify the attempted broker
- Alert context shows attempted hostname/IP
-
Determine source:
- Manual configuration change?
- Network-level redirect?
- Malicious tampering?
-
Response:
- Verify device is now connected to correct broker
- Check for other affected devices
- Review network security
Common Scenarios
Cluster of Devices Go Offline
Pattern: Multiple devices in same location offline simultaneously
Likely causes:
- Network switch/router failure
- PoE switch overload
- ISP/WAN outage
- DHCP server issue
Investigation:
- Check if all devices are in same subnet/location
- Verify network infrastructure status
- Check DHCP lease availability
- Contact network team
Repeated Configuration Conflicts (CFG_003)
Pattern: Same device shows CFG_002 (healed) followed by CFG_003 (conflict) repeatedly
Likely causes:
- Someone manually changing camera settings
- Third-party integration modifying MQTT config
- Script or automation fighting ConfigGuardian
Resolution:
- Identify who/what is making changes
- If legitimate: Update golden config through Console
- If unauthorized: Investigate access, review camera audit logs
Memory Slowly Increasing
Pattern: RES_002 alerts appearing on same device over days/weeks
Likely causes:
- Memory leak in ACAP
- Increasing number of concurrent operations
- Camera firmware issue
Resolution:
- Check ACAP version - upgrade if outdated
- Review skill configuration - reduce if excessive
- Schedule regular ACAP restart as workaround
- Report to support with device diagnostics
Certificate Expiry Wave
Pattern: Multiple SEC_003 alerts across fleet
Likely causes:
- Devices provisioned at same time, certs expiring together
- Certificate rotation not running
Resolution:
- Check Console → Settings → Certificates → Auto-rotation status
- If disabled, enable auto-rotation
- Manually rotate affected devices if urgent
- Review certificate lifecycle policy
Diagnostic Commands
From Anava Console
| Action | Path |
|---|---|
| View device logs | Devices → [Device] → Logs |
| Download diagnostics | Devices → [Device] → Actions → Download Diagnostics |
| Check connectivity | Devices → [Device] → Connection Status |
| View alert history | Events → Filter by device |
| Restart ACAP | Devices → [Device] → Actions → Restart Application |
From Camera Web UI
| Action | Path |
|---|---|
| View ACAP status | Apps → Anava Agent |
| Check ACAP logs | System → Logs → Application Log |
| View network settings | System → Network |
| Check MQTT config | Settings → MQTT (if accessible) |
Direct API Queries
# Get device status
curl -H "Authorization: Bearer $TOKEN" \
"https://api.anava.ai/v1/devices/ACCC8EF12345/status"
# Get recent alerts for device
curl -H "Authorization: Bearer $TOKEN" \
"https://api.anava.ai/v1/devices/ACCC8EF12345/alerts?hours=24"
# Get alert details
curl -H "Authorization: Bearer $TOKEN" \
"https://api.anava.ai/v1/alerts/{alertId}"
Alert Noise Reduction
Too Many Alerts?
-
Adjust thresholds
Console → Settings → Alert Rules → Thresholds- Increase latency threshold if false positives
- Adjust memory warning level for known high-usage devices
-
Use alert rules
rule:
name: "Suppress INFO on test devices"
condition:
group: "test-lab"
severity: "INFO"
action:
suppress: true -
Configure quiet hours
Console → Settings → Notifications → Quiet Hours- Suppress non-critical alerts during off-hours
Not Enough Alerts?
-
Check notification settings
- Verify email/Slack channels configured
- Check spam folders
-
Verify alert rules
- Ensure no rules suppressing needed alerts
-
Test alert pipeline
Console → Devices → [Device] → Actions → Send Test Alert
Escalation Guide
When to Escalate to Anava Support
| Scenario | Priority | Information to Include |
|---|---|---|
| Security incident (SEC_002) | P1 | Device ID, timestamps, network logs |
| Fleet-wide outage | P1 | Affected device list, network topology |
| Repeated ACAP crashes | P2 | Device diagnostics bundle, ACAP version |
| Unexplained alerts | P3 | Alert IDs, patterns observed |
How to Collect Diagnostics
-
Via Console:
Devices → [Device] → Actions → Download Diagnostics -
Via API:
curl -H "Authorization: Bearer $TOKEN" \
"https://api.anava.ai/v1/devices/ACCC8EF12345/diagnostics" \
-o diagnostics.zip -
Include:
- Device ID and firmware version
- ACAP version
- Timestamps of issues
- Steps to reproduce (if applicable)
Related Documentation
- Overview - Alert system introduction
- Alert Codes Reference - Complete code list
- ConfigGuardian Alerts - Configuration-specific alerts
Last updated: December 2025