Skip to main content

Verification

After installation, verify the agent is healthy:
  1. Check Pod Status:
    kubectl get pods -n nofire-system
    
  2. Check Logs:
    kubectl logs -l app=nofire-edge -n nofire-system
    
    Look for “Graph published successfully”.
  3. Check Metrics: Port-forward the agent to access metrics:
    kubectl port-forward svc/nofire-edge 8080:8080 -n nofire-system
    
    Visit http://localhost:8080/metrics.

Monitoring

Key metrics to monitor:
  • dnstap_active_connections: Should be > 0 (indicates CoreDNS is connected).
  • dnstap_frames_received_total: Should be increasing (indicates DNS traffic).
  • graph_nodes_total: Should roughly match the number of resources in your cluster.
  • publisher_errors_total: Should be 0.

Troubleshooting

Common Issues

1. DNSTap Not Connected

Symptoms: dnstap_active_connections is 0, no dependency edges in the graph. Fix:
  • Verify CoreDNS config has the correct Edge IP.
  • Check if a NetworkPolicy is blocking traffic from kube-system to nofire-system on port 6000.
  • Ensure CoreDNS was restarted after config change.

2. Publisher Errors

Symptoms: Logs show “Failed to publish graph”. Fix:
  • Check outbound internet access.
  • Verify API Key is correct.
  • Check if the NOFire AI endpoint is reachable (curl -v https://api.nofire.ai/graph).

3. High Memory Usage

Symptoms: Pod OOMKilled. Fix:
  • Increase memory limit in Helm values.
  • Reduce graph.maxPruneAge to keep the graph smaller.