Operations & Troubleshooting
Monitoring and Operations
CloudWatch Metrics
Key metrics to monitor:
# CPU utilization (triggers autoscaling)
aws cloudwatch get-metric-statistics \
--namespace AWS/EC2 \
--metric-name CPUUtilization \
--dimensions Name=AutoScalingGroupName,Value=<asg-name> \
--start-time 2024-01-01T00:00:00Z \
--end-time 2024-01-02T00:00:00Z \
--period 3600 \
--statistics Average
# Network throughput
aws cloudwatch get-metric-statistics \
--namespace AWS/EC2 \
--metric-name NetworkIn \
--dimensions Name=AutoScalingGroupName,Value=<asg-name> \
--start-time 2024-01-01T00:00:00Z \
--end-time 2024-01-02T00:00:00Z \
--period 3600 \
--statistics SumLambda Function Logs
Monitor license assignment and lifecycle events:
# Stream Lambda logs
aws logs tail /aws/lambda/<function-name> --follow
# Search for errors
aws logs filter-log-events \
--log-group-name /aws/lambda/<function-name> \
--filter-pattern "ERROR"
# Search for license assignments
aws logs filter-log-events \
--log-group-name /aws/lambda/<function-name> \
--filter-pattern "license"Auto Scaling Group Activity
# View scaling activities
aws autoscaling describe-scaling-activities \
--auto-scaling-group-name <asg-name> \
--max-records 20
# View current capacity
aws autoscaling describe-auto-scaling-groups \
--auto-scaling-group-names <asg-name> \
--query 'AutoScalingGroups[0].[MinSize,DesiredCapacity,MaxSize]'Troubleshooting
Issue: Instances Launch But Don’t Get Licensed
Symptoms:
- Instances running but showing unlicensed
- Throughput limited to 1 Mbps
- FortiGuard services not working
Causes and Solutions:
For BYOL:
Check license files exist in directory:
ls -la asg_license/Check S3 bucket has licenses uploaded:
aws s3 ls s3://<bucket-name>/licenses/Check Lambda CloudWatch logs for errors:
aws logs tail /aws/lambda/<function-name> --follow | grep -i errorVerify DynamoDB table has available licenses:
aws dynamodb scan --table-name <table-name>
For FortiFlex:
- Check Lambda CloudWatch logs for API errors
- Verify FortiFlex credentials are correct
- Check point balance in FortiFlex portal
- Verify configuration ID matches instance CPU count
- Check entitlements created in FortiFlex portal
For PAYG:
- Verify AWS Marketplace subscription is active
- Check instance profile has correct permissions
- Verify internet connectivity from FortiGate
Issue: Cannot Access FortiGate GUI
Symptoms:
- Timeout when accessing FortiGate IP
- Connection refused
Solutions:
Verify instance is running:
aws ec2 describe-instances --instance-ids <instance-id>Check security groups allow your IP:
aws ec2 describe-security-groups --group-ids <sg-id>Verify you’re using correct port (default 443):
https://<fortigate-ip>:443Try alternate access methods:
# SSH to check if instance is responsive ssh -i ~/.ssh/keypair.pem admin@<fortigate-ip> # Check system status get system statusIf using dedicated management VPC:
- Ensure you’re accessing via correct IP (management interface)
- Check VPC peering or TGW attachment is working
- Verify route tables allow return traffic
Issue: Traffic Not Flowing Through FortiGate
Symptoms:
- No traffic visible in FortiGate logs
- Connectivity tests bypass FortiGate
- Sessions not appearing on FortiGate
Solutions:
Verify TGW routing (if using TGW):
# Check TGW route tables aws ec2 describe-transit-gateway-route-tables \ --transit-gateway-id <tgw-id> # Verify routes point to inspection VPC attachment aws ec2 search-transit-gateway-routes \ --transit-gateway-route-table-id <spoke-rt-id> \ --filters "Name=state,Values=active"Check GWLB health checks:
aws elbv2 describe-target-health \ --target-group-arn <gwlb-target-group-arn>Verify FortiGate firewall policies:
# SSH to FortiGate ssh admin@<fortigate-ip> # Check policies get firewall policy # Enable debug diagnose debug flow trace start 10 diagnose debug enable # Generate traffic and watch logsCheck spoke VPC route tables (for distributed architecture):
# Verify routes point to GWLB endpoints aws ec2 describe-route-tables \ --filters "Name=vpc-id,Values=<spoke-vpc-id>"
Issue: Primary Election Issues
Symptoms:
- No primary instance elected
- Multiple instances think they’re primary
- HA sync not working
Solutions:
Check Lambda logs for election logic:
aws logs tail /aws/lambda/<function-name> --follow | grep -i primaryVerify
enable_fgt_system_autoscale = true:# On FortiGate get system auto-scaleCheck for network connectivity between instances:
# From one FortiGate, ping another execute ping <other-fortigate-private-ip>Manually verify auto-scale configuration:
# SSH to FortiGate ssh admin@<fortigate-ip> # Check auto-scale config show system auto-scale # Should show: # set status enable # set role primary (or secondary) # set sync-interface "port1" # set psksecret "..."
Issue: FortiManager Integration Not Working
Symptoms:
- FortiGate doesn’t appear in FortiManager device list
- Device shows as unauthorized but can’t authorize
- Connection errors in FortiManager
Solutions:
Verify FortiManager 7.6.3+ VM recognition enabled:
# On FortiManager CLI show system global | grep fgfm-allow-vm # Should show: set fgfm-allow-vm enableCheck network connectivity:
# From FortiGate execute ping <fortimanager-ip> # Check FortiManager reachability diagnose debug application fgfmd -1 diagnose debug enableVerify central-management config:
# On FortiGate show system central-management # Should show: # set type fortimanager # set fmg <fortimanager-ip> # set serial-number <fmgr-sn>Check FortiManager logs:
# On FortiManager CLI diagnose debug application fgfmd -1 diagnose debug enable # Watch for connection attempts from FortiGateVerify only primary instance has central-management config:
# On primary: Should have config show system central-management # On secondary: Should NOT have config (or be blocked by vdom-exception) show system vdom-exception