Self Troubleshooting

How to check logs

The first step in any troubleshooting is to look at the container logs. Here are the relevant commands:

QoS Agent:

docker compose -f lti_qos-agent.yml logs

Reflector:

docker compose -f lti_reflector.yml logs

Analyzer (run from inside the lti_analyzer directory):

docker logs lti_analyzer_kafka_1
docker logs lti_analyzer_influxdb_1
docker logs lti_analyzer_influx-writer_1

No data shown in the dashboard

If the dashboard at http://YOUR_ANALYZER_HOST_IP:12021 is accessible but showing no data, go through the following checklist.

1. Are the analyzer containers running?

From the lti_analyzer directory, check that Kafka, InfluxDB, influx-writer and Grafana containers are up. Check logs for errors.

2. Can the agent reach the analyzer?

The agent connects to the analyzer via the msgbus.latence.ca extra host entry in lti_qos-agent.yml. Make sure the IP set there is the correct IP of the analyzer host, and that port 12092 (Kafka) is open and reachable from the agent host.

To confirm traffic is actually reaching Kafka, run this on the analyzer host (replace docker0 with your bridge interface if different):

sudo tcpdump -i docker0 "dst port 12092" -n -c 20

If packets appear, the agent is successfully sending data. If nothing shows up within a few seconds, the agent cannot reach the analyzer. You can also run this on the agent host to verify it is sending:

# Replace ANALYZER_IP with the actual IP of your analyzer host
sudo tcpdump -i docker0 "dst ANALYZER_IP and dst port 12092" -n -c 20

Note on the network interface: The interface to use with tcpdump depends on how your Docker containers are configured. In bridge mode (the default), use docker0. In host network mode, use your main network interface, which you can find with ip route | grep default.

3. Is the reflector IP correct?

Verify that LTI_reflector in lti_qos-agent.yml points to the correct IP or hostname of the reflector host.

4. Are the reflector ports open?

The following ports must be open on the reflector host depending on which protocols you are running:

Protocol	Port(s)
HTTP	TCP 12080
HTTPS	TCP 12443
TCP	TCP 12023
UDP	UDP 12024
TWAMP	TCP 12862, UDP 12800–12819
Traffic capacity	TCP/UDP 12501
LIFBE	TCP/UDP 12550
Packet Loss	TCP/UDP 12555

5. Is the license valid and does it cover the number of agents running?

Check that LTI_license_key is correct in both lti_qos-agent.yml and lti_reflector.yml. If you are running more agents than your license allows, some agents may not be able to send data.

6. Do all agents have unique IDs?

Each agent must have a distinct integer value for LTI_agent_id. Two agents sharing the same ID will cause data conflicts.

Some protocols are missing from the dashboard

Are those protocols disabled in the agent config?

A protocol is disabled by setting its interval to -1 in lti_qos-agent.yml. For example, LTI_iperf3_session_interval=-1 disables iPerf3. Check that the protocols you expect are not set to -1.

Are the corresponding reflector ports open?

Each protocol requires specific ports to be open on the reflector host (see table above). If a port is blocked, the measurement for that protocol will fail silently.

You can verify whether traffic is actually being sent and received for each protocol. Run these on the agent host (replace REFLECTOR_IP and use the correct interface):

# Verify outgoing traffic to the reflector per protocol
sudo tcpdump -i docker0 "dst REFLECTOR_IP and dst port 12080" -n -c 20   # HTTP
sudo tcpdump -i docker0 "dst REFLECTOR_IP and dst port 12443" -n -c 20   # HTTPS
sudo tcpdump -i docker0 "dst REFLECTOR_IP and dst port 12023" -n -c 20   # TCP
sudo tcpdump -i docker0 "dst REFLECTOR_IP and dst port 12024" -n -c 20   # UDP
sudo tcpdump -i docker0 "dst REFLECTOR_IP and dst port 12862" -n -c 20   # TWAMP control
sudo tcpdump -i docker0 "dst REFLECTOR_IP and dst portrange 12800-12819" -n -c 20  # TWAMP data

# Verify responses are coming back from the reflector
sudo tcpdump -i docker0 "src REFLECTOR_IP" -n -c 20

You can also check from the reflector host that it is receiving the traffic:

# Replace INTERFACE with the correct interface on the reflector host
sudo tcpdump -i INTERFACE "dst port 12080 or dst port 12443 or dst port 12023 or dst port 12024" -n -c 20
sudo tcpdump -i INTERFACE "dst portrange 12800-12819" -n -c 20  # TWAMP data ports

If traffic arrives at the reflector but no responses come back to the agent, the issue is likely on the reflector side (container not running, port not published). If traffic never arrives at the reflector at all, the issue is network-level (firewall, routing, wrong IP).

Multiple network interfaces causing routing issues

If the VM running the reflector or analyzer has multiple network interfaces, the routing table may cause asymmetric routing: traffic arrives on one interface but the response is sent back out through a different one. The agent never receives the reply, so the measurement fails or times out.

To check whether this is happening, run tcpdump on each interface of the reflector or analyzer host and see which one is actually receiving the incoming traffic:

# List available interfaces
ip link show

# Monitor each interface for incoming measurement traffic
sudo tcpdump -i eth0 "dst port 12080 or dst port 12023 or dst port 12862" -n -c 10
sudo tcpdump -i eth1 "dst port 12080 or dst port 12023 or dst port 12862" -n -c 10
# Repeat for each interface

If traffic is arriving on eth1 but the default route sends replies out through eth0, replies will never reach the agent. The fix is to add a policy routing rule that forces replies to go out through the same interface the request came in on. This is typically done with ip rule and ip route on Linux.

iPerf3 result is 0 or shows an error

iPerf3 transfers large amounts of data in a single test, which means it is sensitive to MTU limitations. If the path between the agent and the reflector has a lower MTU than expected, large packets will be dropped or fragmented and the test will return 0 or fail entirely while other protocols (which use smaller packets) continue to work normally.

To check the effective MTU on the path, run a ping with the "do not fragment" flag and progressively larger packet sizes from the agent host:

# Try increasing sizes until packets start failing (-M do = don't fragment, -s = payload size)
ping -M do -s 1400 REFLECTOR_IP -c 3
ping -M do -s 1450 REFLECTOR_IP -c 3
ping -M do -s 1472 REFLECTOR_IP -c 3

The largest size that succeeds (plus 28 bytes for IP+ICMP headers) is the effective MTU on that path. If it is below 1500, there is an MTU restriction somewhere in the network.

You can work around this by lowering the iPerf3 buffer length in lti_qos-agent.yml to fit within the effective MTU:

- LTI_iperf3_buffer_length=1200

Metadatas are not up to date in the dashboard

Metadatas are by default refreshed every week, or on each restart of the QoS-Agent. To ensure they are updated, you can restart the agent container using:

# Stop the container
docker compose -f lti_qos-agent.yml down

# Start the container
docker compose -f lti_qos-agent.yml up -d

No GPS/Radio data displayed

The running agent needs to be one of those 2 types: - Mobile app agent - Cradlepoint agent using the lti-cradlepoint-data container

Mobile app agent

Make sure the app is configured to send data to your own analyzer:

Open the app Settings
Fill in the LTI license key field with your license key
Set the InfluxDB URL to your analyzer's address on port 12086: http://YOUR_ANALYZER_IP:12086/
Press Save

Without the correct InfluxDB URL, the app will not send data to your analyzer and nothing will appear in the dashboard.

On Android, also verify that Foreground Service is enabled in the app settings. When disabled, the app may stop sending measurements as soon as it is minimized or the screen is locked.

If the app is crashing on Android: 1. Go to Settings > Apps > MobileLatency > Force stop 2. Go to Storage & cache > Clear cache 3. Restart the app

Cradlepoint agent

GPS and radio data require the contextual agent configuration, which includes both the lti-cradlepoint-data container and the qos-agent container in the same compose file. The standard single-container config does not collect GPS or radio data.

Check the following in your Cradlepoint compose config:

You are using the qos-agent:cradlepoint image, not qos-agent:latest
The lti-cradlepoint-data container is present in the compose file
LTI_cradlepointapi_ip is set to lti-cradlepoint-data
Both containers are on the same network (e.g. cradle-network)
LTI_agent_id is the same in both containers

Also verify that GPS is enabled on the Cradlepoint device itself:

In NetCloud Manager, go to DEVICES > Configuration > Edit > SYSTEM > GPS
Check the Enable GPS option

Containers won't start

Is the license key correct?

Both the agent and the reflector require a valid LTI_license_key. An invalid or expired key will prevent containers from starting.

Are all required parameters set?

Check that no REPLACE_BY_* placeholder values remain in your config files. Required parameters for the agent are: LTI_agent_id, LTI_customer_id, LTI_license_key, LTI_reflector, and the msgbus.latence.ca IP. Required parameters for the reflector are: LTI_reflector_id and LTI_license_key.

Both LTI_agent_id, LTI_customer_id, and LTI_reflector_id must be integers.

Can you pull the images from the registry?

If docker compose pull fails, verify that your host can reach registry.latence.ca.

Are you using Docker v2?

The correct command is docker compose (with a space, no dash). Using docker-compose (v1) may cause issues.

AI Features

Chatbot shows a "Bad Gateway" error

The chatbot depends on three things all being operational at the same time:

Internet connectivity — the chatbot uses OpenAI and will fail without outbound internet access from the analyzer host.
The MCP container — check that it is running and healthy.
The LatenceTech API container — check that it is running and healthy.

Check the status of the relevant containers from the lti_analyzer directory:

docker ps
docker logs lti_analyzer_latencetech_mcp_1
docker logs lti_analyzer_latencetech_api_1

If a container is down, bring it back up:

docker compose up -d

If the containers are running but the chatbot still fails, confirm the analyzer host has outbound internet access:

curl -s https://api.openai.com

Reports are not generating

The report feature also uses OpenAI. If the analyzer host does not have outbound internet access, report generation will fail. Confirm connectivity the same way:

curl -s https://api.openai.com

Upgrading the analyzer

Before installing a new version of the analyzer, you must remove the downloaded and generated files from the previous installation to avoid incompatibilities. Consider installing each version in a separate directory.