What do “-2”, “-1”, and “0” agent response codes mean?
The -2
, -1
, and 0
response codes are error codes applied to requests that weren’t processed correctly. There are a few reasons why this can happen but they tend to fall into two major categories:
- The post/response couldn’t be matched to the request
- The module timed out waiting for a response from the agent
Request and response mismatch
Error response codes can occur when a post/response couldn’t be matched to any actual requests. This is typically the result of NGINX redirecting before the request is passed to the Signal Sciences module.
Specific server response codes
The following server response codes cause NGINX to skip the phases that normally run. Due to their nature, they cause NGINX to finish processing the request without it being passed to the Signal Sciences module:
- 400 (Bad Request)
- 405 (Not Allowed)
- 408 (Request Timeout)
- 413 (Request Entity Too Large)
- 414 (Request URI Too Large)
- 494 (Request Headers Too Large)
- 499 (Client Closed Request)
- 500 (Internal Server Error)
- 501 (Not Implemented)
Look for NGINX return directives
Look for custom NGINX configurations or Lua code that could be redirecting the request. This is almost always due to return
directives in an NGINX configuration file. There could be return
directives used to redirect specific pages to www
, https
, or a new URL. The return
directive stops all processing, causing the request to not be processed by the Signal Sciences module. For example:
location /oldurl {
return 302 https://example.com/newurl/
}
These would need to be updated to force the request to be processed by our agent first. Calling the rewrite_by_lua_block
directly allows you to force the Signal Sciences module to run first and then perform the return statement for NGINX:
location /oldurl {
rewrite_by_lua_block {
sigsci.prerequest()
return ngx.exit(302 "https://example.com/newurl/")
}
#return 302 https://example.com/newurl/
}
Agent restarted
Request and response mismatches can also be due to restarting the agent. If the agent is restarted after the request is processed, but before the response is processed, the agent will not see the response and fail to attribute it to the request, resulting in an error response code.
Module timing out
When the module receives a request, it sends it to the agent for processing. The module then waits for a response from the agent (whether or not to block) for a set amount of time (typically 100ms). If the agent doesn’t process the request within that time, the module will time out and default to failing open, allowing the request through. These requests that failed open will have error response codes applied to them.
Module timeouts are most commonly due to insufficient resources allocated to the agent. This can be a result of host or agent misconfiguration, such as the agent being limited to too few CPU cores.
This can also be due to a high volume of traffic to the host. If requests are coming in faster than the agent can process them subsequent requests will be queued for processing. If a queued request reaches the timeout limit, then the module will fail open and allow the request through.
Similarly, certain rules designed specifically for penetration testing can take longer to run than traditional rules. This can result in requests queueing and timing out due to the increased processing time per request.
Look at Response Time
Requests that are timing out will have a high response time, exceeding the default timeout of 100ms.
Look at Agent metrics
Metrics for each agent can be viewed directly in the console:
- Click Agents in the navigation bar. The agents page appears.
- Click on the name of the agent. The agent metrics page appears.
Connections dropped
The “Connections dropped” metric indicates the number of requests that were allowed through (or “dropped”).
CPU usage
The CPU metrics can indicate the host is overloaded, preventing it from processing requests quickly enough.
- The “Host CPU” metric indicates the CPU percentage for all cores together (100% is maximum).
- The “Agent CPU” metric indicates the total CPU percentage for the number of cores in use by the agent. For example, if the agent were using 4 cores, then 400% would be the maximum.
CPU allocation and containerization
There are known issues with agents running within containers. It’s possible for agents to have insufficient CPU to process requests, due to a low number of CPUs (cores) allocated to the container by the cgroups
feature.
We recommend the container running the agent should be given at least 1 CPU. If both NGINX and the agent are running in the same container, then we recommend allocating at least 1.5 CPUs.
Further help
If you’re unable to troubleshoot or resolve this issue yourself, generate an agent diagnostic package by running sigsci-agent-diag
, which will output a .tar.gz archive with diagnostic information. Reach out to our support team to explain the issue in detail, including console links to the requests and agents affected, and provide the diagnostic .tar.gz archive.