search close

Kubernetes Ambassador

access_time Updated Jun 20, 2021

Installing with Ambassador Edge Stack (AES)

This example illustrates the integration of Signal Sciences with Ambassador Edge Stack, a cloud native API gateway and ingress controller for Kubernetes, built upon Envoy proxy.

Integrating the Signal Sciences Agent

The Signal Sciences Agent can be installed as a sidecar into each pod or as a service for some specialized needs. The recommended way of installing the Signal Sciences Agent in Kubernetes is by integrating the sigsci-agent into a pod as a sidecar. This just means adding the sigsci-agent as an additional container to the Kubernetes pod. As a sidecar, the agent will scale with the app/service in the pod instead of having to do this separately. However, in some situations, it may make more sense to install the sigsci-agent container as a service and scale it separately from the application. The sigsci-agent container can be configured in various ways depending on the installation type and module being used.

Getting and Updating the Signal Sciences Agent Container Image

The official signalsciences/sigsci-agent container image available from the Signal Sciences account on Docker Hub is the recommended place to get the image. If you want to build your own image or need to customize the image, then follow the sigsci-agent build instructions.

The documentation references the latest version of the agent with imagePullPolicy: Always which will pull the latest agent version even if one already exist locally. This is so the documentation does not fall out of date and anyone using this will not have an agent that stays stagnant, however this may not be what if you need to keep installations consistent or on a specific version of the agent. In this case you should specify a version. Images on Docker Hub are tagged with their versions and a list of versions is available on Docker Hub.

Whether you choose to use the latest image or a specific version, there are a few items to consider to keep the agent up-to-date:

Using the latest Signal Sciences Container Image

If you do choose to use the latest image, then you want to consider how you will keep the agent up-to-date. If you have used the imagePullPolicy: Always option, then the latest image will be pulled on each startup and your agent will continue to get updates. To keep some consistency, you may instead choose to manually update the local cache by periodically forcing a pull instead of always pulling on startup.

docker pull signalsciences/sigsci-agent:latest

Then, use latest with imagePullPolicy: Never set in the configuration so that pulls are never done on startup (only manually as above):

- name: sigsci-agent
   image: signalsciences/sigsci-agent:latest
   imagePullPolicy: Never
   ...

Using a Versioned Signal Sciences Container Image

To use a specific version of the agent, then just replace latest with the agent version. You may also want to change imagePullPolicy: IfNotPresent in this case as the image should not change.

- name: sigsci-agent
   image: signalsciences/sigsci-agent:4.1.0
   imagePullPolicy: IfNotPresent
   ...

This will pull the specified agent version and cache it locally. If you use this method, then it is recommended that you parameterize the agent image, using Helm or similar, so that it is easier to update the agent images later on.

Using a Custom Tag for the Signal Sciences Container Image

It is also possible to apply a custom tag to a local agent image. To do this, pull the agent image (by version or use the latest), apply a custom tag, then use that custom tag in the configuration. You will want to specify imagePullPolicy: Never so that local images are only updated manually. You will need to periodically update the local image to keep the agent up-to-date.

For example:

docker pull signalsciences/sigsci-agent:latest
docker tag signalsciences/sigsci-agent:latest signalsciences/sigsci-agent:testing

Then use this image tag in the configuration:

- name: sigsci-agent
   image: signalsciences/sigsci-agent:testing
   imagePullPolicy: Never
...

Configuring the Signal Sciences Agent Container

Agent configuration is normally done via the environment. Most configuration options are available as environment variables. Environment variables names have the configuration option name all capitalized, prefixed with SIGSCI_ and any dashes (-) changed to underscores (_) (e.g., the max-procs option would become the SIGSCI_MAX_PROCS environment variable). For more details on what options are available, see the Agent Configuration documentation.

The sigsci-agent container has a few required options that need to be configured:

  • Agent credentials (ID and secret key)
  • A volume to write temporary files

Agent Credentials

The sigsci-agent credentials are configured with two environment variables. These variables must be set or the agent will not start.

  • SIGSCI_ACCESSKEYID: Identifies the site that the agent is configured against
  • SIGSCI_SECRETACCESSKEY: The shared secret key to authenticate and authorize the agent

The credentials can be found by following these steps:

  1. Log into the Signal Sciences console
  2. Go to Agents Configurations “Configurations Menu”
  3. On the Agents page click “View Agent Keys”
  4. Copy down the Access Key and Secret Key as this will be used later agent-keys “Agent Keys”

Because of the sensitive nature of these values, it is recommended to use the builtin secrets functionality of Kubernetes. With this configuration, the agent will pull the values from the secrets data instead of reading hardcoded the values into the deployment configuration. This also makes any desired agent credential rotation easier to manage by having to change them in only one place.

Using secrets via environment variables is done using the valueFrom option instead of the value option such as follows:

env:
 - name: SIGSCI_ACCESSKEYID
   valueFrom:
     secretKeyRef:
       # Update "my-site-name-here" to the correct site name or similar identifier
       name: sigsci.my-site-name-here
       key: accesskeyid
 - name: SIGSCI_SECRETACCESSKEY
   valueFrom:
     secretKeyRef:
       # Update "my-site-name-here" to the correct site name or similar identifier
       name: sigsci.my-site-name-here
       key: secretaccesskey

The secrets functionality keeps secrets in various stores in Kubernetes. This documentation uses the generic secret store in its examples, however any equivalent store can be used. Agent secrets can be added to the generic secret store with something like the following YAML:

apiVersion: v1
kind: Secret
metadata:
  name: sigsci.my-site-name-here
stringData:
  accesskeyid: 12345678-abcd-1234-abcd-1234567890ab
  secretaccesskey: abcdefg_hijklmn_opqrstuvwxy_z0123456789ABCD

This can also be created from the command line with kubectl such as with the following:

kubectl create secret generic sigsci.my-site-name-here \
  --from-literal=accesskeyid=12345678-abcd-1234-abcd-1234567890ab \
  --from-literal=secretaccesskey=abcdefg_hijklmn_opqrstuvwxy_z0123456789ABCD

See the documentation on secrets for more details.

Agent Temporary Volume

For added security, it is recommended that the sigsci-agent container be executed with the root filesystem mounted read only. The agent, however, still needs to write some temporary files such as the socket file for RPC communication and some periodically updated files such as GeoIP data. To accomplish this with a read only root filesystem, there needs to be a writeable volume mounted. This writeable volume can also be shared to expose the RPC socket file to other containers in the same pod. The recommended way of creating a writeable volume is to use the builtin emptyDir volume type. Typically this is just configured in the volumes section of a deployment.

volumes:
 - name: sigsci-tmp
   emptyDir: {}

Containers would then typically mount this volume at /sigsci/tmp:

volumeMounts:
 - name: sigsci-tmp
   mountPath: /sigsci/tmp

The default in the official agent container image is to have the temporary volume mounted at /sigsci/tmp. If this needs to be moved for the agent container, then the following agent configuration options should also be changed from their defaults to match the new mount location:

  • rpc-address defaults to /sigsci/tmp/sigsci.sock
  • shared-cache-dir defaults to /sigsci/tmp/cache

Integrating the Signal Sciences Agent into Ambassador Edge Stack (AES)

The Signal Sciences Agent (as of v4.5.0) can be integrated with Datawire’s Ambassador Edge Stack (AES). This integration uses the underlying Envoy integration built into the agent. The agent is configured with an Envoy gRPC Listener and through AES’s Filter, FilterPolicy, and LogService Kubernetes resources. Deployment and configuration is flexible. As such, this document is designed so the information can be applied to your own methods of deployment.

Note that the examples in the documentation will refer to installing the “latest” agent version, but this is only so that the documentation examples do not fall behind. Refer to the docs on getting and updating the agent for more details on agent versioning and how to keep the agent up-to-date.

Namespaces

By default AES is installed into the ambassador Kubernetes namespace. The agent and any applications running behind AES do not have to run in this namespace, but some care must be taken during configuration to use the correct namespaces and this documentation may differ from your configuration. The following namespaces are used in this documentation.

Ambassador

  • Used for the ambassador install
  • Used for all ambassador resources (Filter, FilterPolicy, LogService, Mapping, etc.)
  • Used for the sigsci-agent when running as a sidecar

default

  • Used for all applications and services running behind AES
  • Used for the agent when run in standalone mode

Agent: Standalone or Sidecar

The agent can run as a standalone deployment/service or as a sidecar container within the AES pod. Either is fine, but running as a sidecar is much easier if you are using Helm as this is directly supported in the Helm values file. Running as a sidecar has the distinct advantage of scaling with AES, so this is the recommended route if you are using scaling via replica counts or autoscaling.

Installation

Installation involves two tasks: Deploying the agent configured in gRPC mode and Configuring AES to send traffic to the agent.

Deploying the Agent

Deploying the agent is done by deploying the signalsciences/sigsci-agent container as a sidecar to AES or as a standalone service. The agent must be configured with its ID and Secret Key. This is typically done via a Kubernetes secret. One important point about secrets is that the secret must be in the same namespace as the pod using the secret. So, if you are running as a sidecar in the ambassador namespace, then the secret must also reside in that namespace. Refer to the agent credentials docs for more details.

Example Secret in the ambassador namespace:

apiVersion: v1
kind: Secret
metadata:
  # Edit `my-site-name-here`
  # and change the namespace to match that which
  # the agent is to be deployed
  name: sigsci.my-site-name-here
  namespace: ambassador
stringData:
  # Edit these `my-agent-*-here` values:
  accesskeyid: my-agent-access-key-id-here
  secretaccesskey: my-agent-secret-access-key-here

Sidecar with Helm

Configuring AES with Helm is the easiest way to deploy as the Ambassador values file already has direct support for this without having to modify an existing deployment YAML file. Refer to the AES docs for installing with helm.

To install the agent as a sidecar, you should add the following to your custom values file, then install or upgrade AES with this values file. Refer to the Ambassador helm chart docs for a reference on the values file. This will add the container with the correct configuration to the AES pod as a sidecar.

Add to the values YAML file:

sidecarContainers:
- name: sigsci-agent
  image: signalsciences/sigsci-agent:latest
  imagePullPolicy: IfNotPresent
  # Configure the agent to use envoy gRPC on port 9999
  env:
  - name: SIGSCI_ACCESSKEYID
    valueFrom:
      secretKeyRef:
        # This secret needs added (see docs on sigsci secrets)
        name: sigsci.my-site-name-here
        key: accesskeyid
  - name: SIGSCI_SECRETACCESSKEY
    valueFrom:
      secretKeyRef:
        # This secret needs added (see docs on sigsci secrets)
        name: sigsci.my-site-name-here
        key: secretaccesskey
  # Configure the envoy to expect response data
  - name: SIGSCI_ENVOY_EXPECT_RESPONSE_DATA
    value: "1"
  # Configure the envoy gRPC listener address on any unused port
  - name: SIGSCI_ENVOY_GRPC_ADDRESS
    value: localhost:9999
  ports:
  - containerPort: 9999
    name: grpc
  securityContext:
    # The sigsci-agent container should run with its root filesystem read only
    readOnlyRootFilesystem: true
    # Ambassador uses user 8888 by default, but the sigsci-agent container
    # needs to run as sigsci(100)
    runAsUser: 100
  volumeMounts:
  - name: sigsci-tmp
    mountPath: /sigsci/tmp
volumes:
- name: sigsci-tmp
  emptyDir: {}

Example upgrading AES with helm:

helm upgrade ambassador \
  --values /path/to/ambassador-sigsci_values.yaml \
  --namespace ambassador \
  datawire/ambassador

Alternatively use Helm to render the manifest files. This makes adding the agent sidecar much easier than manually editing the YAML files. The modified deployment YAML will be in:

<output-dir>/ambassador/templates/deployment.yaml

Example rendering the manifests with helm and applying the results:

helm template \
  --output-dir ./manifests \
  --values ./ambassador-sigsci_values.yaml \
  --namespace ambassador \
  datawire/ambassador
kubectl apply \
  --recursive
  --filename ./manifests/ambassador

Sidecar Manually

To sidecar the agent into the AES pod manually is a bit more involved. It is instead recommended to use Helm to render the manifests (see the Helm section above).

Refer to the AES installation guide for more details. You will need to modify the aes.yaml file (download here: https://www.getambassador.io/yaml/aes.yaml) and append the container and volumes described above in the helm docs to the ambassador deployment resource. Refer to the Kubernetes and envoy documentation for more details.

This is the correct resource to modify:

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    product: aes
  name: ambassador
  namespace: ambassador
…
  containers:
  …
  volumes:
  …

The container will need to be added to the containers section and the volume to the volumes section.

Standalone

For a standalone agent, you just need to add a Deployment and Service resource for the agent such as follows. Refer to the Kubernetes and envoy documentation for more details.

Example SigSci Agent Service and Deployment:

apiVersion: v1
kind: Service
metadata:
  name: sigsci-agent
  # You may want it running in the ambassador namespace
  #namespace: ambassador
  labels:
    service: sigsci-agent
spec:
  type: ClusterIP
  ports:
  - name: sigsci-agent
    port: 9999
    targetPort: grpc
  selector:
    service: sigsci-agent
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: sigsci-agent
  # You may want it running in the ambassador namespace
  #namespace: ambassador
spec:
  replicas: 1
  selector:
    matchLabels:
      service: sigsci-agent
  template:
    metadata:
      labels:
        service: sigsci-agent
    spec:
      containers:
      - name: sigsci-agent
        image: signalsciences/sigsci-agent:latest
        imagePullPolicy: IfNotPresent
        # Configure the agent to use envoy gRPC on port 9999
        env:
        - name: SIGSCI_ACCESSKEYID
          valueFrom:
            secretKeyRef:
              # This secret needs added (see docs on sigsci secrets)
              name: sigsci.my-site-name-here
              key: accesskeyid
        - name: SIGSCI_SECRETACCESSKEY
          valueFrom:
            secretKeyRef:
              # This secret needs added (see docs on sigsci secrets)
              name: sigsci.my-site-name-here
              key: secretaccesskey
        # Configure the envoy to expect response data
        - name: SIGSCI_ENVOY_EXPECT_RESPONSE_DATA
          value: "1"
        # Configure the envoy gRPC listener address on any unused port
        - name: SIGSCI_ENVOY_GRPC_ADDRESS
          value: 0.0.0.0:9999
        ports:
        - containerPort: 9999
          name: grpc
        securityContext:
          # The sigsci-agent should run with its root filesystem read only
          readOnlyRootFilesystem: true
        volumeMounts:
        - name: sigsci-tmp
          mountPath: /sigsci/tmp
      volumes:
      - name: sigsci-tmp
        emptyDir: {}

Sending Traffic to the Agent

Three Ambassador resources need to be configured for AES to send data to the agent. Refer to the envoy configuration docs for more detailed information on what each of these configures in the underlying Envoy install. The following documentation uses the example quote service included with Ambassador.

Filter

The Filter resource is used to add the external authorization (ext_authz) filter to Envoy. This will inspect incoming requests that match the FilterPolicy (see below).

The Signal Sciences agent requires AuthService to be defined in the Ambassador configuration, otherwise the agent will not receive request data. AuthService should be enabled by default; if requests are not being received by the agent check that AuthService is enabled by running kubectl get authservice.

One item to note here is the namespace that needs to be used for the auth_service configuration. This is the namespace that the agent is deployed to. For this documentation we have used the ambassador namespace for sidecar agents and default namespace for standalone agents. The format for the auth_service URL should be:

agent-hostname[.namespace]:agent-port

Examples:

  • Sidecar: auth_service: localhost:9999
  • Standalone: auth_service: sigsci-agent.default:9999

Example Filter YAML:

# Filter defines an external auth filter to send to the agent
kind: Filter
apiVersion: getambassador.io/v2
metadata:
  name: sigsci
  namespace: ambassador
  annotations:
    getambassador.io/resource-changed: "true"
spec:
  External:
    # Sidecar agent:
    auth_service: localhost:9999
    # Standalone "sigsci-agent" service in "default" namespace:
    #auth_service: sigsci-agent.default:9999
    path_prefix: ""
    tls: false
    proto: grpc
    include_body:
      max_bytes: 8192
      allow_partial: true
    failure_mode_allow: true
    timeout_ms: 100000

FilterPolicy

The FilterPolicy resource maps what paths will be inspected by the agent. This can be mapped to all traffic (path: /*) or subsets (path: /app1/*). However, there is a limitation that each subset MUST map to the same agent. This is due to a limitation on the LogService not having a path based filter like the FilterPolicy. The LogService MUST route all matching response data to the same agent as handled the request.

Example routing all traffic to the agent:

# FilterPolicy defines which requests go to sigsci
kind: FilterPolicy
apiVersion: getambassador.io/v2
metadata:
  namespace: ambassador
  name: sigsci-policy
  annotations:
    getambassador.io/resource-changed: "true"
spec:
  rules:
    - host: "*"
      # All traffic to the sigsci-agent
      path: "/*"
      filters:
        # Use the same name as the Filter above
        - name: sigsci
          namespace: ambassador
          onDeny: break
          onAllow: continue
          ifRequestHeader: null
          arguments: {}

Routing subsets of traffic to the agent is possible with multiple rules. However every rule must go to the same agent due to limitations described above.

Example routing subsets of traffic to the agent:

# FilterPolicy defines which requests go to the sigsci-agent
kind: FilterPolicy
apiVersion: getambassador.io/v2
metadata:
  namespace: ambassador
  name: sigsci-policy
  annotations:
    getambassador.io/resource-changed: "true"
spec:
  rules:
    # /app1/* and /app2/* to the sigsci-agent
    - host: "*"
      path: "/app1/*"
      filters:
        # Use the same name as the Filter above
        - name: sigsci
          namespace: ambassador
          onDeny: break
          onAllow: continue
          ifRequestHeader: null
          arguments: {}
    - host: "*"
      path: "/app2/*"
      filters:
        # Use the same name as the Filter above
        - name: sigsci
          namespace: ambassador
          onDeny: break
          onAllow: continue
          ifRequestHeader: null
          arguments: {}

LogService

The LogService resource is used to add the gRPC Access Log Service to Envoy. This will inspect the outgoing response data and record this data if there was a signal detected. It is also used for anomaly signals such as HTTP_4XX, HTTP_5XX, etc.

One item to note here is the namespace that needs to be used for the service configuration. This is the namespace that the agent is deployed to. For this documentation we have used the ambassador namespace for sidecar agents and default namespace for standalone agents. The format for the service URL should be:

agent-hostname[.namespace]:agent-port

Examples:

  • Sidecar: service: localhost:9999
  • Standalone: service: sigsci-agent.default:9999

Example:

# Configure the access log gRPC service for the response
# NOTE: There is no policy equiv here, so all requests are sent
apiVersion: getambassador.io/v2
kind: LogService
metadata:
  namespace: ambassador
  name: sigsci-agent
spec:
  # Sidecar agent
  service: localhost:9999
  # Standalone "sigsci-agent" service in "default" namespace:
  #service: sigsci-agent.default:9999
  driver: http
  driver_config:
    additional_log_headers:
    ### Request headers:
    # Required:
    - header_name: "x-sigsci-request-id"
      during_request: true
      during_response: false
      during_trailer: false
    - header_name: "x-sigsci-waf-response"
      during_request: true
      during_response: false
      during_trailer: false
    # Recommended:
    - header_name: "accept"
      during_request: true
      during_response: false
      during_trailer: false
    - header_name: "date"
      during_request: false
      during_response: true
      during_trailer: true
    - header_name: "server"
      during_request: false
      during_response: true
      during_trailer: true
    ### Both request/response headers:
    # Recommended
    - header_name: "content-type"
      during_request: true
      during_response: true
      during_trailer: true
    - header_name: "content-length"
      during_request: true
      during_response: true
      during_trailer: true
  grpc: true