Overview
Overview
To find an overview of Gremlin’s security practices, check out gremlin.com/security
Gremlin’s “Failure as a Service” makes it easy to find weaknesses in your system before they cause problems for your customers. Gremlin is a simple, safe, and secure way to use Chaos Engineering to improve system resilience.
Gremlin attacks are generated on the control plane. Clients make outbound TLS calls to poll for attacks. Gremlin provides secure command execution, security auditing, multi-factor authentication (MFA), and SAML SSO.
Linux
Gremlin is installed on Linux with a least privilege setup. When installed directly on the host, Gremlin does not require root privileges to any machines in your infrastructure. Gremlin operations are run via a gremlin
user created with default Linux privileges.
Gremlin needs the following Linux capabilities to perform the corresponding attacks.
capability | purpose |
---|---|
cap_sys_boot | used by shutdown to shutdown (and optionally reboot) your hosts |
cap_sys_time | used by time travel to move your hosts forward and backward through time |
cap_net_admin | used by the network gremlins for all network attacks |
cap_kill | used by process killer to kill requested process(es) |
When installed directly on the host and running as any non-root Linux user, Gremlin needs the following Linux capabilities for collecting process information required for Services Discovery features. The Service Discovery feature is opt-in for Host and Container based services. If not opted in these capabilities will be granted to the gremlind
process, but will not be used.
capability | purpose |
---|---|
cap_dac_read_search | grants us the ability to execute directories (list contents) without having access granted by the file owner/mode to obtain sockets for process collection |
cap_sys_ptrace | used by process collection to grant access to absolute path to process binary for hosts and container services, see proc(5) and ptrace(2) |
Windows
The Gremlin daemon is installed as a Windows service under the LocalSystem account. Attacks created from the user interface run as a child process of the deamon so they too run under the LocalSystem account.
Gremlin configuration and work files are placed in the %ALLUSERSPROFILE%\Gremlin\Agent
directory. By default Windows places that location at C:\ProgramData\Gremlin\Agent
. The Gremlin folders and files inherit permissions from the parent %ALLUSERSPROFILE%
/C:\ProgramData
folder. Normally the permissions are read-write for administrators and read-only for all others. Those permissions prevent non-administrators from being able to run attacks from the command line.
Gremlin agent includes a kernel driver. The kernel driver is used for latency attacks. Like the Gremlin daemon, the Gremlin kernel driver loads with the operating system.
Network Access
Gremlin never intercepts the content or payload of any network traffic. Gremlin only looks at routing information in order to apply its impact to the intended network traffic.
No Ingress ports required
All communication between the Gremlin daemon and our service is initiated by the Gremlin daemon. For this reason, the daemon must have an outbound network path to the Gremlin service (api.gremlin.com
). Since all connections from the daemon are outbound, it is not necessary to open ports in your security groups or firewall to allow inbound communications.
Proxy support
The Gremlin client supports http/https proxies via the environment variables http_proxy
and https_proxy
. These are set to use a proxy server via HTTP and HTTPS traffic, respectively. Values used should be of the form http[s]://[username:password@]address:port
, such as export https_proxy=https://proxy.your_company.com:8080
or export https_proxy=https://your_username:your_password@proxy.your_company.com:8080
.
For Linux, the Gremlin daemon, which is typically run as a service, requires these environment variables to be set in /etc/default/gremlind
:
1echo "https_proxy=https://localhost:8888" | sudo tee -a /etc/default/gremlind2sudo systemctl restart gremlind
For Windows the environment variables can be set through Control Panel or using PowerShell commands.
Note that the Gremlin Service only functions via encrypted communication (HTTPS). Attempts to connect to it via unencrypted protocols (HTTP) are denied.
Secure command execution
The Gremlin daemon periodically communicates with our service over a TLS-protected channel which is authenticated using your organization's credentials. Once authenticated, the daemon sends heartbeat messages to the service and receives instructions from the service as responses to the heartbeat messages. If an attack has been scheduled, the daemon receives the instructions for executing that attack. Each instruction action is pre-defined within the daemon. Arbitrary instructions cannot be executed.
The service API only supports TLSv1.2 connections.
Security auditing
The Gremlin client, daemon, API, and website undergo regular security auditing, including penetration testing, by the external security auditor Bishop Fox. All identified vulnerabilities are remediated promptly and confirmed via remediation testing by our auditors. We can provide a Letter of Assessment from our auditors outlining our most recent audit findings and remediation results upon request.
Two Factor Authentication (MFA)
Gremlin offers Two Factor Authentication. See User Management.
SAML SSO
Gremlin supports SAML SSO. See User Management.
Docker (Linux)
User namespace isolation
Gremlin currently uses the host's file system to store temporary log and state information about attacks. When running Docker with user namespace remapping (userns-remap
), Gremlin needs to assume the user namespace of the host. This applies for both the gremlin daemon container as well as when running gremlin attack-container
. Note that by assuming the user namespace of the host, we are creating an exception to backspace isolation for the Docker containers running Gremlin.
For running the Gremlin daemon in a container
1docker run -d \2 --userns-remap=host \3 -e GREMLIN_BYPASS_USERNS_REMAP=1 \4 -v /var/lib/gremlin:/var/lib/gremlin \5 -v /var/log/gremlin:/var/log/gremlin \6 gremlin/gremlin daemon
For running the Gremlin daemon on the host
1echo "GREMLIN_BYPASS_USERNS_REMAP=1" | sudo tee -a /etc/default/gremlind2sudo systemctl restart gremlind
For running a Gremlin attack from the command line
1export GREMLIN_BYPASS_USERNS_REMAP=12gremlin attack-container 38dbd9016529 cpu
SELinux and Gremlin in Containers
Gremlin performs some actions that are not allowed by the default SELinux process label for containers (container_t
):
- Install and manipulate files on the host:
/var/lib/gremlin
,/var/log/gremlin
- Load kernel modules for manipulating network transactions during network attacks, such as
net_sch
- Communicate with the container runtime socket (e.g.
/var/run/docker.sock
) to launch containers that carry out attacks - Read files in
/proc
Bypass container_t restrictions
It is possible to alleviate these restrictions on container_t
by installing the following policy. However, this grants the privileges required by Gremlin to all other containers on your system that use container_t
.
This is not ideal for secure environments. Gremlin recommends setting up a new process label type for Gremlin containers and granting privileges to this type only. You can find more information, including steps to configure this, at github.com/gremlin/selinux-policies.
If you wish to run Gremlin with the container_t
process label, and bypass its restrictions, supply the following type enforcement rules into a new SELinux policy:
1# WARNING: This policy adds capabilities to all containers run under the default type: container_t2# Gremlin needs access to /var/log/gremlin3allow container_t container_log_t:dir { read write create getattr setattr unlink link add_name remove_name rmdir open };4allow container_t container_log_t:file { read write create getattr setattr append unlink link open };5allow container_t var_log_t:dir { write add_name };67# Gremlin needs access to /var/run/docker.sock8allow container_t container_runtime_t:unix_stream_socket connectto;910# Gremlin needs access to /var/lib/gremlin11allow container_t container_var_lib_t:dir { read write create getattr setattr unlink link add_name remove_name rmdir open };12allow container_t container_var_lib_t:file { read write create getattr setattr append unlink link open };1314# Gremlin needs to load the kernel modules: net_sch15allow container_t kernel_t:system module_request;