Test-Driven Security With Chef InSpec

Test-Driven Security

Test-driven security is the implementation of tests into the development process, and Chef InSpec is one tool that will help you get started with this process. These security tests are intended to define the security features required for a system to be production ready.

In this post, we will walk through the process of using test-driven security, with proscriptive security tests, using Chef InSpec.

Regression Testing Security

Regression testing is the testing of software to ensure that changes do not break existing behavior. As new features are added or even as bugs are fixed, we want to test that previously existing behavior does not become broken and result in new bugs. To continue ensuring quality as software is developed, regression tests are added. The benefit of these tests lies in the fact that they help to prevent duplicate work and also help to ensure a better user experience.

Typically the process of adding a regression test is as follows:

  • Bug reported
  • Bug identified
  • Fix written
  • Regression test added to ensure that the problem doesn’t happen again

If the same bug happens more than once, then shame on us for not having done the work needed to prevent it from happening again.

Let’s reapply this thinking to security. Eventually, you’re going to have a security incident. Mistakes are made. Previously unknown issues are discovered. And sometimes, people just do things they shouldn’t do. After the incident has been resolved, why not add tests to reduce the likelihood that the same incident will happen again?

Chef InSpec to Bring Security and Ops Together

Increasingly many Ops teams want the ability to go as fast as their Dev teams. Just as a Development team may want the ability to ship a new feature or service quickly, Ops teams increasingly want to ship their own new services and features or adapt to changing needs more quickly.

InSpec is Chef’s testing framework for specifying compliance, security, and policy requirements in code. By defining your requirements as code, testing for them can be easily integrated into your service development cycles and deployment pipeline.

To get started with InSpec, see Annie Hedgpeth’s tutorial. I’m more interested in demonstrating how Ops and Security teams can use it together to prevent repeats of incidents after they’ve become familiar with it and its language.

Some more mature Ops teams have already adopted InSpec for their system level integration testing needs. This allows an Ops team to develop and deploy with better confidence that their changes do not adversely impact existing behavior. For example, while working on changes to an Nginx Chef cookbook, the following InSpec policy would be useful to ensure that changes didn’t inadvertently break the ability for the service to successfully start up:

# nginx base
control 'base-1' do
  impact 1.0
  title 'nginx running and listening.'
  desc 'Ensure service is running and listening'

  describe service('nginx') do
    it { should be_running }
  end
  describe host('127.0.0.1', port: 443, proto: 'tcp') do
    it { should be_reachable }
  end
end

Now, how can we extend this to work with Security?

Just Tell Me What I Can’t Do

InSpec can be leveraged similarly for security policy compliance. We’re not talking about regulatory compliance, but rather, compliance with those internal policies that have been developed to ensure the security stance of an organization. But instead of taking a prescriptive approach to compliance, we will take a proscriptive approach. That is, Security should, where possible, focus on telling Dev and Ops less about how their systems should be configured and more about how they shouldn’t be configured.

Why approach security policy compliance this way? We want to change the process of ensuring the security of the environment from a gate to a guardrail. Rather than requiring people to interrupt their work to get permission to make changes and try new things, this approach empowers teams to make changes until they they do something, often inadvertently, that we don’t want to happen. Making security less interruptive makes service owners more likely to come to Security when they need to instead of doing everything in their power to work around it.

How would this process work? Let’s look at a repository where InSpec testing is built in.

Chef InSpec in Action

Let’s take a walk through Chef InSpec using this Chef repo:

https://github.com/threatstack/chef-repo-inspec-example

This Chef repo provides the ability to spin up a Vagrant host via test-kitchen that configures the host to run the threatstack-to-s3 service using Habitat. The repository has three cookbooks in it that configure the host:

To run test-kitchen and get a running and configured Vagrant instance, do the following:

$ kitchen converge threatstack-to-s3
-----> Starting Kitchen (v1.16.0)
-----> Creating ...
       Bringing machine 'default' up with 'virtualbox' provider...
       ==> default: Importing base box 'bento/centos-7.3'...

The running host should have some package repositories set up, packages installed, some UNIX groups set up, and of course, the Threat Stack agent installed and configured. Nginx is installed and running. Finally, the threatstack-to-s3 service is deployed and started via Habitat, and an Nginx virtual host is setup to proxy HTTPS traffic to the Habitat service. It’s a fairly typical microservice setup.

Now run the InSpec tests in the repository. The output below has been abbreviated for the sake of clarity:

 $ kitchen verify threatstack-to-s3

Most tests pass, but a handful fail. Let’s walk through some of these tests and understand why they’re here and how they act as security guard rails instead of gates.

System Integration vs. Security Policy Compliance Tests

Let’s start by discussing the difference between our system integration tests and security policy compliance tests. For us, a system integration test should describe the state we want a running system to be in. Our security policy compliance tests are testing that ensures we haven’t violated a security policy constraint. How does this work in practice? Let’s take a look at two sets of tests for the wheel group and its users.

The wheel group is the designated group in our environment that all admin users should be in. Membership in that group grants the user wide-ranging “sudo” permissions in our configuration. We have a set of tests to ensure that people we expect to be admins exist and are in the wheel group. This way we don’t deploy services for which our hypothetical Ops team has no access to.

cookbooks/base/test/integration/inspec/controls/groups.rb

control 'users-1' do
  impact 0.6
  title 'Ensure admin users have been created'
  desc 'All admin users must be present on the system.'
  tag 'users', 'groups'

  admin_users.each do |u|
    describe user(u) do
      it { should exist }
      its('groups') { should include 'wheel' }
    end
  end
end

All users that we have listed as admins must be a member of the wheel group or this test fails.

But ask yourself how often you are asked to grant a subset of users elevated privileges. The request may, in fact, make perfect sense. It’s a new service, the team needs increased privileges in order to debug issues, and they can’t keep going through Ops to ask them to complete tasks. You oblige the request because it makes reasonable sense. You add some users to the svc-threatstack-to-s3 cookbook and make them members of the wheel group.

This is where the following test would fail:

cookbooks/base/test/integration/inspec/controls/compliance_groups.rb

control 'compliance-groups-1' do
  impact 1.0
  title 'Ensure only appropriate members of wheel group'
  desc '
    Ensure the wheel group only contains agreed upon members.
  '

  tag 'users', 'groups'
  ref 'Compliance: Host Access', url: 'https://wiki.example.com/security/compliance/ops/Host+Access+And+Permissions'

  describe users.where { groups.include?('wheel')  && ! admin_users.include?(username) } do
    its('entries.length') { should eq 0 }
  end
end

The test fails fails because Security and Ops have an agreed upon security policy about who is allowed to be a member of the wheel group, and the most recent change results in a system being out of compliance. We’re about to put a system into a state it should not be in.

What does one do now? Security is just doing their job of ensuring the integrity of the environment. As an Ops or Software engineer, go talk to Security and ask to work out additional elevated privileges for the time being. These InSpec tests shouldn’t be seen as trying to stop people from getting work done, but providing a notice that someone needs to be informed of a change to the security posture of a collection of hosts in the environment. And since the test is code, the approval process could even be handled entirely via the git PR process by requiring approval for the merge from Security.

I Don’t Know What to do. Please Help?

When it comes to SSL, I know today to disable SSLv2 and SSLv3 due to their weaknesses. However, when it comes to protocol cipher suites to disable, I need to ask for help. InSpec is an excellent way for Security to insert their expertise in this area in an automated way. The following test is in the repository and it checks for weak and outdated ciphers:

cookbooks/site-nginx/test/integration/inspec/controls/compliance_ssl.rb

cipher_list = command('openssl ciphers').stdout.strip.split(':')

control 'compliance-ssl-2' do
  impact 0.7
  title 'Disallowed SSL/TLS ciphers'
  desc 'Test for deprecated SSL/TLS protocols being enabled.'

  tag 'network'
  ref 'Compliance: Network Accessibility', url: 'https://wiki.example.com/security/compliance/ops/Network+Accessibility'

  cipher_list.select { |x| x =~ /DES/ }.each do |c|
    describe command("openssl s_client -cipher #{c} -connect 127.0.0.1:443 < /dev/null") do
      its('exit_status') { should eq 1 }
    end
  end

  cipher_list.select { |x| x =~ /MD5/ }.each do |c|
    describe command("openssl s_client -cipher #{c} -connect 127.0.0.1:443 < /dev/null") do
      its('exit_status') { should eq 1 }
    end
  end

  cipher_list.select { |x| x =~ /PSK/ }.each do |c|
    describe command("openssl s_client -cipher #{c} -connect 127.0.0.1:443 < /dev/null") do
      its('exit_status') { should eq 1 }
    end
  end

  cipher_list.select { |x| x =~ /RC4/ }.each do |c|
    describe command("openssl s_client -cipher #{c} -connect 127.0.0.1:443 < /dev/null") do
      its('exit_status') { should eq 1 }
    end
  end

  cipher_list.select { |x| x =~ /EXPORT/ }.each do |c|
    describe command("openssl s_client -cipher #{c} -connect 127.0.0.1:443 < /dev/null") do
      its('exit_status') { should eq 1 }
    end
  end

  cipher_list.select { |x| x =~ /NULL/ }.each do |c|
    describe command("openssl s_client -cipher #{c} -connect 127.0.0.1:443 < /dev/null") do
      its('exit_status') { should eq 1 }
    end
  end
end

We start by getting the list of ciphers supported by “openssl” on the system. We then attempt to connect to Nginx, requesting each of the cipher suites from families we don’t want to allow.

We pass all tests except for two of them:

  ×  compliance-ssl-2: Disallowed SSL/TLS ciphers (2 failed)
     ×  Command openssl s_client -cipher ECDHE-RSA-DES-CBC3-SHA -connect 127.0.0.1:443 < /dev/null exit_status should eq 1

     expected: 1
          got: 0

     (compared using ==)

     ✔  Command openssl s_client -cipher ECDHE-ECDSA-DES-CBC3-SHA -connect 127.0.0.1:443 < /dev/null exit_status should eq 1
     ✔  Command openssl s_client -cipher EDH-RSA-DES-CBC3-SHA -connect 127.0.0.1:443 < /dev/null exit_status should eq 1
     ✔  Command openssl s_client -cipher EDH-DSS-DES-CBC3-SHA -connect 127.0.0.1:443 < /dev/null exit_status should eq 1
     ✔  Command openssl s_client -cipher ECDH-RSA-DES-CBC3-SHA -connect 127.0.0.1:443 < /dev/null exit_status should eq 1
     ✔  Command openssl s_client -cipher ECDH-ECDSA-DES-CBC3-SHA -connect 127.0.0.1:443 < /dev/null exit_status should eq 1
     ×  Command openssl s_client -cipher DES-CBC3-SHA -connect 127.0.0.1:443 < /dev/null exit_status should eq 1

     expected: 1
          got: 0

     (compared using ==)

     ✔  Command openssl s_client -cipher PSK-3DES-EDE-CBC-SHA -connect 127.0.0.1:443 < /dev/null exit_status should eq 1
     ✔  Command openssl s_client -cipher KRB5-DES-CBC3-SHA -connect 127.0.0.1:443 < /dev/null exit_status should eq 1
     ✔  Command openssl s_client -cipher KRB5-DES-CBC3-MD5 -connect 127.0.0.1:443 < /dev/null exit_status should eq 1
     ✔  Command openssl s_client -cipher KRB5-IDEA-CBC-MD5 -connect 127.0.0.1:443 < /dev/null exit_status should eq 1
     ✔  Command openssl s_client -cipher KRB5-DES-CBC3-MD5 -connect 127.0.0.1:443 < /dev/null exit_status should eq 1
     ✔  Command openssl s_client -cipher RC4-MD5 -connect 127.0.0.1:443 < /dev/null exit_status should eq 1
     ✔  Command openssl s_client -cipher KRB5-RC4-MD5 -connect 127.0.0.1:443 < /dev/null exit_status should eq 1
     ✔  Command openssl s_client -cipher PSK-AES256-CBC-SHA -connect 127.0.0.1:443 < /dev/null exit_status should eq 1
     ✔  Command openssl s_client -cipher PSK-AES128-CBC-SHA -connect 127.0.0.1:443 < /dev/null exit_status should eq 1
     ✔  Command openssl s_client -cipher PSK-3DES-EDE-CBC-SHA -connect 127.0.0.1:443 < /dev/null exit_status should eq 1
     ✔  Command openssl s_client -cipher PSK-RC4-SHA -connect 127.0.0.1:443 < /dev/null exit_status should eq 1
     ✔  Command openssl s_client -cipher ECDHE-RSA-RC4-SHA -connect 127.0.0.1:443 < /dev/null exit_status should eq 1
     ✔  Command openssl s_client -cipher ECDHE-ECDSA-RC4-SHA -connect 127.0.0.1:443 < /dev/null exit_status should eq 1
     ✔  Command openssl s_client -cipher ECDH-RSA-RC4-SHA -connect 127.0.0.1:443 < /dev/null exit_status should eq 1
     ✔  Command openssl s_client -cipher ECDH-ECDSA-RC4-SHA -connect 127.0.0.1:443 < /dev/null exit_status should eq 1
     ✔  Command openssl s_client -cipher RC4-SHA -connect 127.0.0.1:443 < /dev/null exit_status should eq 1
     ✔  Command openssl s_client -cipher RC4-MD5 -connect 127.0.0.1:443 < /dev/null exit_status should eq 1
     ✔  Command openssl s_client -cipher PSK-RC4-SHA -connect 127.0.0.1:443 < /dev/null exit_status should eq 1
     ✔  Command openssl s_client -cipher KRB5-RC4-SHA -connect 127.0.0.1:443 < /dev/null exit_status should eq 1
     ✔  Command openssl s_client -cipher KRB5-RC4-MD5 -connect 127.0.0.1:443 < /dev/null exit_status should eq 1

With this test in place, I should now expect to ask Security for their expertise in what to do. I see numerous suggestions to disable DES-CBC3-SHA. But for ECDHE-RSA-DES-CBC3-SHA, I see mixed messages. Some say to disable it because of DES, and others saying it’s still fine. From there I can update the nginx configuration appropriately.

Capturing the Unexpected

Sometimes mistakes happen when you’re getting a service into production. InSpec tests are a great help catching these mistakes. If you run kitchen verify, you’ll notice a failing test. This failure was completely unintended when I started building this InSpec project and is an actual mistake I unintentionally made.

  ×  compliance-network-1: Ensure only allowed ports are network accessible (expected `Port 8080.listening?` to return false, got true)
     ×  Port 8080 should not be listening
     expected `Port 8080.listening?` to return false, got true

The test that produces this is here:

cookbooks/base/test/integration/inspec/controls/compliance_network.rb

# Check open ports
#

default_allowed_ports = attribute(
  'base_compliance_allowed_ports',
  description: 'List of allowed ports',
  default: []
)

service_allowed_ports = attribute(
  'service_compliance_allowed_ports',
  description: 'List of allowed ports for a service.',
  default: []
)

allowed_ports = default_allowed_ports + service_allowed_ports

control 'compliance-network-1' do
  impact 0.8
  title 'Ensure only allowed ports are network accessible'
  desc 'Only allowed ports should be network accessible.'

  tag 'network'
  ref 'Compliance: Network Accessibility', url: 'https://wiki.example.com/security/compliance/ops/Network+Accessibility'

  # Get list of non locally bound ports.
  # XXX: We should be checking IPv6.
  ports = command('netstat -ant -A inet | tail -n +3 | awk '{print $4}' | grep -v 127.0.0.1 | cut -d : -f 2').stdout.strip.lines.sort.uniq

  int_ports = ports.map { |i| i.to_i }
  (int_ports - allowed_ports).each do |p|
    describe port(p) do
      it { should_not be_listening }
    end
  end
end

When running threatstack-to-s3 on a host via Chef Habitat, Nginx is listening for HTTPS on port 443 and proxies traffic over localhost to threatstack-to-s3 listening on port 8080. The test above failed because the service binds to all addresses on port 8080. This is a behavior I set early on in threatstack-to-s3 development before I started using Nginx. This seems like a small issue at first, but there’s a glaring potential hole.

I started proxying through Nginx because it could handle SSL for me. I also intended for Nginx to handle authentication and authorization for me. Why? Because writing secure authentication and authorization code is outside my expertise and better left to more experienced people. However, while Nginx is setup now, I’ve left an unauthenticated endpoint with no encryption available.

As teams move faster to adopt new technologies and ideas to adapt to changing needs, these sorts of mistakes are bound to happen. And that is where security compliance tests should be acting as guard rails.

Note: threatstack-to-s3 does not run in the Threat Stack product environment. It is an open source project intended to run on AWS Lambda by Threat Stack users and act as a demonstration code for handling Threat Stack webhooks.

Final Words . . .

As stated earlier, test-driven security is the implementation of tests into the development process, and Chef InSpec is one tool that will help you get started with this process. InSpec is Chef’s testing framework for specifying compliance, security, and policy requirements in code. By defining your requirements as code, testing for them can be easily integrated into your service development cycles and deployment pipeline.

If you have any comments, questions, or insights into encryption cipher suites, give me a shout at @tmclaughbos.

If your organization is  interested in getting started in cloud security, we invite you to sign up for a free cloud security trial now.