Continuous configuration and compliance monitoring with AWS Config

Config splash

I wrote previously about inventorying your assets in AWS using an external open source tool.  An alternative to this approach is to use AWS Config.

This AWS service certainly has its imperfections (e.g. it doesn’t support all AWS resources) but it is easy to set up and can be quite useful too. When you first enable it, Config will analyse the resources in your account and make the summary available to you in a dashboard (example below). It’s a regional service, so you might want to enable it in all active regions.

Config

Config, however, doesn’t stop there. You can now use this snapshot as a reference point and track all changes to your resources on a timeline. It can be useful when you need to analyse historical records, demonstrate compliance or gain visibility in your change management practices. It can also notify you of any configuration changes if you set up SNS notifications.

Config rules allow you to continuously track compliance with various baselines. AWS provide quite a few out of the box and you can create your own to tailor to the specific environment you operate in. You have to pay separately for rules, so I encourage you to check out pricing first.

As with some other AWS services, you can aggregate the data in a single account. I recommend using the account used for security operations as a master. You will then need to establish a two-way handshake, inviting member accounts and authorising the master account to be able to consolidate the results.

Member

 


How to set up CloudTrail logging in AWS

I bet you already know that you should enable CloudTrail on your AWS accounts, if you haven’t yet. This service captures all the API activity taking place in your AWS account and stores it in an S3 bucket for that account by default. This means you would have to configure the logging and storage permissions for every AWS account your company has. If you are tasked with securing your cloud infrastructure, you will first need to establish how many accounts your organisation owns and how CloudTrail is configured for them. Additionally, you would want to have access to S3 buckets storing these logs in every account to be able to analyse them.

If this doesn’t sound complicated already, think of a potential error in permissions where logs can be deleted by an account administrator. Or situations where new accounts are created without your awareness and therefore not part of the overall logging pipeline. Luckily, these scenarios can be avoided if you are using the Organization Trail.

Create Trail

Your accounts have to be part of the same AWS Organization, of course. You also would need to have a separate account for security operations. Hopefully, this has been done already. If not, feel free to refer to my previous blogs on inventorying your assets and IAM fundamentals for further guidance on setting it up.

Establishing an Organization Trail not only allows you to collect, store and analyse logs centrally, it also ensures all new accounts created will have CloudTrail enabled and configured by default (and it cannot be turned off by child accounts).

Trails

Switch on Insights while you’re at it. This will simply the analysis down the line, alerting unusual API activity. Logging data events (for both S3 and Lambda) and integrating with CloudWatch Logs is also a good idea.

Where can all these logs be stored? The best destination (before archiving) is the S3 bucket in your account used for security operations, so that’s where it should be created.

S3 bucket

Enabling encryption and Object Lock is always a good idea. While encryption will help with confidentiality of your log data, Object Lock will ensure redundancy and prevent objects from accidental deletion. It requires versioning to be enabled and is best configured on bucket creation. Don’t forget to block public access!

S3 block public access

You will then must to use your organisational root account to set up Organization Trail, selecting the bucket you created in your operational security account as a destination (rather than creating a new bucket in your master account).

For this to work, you will need to set up appropriate permissions on that bucket. It is also advisable to set up access for child accounts to be able to read their own logs.

If you had other trails in your accounts previously, feel free to turn them off to avoid unnecessary duplication and save money. It’s best to give it a day for these trails to run in parallel though to ensure nothing is lost in transition. Keep your old S3 buckets used for collection in your accounts previously; you will need these logs too. You can transfer them to Glacier later on.

And that’s how you set up CloudTrail for centralised collection, storage and analysis..


How to audit your AWS environment

Cloudmapper

Let’s build on my previous blog on inventorying your AWS assets. I described how to use CloudMapper‘s collect command to gather metadata about your AWS accounts and report on resources used and potential security issues.

This open source tool can do more than that and it’s functionality is being continuously updated. Once the data on the accounts in scope is downloaded, various operations can be performed on it locally without the need to continuously query the accounts.

One of interesting use cases is to visualise your AWS environment in the browser. An example based on the test data of such a visualisation is at the top of this blog. You can apply various filters to reduce complexity which can be especially useful for larger environments.

Another piece of CloudMapper’s functionality is the ability to display trust relationships between accounts using the weboftrust command. Below is an example from Scott’s guidance on the use of this command. It demonstrates the connections between accounts, including external vendors.

weboftrust

I’m not going co cover all the commands here and suggest checking the official GitHub page for the latest list. I also recommend running CloudMapper regularly, especially in environments that constantly evolve.

An approach of that conducts regular audits. saving reports and integrating with Slack for security alerts is described here.


AWS security fundamentals: IAM

IAM

Here I am going to build on my previous blog of inventorying AWS accounts and talk about identity and access management. By now you have probably realised that your organisation, depending on its size, has more accounts with a lot of associated resources than you initially thought. The way users are created and access is managed in these accounts has a direct impact on the overall security of your infrastructure.

What accounts should your company have? Well it really depends on the nature of your organisation but I tend to see the following pattern for software development driven companies:

1. Organisation root. Your organisation root account should be used to create other accounts (and some other limited amount of operations) and otherwise shouldn’t be touched. Secure the credentials and leave it alone. It should not have any resources associated with it.

2. Identity. Not strictly necessary to have a separate account for this but isn’t it great to be able to manage all your users in a single account?

3. Operations. This account should be used for log collection and analysis. Your security team will be happy.

4, 5 and 6. A separate account for your development, staging and production environments. It’s a good idea to separate them for the ease of managing permissions and pleasing auditors.

Users and services that are managed within an AWS account, should only get access to what they need.

Security specialists are spending a great deal of their time reviewing firewall rules when working on their on-premise infrastructure to ensure they are not too permissive. When we move to the cloud, these rules look somewhat different but their importance has only increased.

To demonstrate the relationship between accounts, users, groups, roles and permissions, let’s walk through an example scenario of a developer in your company requiring read only access to the staging environment.

No automation or anything even remotely advanced is going to be discussed here as we are just covering the basics in this blog. It is no less important, however, to get these right. The principles discussed here will lay the foundation for more advanced concepts. Again, the terminology here is specific to AWS but overarching principles can be applied to any cloud environment.

To start with this scenario, let’s create a custom role CompanyReadOnly and attach an AWS managed ReadOnlyAccess policy in the Permissions tab.

Role Policies
CompanyReadOnly ReadOnlyAccess

This role allows a trusted entity (an account in this case) to access this account. When you access this account you will get the permissions defined in the policy.

Let’s say we have an account where all users are managed (the Identity account in point 2 in the list above). In this account, create a custom policy CompanyAssumeRoleStagingReadOnly allowing assuming the right role, where 123456789012 is Staging account ID which is the trusted entity for the Identity account:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "sts:AssumeRole",
            "Resource": "arn:aws:iam::123456789012:role/CompanyReadOnly"
        }
    ]
}

Now let’s create a custom StagingReadOnly group and attach the above policy in the Permission tab.

Group Permissions
StagingReadOnly CompanyAssumeRoleStagingReadOnly

Finally add a user to that group:

User Group Permissions
Developer StagingReadOnly CompanyAssumeRoleStagingReadOnly

In this group additional permissions can be added, e.g. AWS managed enforce-mfa policy for mandatory multi-factor authentication.

Of course, granular policies specifying access to particular services rather than blanket ReadOnly is preferred. Remember the aim here is to demonstrate IAM fundamental principles rather than recommend specific approaches you should use. The policies will depend on the AWS resources your organisation actually uses.


How to inventory your AWS assets

Resource

Securing your cloud infrastructure starts with establishing visibility of your assets. I’ll be using Amazon Web Services (AWS) as an example here but principles discussed in this blog can be applied to any IaaS provider.

Speaking about securing your AWS environment specifically, a good place to start is the AWS Security Maturity Roadmap by Scott Piper. He suggests identifying all AWS accounts in your organisation as a first step in your cloud security programme. 

Following Scott’s guidance, it’s a good idea to check in with your DevOps team and/or Finance to establish what accounts are being used in your company. Capture this information in a spreadsheet, documenting account name, ID, description and an owner at a minimum. You can expand on this in the future to track compliance with baseline requirements (e.g. enabling CloudTrail logs).

Once we have a comprehensive view of the accounts used in the organisation, we need to find out what resources these accounts use and how they are configured. We can get metadata about the accounts using CloudMapper’s collect command. CloudMapper is a great open source tool and can do much more than that. It deserves a separate blog, but for now just check out setup instructions on its GitHub page and Scott’s detailed instructions on using the collect command.

The CloudMapper report will reveal the resources you use in all the regions (the image at the top of this blog is from the demo data). This can be useful in scenarios where employees in your company might test out new services and forget to switch them off or nobody knows what these services are used for to begin with. In either case, the company ends up paying for these, so it makes economic sense to investigate, and disabling them will also reduce the attack surface.

In addition to that, the report includes a section on security findings and will alert of potential misconfigurations on the account. It also provides recommendations on how to address them. Below is an example report based on the demo data. 

Findings

As we are just establishing the view of our assets in AWS at this stage, we are not going to discuss remediation activities in this blog. We will, however, use this report to understand how much work is ahead of us and prioritise accordingly.

Of course, it is always a good idea to tackle high criticality issues like publicly exposed S3 buckets with sensitive information but don’t get discouraged by a potentially large number of security findings. Instead, focus on strategic improvements that will prevent these issues from happening in the future.

To lay the foundation for a security improvements programme at this point, I suggest adding all the identified accounts to an AWS Organisation if you haven’t already. This will simplify account management and billing and allow you to apply organisation-wide service control policies.


Resilience in the Cloud

4157700219_609fca025b_z

Modern digital technology underpins the shift that enables businesses to implement new processes, scale quickly and serve customers in a whole new way.

Historically, organisations would invest in their own IT infrastructure to support their business objectives and the IT department’s role would be focused on keeping the ‘lights on’.

To minimise the chance of failure of the equipment, engineers traditionally introduced an element of redundancy in the architecture. That redundancy could manifest itself on many levels. For example, it could be a redundant datacentre, which is kept as a ‘hot’ or ‘warm’ site with a complete set of hardware and software ready to take the workload in case of the failure of a primary datacentre. Components of the datacentre, like power and cooling, can also be redundant to increase the resiliency.

On a lesser scale, within a single datacentre, networking infrastructure elements can be redundant. It is not uncommon to procure two firewalls instead of just one to configure them to balance the load or just to have a second one as a backup. Power and utilities companies still stock up on critical industrial control equipment to be able to quickly react to a failed component.

The majority of effort, however, went into protecting the data storage. Magnetic disks were assembled in RAIDs to reduce the chances of data loss in case of failure and backups were relegated to magnetic tapes to preserve less time-sensitive data and stored in separate physical locations.

Depending on specific business objectives or compliance requirements, organisations had to heavily invest in these architectures. One-off investments were, however, only one side of the story. On-going maintenance, regular tests and periodic upgrades were also required to keep these components operational. Labour, electricity, insurance and other costs were adding to the final bill. Moreover, if a company was operating in a regulated space, for example if they processed payments and cardholder data then external audits, certification and attestation were also required.

With the advent of cloud computing, companies were able to abstract away a lot of this complexity and let someone else handle the building and operation of datacentres and dealing with compliance issues relating to physical security.

The need for the business resilience, however, did not go away.

Cloud providers can offer options that far exceed (at comparable costs) the traditional infrastructure; but only if configured appropriately.

One example of this is the use of ‘zones’ of availability, where your resources can be deployed across physically separate datacentres. In this scenario, your service can be balanced across these availability zones and can remain running even if one of the ‘zones’ goes down. If you build your own infrastructure for this, you would have to build one datacentre in each zone and . You better have a solid business case for this.

It is important to keep this in mind when deciding to move to the cloud from the traditional infrastructure. Simply lifting and shifting your applications to the cloud may, in fact. These applications are unlikely to have been developed to work in the cloud and take advantage of these additional resiliency options. Therefore, I advise against such migration in favour of re-architecting.

Cloud Service Provider SLAs should also be considered. Compensation might be offered for failure to meet these, but it’s your job to check how this compares to the traditional “5 nines” of a availability in a traditional datacentre.

You should also be aware of the many differences between cloud service models.

When procuring a SaaS, for example, your ability to manage resilience is significantly reduced. In this case you are relying completely on your provider to keep the service up and running, potentially raising the provider outage concern. . Even with the data, however, your options are limited without a second application on-hand to process that data which may also require data transformation. Study the historical performance and pick your SaaS provider carefully.

IaaS gives you more options to design an architecture for your application, but with this great freedom comes great responsibility. The provider is responsible for fewer layers of the overall stack when it comes to IaaS, so you must design and maintain a lot of it yourself. When doing so, assume failure rather than thinking of it as a (remote) possibility.  Availability Zones are helpful, but not always sufficient.  What scenarios require consideration of the use of a separate geographical region? The European Banking Authority recommendations on Exit and Continuity can be an interesting example to look at from a testing and deliverability perspective.

Be mindful of characteristics of SaaS that also affect PaaS from a redundancy perspective. For example, if you’re using a proprietary PaaS then you can’t just lift and shift your data and code.

Above all, when designing for resiliency, take a risk-based approach. Not all your assets have the same criticality. , know your RPO and RTO. Remember that SaaS can be built on top of AWS or Azure, exposing you to supply chain risks.

Even when assuming the worst, you may not have to keep every single service running should the worst actually happen. For one thing, it’s too expensive – just ask your business stakeholders. The very worst time to be defining your approach to resilience is in the middle of an incident, closely followed by shortly after an incident.  As with other elements of security in the cloud, resilience should “shift left” and be addressed as early in the delivery cycle as possible.  As the Scout movement is fond of saying – “be prepared”.

Image by Berkeley Lab.


How to pass the CCSP exam

CCSP-logo-2lines

I just passed the Certified Cloud Security Practitioner (CCSP) exam. It wasn’t easy, but nothing you can’t prepare for.

Apart from the official (ISC)2 guides, here are some of the resources I used in my studies:

If you would prefer to add video lectures to your study plan, there’s a free course on Cybrary. For a quick summary, check out these study notes and mindmaps. Also, multiple sets of free flashcards are available on Quizlet.

It is a good idea to do some practice questions: there are books and mobile apps out there to help you with this. Practical experience in cloud security is also essential.

The exam tests your knowledge of the following CCSP domains:

  • Architectural Concepts and Design Requirements
  • Cloud Data Security
  • Cloud Platform and Infrastructure Security
  • Cloud Application Security
  • Operations
  • Legal and Compliance

The structure and format might change as (ISC)2 continuously revise their exams, so please check the official website to make sure you are up-to-date with the latest developments.

On the day, read the questions carefully. It’s not a time pressured exam (I was done in two hours), so it’s worth re-reading the questions and answers again to make sure you are answering exactly what is being asked. Eliminate the wrong options first and then decide on the best out of the remaining ones.

Finally, my suggestion would be to approach the questions from the perspective of a consultant. What would you recommend in each situation? Don’t go too technical – keep the business needs in mind at all times.

Don’t stress too much about the final result. I’m sure you’ll pass, but even if not on your first attempt, you’ll learn either way! Remember, the knowledge you accumulate in the process of preparing for the test itself has the most value, not the credential.

Good luck!