Modern security tools continue to improve in their ability to defend organizations’ networks and endpoints against cybercriminals. But the bad actors still occasionally find a way in.
Security teams must be able to stop threats and restore normal operations as quickly as possible. That’s why it’s essential that these teams not only have the right tools but also understand how to respond to an incident effectively.
Resources like an incident response template can be customized to define a plan with roles and responsibilities, processes, and an action item checklist.
But preparations can’t stop there. Teams must continuously train to adapt as threats rapidly evolve. Every security incident must be harnessed as an educational opportunity to help the organization better prepare for—or even prevent—future incidents.
SANS Institute defines a framework with six steps to a successful IR.
- Preparation
- Identification
- Containment
- Eradication
- Recovery
- Lessons learned
While these phases follow a logical flow, you may need to return to a previous phase in the process to repeat specific steps that were done incorrectly or incompletely the first time.
Yes, this slows down the IR. But it’s more important to complete each phase thoroughly than to try to save time by expediting steps.
1: Preparation
Goal: Get your team ready to handle events efficiently and effectively.
Everybody with access to your systems needs to be prepared for an incident—not just the incident response team. Human error is to blame for most cybersecurity breaches.So the first and most important step in IR is to educate personnel about what to look for.
Leveraging a templated incident response plan to establish roles and responsibilities for all participants—security leaders, operations managers, help desk teams, identity and access managers, as well as audit, compliance, communications, and executives—can ensure efficient coordination.
Attackers will continue to evolve their social engineering and spear phishing techniques to try to stay one step ahead of training and awareness campaigns. While most everybody now knows to ignore a poorly written email that promises a reward in return for a small up-front payment, some targets will fall victim to an off-hours text message pretending to be their boss asking for help with a time-sensitive task.
To account for these adaptations, your internal training must regularly reflect the latest trends and techniques.
Your incident responders—or security operations center (SOC), if you have one—will also require regular training, ideally based on simulations of actual incidents. An intensive tabletop exercise can raise adrenaline levels and give your team a sense of what it’s like to experience a real-world incident. You might find that some team members shine when the heat is on, while others require additional training and guidance.
Another part of your preparation is outlining a specific response strategy. The most common approach is to contain and eradicate the incident. The other option is to watch an incident in progress so you can assess the attacker’s behavior and identify their goals, assuming this does not cause irreparable harm.
Beyond training and strategy, technology plays a huge role in incident response. Logs are a critical component. Simply put, the more you log, the easier and more efficient it will be for the IR team to investigate an incident.
You can also quickly protect yourself by isolating machines, cutting them off from the network, and running counteracting commands at scale if you use an endpoint detection and response (EDR) platform or an extended detection and response (XDR) tool with centralized control.
Other technology needed for IR includes a virtual environment where logs, files, and other data can be analyzed, along with ample storage to house this information. You don’t want to waste time during an incident setting up virtual machines and allocating storage space.
Finally, you’ll need a system for documenting your findings from an incident, whether that’s using spreadsheets or a dedicated IR documentation tool. Your documentation should cover the incident’s timeline, what systems and users were impacted, and what malicious files and indicators of compromise (IOC) you discovered (both in the moment and retrospectively).
2: Identification
Goal: Detect whether you have been breached and collect IOCs.
There are a few ways to identify that an incident has occurred or is currently in progress.
Internal detection | The worst-case scenario is to learn that an incident has occurred only after discovering that data has been exfiltrated from your environment and posted to the internet or darknet sites. The implications are even worse if such data includes sensitive customer information and the news leaks to the press before you have time to prepare a coordinated public response. |
External detection | An incident can be discovered by your in-house monitoring team or by another member of your org (thanks to your security awareness efforts), via alerts from one or more of your security products, or during a proactive threat-hunting exercise. |
Exfiltrated data disclosed | The worst-case scenario is to learn that an incident has occurred only after discovering that data has been exfiltrated from your environment and posted to internet or darknet sites. The implications are even worse if such data includes sensitive customer information and the news leaks to the press before you have time to prepare a coordinated public response. |
No discussion about identification would be complete without bringing up alert fatigue.
If the detection settings for your security products are dialed too high, you will receive too many alerts about unimportant activities on your endpoints and network. That is a great way to overwhelm your team and can result in many ignored alerts.
The reverse scenario, where your settings are dialed too low, is equally problematic because you might miss critical events. A balanced security posture will provide just the right number of alerts to identify incidents worthy of further investigation without suffering alert fatigue.
Your security vendors can help you find the right balance and, ideally, automatically filter alerts so your team can focus on what matters.
During the identification phase, you will document all indicators of compromise (IOCs) gathered from alerts, such as compromised hosts and users, malicious files and process, new registry keys, and more.
Once you have documented all IOCs, you will move to the containment phase.
3: Containment
Goal: Minimize the damage.
Containment is as much a strategy as it is a distinct step in IR.
You will want to establish an approach fit for your organization, keeping security and business implications in mind. Although isolating devices or disconnecting them from the network may prevent an attack from spreading across the organization, it could also result in significant financial damage or other business impact. These decisions should be made beforehand and clearly articulated in your IR strategy.
Containment can be broken down into short- and long-term steps, with unique implications for each.
Short-term | This includes steps you might take in the moment, like shutting down systems, disconnecting devices from the network, and actively observing the threat actor’s activities. There are pros and cons to each of these steps. |
Long-term | The best-case scenario is to keep infected system offline so you can safely move to the eradication phase. This isn’t always possible, however, so you may need to take measures like patching, changing passwords, killing specific services, and more. |
During the containment phase you will want to prioritize your critical devices, like domain controllers, file servers, and backup servers to ensure they haven’t been compromised.
Additional steps in this phase include documenting which assets and threats were contained during the incident, as well as grouping devices based on whether they were compromised or not.If you are unsure, assume the worst. Once all devices have been categorized and meet your definition of containment, this phase is over.
Bonus step: Investigation
Goal:Determine who, what, when, where, why, how.
At this stage, it is worth noting another important aspect of IR: investigation.
An investigation takes place throughout the IR process. While not a phase of its own,it should be kept in mind as each step is performed. An investigation aims to answer questions about which systems were accessed and the origins of a breach.
When the incident has been contained, teams can facilitate a thorough investigation by capturing as much relevant data as possible from sources like disk and memory images and logs.
This flowchart visualizes the overall process:
You may be familiar with the term digital forensics and incident response (DFIR), but it’s worth noting that the goals of IR forensics differ from those of traditional forensics. In IR, the primary goal of forensics is to help progress from one phase to the next as efficiently as possible to resume normal business operations.
Digital forensics techniques are designed to extract as much useful information as possible from any evidence captured and turn it into useful intelligence that can help build a more complete picture of the incident or even to aid in prosecuting a bad actor.
Data points that add context to discovered artifacts might include how the attacker entered the network or moved around, which files were accessed or created, what processes were executed, and more. Of course, this can be a time-consuming process that might conflict with IR.
Notably, DFIR has evolved since the term was first coined. Organizations today have hundreds or thousands of machines, each with hundreds of gigabytes or even multiple terabytes of storage, so the traditional approach of capturing and analyzing full disk images from all compromised machines is no longer practical.
Current conditions require a more surgical approach, where specific information from each compromised machine is captured and analyzed.
4: Eradication
Goal: Make sure the threat is completely removed.
With the containment phase complete, you can move to eradication, which can be handled through either disk cleaning, restoring to a clean backup, or full disk reimaging. Cleaning entails deleting malicious files and deleting or modifying registry keys. Reimaging means reinstalling the operating system.
Before taking any action, the IR team will want to refer to any organizational policies that, for example, call for specific machines to be reimaged in the event of a malware attack.
As with earlier steps, documentation plays a role in eradication. The IR team should carefully document the actions taken on each machine to ensure that nothing is missed. As an additional check, you can perform active scans of your systems for any evidence of the threat after the complete eradication process.
5: Recovery
Goal: Get back to normal operations.
All your efforts have been leading here! The recovery phase is when you can resume business as usual. Determining when to restore operations is the key decision at this point. Ideally, this can happen without delay, but it may be necessary to wait for your organization’s off-hours or other quiet periods.
One more check to verify that there aren’t any IOCs left on the restored systems. You must also determine if the root cause still exists and implement the appropriate fixes.
Now that you have learned about this type of incident, you can monitor for it in the future and establish protective controls.
6: Lessons learned
Goal: Document what happened and improve your capabilities.
Now that the incident is comfortably behind you, it’s time to reflect on each major IR step and answer key questions; there are plenty of questions and aspects that should be asked and reviewed; below are a few examples:
- Identification: How long did it take to detect the incident after the initial compromise occurred?
- Containment: How long did it take to contain the incident?
- Eradication: After eradication, did you still find any signs of malware or compromise?
Probing these will help you step back and reconsider fundamental questions like: Do we have the right tools? Is our staff appropriately trained to respond to incidents?
Then the cycle returns to preparation, where you can make necessary improvements like updating your incident response plan template, technology, and processes and providing your people with better training.
4 Pro Tips to Stay Secure
Let’s conclude with four final suggestions to bear in mind:
- The more you log in, the easier the investigation will be. Log in as much as possible to save money and time.
- Stay prepared by simulating attacks against your network. This will reveal how your SOC team analyzes alerts and their communication ability—critical during a real incident.
- People are integral to your organization’s security posture. Did you know human error is responsible for 95% of cyber breaches? That’s why it’s important to perform periodic training for two groups: end users and your security team.
- Consider having a specialized 3rd party IR Team on call that can immediately help with more difficult incidents beyond your team’s ability to resolve. These teams, which may have resolved hundreds of incidents, will have the IR experience and Incident response tools to hit the ground running and accelerate your IR.
You can download a free incident response template here.