Configuration Management & Vulnerability Management

The Configuration Management & Vulnerability Management practice concerns itself with patching and updating applications, version control, defect tracking and remediation, and incident handling.

Configuration Management & Vulnerability Management Level 1

[CMVM1.1: 103] Create or interface with incident response.

The SSG is prepared to respond to an event or alert, and is regularly included in the incident response process, either by creating its own incident response capability or by regularly interfacing with the organization’s existing team. A regular meeting between the SSG and the incident response team can keep information flowing in both directions. Having pre-built communication channels with critical vendors (e.g., infrastructure, SaaS) is also very important.

[CMVM1.2: 101] Identify software defects found in operations monitoring and feed them back to development.

Defects identified through operations monitoring are fed back to development and used to change developer behavior. In some cases, the contents of production logs can be revealing (or can reveal the need for improved logging). Offering a way to enter incident triage data into an existing bug-tracking system (perhaps by making use of a special security flag) seems to solve some problems, but the idea is to close the information loop and make sure that security issues get fixed. In the best of cases, processes in the SSDL can be improved based on operational data.

Configuration Management & Vulnerability Management Level 2

[CMVM2.1: 91] Have emergency codebase response.

The organization can make quick code changes when an application is under attack, with a rapid-response team working in conjunction with application owners and the SSG to study the code and the attack, find a resolution, and fix the production code (e.g., push a patch into production, rollback to a known-good version, deploy a new container). Often, the emergency response team is the engineering team itself. A well-defined process is a must here, but a process that has never been used might not actually work.

[CMVM2.2: 88] Track software bugs found in operations through the fix process.

Defects found in operations are fed back to development, entered into established defect management systems, and tracked through the fix process. This capability could come in the form of a two-way bridge between bug finders and bug fixers, but make sure the loop is closed completely. Setting a security flag in the bug-tracking system can help facilitate tracking.

[CMVM2.3: 64] Develop an operations inventory of applications.

The organization has a map of its software deployments. If a piece of code needs to be changed, operations or DevOps teams can reliably identify all the places where the change needs to be installed. Common components shared between multiple projects can be noted so that, when an error occurs in one application, other applications that share the same components can be fixed as well.

Configuration Management & Vulnerability Management Level 3

[CMVM3.1: 2] Fix all occurrences of software bugs found in operations.

The organization fixes all instances of each bug found during operations, not just the small number of instances that trigger bug reports. This requires the ability to reexamine the entire codebase when new kinds of bugs come to light (see [CR3.3 Eradicate specific bugs from the entire codebase]). One way to approach this is to create a rule set that generalizes a deployed bug into something that can be scanned for via automated code review. Use of containers can greatly simplify deploying the fix for all occurrences of a software bug.

[CMVM3.2: 9] Enhance the SSDL to prevent software bugs found in operations.

Experience from operations leads to changes in the SSDL, which can in turn be strengthened to prevent the reintroduction of bugs found during operations. To make this process systematic, each incident response postmortem could include a “feedback to SSDL” step. This works best when root-cause analysis pinpoints where in the SDLC an error could have been introduced or slipped by uncaught. DevOps engineers might have an easier time with this because all the players are likely involved in the discussion and the solution. An ad hoc approach to SSDL improvement isn’t sufficient.

[CMVM3.3: 12] Simulate software crises.

The SSG simulates high-impact software security crises to ensure software incident response capabilities minimize damage. Simulations could test for the ability to identify and mitigate specific threats or, in other cases, begin with the assumption that a critical system or service is already compromised and evaluate the organization’s ability to respond. When simulations model successful attacks, an important question to consider is the time required to clean up. Regardless, simulations must focus on security-relevant software failure and not on natural disasters or other types of emergency response drills. Organizations that are highly dependent on vendor infrastructure (e.g., cloud service providers, SaaS) and security features will naturally include those things in crisis simulations.

[CMVM3.4: 13] Operate a bug bounty program.

The organization solicits vulnerability reports from external researchers and pays a bounty for each verified and accepted vulnerability received. Payouts typically follow a sliding scale linked to multiple factors, such as vulnerability type (e.g., remote code execution is worth $10,000 versus CSRF is worth $750), exploitability (demonstrable exploits command much higher payouts), or specific service and software versions (widely deployed or critical services warrant higher payouts). Ad hoc or short-duration activities, such as capture-the-flag contests or informal crowd-sourced efforts, don’t constitute a bug bounty program.

[CMVM3.5: 0] Automate verification of operational infrastructure security.

The SSG works with engineering teams to facilitate a controlled self-service process that replaces some traditional IT efforts, such as application and infrastructure deployment, and includes verification of security properties (e.g., adherence to agreed-upon security hardening). Engineers now create networks, containers, and machine instances, orchestrate deployments, and perform other tasks that were once IT’s sole responsibility. In facilitating this change, the organization uses machine-readable policies and configuration standards to automatically detect and report on infrastructure that does not meet expectations. In some cases, the automation makes changes to running environments to bring them into compliance. In many cases, organizations use a single policy to manage automation in different environments, such as in multi-cloud and hybrid-cloud environments.