CVSS: an (inappropriate) industry standard for assessing cyber risk
Which is worse: a vulnerability that allows an unauthenticated user to perform remote arbitrary file execution against a VPN gateway your company uses or the fact your business’ data (assuming you are based in the United States) would be destroyed in the case of a nuclear war with Russia (due to America’s lack of a comprehensive ballistic missile defense system and general susceptibility of most server farms to thermonuclear infernos)?
According to the Common Vulnerability Scoring System (CVSS), the former vulnerability has a “perfect” score of 10.0 (out of 10.0) while by my calculations the latter would have a score of only 7.3. I am fairly certain, though, that most people would consider nuclear annihilation to be the worse outcome, from both a business and general life perspective.
I’m sure people’s eyes are already rolling due to my use of an extreme example here, but hear me out. Obviously the risk of the first vulnerability being “exploited” is higher than the second, and thus it might make sense to worry about it more. But just being able to understand the consequences of a vulnerability’s successful exploitation - its severity - is the first step in assessing risk, and the CVSS falls short even in its ability to do just that.
In this post, I will explore the use of the CVSS throughout the software industry and explain why it is not an appropriate standard for cyber risk management. This is because it is neither designed for risk management nor is it an appropriate tool for evaluating severity alone.
At a high level, the system evaluates security vulnerabilities based on the potential impact to data confidentiality, integrity, and availability (known as the “CIA triad”). With its focus on CIA, the standard necessarily takes a defender’s perspective, which generally makes sense since information security professionals are focused on protecting these attributes of data (despite pronouncements that “The CIA Triad is Dead”).
Otherwise, though, I won’t rehash the standard in depth, because there is detailed documentation available explaining each aspect of it. I will instead focus on the specific areas where I think that the CVSS falls short.
Before I dive in, I want to say that I mean no disrespect whatsoever to the Forum of Incident Response and Security Teams (FIRST), which maintains the standard. All of my comments are intended in the spirit of providing constructive feedback. FIRST seems like a great organization, and they have steadily modified the CVSS over time to reflect industry trends and input. Most notably, they took an important step with the 3.1 version by clearly stating that “CVSS Measures Severity, not Risk.”
Not a risk measurement tool
Unfortunately, it doesn’t seem like the software industry is listening to FIRST. Interacting with a fair number of stakeholders - both internal and external - over my career suggests to me that many people have not internalized the FIRST position. Despite the multitude of vendor blog posts that explain (admittedly in a self-interested manner) why the CVSS alone should not be used for risk assessment, I would say that many information security teams still do exactly this.
Additionally, and despite FIRST’s unequivocal position on the surface, the CVSS standard itself muddies the waters regarding its focus on severity only. Inputs to the model such as attack complexity (base score) as well as exploit code maturity and report confidence (temporal score) are actually measures of likelihood rather than severity. Their mere existence can imply to some that CVSS is in fact a risk measurement tool.
Not great for measuring severity
In addition to not serving as an effective risk measurement tool, the CVSS formula for delineating severity isn’t the best option available for even this specific purpose. I’ll use a more realistic and detailed example than I used above to explore this point.
Assume your organization maintains custody of protected health information (PHI). Storing a local copy of such information on a laptop without encryption would, by my assessment, register as a 6.4 using the CVSS environmental score. I’ll assume that confidentiality is the only impacted attribute and that data integrity and availability are not at risk in this situation, because the information in question is just a local copy and not the authoritative source used by your company for any decision-making. I will also assume that the confidentiality requirement is “high” and that the attack complexity - of accessing the unencrypted information on a laptop - is “low” (due to the fact that detailed instructions for recovering even encrypted data are readily available on the internet). As I mentioned previously, I consider the last parameter to be a measure of likelihood rather than severity, but it is not possible to generate a CVSS score without filling in all fields.
Qualitatively, the CVSS describes such vulnerabilities as being of “medium” severity. Looking at multiple historical cases, however, shows that companies usually pay one to several million dollars in fines levied by regulators in the case of the theft of such a laptop. This does not even include the costs of lost productivity and reputation damage that result from such events.
Conversely, assume that a web application for the same company is leaking relatively insignificant - but technically confidential - metadata via an unauthenticated application programming interface (API), such as the last time a certain user has logged in. Also assume that someone discovering this API could completely deny access for anyone else to it (thus completely compromising its availability) through a long-running call to said API. The attacker could not, however, modify the underlying data. This issue would register as a 7.0 on the CVSS environmental scale (even assuming a “low” requirement for confidentiality and availability). Such an issue qualifies as a “high” vulnerability qualitatively, though, according to the system.
There might be something to be said about the potential for this flaw allowing a malicious insider the repudiate his activity on a system in a highly complex attack. This would require constant “suppression” of the API through a long-running call. But once any subsequent unauthorized activity is detected, it would be possible to apply compensating controls or simply fix the underlying issue to regain access to the login data. Thus, by itself this issue is very minor. The financial cost of this vulnerability being “exploited” is de minimis, as it would do almost nothing to disrupt business operations or create any sort of liability for the company.
Thus, incidents resulting from the exploitation of these two vulnerabilities would be of radically different real-world severity. Any competent business leader who understands the consequences of both types of events would immediately realize they are not even of the same order of magnitude. An information security professional using only the CVSS, however, would be hard-pressed to make clear the enormity of the difference between the two (and frankly, might not even be thinking about it, which is a separate problem). If you are blindly using this rubric, you would only be able to characterize these as “medium” and “high” vulnerabilities, respectively.
Note: after a good discussion with Stephen Massey on the topic, I realized I made a mistake in my CVSS calculation, so am striking this section.
How soon to remediate?
Developing a rational policy for managing vulnerabilities requires having not only a scale for evaluating them but also an action plan for fixing them.
Faced with the prospect of a multi-million dollar fine resulting from theft of the PHI, as described above, a wise business leader might consider radical steps. This could include costly measures such as rapidly purchasing and installing mobile device management software on every company-owned laptop and phone to enforce full-disk encryption along with halting operations the next day to conduct 100% mandatory retraining of the entire workforce on how to use this software. These would be costly steps but might be worth it in the long run.
Conversely, the leaking and potentially unavailable metadata would not really justify any immediate action. It could be resolved during a routine update of the web application, or, if the cost of fixing were too great, might be left unresolved indefinitely.
A brief survey of publicly available policies based on the CVSS reveals a relatively large range of allowable timelines for defect resolution. FIRST proposes a requirement where those scoring 7.0 and higher must be resolved in two weeks. Drexel University suggests that 30 days is the right amount of time for those scoring 7.0-10.0. The University of Michigan’s website proposes three months for those rated 7.0-8.9, but anything lower is only to be resolved “based on availability of staff resources.”
Which of these timelines is appropriate? In most settings, you would probably anticipate an answer of “it depends,” but in this case I can pretty confidently respond that the answer is “none of these.”
For example, the Heartbleed vulnerability (CVE-2014-0160) initially registered as a 5.0 on the CVSS (later upgraded to 7.5). For a software flaw that effectively “broke the internet” and facilitated at least 840 known breaches, though, it seems like the correct timeline for remediation should be measured in hours, rather than days. Conversely, a CVSS 9.8 issue that is not exploitable in the default configuration of an operating system (such as CVE-2017-8283) can be safely deferred indefinitely.
Thus, I would strongly advise against basing any sort of risk or vulnerability management policy purely on the CVSS.
Living with the CVSS
Unfortunately, at this stage, it is unlikely that we will be able to dispense with or substantially reform the CVSS any time soon. Many of your customers, suppliers, and colleagues likely still use it as the primary tool for evaluating software vulnerabilities (and managing cyber risk). Thus, you will probably need to work with the standard and respond to findings presented using its rubric.
Misuse of the CVSS is an especially bad problem when evaluating vulnerabilities in 3rd (and greater) party code (please take a look at my article on LinkedIn for a deep dive on this topic). Due to the near-ubiquitous presence of open source libraries in applications, software composition analysis (SCA) tools often reveal dozens of vulnerabilities with very high CVSS scores. Although only a minority of these are generally exploitable in any given deployment configuration, panic often ensues due to a laser-like focus on tool findings, without considering the context. I have found it very difficult to explain that even if the reported severity of a given vulnerability is high, if the likelihood of it being exploited is zero, so is the associated risk.
To deal with this reality, I would recommend several techniques.
First and foremost, I would advise referring to the raw outputs of vulnerability scanners that provide CVSS scores as “findings” rather than “vulnerabilities.” I think it is fair to say that most such findings are in fact false positives, and using more neutral language is appropriate prior to any sort of technical investigation confirming or denying the existence of a true vulnerability.
Additionally, I have found saying things like “the effective CVSS score is zero” to be useful when referring to findings that register with such scanning tools but are not exploitable under any set of realistic circumstances. This uses the same framework with which the counterparty may be familiar while still conveying your point.
Furthermore, using a version of the Socratic Method can also be helpful when communicating with those who are rigidly focused on the CVSS reading. Asking questions like “what would be the consequences of exploiting this finding, assuming it is valid?” can gently push them to realize that a given issue is of no risk at all (assuming this is the case).
Conversely, when dealing with issues where the outcome of a successful exploitation would be much more severe than implied by the quantitative or qualitative description of the vulnerability per CVSS, I would focus heavily on the potential financial consequences rather than the “technical” severity. Even without a lightweight vulnerability analysis technique (which I will eventually provide), pointing to the consequences of real-world exploitations of similar vulnerabilities - as I have done above - can prove illuminating.
CVSS has strong defenders, and I am open to counter-arguments on any of the above points, so please make them in the comments section. But I am of the opinion that there are better ways of evaluating cyber risk stemming from identified vulnerabilities. In the next edition, I will evaluate another method for doing so: the Microsoft Security Development Lifecycle’s bug bar.