How I stop projects from shipping at big tech companies
Shipping projects at a big tech company is very challenging work. Much can be written on the subject (such as Sean Goedecke’s post on how to ship), but it’s not my area of expertise. Instead, I work to make sure that projects don’t ship. Or rather, that projects that shouldn’t be shipping don’t ship. I do this for the same reason doctors sometimes measure success in terms of “nobody died who shouldn’t have”. Projects should be built to meet the security needs of the business without introducing risks that are likely to make the project cost more than the revenue it brings in. I want to ensure that we aren’t shipping projects with unnecessary risks when we know how to mitigate those risks.
The following explains at a very high level how I do this. At any given company it might look very different, with different processes or approaches taken to discovering risks and ensuring the risks are appropriately balanced against business needs.
Reviewing projects
There are many ways to review projects. I have seen this both as an external consultant and internal security employee. It might be entirely reverse engineering of hardware and firmware/software, it might be source code review and construction of fuzzers, or it might even be an architecture review with years worth of design documents, software, and even schematics to review. Whatever the case, the output of these activities are a set of vulnerabilities or risks that have been identified, hopefully along with some understanding of the security impact of these findings.
So you’ve found a problem
Let’s say you have found some problem with the project. How do you decide if it warrants not shipping at all, versus something that should be fixed before shipping or as something that is a problem but can be fixed a bit later? You must take the findings and the technical impact and then evaluate them against the cost of remediation and other business-specific information.
At a big company you’re sometimes going to have to work with others to make this determination. In some cases it will be obvious or you will already have all the background necessary to demonstrate the impact, say if you have an exploit that grants you a root shell on every machine running the product. In other cases it might be far less clear what the impact is without spending an inordinate amount of time investigating that yourself. There is some careful balance necessary in ensuring you’re neither spending a month investigating every bug, nor are you wasting the time of others by not digging into something yourself. With time and experience you learn when to confirm (or disconfirm) your own findings’ impact, versus when to involve others. When you lack that experience, hopefully you have a more senior security tech lead to guide you.
Just fix it
For small bugs, or things with obvious and straightforward technical and business impact, just fix it. This might be easier said than done, but it’s not the tough edge case I’m considering here. The tough cases are when the problem is large and might impact project timelines, or when it requires significant architectural changes after the project has already been built.
Raising things up the chain
Let’s say you get to the point where a project is about to ship, but you’ve found an important technical issue. It could be architectural or some more concrete vulnerability with a proof of concept, but some hypothetically important finding. Even once you understand the technical impact, you might not even know anyone who understands the business impact. This brings us to the point where we must work with others to make a determination.
I can’t guarantee the process is going to be the same everywhere once you get to this point. I’ve seen this at a number of companies now, and even in the absence of a defined process it usually plays out in much the same way. Something is discovered, stakeholders are made aware, and eventually someone makes a decision on what to do. They might be several pay grades up from you or it might be your direct manager or a TL that has the necessary context and decision making power to say “this is fine, because we will or have already done X to mitigate it.” They might also say “hold on a minute, this is an enormous risk and we can’t accept it.” Either way the decision is out of your hands, but there might be a significant amount of followup work to create a solution to mitigate the risk.
Summary
Shipping products quickly can mean the business continues to exist tomorrow. If you don’t it can be an existential risk, put a big dent in revenue, or it can mean losing out on the next big thing. There is enormous pressure to ship quickly, but it must be balanced with security risks, not to mention other risks on which I am not a domain expert. It takes a village to raise a child, and the same is true of shipping projects. To successfully ship a project that doesn’t quickly crash and burn because of a security issue, we must collaborate to find the right balance.
Above all, ensure you work with your manager and other nearby folks to ensure you aren’t burning your reputation unnecessarily, especially if you’re earlier in your career or at a new company.