Lessons from the Past: What Zoom can learn from Microsoft

The Zoom logo, with a black padlock

Summary: In the early 2000s, Microsoft faced challenges similar to the ones Zoom’s looking at today, and successfully turned things around.   Some of the key lessons from Microsoft’s experiences include

  • Think broadly about “trust”
  • Make trust the product teams’ responsibility
  • Fix your privacy practices and policies
  • Do threat modeling
  • Use the tools — and develop new ones
  • Learn from your experiences — and continue to update your processes
  • It’s a social problem, not just “technical”
  • It may take a while to address — but there’s a big potential upside

A tough time for Zoom

After a great start to the year, with usage soaring as people around the world stay home, the last few weeks have been a really tough time for Zoom.  The company has always focused on convenience and usability.  Now, they’re dealing with the consequences of not having paid much attention to security and privacy:

Facing an existential threat to its business, Zoom’s CEO Eric Yuan has announced that the company will be shutting down feature development for 90 days to focus on security and privacy. They’re also bringing in third-party security consultants, creating an advisory board, and engaging with security researchers.

Lessons from the past

Back in 2001, high-profile security problems (including one so severe the FBI issued a warning) had become an existential threat to Microsoft’s business.   In January of 2002, Bill Gates’ company-wide Trustworthy Computing memo announced that the company was shutting down Windows feature development to focus on security and privacy.

Michael Howard’s 10 years since the Bill Gates security memo: A personal journey is a great short summary of what Microsoft did as part of the effort — including bringing in third-party security consultants, creating an advisory board, and engaging with security researchers.

And it worked.  It took a few years, but Microsoft wound up turning things around.  By the mid-2000s, security and trustworthiness were becoming competitive advantage for the company.

I was at Microsoft Research at the time, and wound up pretty heavily involved in this work for several years — including helping plan the initial “security push”, researching attack surface reduction with Jeannette Wing and Michael, and modeling the effects of buffer overrun detection and mitigation technologies as part of a $200 million decision of whether or not to recompile the entire code base for a service pack release.  It was really stressful, an incredible sense of urgency crashing up against the complexities of evolving a culture that had been seen as core to the company’s successes.   At the same time, though, it was also a chance to work with some really great people and have an impact on the whole software industry.

Of course, it’s a different world today from the early 2000s.  Some of what we did looks downright quaint by today’s standards — for example all the time, energy, and money that went into flying consultants and advisors to Redmond, and flying employees to visit customers and conferences.  And Zoom’s very different than Microsoft was in quite a few ways, starting with being much more nimble.

Still, many of Microsoft’s experiences are extremely relevant.   Here’s some of the lessons that might be especially useful to Zoom.

Think broadly about “trust”

“Trust online will not be achieved through security because that vision is founded on a misconstrued notion of trust” — Helen Nissenbaum,  Securing Trust Online: Wisdom or Oxymoron?, 2001

Zoom clearly understands this. In A Message to Our Users  Eric Yuan emphasized that “we want to do what it takes to maintain your trust”, and also talked about “shifting all our engineering resources to focus on our biggest trust, safety, and privacy issues” as well as committing to providing a transparency report.  That’s very encouraging!

That said, Zoom’s initial responses have primarily focused on the security side.  One clear example is their new CISO Advisory Board, made up of Chief Information Security Officers from large corporations.  Another is bringing in ex-Facebook Chief Security Officer Alex Stamos as an outside advisor, and Katie Moussouris of Luta Security to assess Zoom’s internal vulnerability handling processes.

“Trustworthiness is a much broader concept than security, and winning our customers’ trust involves more than just fixing bugs.” — Bill Gates, Trustworthy Computing, 2001

CISO’s have a deep understanding of security, and Alex’s and Katie’s experiences and expertise are clearly relevant, so I can certainly see why Zoom started there.   Still, to make broad progress on trust, Zoom’s also likely to need

  • consumer privacy experts, as well as an advisory board with representatives from groups with a deep knowledge of privacy and represent consumer interests (such as EPIC, Consumer Federation of America, Privacy International, and Privacy Rights Clearinghouse)
  • safety experts, as well as an advisory board with representatives from those who are most targeted online — including domestic violence survivors, reproductive justice advocates, trans and non-binary people, people in recovery, racial justice activists, and disabled people

Similarly, as Zoom’s refocusing engineering, I really wonder how much of the training, code review, and testing they’re doing is getting informed by this broader perspective.  As Casey Fiesler says, user personas really need to include “user stalking their ex,”  “user who wants to traumatize vulnerable folks,” and “user who thinks it’s funny to show everyone their genitals”.   That clearly hasn’t been the case so far at Zoom.

Of course, you gotta start somewhere.   Zoom’s first steps are good ones.  Hopefully they’re already working on these other aspects as well.

Make trust the product teams’ responsibility

“Once Microsoft started using the Security Development Lifecycle, there was no stopping it.” — from Life in the Digital Crosshairs, 2014

Microsoft’s Security Development Lifecycle (SDL) continues to be one of the most significant contributions of the early-2000s work.  Zoom’s different enough from Microsoft that other security processes, or SDL variants for agile development and DevOps might be better starting points; but the same principles are likely to apply.  Zoom needs to find a way to operationalize security and other aspects of trustworthiness throughout their whole engineering organization, while evolving their culture to be more security-focused.

One of the most important principles of the SDL is to incorporate security into everybody’s role.  It’s important and valuable to have an empowered, well-resourced, security team that focus on security and privacy — and it’s equally important to have this expertise in the teams designing, developing, and testing the products.   As well as investing in training for the product teams, Microsoft wound up introducing new roles like Security Product Manager and Security Architrect, and revising other job responsibilities to make the security focus explicit.

“Privacy must become integral to organizational priorities, project objectives, design processes, and planning operations.”  — Ann Cavoukian, Privacy by Design: the Seven Foundational Principles

The same is true for other aspects of trust.  Privacy and safety teams are useful; by themselves, they’re not enough.  Fortunately, as with the SDL, there are useful blueprints for the path forward — Privacy by Design is a great example.

Fix your privacy practices and policies

“This is a clear breach of GDPR” — Tara Taubman-Bassirian, in Zoom’s Security and Privacy Woes Violated GDPR, Expert Says

EPIC’s 2001 FTC complaint about Microsoft Passport’s privacy practices led to a 2002 consent decree which committed the company to cleaning up its privacy act.   Progress was imperfect, but substantial in many ways.   Today’s FTC ignored EPIC’s 2019 complaint against Zoom, but that doesn’t mean they’re off the hook.  In Europe, there’s the GDPR and regulators who don’t have a lot of patience with badly-behaving US companies. In the US, Zoom may well have problems with COPPA, FERPA. HIPAA, and potentially a bunch of state regulations as well.

Even after some improvements, Zoom’s privacy policy still has a lot of problems — including minimal restrictions on sharing their data with third parties.   It doesn’t have to be this way.  One very positive way in which Zoom today is similar to Microsoft in the early 2000s is that their business model primarily revolves around people paying for software — as opposed to advertising-based companies like Facebook and Google who rely on exploiting their users’ personal data.

Zoom really needs to fixing their privacy policy — quite frankly they shouldn’t expect any credibility in the privacy community until they do.   But that’s just the first step.   Getting privacy experts involved in the design and review of their products, auditing their software to learn other unexpected data sharing is going on (and introducing tools and processes to prevent future problems), and applying the principles of Privacy by Design throughout their engineering process are also important.

Do threat modeling

“The risks, the misuse, we never thought about that.”

— Eric Yuan, in Zoom Rushes to Improve Privacy for Consumers Flooding Its Service

Threat modeling is a structured approach to looking at security threats — and what can be done in response.  As well as identifying specific threats that need to be prevented or mitigated, threat modeling also reminds developers and testers to keep security in mind, and forces the organization to document a system’s security properties — which in turn helps with tools, code review, and testing.

Microsoft’s early-2000s work on threat modeling, including Window Snyder and Frank Swiderski’s book and the broad use of the STRIDE model internally, had a significant impact not just on the company but the broader industry.   Threat modeling’s come a long way since then, with well-developed techniques and methodologies as well as excellent resources available like Mitre’s ATT&CK.

Still, many companies don’t do threat modeling very well, especially when it comes to social threats.   Facebook’s threat modeling, for example, didn’t pay attention to easy-to-predict threats such as companies like Cambridge Analytica lying to them, fake news sites trying to get more views by manipulating trending topics, intelligence agencies trying to influence elections in other countries, or communications channels being used to foment genocide.

Zoombombing is a great example of a high-profile problem that could have been anticipated and significantly reduced by even basic social threat modeling techniques.  The weakness of Zoom’s muting, blocking, and moderation support (leaving attendees open to bullying, hate speech, harassment) is another major areas where Zoom hasn’t paid attention to the threats.   And it’s worth noting that these aren’t just problems in the consumer and education worlds; they’re issues in corporate environments as well.

So hopefully as Zoom focuses on threat modeling inputs from Window, Casey, Shireen Mitchell, Kaliya Young, Danielle Citron, Leigh Honeywell, and others who focus on the social aspects — as well as content moderation experts like Sarah Roberts, who have a lot of experience with how to mitigate some of these threats.

Use the tools — and develop new ones

“Consider tools throughout the process, beginning in the planning phase” — me, in Steering the Pyramids: Tools, Technology, and Process in Engineering at Microsoft, ICSM 2002

Tools aren’t magic bullets — some of my most valuable contributions in the Microsoft security efforts were times I said “tools aren’t going to help with this particular problem.”   Still, tools can make a big difference on some kinds of problems.  As well as adopting commercially-available and research tools, Microsoft invested heavily in creating its own — static analysis tools (the focus of Righting Software, from 2004, which discusses the PREfix and PREfast tools I architected as well as SLAM, Vault, and ESP ), as well as attack surface estimators,  vulnerability scanners, and so much more.

Zoom’s undisclosed, and apparently unintentional, data-sharing with Facebook is a good example of an area where tools can be helpful: analyzing dependencies’ security behavior could have identified the privacy-invasive behavior of Facebook’s iOS SDK.  Zoom’s recent, and welcome, announcement that users will soon be able to customize which data center regions their account can use for its real-time meeting traffic, is another: information flow analyses, and better use of chaos testing and run-time monitoring tools, can help avoid the kind of unexpected behavior led to meeting information unexpectedly getting routed through China a couple of months go.

Zoom isn’t anywhere near as large as companies like Google, Facebook, and Amazon that have followed Microsoft’s playbook of developing large internal tools teams that mix research and developing practical tools.  So they’ll need to think about where off-the-shelf tools can help, where they can get creative by applying technologies like Jepsen and Alloy, and where they’ll need to move the state of the art forward.

Tools are often deployed in a tactical way, helping to address particular problems.  Especially in a situation like this, it’s also worth thinking about tool usage strategically, for example looking at how tools can contributing to process and cultural change.

Learn from your experiences — and continue to update your processes

“Controls are created to prevent hazards. Accidents occur when the controls are ineffective.” — Nancy Leveson, in How To Learn More From Accidents

Microsoft’s products and processes evolved significantly as part of the focus on Trustworthy Computing.  In many cases the changes were driven by analysis of security vulnerabilities.  Any vulnerability is a chance to ask questions like “Why weren’t the controls like testing, code review, and pen testing that should have prevented this hazard from being shipped effective?” Very often the answers point to training or process gaps, or identify patterns that highlight where other vulnerabilities may be lurking.

Root cause analysis was one popular technique at Microsoft.  The state of the art has progressed significantly over the last 20 years, so other approaches may make more sense for Zoom.  How To Learn More From Accidents is an excellent intro to Leveson’s Causal Analysis Using System Theory (CAST) approach; her 2019 CAST Handbook and Engineering a Safer World: Systems Thinking Applied to Safety, from 2012, go into a lot more detail.  No matter what approach Zoom winds up using, though, there’s a lot of leverage here.

It’s also useful to apply this kind of thinking to the system level.   Zoom has had indications for a while that there were some big security and privacy problems.  Why didn’t something get done about it before it hit the front pages and the FBI was issuing warnings?   Maybe (as with Microsoft back in the day) some people had been trying to get the word out that there was a big problem but they didn’t get heard.   Maybe executives and the board understood the risks, made a rational decision to focus on other priorities, but didn’t realize quickly enough that the risks had changed significantly as a result of the pandemic.

Whatever the explanation, it almost certainly points to opportunities for improvement going forward.

It’s a social problem, not just “technical”

“These are racist cyber attacks; not innocent party crashers just stopping by to say hey.” — Dr. Dennis Johnson, in Demand that Zoom immediately create a solution to protect its users from racist cyber attacks!

Software engineers like to think of security and privacy as purely “technical” problems.   The reality, though, is that software is used by people and organizations; you can’t separate the technology from the social aspects.  Alas, as Zeynep Tufekci,  Sally Applin, and others continue to point out, most software companies have a long track record of not getting anthropologists, sociologists and other social scientists involved in the process.

All of Microsoft’s work I’ve discussed here had a strong social focus, for example the the cultural, organizational, and interpersonal aspects of the SDL and threat modeling and the Analysis is necessary but by no means sufficient attitude towards tools.

“Applying social science perspectives to the field of computer security not only helps explain current limitations, and highlight an emerging trend, but also points the way towards a radical rethinking of how to make progress on this vital issue.” — Sarah Blankinship, Tomasz Ostwald, and me in Computer Science as a Social Science: Applications to Computer Security, 2009

Another outstanding example of the social perspective the work that people like Window Snyder, Kymberlee Price, Katie Moussouris, Terri Forslof, Celene Richenburg, and Sarah did to change the company’s attitude about working with the security community and move towards an ecosystem approach.  In an excellent Facebook discussion from a couple of years ago, Steve Lipner commented that he and other experienced security people at the company originally resisted this outreach until Window and others changed their minds.

Microsoft’s early-2000s work as well was also heavily influenced by people  like Jeannette Wing, Helen Nissenbaum, Laurie Williams, Andreas Zeller, and Andrea Matwyshyn whose work was infused with social perspectives.   Today,  Microsoft is reportedly the world’s second-largest employer of anthropologists.

Of course, Zoom won’t necessarily use the same tactics as Microsoft.  For example:

  • Microsoft’s outreach strategy was very in-person focused, including conferences and parties.  As the conference circuit moves online, Zoom’s got a great opportunity to build on the kudos they’ve gotten for their initial engagement with security researchers.
  • Zoom doesn’t have anything equivalent to Microsoft Research, but there are plenty of other ways to engage with academia.
  • Some of the most important disciplines for Zoo to engage with, like intersectional internet studies and content moderation, didn’t even exist in the early 2000s.

The calls by civil rights groups like Color Of Change, the National LGBTQ Task Force, and the National Hispanic Media Coalition for Zoom to release a plan to combat racial harassment also highlight the need for expertise in diversity, equity, and inclusion.   Perspectives from people like Safiya Noble, Ruha Benjamin, Shireen Mitchell, André Brock, and others who focus on the intersection of race and technology are especially important here.

As well as bringing experts in as consultants, Zoom also needs to build capacity by hiring them throughout the organization — including at the executive level as well as senior product and engineering roles.

It may take a while to address — but there’s a big potential upside

“We needed to change some security settings, like password enforcement on day one. But we learned a lesson, we quickly made a change.”  — Eric Yuan, in Zoom’s CEO Wants You to Trust the Company Again

Zoom’s getting a lot of justifiable praise for their fast and forceful reaction: quickly releasing several important fixes, engaging with security researchers, freezing feature development, communicating regularly and candidly.  That said, they’re still at a very early stage.  They’re just starting to think through what security, privacy, safety, and trust mean for them.  Most likely, they’re still trying to fully understand the technical debt — and ethical debt — they’ve taken on by ignoring it for so many years.

Zoom will probably continue to make progress much faster than Microsoft did — their code base is a lot smaller, their development cycles are a lot faster, and they don’t have the same legacy problems.  Still, it’s instructive to look at Microsoft’s timeline

  • In September 2001 (after Code Red, Nimda, and Gartner’s recommendation that companies consider Apache rather than Microsoft’s IIS), Microsoft knew they had a problem.
  • By early 2002, Bill Gates’ memo and the Windows security push signaled the start of significant sustained investment
  • Windows Server 2003 included some significant improvements, but in the summer of 2003 the Blaster worm led to another major mobilization with the Sledgehammer task force trying to “squash the bugs.”
  • Things only really turned the corner on a sustained basis with the introduction of the Security Development Lifecycle (SDL) in July 2004 and the release of Windows XP SP2 later that year

At the end of the day, though, Microsoft wound up in a much stronger position than they had been in before.  By the time I was GM of Competitive Strategy in 2006-7, security and customer perceptions of trustworthiness were starting to become significant competitive advantages.  Today, Microsoft’s reputation for security is one of the reasons that school districts like New York are replacing Zoom with Microsoft Teams.

So if Zoom continues to apply the lesson they’ve learned, and sustains their new focus on trust, there’s a big upside.

Zoom already has a remarkably usable, highly scalable, and very reliable product.  If they also become leaders in security, privacy, and other aspects of trust, they’ll be in a great position.

 


Thanks to Steve, David, George, Dragos, Matt, Kristen, Jeff, Pat, Michael, Jason, Deborah, and everybody else for feedback and discussions on earlier versions of this post.