This article was co-written by Vika Basarab, technical program manager at Grammarly, and Igor Maxuk, system engineer at Grammarly.
Grammarly recently made a big push to require FIDO2 for every team member login, replacing OTP. In Part 1 of this series, we shared why we decided to undertake such a complicated transition. Check out that article to read why we couldn’t rely solely on biometrics authentication (like TouchID) for implementing FIDO2 and needed to also support hardware keys (YubiKeys).
FIDO2 is a relatively new standard, and support for it is still a work in progress for some apps and services. Plus, getting every team member around the world to adopt a new way of logging in is not an easy task. In this article, we’ll share what this rollout taught us from both a technical and an operational standpoint. We hope it will help other teams that are undertaking the switch to FIDO2.
1 Scoping
The first step in our rollout was to determine what systems changes we would need in order to deliver the fullest possible support and ensure that the user experience was smooth, convenient, and consistent across different platforms and applications.
We quickly ran into a roadblock involving “embedded” browsers. An embedded browser is involved when a native app authenticates with SSO on macoS by opening a Safari window to obtain session information. This works with login, password, and OTP, but embedded Safari doesn’t support FIDO2. We couldn’t find a way to solve it directly, and as we’re primarily an Apple shop, we had to put in workarounds:
The good
We reworked the authentication process for our F5BigIP VPN client completely. Previously we’d used the embedded window for authentication in Okta, but we found a way to switch to OAuth authentication through the default browser for the OS. In 99 percent of cases, the user is already authenticated in Identity Provider before connecting to the VPN, so the client can get the username and establish a session without asking additional questions. We ended up with a better UX, and our VPN client is even more stable, a win-win.
The bad
There were no such solutions for a handful of other native apps, including Microsoft 365. We decided that only these applications would be allowed to authenticate via OTP/push. The problem with that was that our main SSO system, Okta Classic, didn’t support flexible authentication policies. Thankfully, Okta created a new product called Okta Identity Engine (OIE) that did have this support. Thus, we were able to develop a practical solution that gives us the best security possible: Critical apps and SaaS authenticate via FIDO2 only; everything else authenticates via FIDO2 if it’s supported and falls back to using OTP/push.
And the ugly
While preparing for the migration to OIE, we found that saml2aws, our primary CLI authentication tool for AWS, doesn’t work with OIE and AWS Federation. After migration, none of our engineers could manipulate AWS infrastructure from their code. This blocker almost buried the FIDO2 project. We had to completely rework the entire integration between Okta and AWS and educate engineers on how to use AWS SSO integration from the CLI.
Lessons learned
- As a precondition, check if all of your critical systems (VPN, SSO, SaaS, cloud provider, on-premise, desktop apps, and combinations of these) support the FIDO2 standard.
- FIDO2 cannot fully replace the password in SSO systems yet. There are still systems where security keys do not work well, mostly due to a lack of support for Web Authentication. Think about how you will authenticate legacy services and apps that lack FIDO2 abilities.
- Several companies are developing solutions for these issues. For example, Okta has developed FastPass, which we plan to use for native apps. Stay tuned for more on this in a future blog post.
2 Trials
We launched alpha trials with small groups of team members one month before beginning our major rollout. Before beginning the trials, we had already prepared initial documentation, such as FAQs and YubiKey setup guides. Trials helped us extend the documentation, identifying bottlenecks and points of confusion.
We uncovered crucial corner cases during our trials. For instance, YubiKeys combine several security interfaces. One of them works as an OTP using a virtual keyboard driver; it provides a time-based generated string like “ccccccril….” followed by the Enter key. But if you accidentally touched the key’s contact plate while writing on Slack, you could end up suddenly messaging this random string of text to hundreds of coworkers. We started to see a lot of Jira tickets that contained these OTP strings! (Fortunately, we figured out how to turn off this behavior.)
Biometrics can also give you an unreliable authentication experience. For example, if the user needs to reset their session in the browser, they won’t be able to log in to Okta anymore, because macOS TouchID FIDO2 requires reenrollment after such activities.
Lessons learned
- Trials give you an opportunity to fail fast in case your assumptions or implementation doesn’t work.
- Moreover, trials are an essential way to get feedback and polish the user experience. You need to test your solution and manuals and see how they work with your employees’ business processes.
3 Rollout
The challenge of rolling out security keys is that you can’t do it automatically or programmatically; each team member must set up the keys themselves.
At Grammarly, we tackled this in several ways:
- We made FIDO2 a required authenticator in Okta, so users had to enroll their biometric authentication (TouchID).
- As soon as a team member obtained an option to receive the YubiKeys, we launched a Slack bot that the IT team developed to monitor team member factors in Okta. If users did not enroll in YubiKeys, the bot would ping them directly once a week. We decided that “loud” messages on global channels won’t do the job. The reason is that we have to ask people to enroll two keys (in case one is lost). Often, people register just one key and think they’re done, so they will ignore the message on the global channel. But it’s much harder to keep ignoring direct pings from bots!
- In parallel, we automatically gathered statistics on the adoption rate and sent notifications via our custom bot to the managers of people who didn’t enroll keys.
- Last but not least, if a team member enrolled at least one key, special automation would turn off all factors except FIDO2.
Lessons learned
- To speed up your rollout without harmfully impacting day-to-day work, eliminate the other options as soon as possible for each individual.
- Use automation to remind the audience about the deadlines, so you don’t have to do it.
- Start gathering data on your progress early. This is something that managers love to see!
- Communication is one of the most important parts of a FIDO2 rollout. All your employees should know about the new initiatives and understand why you’re changing the process and what benefits it will bring to the company.
4 Logistics
Grammarly introduced its remote-first hybrid work model in 2021. While this has benefited our workforce in countless ways, it meant that our IT team faced a logistical challenge in delivering YubiKeys across the globe.
For new team members, we use the Yubico enterprise portal’s bulk upload functionality to deliver their YubiKeys along with their laptops. For existing team members, we use a self-service model that allows them to order security keys from Yubico’s regional vendors or pick them up from one of Grammarly’s five hubs, our coworking and collaboration spaces.
Lessons learned
- Anticipate the logistics challenge and plan for it early, including questions like how you’ll deal with lost packages.
- Keep in mind that while Yubico’s regional vendor network is extensive, there are still places where the keys aren’t possible to find locally.
Looking forward
In the future, we plan to optimize our logistics and reduce friction by providing a self-service solution for team members to quickly get new keys and invalidate old ones if they were lost—all in two clicks.
We’re also excited about how the FIDO2 standard and the Yubico security hardware keys implementation give Grammarly a solid foundation for going beyond passwords at login.
We hope you learned something from this article that you can apply to your own rollout. Does the way we approach problems at Grammarly resonate with you? We’re hiring! Check out our careers page.