Skip to main content

28 posts tagged with "identity provider"

View All Tags

Automated security versus the security mindset

· 12 min read
Jens Langhammer
CTO at Authentik Security Inc

authentik is an open source Identity Provider that unifies your identity needs into a single platform, replacing Okta, Active Directory, and auth0. Authentik Security is a public benefit company building on top of the open source project.


Automation plays a large and increasingly important role in cybersecurity. Cybersecurity vendors promote their Machine Learning and Artificial Intelligence products as the inevitable future. However, thanks to the work of security experts like Bruce Schneier, we have more insight into the human adversaries that create the underlying risks to network security, and a better understanding of why teaching humans to have a security mindset is the critical first step to keeping your network safe.

The best response to these malicious actors is to think like a security expert and develop the security mindset.

In this blog post, we examine why automation is such a popular solution to cybersecurity problems—from vulnerability scanning to risk assessments. Then, we will look at those tasks in which security automation by itself proves inadequate, with particular focus on automatic scanning. Next, we make a positive case for why the human factor will always be needed in security. Finally, we will propose that good security isn't a feature. It's a proactive security mindset that's required—one with a human element at its core.

authentik UI

Building an OSS security stack with Loki, Wazuh, and CodeQL to save $100k

· 12 min read

authentik is an open source Identity Provider that unifies your identity needs into a single platform, replacing Okta, Active Directory, and auth0. Authentik Security is a public benefit company building on top of the open source project.


There was an article recently about nearly 20 well-known startups’ first 10 hires—security engineers didn’t feature at all. Our third hire at Authentik Security was a security engineer so we might be biased, but even startups without the resources for a full-time security hire should have someone on your founding team wearing the security hat, so you get started on the right foot.

As security departments are cost centers (not revenue generators) it’s not unusual for startups to take a tightwad mentality with security. The good news is that you don’t need a big budget to have a good security posture. There are plenty of free and open source tools at your disposal, and a lot of what makes good security is actually organizational practices—many of which don’t cost a thing to implement.

We estimate that using mostly non-commercial security tools saves us approximately $100,000 annually, and the end-result is a robust stack of security tools and processes.

Here’s how we built out our security stack and processes using mostly free and open source software (FOSS).

Everyone agrees zero trust is good but no one correctly implements it

· 12 min read
Jens Langhammer
CTO at Authentik Security Inc

authentik is an open source Identity Provider that unifies your identity needs into a single platform, replacing Okta, Active Directory, and auth0. Authentik Security is a public benefit company building on top of the open source project.


Buzzwords are the scourge of the tech industry – reviled by developers, pushed by vendors, and commanded by executives.

All too often, a buzzword is the first signal of rain (or worse): Marketers have created a trend; vendors are using the trend to explain why you need to buy their software right now; executives are worried about a problem they didn’t know existed before they read that Gartner report; and the downpour rains on developers.

Implement zero trust!

Why aren’t we shifting left?

Are we resilient? Well, can we get more resilient?

After a while, buzzwords start to look like trojan horses, and the invading army feels like a swarm of tasks that will result in little reward or recognition. It’s tempting to retreat to cynicism and to ignore every Term™ that comes your way.

But this can be risky. For better or worse, good ideas inevitably get branded, and if you want to keep up, you need to see past the branding – even if it involves stripping away the marketing fluff to see the nugget of an idea within.

There’s no better example of this than zero trust. In this post, we’ll briefly explore the term's history, explain how it became such an untrustworthy buzzword, and argue that thanks to a few advancements (mainly Wireguard), zero trust will soon go from buzzword to reality.

IPv6 addresses and why you need to make the switch now

· 14 min read
Jens Langhammer
CTO at Authentik Security Inc

authentik is an open source Identity Provider that unifies your identity needs into a single platform, replacing Okta, Active Directory, and auth0. Authentik Security is a public benefit company building on top of the open source project.


IPv6 addresses have been commercially available since 2010. Yet, after Google’s IPv6 rollout the following year, the adoption by System Administrators and security engineers responsible for an entire organization’s network has been slower than you might expect. Population size and the plethora of work and personal devices that accompany this large number of workers do not accurately predict which countries have deployed this protocol.

In this blog post, I explain briefly what IP addresses are and how they work; share why at Authentik Security we went full IPv6 in May 2023; and then set out some reasons why you should switch now.

What are IP addresses?

IP Addresses are locations (similar to street addresses) that are assigned to allow system administrators and others to identify and locate every point (often referred to as a node) on a network through which traffic and communication passes via the internet. For example, every server, printer, computer, laptop, and phone in a single workplace network has its own IP address.

We use domain names for websites, to avoid having to remember IP addresses, though our readers who are sysadmin—used to referencing all sorts of nodes deep within their organization’s networks—will recall them at the drop of a hat.

But, increasingly, since many devices are online and 96.6% of internet users now use a smartphone, most Internet of Things (IoT) devices that we have in our workplaces and homes also have their own IP address. This includes:

  • Computers, laptops and smartphones
  • Database servers, web servers, mail servers, virtual servers (virtual machines), and servers that store software packages for distribution
  • Other devices such as network printers, routers and services running on computer networks
  • Domain names for websites, which are mapped to the IP address using Domain Name Servers (DNS)

IP addresses are centrally overseen by the Internet Assigned Numbers Authority (IANA), with five Regional Internet Registries (RIRs).

What is the state of the IP landscape right now?

Well, it’s all down to numbers.

The previous version of this network layer communications protocol is known as IPv4. From our informed vantage point—looking over the rapid growth of ecommerce, business, government, educational, and entertainment services across the internet—it’s easy to see how its originator could not possibly have predicted that demand for IPv4 addresses would outstrip supply.

Add in the ubiquity of connected devices that allow us to access and consume those services and you can see the problem.

IP address exhaustion was foreseen in the 1980s, which is why the Internet Engineering Task Force (IETF) started work on IPv6 in the early 1990s. The first RIR to run out of IPv4 addresses was ARIN (North America) in 2015, followed by the RIPE (Europe) in 2019, and LACNIC (South America) in 2020. The very last, free /8 address block of IPv4 addresses was issued by IANA in January 2011.

The following realities contributed to the depletion of the IPv4 addresses:

  • IPv4 addresses were designed to use 32 bits and are written with decimal numbers
  • This allowed for 4.3 billion IP addresses

The IPv4 address format is written in 4 groups of 4 numbers, each group separated by a period.

Even though IPv4 addresses still trade hands, it’s actually quite difficult now to buy a completely unused block. What’s more, they’re expensive for smaller organizations (currently around $39 each) and leasing is cheaper. Unless you can acquire them from those sources, you’ll likely now be issued IPv6 ones.

Interesting historical fact: IPv5 was developed specifically for streaming video and voice, becoming the basis for VoIP, though it was never widely adopted as a standard protocol.

IPv6 addresses, history and adoption

The development of IPv6 was initiated by IETF in 1994, and was published as a draft standard in December 1998. The use of IPv6, went live in June 2012, and was ratified as an internet standard in July 2017.

There is an often circulated metaphor from J. Wiljakka’s IEEE paper, Transition to IPv6 in GPRS and WCDMA Mobile Networks, stating that every grain of sand on every seashore could be allocated its own IPv6 address. Let me illustrate.

  • IPv6 addresses were designed to use 128 bits and are written with hexadecimal digits (10 numbers from 1-10 and 6 letters from A-F).
  • So, how many IPv6 addresses are there? In short, there are over 340 trillion IP addresses available!

The IPv6 address format is written in 8 groups of 4 digits (each digit can be made up of 4 bits), each group separated by a colon.

Importantly, the hierarchical structure optimizes global IP routing, keeping routing tables small.

If you plan to make the switch to IPv6, it’s worth noting that you’ll need to ensure that your devices, router, and ISP all support it.

Upward trend in the worldwide adoption by country

Over 42.9% of Google users worldwide are accessing search using the IPv6 protocol. It’s intriguing to note which countries have a larger adoption of the IPv6 protocol than not:

  • France 74.38%
  • Germany 71.52%
  • India with 70.18%
  • Malaysia 62.67%
  • Greece 61.43%
  • Saudi Arabia 60.93%

And, yet China, Indonesia, Pakistan, Nigeria, and Russia lag surprisingly far behind many others in terms of adoption (between 5-15%) given their population size. Even many ISPs have been slow to switch.

You can consult Google’s per country IPv6 adoption statistics to see where your location sits in the league table.

Why we decided on a full IPv6 addresses deployment

The average internet user won’t be aware of anything much beyond what an IP address is, if even that. However for system administrators, IP addresses form a crucial part of an organization’s computer network infrastructure.

In our case, the impetus to use IPv6 addresses for authentik came from our own, internal Infrastructure Engineer, Marc Schmitt. We initially considered configuring IPv4 for internal traffic and, as an interim measure, provide IPv6 at the edge only (remaining with IPv4 for everything else). However, that would still have required providing IPv6 support for customers who needed it.

In the end, we determined it would be more efficient to adopt the IPv6 addresses protocol while we still had time to purchase, deploy, and configure it at our leisure across our existing network. We found it to be mostly a straightforward process. However, there are still some applications that did not fully support IPv6, but we were aided by the fact that we use open source software. This means that we were able to contribute back the changes needed to add IPv6 support to the tools we use. We were thrilled to have close access to a responsive community with some (not all!) of the tool vendors and their communities to help with any integration issues. Plausible, our web analytics tool, was especially helpful and supportive in our shift to IPv6.

Future proofing IP addresses on our network and platform

While it seemed like there was no urgent reason to deploy IPv6 across our network, we knew that one day, it would suddenly become pressing once ISPs and larger organizations had completely run out of still-circulating IPv4 addresses.

For those customers who have not yet shifted to IPv6, we still provide IPv4 support at the edge, configuring our load balancers to receive requests over IPv4 and IPv6, and forwarding them internally over IPv6 to our services (such as our customer portal, for example).

Limiting ongoing spend

Deployment of IPv6 can be less expensive as time goes on. If we’d opted to remain with IPv4 even temporarily, we knew we would have needed to buy more IPv4 addresses.

In addition, we were paying our cloud-provider for using the NAT Gateway to convert our IPv4 addresses—all of which are private—to public IP addresses. On top of that, we were also charged a few cents per GB based on users. The costs can mount up, particularly when we pull Docker images multiple times per day. These costs were ongoing and on top of our existing cloud provider subscription. With IPv6, however, since IP addresses are already public—and there is no need to pay for the cost of translating them from private to public—the costs are limited to paying for the amount of data (incoming and outgoing traffic) passing through the network.

Unlimited pods

Specifically when using the IPv4 protocol, there’s a limitation with our cloud provider if pulling IP addresses from the same subnet for both nodes and Kubernetes pods. You are limited by the number of pods (21) you can attach to a single node. With IPv6, the limit is so much higher that it's insignificant.

Clusters setup

All original clusters were only configured for IPv4. It seemed like a good time to build in the IPv6 protocol while we were already investing time in renewing a cluster.

We’d already been planning to switch out a cluster for several reasons:

  • We wanted to build a new cluster using ArgoCD (to replace the existing FluxCD one) for better GitOps, since ArgoCD comes with a built-in UI and provides a test deployment of the changes made in PRs to the application.
  • We wanted to change the Container Network Interface (CNI) to select an IP from the same subnet as further future-proofing for when more clusters are added (a sandbox for Authentik Security and another sandbox for customers, for example). We enhanced our AWS-VPC-CNI with Cilium to handle the interconnections between clusters and currently still use it to grab IPs.

IPv6 ensures everything works out-of-the-box

If you’re a system administrator with limited time and resources, you’ll be concerned with ensuring that all devices, software, or connections are working across your network, and that traffic can flow securely without bottlenecks. So, it’s reassuring to know that IPv6 works out of the box—reducing the onboarding, expense, and maintenance feared by already overburdened sysadmins.

Stateless address auto-configuration (SLAAC)

When it comes to devices, each device on which IPv6 has been enabled will independently assign IP addresses by default. With IPv6, there is no need for static or manual DHCP IP address configuration (though manual configuration is still supported). This is how it works:

  1. When a device is switched on, it requests a network prefix.
  2. A router or routers on the link will provide the network prefix to the host.
  3. Previously, the subnet prefix was combined with an interface ID generated from an interface's MAC address. However, having a common IP based on the MAC address raises privacy concerns, so now most devices just generate a random one.

No need to maintain both protocols across your network or convert IPv4 to IPv6

Unless you already have IPv6 deployed right across your network, if your traffic comes in via IPv4 or legacy networks, you’ll have to:

  • Maintain both protocols
  • Route traffic differently, depending on what it is

No IP addresses sharing

Typically, public IP addresses, particularly in Europe, are shared by multiple individual units in a single apartment building, or by multiple homes on the same street. This is not really a problem for private individuals, because most people have private IP addresses assigned to them by their routers.

However, those in charge of the system administration for  organizations and workplaces want to avoid sharing IP addresses. We are almost all subject to various country, state, and territory-based data protection and other compliance legislation. This makes it important to reduce the risks posed by improperly configured static IP addresses. And, given the virtually unlimited number of IP addresses now available with the IPv6 protocol, configuring unique IP addresses for every node on a network is possible.

OK but are there any compelling reasons for me to adopt IPv6 addresses now?

If our positive experience and outcomes, as well as the out-of-the-box nature of IPv6 have not yet persuaded you, these reasons might pique your interest.

Ubiquitous support for the IPv6 addresses protocol

Consider how off-putting it is for users that some online services still do not offer otherwise ubiquitous identity protection mechanisms, such as sign-on Single Sign-on (SSO) and Multi-factor Authentication (MFA). And, think of systems that do not allow you to switch off or otherwise configure pesky tracking settings that contradict data protection legislation.

Increasingly and in the same way, professionals will all simply assume that our online platforms, network services, smart devices, and tools support the IPv6 protocol—or they might go elsewhere. While IPv6 does not support all apps, and migration can be risky, putting this off indefinitely could deter buyers from purchasing your software solution.

Man-in-the-Middle hack reduction

Man-in-the-Middle (MITM) attacks rely on redirecting or otherwise changing the communication between two parties using Address Resolution Protocol (ARP) poisoning and other naming-type interceptions. This is how many malicious ecommerce hacks target consumers, via spoofed ecommerce, banking, password reset, or MFA links sent by email or SMS. Experiencing this attack is less likely when you deploy and correctly configure the IPv6 protocol, and connect to other networks and nodes on which it is similarly configured. For example, you should enable IPv6 routing, but also include DNS information and network security policies

Are there any challenges with IPv6 that I should be aware of before starting to make the switch?

Great question! Let’s address each of the stumbling blocks in turn.

Long, multipart hexadecimal numbers

Since they are very long, IPv6 addresses are less memorable than IPv4 ones.

However, this has been alleviated using a built-in abbreviation standard. Here are the general principles:

  • Dropping any leadings zeros in a group
  • Replacing a group of all zeros with a single zero
  • Replacing continuous zeros with a double colon

Though this might take a moment to memorize, familiarity comes through use.

Handling firewalls in IPv6

With IPv4, the deployment of Network Address Translation (NAT) enables system administrators in larger enterprises, with hundreds or thousands of connected and online devices, to provide a sense of security. Devices with private IP addresses are displayed to the public internet via NAT firewalls and routers that mask those private addresses behind a single, public one.

  • This helps to keep organizations’ IP addresses, devices, and networks hidden and secure.
  • Hiding the private IP address discourages malicious attacks that would attempt to target an individual IP address.

This lack of the need for a huge number of public IPv4 addresses offered by NAT has additional benefits for sysadmins:

  • Helping to manage the central problem of the limited number of available IPv4 addresses
  • Allowing for flexibility in how you build and configure your network, without having to change IP addresses of internal nodes
  • Limiting the admin burden of assigning and managing IP addresses, particularly if you manage a large number of devices across networks

Firewall filter rules

It is difficult for some to move away from this secure and familiar setup. When it comes to IPv6 however, NAT is not deployed. This might prove to be a concern, if you are used to relying on NAT to provide a layer of security across your network.

Instead, while a firewall is still one of the default protective mechanisms, system administrators must deploy filter rules in place of NAT.

  • In your router, you’ll be able to add both IPv4 and IPv6 values—with many device vendors now enabling it by default.
  • Then, if you’ve also configured filtering rules, when packets encounter the router, they’ll meet any firewall filter rules. The filter rule will check if the packet header matches the rule’s filtering condition, including IP information.
    • If it does, the Filter Action will be deployed
    • If not, the packet simply proceeds to the next rule

If you configure filtering on your router, don’t forget to also enable IPv6 there, on your other devices, and on your ISP.

Have you deployed IPv6 addresses to tackle address exhaustion?

Yes, it is true that there is still a way to go before IPv6 is adopted worldwide, as we discussed above. However, as the pace of innovative technologies, solutions, and platforms continues, we predict this will simply become one more common instrument in our tool bag.

We’d be very interested to know what you think of the IPv6 protocol, whether you’ve already converted and how you found the process. Do you have any ongoing challenges?

Join the Authentik Security community on Github or Discord, or send us an email at hello@goauthentik.io. We look forward to hearing from you.

Happy Birthday to Us!

· 8 min read

authentik is an open source Identity Provider that unifies your identity needs into a single platform, replacing Okta, Active Directory, and auth0. Authentik Security is a public benefit company building on top of the open source project.


Even though we are shouting Happy Birthday to Us, we want to start by saying:

Thank You to you all, our users and supporters and contributors, our questioners and testers!

We simply would not be here, celebrating our 1-year mark, without your past and present support. While there are only 7 employees at Authentik Security, we know that our flagship product, authentik, has a much bigger team... you all! Our contributors and fellow builders and users are on the same team that took us this far, and we look forward to continuing the journey with you to build our amazing authentication platform on authentik!

"Photo by montatip lilitsanong on Unsplash"

3 ways you (might be) doing containers wrong

· 8 min read
Jens Langhammer
CTO at Authentik Security Inc

authentik is an open source Identity Provider that unifies your identity needs into a single platform, replacing Okta, Active Directory, and Auth0. Authentik Security is a public benefit company building on top of the open source project.


There are two ways to judge an application:

  1. Does it do what it’s supposed to do?
  2. Is it easy to run?

This post is about the second.

Using containers is not a best practice in itself. As an infrastructure engineer by background, I’m pretty opinionated about how to set up containers properly. Doing things the “right” way makes things easier not just for you, but for your users as well.

Below are some common mistakes that I see beginners make with containers:

  1. Using one container per application
  2. Installing things at runtime
  3. Writing logs to files instead of stdout

Mistake #1: One container per application

There tend to be two mindsets when approaching setting up containers:

  • The inexperienced usually think 1 container = 1 application
  • The other option is 1 container = 1 service

Your application usually consists of multiple services, and to my mind these should always be separated into their own containers (in keeping with the Single Responsibility Principle).

For example, authentik consists of four components (services):

  • Server
  • Worker
  • Database
  • Cache

With our deployment, that means you get four different containers because they each run one of those four services.

Why you should use one container per service

At the point where you need to scale, or need High Availability, having different processes in separate containers enables horizontal scaling. Because of how authentik deploys, if we need to handle more traffic we can scale up to 50 servers, rather than having to scale up everything. This wouldn’t work if all those components were all bundled together.

Additionally, if you’re using a container orchestrator (whether that’s Kubernetes or something simpler like Docker Compose), if it’s all bundled together, the orchestrator can’t distinguish between components because they’re all in the black box of your container.

Say you want to start up processes in a specific order. This isn’t possible if they’re in a single container (unless you rebuild the entire image). If those processes are separate, you can just tell Docker Compose to start them up in the order you want, or you can run specific components on specific servers.

Of course, your application architecture and deployment model need to support this setup, which is why it’s critical to think about these things when you’re starting out. If you’re reading this and thinking, I have a small-scale, hobby project, this doesn’t apply to me—let me put it this way: you will never regret setting things up the “right” way. It’s not going to come back to bite you if your situation changes later. It also gives users who install the application a lot more freedom and flexibility in how they want to run it.

Mistake #2: Installing things at runtime

Your container image should be complete in itself: it should contain all code and dependencies—everything it needs to run. This is the point of a container—it’s self contained.

I’ve seen people set up their container to download an application from the vendor and install it into the container on startup. While this does work, what happens if you don’t have internet access? What if the vendor shut down and that URL now points to a malicious bit of code?

If you have 100 instances downloading files at startup (or end up scaling to that point), this can lead to rate limiting, failed downloads, or your internet connection getting saturated—it’s just inefficient and causes problems that can be avoided.

Also, don’t use :latest

This leads me to a different but related bad practice: using the :latest tag. It’s a common pitfall for folks who use containers but don’t necessarily build them themselves.

It’s easy to get started with the :latest tag and it’s understandable to want the latest version without having to go into files and manually edit everything. But what can happen is that you update and suddenly it’s pointing to a new version and breaking things.

I’ve seen this happen where you’re just running something on a local server and your disk is full, so you empty out your Docker images. The next time you pull, it’s with a new version which now no longer works and you’re stuck trying to figure out what version you were on before.

Instead: Pin your dependencies

You should be pinning your dependencies to a specific version, and updating to newer versions intentionally rather than by default.

The most reliable way to do this is with a process called GitOps:

  • In the context of Kubernetes, all the YAML files you deploy with Kubernetes are stored in the central Git repository.
  • You have software in your Kubernetes cluster that automatically pulls the files from your Git repo and installs them into the cluster.
  • Then you can use a tool like Dependabot or Renovate to automatically create PRs with a new version (if there is one) so you can test and approve it, and it’s all captured in your Git history.

GitOps might be a bit excessive if you’re only running a small hobby project on a single server, but in any case you should still pin a version.

For a long time, authentik purposefully didn’t have a :latest tag, because people would use it inadvertently (sometimes not realizing they had an auto-updater running). Suddenly something wouldn’t work and there wasn’t really a way to downgrade.

We have since added it due to popular request. This is how authentik’s version tags work:

  • Our version number is 3 digits reflecting the date of the release, so the latest currently is 2023.10.1.
    • You can either use 2023.10.1 as the tag, pinning to that specific version
    • You can pin to 2023.10, which you means that you always get the latest patch version, or
    • You can use 2023, which means you always get the latest version within that year.

The principle is roughly the same with any project using SemVer: you could just lock to v1, which means you get the latest v1 with all minor patches and fixes, without breaking updates. Then you switch to v2 when you’re ready.

With this approach you are putting some trust in the developer not to publish any breaking changes with the wrong version number (but you’re technically always putting trust in some developer when using someone else’s software!).

Mistake #3: Writing logs to files instead of stdout

This is another issue on the infrastructure side that mainly happens when you put legacy applications into containers. It used to be standard that applications put their log output into a file, and you’d probably have a system daemon set up to rotate those files and archive the old ones. This was great when everything ran on the same server without containers.

A lot of software still logs to files by default, but this makes collecting and aggregating your services logs much harder. Docker (and containers in general) expect that you log to standard output so your orchestration platform can route the logs to your monitoring tool of choice.

Docker puts the logs into a JSON file that it can read itself and see the timestamps and which container the log refers to. You can set up log forwarding with both Docker and Kubernetes. If you have a central logging server, the plugin gets the standard output of a container and sends it to that server.

Not logging to stdout just makes it harder for everyone, including making it harder to debug: Instead of just running docker logs + the name of the container, you need to exec into the container, go to find the files, then look at the files to start debugging.

This bad practice is arguably the easiest one to work around

As an engineer you can easily redirect the logs back from a file into the standard output, but there’s no real reason not to do it the “correct” way.

There aren’t many use cases where there’s an advantage to writing your logs directly to a file instead of stdout—in fact the main one is for when you’re making the first mistake (having your whole application in one container)! If you’re running multiple services in one container, then you’ll have logs from multiple different processes in one place, which could be easier to work with in a file vs stdout.

Even if you specifically want your logs to exist in a file, by default if you run docker logs it just reads a JSON file that it adds the logs to, so you’re not losing anything by logging to stdout. You can configure Docker to just put the logs into a plain text file wherever you want to.

It’s a little simplistic, but I’d encourage you to check out The Twelve-Factor App which outlines good practices for making software that’s easy to run.

Are you doing containers differently and is it working for you? Let us know in the comments, or send us an email at hello@goauthentik.io!

Okta got breached again and they still have not learned their lesson

· 7 min read
Jens Langhammer
CTO at Authentik Security Inc

authentik is an open source Identity Provider that unifies your identity needs into a single platform, replacing Okta, Active Directory, and Auth0. Authentik Security is a public benefit company building on top of the open source project.


Another security breach for Okta

Late last week, on October 20, Okta publicly shared that they had experienced a security breach. Fortunately, the damage was limited. However, the incident highlights not only how incredibly vigilant vendors (especially huge vendors of security solutions!) must be, but also how risky the careless following of seemingly reasonable requests can be.

We now know that the breach was enabled by a hacker who used stolen credentials to access the Okta support system. This malicious actor then collected session tokens that were included in HAR files (HTTP Archive Format) that were uploaded to the Okta support system by customers. A HAR file is a JSON archive file format that stores session data for all browsers running during the session. It is not rare for a support team troubleshooting an issue to request a HAR file from their customer: Zendesk does it, Atlassian does it, Salesforce as well.

So it’s not the HAR file itself; it was what was in the file, and left in the file. And, destructively, it is our collective training to not second-guess support teams; especially the support team at one of the world’s most renowned identity protection vendors.

But it is not all on Okta; every customer impacted by this hack, including 1Password (who communicated the breach to Okta on September 29), BeyondTrust (who communicated the breach on October 2), and Cloudflare (October 18) were "guilty" of uploading HAR files that had not been scrubbed clean and still included session tokens and other sensitive access data. (Cleaning an HAR file is not always a simple task, there are tools like Google's HAR Sanitizer, but even tools like that don't 100% guarantee that the resulting file will be clean.)

Target the ancillaries

An interesting aspect of this hack was that it exploited the less-considered vulnerability of Support teams, not considered to be the typical entry-way for hackers.

But security engineers know that hackers go in at the odd, unexpected angles. A classic parallel is when someone wants data that a CEO has, they don’t go to the CEO, they go to (and through) the CEO’s assistant!

Similarly, the support team at Okta was used as entry point. Once the hacker gained control of a single customer’s account, they worked to take control of the main Okta dashboard and the entire support system. This lateral-to-go-up movement through access control layers is common technique of hackers.

It’s the response… lesson not yet learned

The timing of Okta's response, not great. The initial denial of the incident, not great. And then, add insult to injury, there’s what can objectively be labeled an abysmal “announcement” blog from Okta on October 20.

Everything from the obfuscatory title to the blog’s brevity to the actual writing… and importantly, the lack of any mention at all of BeyondTrust, the company that informed Okta on October 2nd that they suspected a breach of the Okta support system.

Tracking Unauthorized Access to Okta's Support System” has to be the lamest of all confession titles in the history of security breach announcements.

Not to acknowledge that their customers first informed them seems like willful omission, and it absolutely illustrates that Okta has not yet learned their lesson about transparency, trusting their customers and security partners, and the importance of moving more quickly towards full disclosure. Ironically, BeyondTrust thanks Okta for their efforts and communications during the two week period of investigation (and denial).

Back to the timing; BeyondTrust has written an excellent article about the breach, with a rather damning timeline of Okta’s responses.

“We raised our concerns of a breach to Okta on October 2nd. Having received no acknowledgement from Okta of a possible breach, we persisted with escalations within Okta until October 19th when Okta security leadership notified us that they had indeed experienced a breach and we were one of their affected customers.”(source)

The BeyondTrust blog provides important details about the persistence and ingenuity of the hacker.

“Within 30 minutes of the administrator uploading the file to Okta’s support portal an attacker used the session cookie from this support ticket, attempting to perform actions in the BeyondTrust Okta environment. BeyondTrust’s custom policies around admin console access initially blocked them, but they pivoted to using admin API actions authenticated with the stolen session cookie. API actions cannot be protected by policies in the same way as actual admin console access. Using the API, they created a backdoor user account using a naming convention like existing service accounts.”

Oddly, the BeyondTrust blog about the breach does a better job of selling Okta (by highlighting the things that went right with Okta) than the Okta announcement blog. For example, in the detailed timeline, BeyondTrust points out that one layer of prevention succeeded when the hacker attempted to access the main internal Okta dashboard, but because Okta still views dashboard access as a new sign in, it prompted for MFA thus thwarting the log in attempt.

Cloudflare’s revelation of their communications timeline with Okta shows another case of poor response timing by Okta, another situation where the customer informed the breached vendor first, and the breached company took too long to publicly acknowledge the breach.

“In fact, we contacted Okta about the breach of their systems before they had notified us.” … “We detected this activity internally more than 24 hours before we were notified of the breach by Okta.” (source)

In their blog about this incident, Cloudflare provides a helpful set of recommendations to users, including sensible suggestions such as monitoring for new Okta users created, and reactivation of Okta users.

Which just takes us back to the rather lean response by Okta; their customers wrote much more informative and helpful responses than Okta themselves.

Keep telling us

We can’t be reminded often enough about keeping our tokens safe.

This incident at Okta is parallel to the breach at Sourcegraph that we recently blogged about, in which a token was inadvertently included in a GitHub commit, and thus exposed to the world. With Okta, it was session tokens included in an uploaded HAR file, exposed to a hacker who had already gained access to the Okta support system.

But talk about things that keep security engineers up at night; timing was tight on this one.

The initial breach attempt was noticed by BeyondTrust within only 30 minutes of their having uploaded a HAR file to Okta Support. By default (and this is a good, strong, industry-standard default) Okta session tokens have a lifespan of two hours. However, with hackers moving as quickly as these, 2 hours is plenty long for the damage to be done. So, the extra step of scrubbing clean any and all files that are uploaded would have saved the day in this case.

Keep your enemies close, but your tokens even closer.

Stay vigilant out there

Lessons learned abound with every breach. Each of us in the software and technology area watch and learn from each attack. In the blog by BeyondTrust, they provide some valuable steps that customers and security teams can take to monitor for possible infiltration.

Strong security relies on multiple layers, enforced processes, and defense-in-depth policies.

“The failure of a single control or process should not result in breach. Here, multiple layers of controls -- e.g. Okta sign on controls, identity security monitoring, and so on, prevented a breach.” (source)

A writer on HackerNews points out that Okta has updated their documentation about generating HAR files, to tell users to sanitize the files first. But whether HAR files or GutHub commits, lack of MFA or misuse of APIs, we all have to stay ever-vigilant to keep ahead of malicious hackers.

Addendum

This blog was edited to provide updates about the 1Password announcement that they too were hacked, and to clarify that the hacker responsible for obtaining session tokens from the HAR files had originally gained entry into the Okta support system using stolen credentials.

How small companies get taxed out of security and why the whole industry suffers

· 13 min read
Jens Langhammer
CTO at Authentik Security Inc

authentik is an open source Identity Provider that unifies your identity needs into a single platform, replacing Okta, Active Directory, and auth0. Authentik Security is a public benefit company building on top of the open source project.


Let’s say you’re working at a small startup: You’re the CTO, your CEO is a good friend, and you have a couple of developers working with you from a previous company. You’re building your initial tech stack, and you start – where else? – with GitHub.

The pricing is simple enough. There’s a pretty feature-rich free plan, but you’re willing to pay up because the Team plan includes features for restricting access to particular branches and protecting secrets.

But the enterprise plan, the plan that costs more than four times as much per user per month – the plan that seems targeted at, well, enterprises – promises “Security, compliance, and flexible deployment.”

Is security… not for startups?

The feature comparison bears this out: Only the enterprise plan offers single-sign-on (SSO) functionality as part of the package – a feature that security experts have long agreed is essential. But don’t get mad at GitHub.

Do you want Box? You’ll have to pay twice as much for external two-factor authentication.

Do you want Mailtrap? The team, premium, and business plans won’t do. Only the enterprise plan, which costs more than $300 per month more than the team plan, offers SSO.

Do you want Hubspot’s marketing product, but with SSO? Prepare to pay $2,800 more per month than the next cheapest plan.

And these are only a few examples. SSO.tax, a website started by Rob Chahin, gathers many more. If you look through, you’ll see companies like SurveyMonkey and Webflow even restrict SSO to enterprise plans with a Contact Us option instead of a price.

"pricing page"

We need to talk about SCIM: More deviation than standard

· 7 min read
Jens Langhammer
CTO at Authentik Security Inc

authentik is an open source Identity Provider that unifies your identity needs into a single platform, replacing Okta, Active Directory, and auth0. Authentik Security is a public benefit company building on top of the open source project.


As a young security company, we’ve been working on our implementation of SCIM (System for Cross-domain Identity Management), which I’ll share more about below. SCIM is in many ways a great improvement on LDAP, but we’ve run into challenges in implementation and some things just seem to be harder than they need to be. Is it just us?

"authentik admin interface"

Machine-to-machine communication in authentik

· 8 min read
Jens Langhammer
CTO at Authentik Security Inc

authentik is a unified identity platform that helps with all of your authentication needs, replacing Okta, Active Directory, Auth0, and more. Building on the open-source project, Authentik Security Inc is a public benefit company that provides additional features and dedicated support.


We have provided M2M communication in authentik for the past year, and in this blog we want to share some more information about how it works in authentik, and take a look at three use cases.

What is M2M?

Broadly speaking, M2M communication is the process by which machines (devices, laptops, servers, smart appliances, or more precisely the client interface of any thing that can be digitally communicated with) exchange data. Machine-to-machine communication is an important component of IoT, the Internet of Things; M2M is how all of the “things” communicate. So M2M is more about the communication between the devices, while IoT is the larger, more complex, overarching technology.

Interestingly, M2M is also implemented as a communication process between business systems, such as banking services, or payroll workflows. One of the first fields to heavily utilize M2M is the oil and gas industry; everything from monitoring the production (volume, pressure, etc.) of gas wells, to tracking fleets of trucks and sea vessels, to the health of pipelines can be done using M2M communication.

Financial systems, analytics, really any work that involves multi-machine data processing, can be optimized using M2M.

“Machine to machine systems are the key to reliable data processing with near to zero errors” (source)

Where there is communication in software systems, there is both authentication and authorization. The basic definition of the terms is that authentication is about assessing and verifying WHO (the person, device, thing) is involved, while authorization is about what access rights that person or device has. So we choose to use the phrase “machine-to-machine communication” in order to capture both of those important aspects.

Or we could use fun terms like AuthN (authentication) and AuthZ (authorization).

So in some ways you can think of M2M as being like an internal API, with data (tokens and keys and certs and all thing access-related) being passed back and forth, but specifically for authentication and authorization processes.

"Screenshot of authentik UI"