OIDC workload identity on AWS

Update: after years of being on the wish list of a ton of top AWS teams, AWS released a built-in version of this feature about two weeks after we published this. Never let it be said gentle ribbing doesn’t work. Also, thanks AWS! We meant it when we said that the only thing better than having something easy to deploy was not needing to deploy anything at all. Everything in this post about workload identity is still relevant but you should probably use upstream’s implementation unless you have a good reason not to (for example, private validators for whom you need a VPC endpoint).

Introduction

We’re big fans of Tailscale. It’s fast, secure, and the developer experience doesn’t make you want to throw your laptop out the window. It makes my cryptographer heart sing, too: it’s based on WireGuard, which is based on Noise, and is one of the cleanest modern cryptographic designs around.

Tailscale just announced Workload Identity beta. Workload identity is an awesome concept that we already used in places like GitHub Actions, so we wanted to put it to work right away.

Just one problem: we run most of our infrastructure on AWS, and AWS doesn’t speak OpenID Connect (OIDC) as an identity provider for its workloads. AWS is happy to accept OIDC tokens from providers like GitHub Actions or Google Cloud, but there’s no built-in way to turn an AWS IAM identity into an OIDC token that Tailscale could verify (or anyone else for that matter).

So we built one. aws-oidc-token-exchange is an open-source bridge that lets your AWS workloads authenticate to OIDC-based services using their existing IAM identities. Let’s talk about why this matters and how it works.

What is workload identity?

“Workload identity” means your services authenticate using short-lived tokens where the platform vouches for the workload’s identity, not secrets you manage.

Traditional authentication uses secrets you create and distribute, like API keys, passwords, certificates. You rotate them periodically and revoke them when people leave or systems are compromised. (Right? You’d never forget to do that. And you’d always detect all compromises. Surely!) Workload identity flips this: when your Lambda function or EC2 instance needs to prove who it is, it requests a token from the platform. The platform says “yes, I’m running this specific workload, here’s a cryptographically signed assertion of its identity”.

The key difference is who provides the identity attestation. With traditional secrets, you assert and guard your workload’s identity. With workload identity, the platform provider asserts your workload’s identity based on properties it controls: what code is running, on which hardware, with which configuration, under which account.

The benefits: no secret distribution, cryptographic binding to platform infrastructure, automatic expiration without manual rotation, and audit trails from the platform. Access control improves too: you change policies instead of rotating secrets. Tokens are tightly scoped based on platform-verified claims, so revoking access is surgical. Tailscale is particularly good here: you can rapidly lock down access. We find this is a useful in-between for organizations who aren’t ready for dedicated identity management tools but want the generally useful features Tailscale brings.

This isn’t new: Kubernetes has been doing this with service account tokens for years. What’s new is the ecosystem standardizing on OIDC as the interchange format, which means you can use one workload identity to access multiple services.

Tailscale’s take on workload identity

Tailscale’s workload identity feature treats nodes as first-class network identities. This is powerful because it means your EC2 instance, Lambda function, or ECS task can join your tailnet just like a user’s laptop would, but with its own cryptographically verified identity provided by the underlying platform.

This changes how nodes join your network. Normally, adding a node to your tailnet requires an auth key, which is a long-lived secret. Workload identity lets nodes join using short-lived OIDC tokens instead, where the OIDC provider (like our bridge) vouches for the workload’s identity.

Because Tailscale knows the workload’s identity claims (such as AWS account, role name, …), you can assign ACL tags based on those platform-verified claims. “All EC2 instances in our production account with the web-server role can access the database” becomes expressible as policy, not a pile of manually managed auth keys.

This also helps you improve your audit logs by helping you tie AWS identities to Tailscale ones.

Tailscale’s implementation supports OIDC federation, which means they’ll accept tokens from any OpenID Provider you configure. They verify the signature against the provider’s public keys (via JWKS, which is just a way to standardize where to find your keys, and indirectly root your identity into WebPKI), validate the claims, and grant access accordingly.

AWS’s workload identity story

AWS has a strong workload identity model: IAM. Every EC2 instance, Lambda function, and ECS task has an IAM role that uniquely identifies it. The instance metadata service (IMDS) provides temporary credentials with platform-backed identity attestation: your workload can prove it’s running on AWS infrastructure with specific properties. The problem isn’t the identity; it’s the protocol.

AWS speaks SigV4 for authentication. It’s a solid signature-based protocol (evolved from several broken schemes) where your AWS credentials sign requests, proving you possess the secret key associated with an IAM identity. This works great for AWS services, but it doesn’t help when you need to authenticate to a third-party service that expects OIDC tokens.

AWS does support OIDC, but only inbound. You can use AssumeRoleWithWebIdentity to let GitHub Actions or Google Cloud workloads assume AWS roles. Cognito lets you associate OIDC identities with AWS ones. IAM Roles Anywhere lets external workloads authenticate to AWS using X.509 certificates. IRSA (IAM Roles for Service Accounts) gives EKS Kubernetes service accounts AWS credentials via OIDC. You can register OIDC Identity Providers with AWS so they can assume roles. All of these are about authenticating with OIDC to AWS.

What AWS doesn’t provide is the outbound direction: you can’t use your AWS identity to authenticate to an external service. There’s no built-in way for an AWS workload to obtain an OIDC token that external services can verify.

Building the bridge

Our solution is conceptually simple: build an OpenID Provider that issues OIDC tokens for AWS identities.

Architecture

The bridge consists of three endpoints. The token exchange endpoint (/token) accepts SigV4-signed requests, extracts the caller’s IAM identity from the platform, and returns a signed OIDC token. The JWKS (/.well-known/jwks.json) and Discovery endpoints (/.well-known/openid-configuration) help third party services figure out how to use the provider.

A note about cryptography

We use a KMS asymmetric key (RSA-2048 by default) to sign JWT tokens. The private key never leaves AWS’s hardware security modules, and we can cryptographically prove which AWS account issued a token by verifying it against the public key.

In a cryptographically ideal world, RSA-2048, certainly in PKCSv1.5, isn’t what we’d pick. Even ECDSA on P256, the other realistic option, isn’t the modern cryptography favorite. Heck, this entire thing is built on JWTs and we haven’t exactly been subtle about our feelings there either. However, we don’t get to pick the standard here: we’re interoperating with existing systems. Additionally, a lot of the concerns are mitigated because of the hardened implementation. The Cryptographic Right Answers series is about helping developers pick safe defaults: it does not say that every deviation is a vulnerability.

If you’re wondering why not post-quantum cryptography: these are short-lived tokens used for authentication. A future quantum computer may be able to break that key, after which the attacker would be able to forge tokens. By then no-one would accept those tokens anymore: they’re only being used for authentication, implying a short validity window. This design is also rooted in WebPKI, which today has the same limitation. Therefore, it doesn’t make much sense to accept all the downsides of post-quantum cryptography today (performance, key size, ecosystem support, …) when an attacker with that capability would just attack WebPKI to have the JWKS attest to a key they control. JWKS means we can easily upgrade to quantum-resistant algorithms down the line.

Post-quantum algorithms would be important if these tokens were used for confidentiality or non-repudiation, and validity window is too large to risk quantum computers not becoming practical. The linked blog post goes into more detail, and goes into what you should do for situations where it does matter, like TLS key exchange.

Security properties

The design leans on AWS’s existing security guarantees rather than inventing new ones. AWS Lambda (via Function URLs or API Gateway) verifies the workload’s identity before your handler runs, giving you platform-backed identity attestation. Everything derives from IAM and KMS.

KMS handles signing with HSM-backed keys, so private key material never leaves specialized hardware. Tokens are short-lived (10 minutes by default), limiting blast radius. All token requests appear in CloudWatch logs for audit trails. AWS credentials never cross the network: workloads present signed tokens instead.

How it works

Flow

Here’s the flow when an AWS workload wants to authenticate to Tailscale:

The workload makes a SigV4-signed HTTP request to the token exchange endpoint, specifying the audience (for example: tailscale.com)
AWS-provided IAM authentication verifies the SigV4 signature and extracts platform-provided identity details into the handler’s request context, such as account ID and role ARN.
The handler builds a standard OIDC token with standard claims and a few AWS ones. Standard claims include iss (issuer), sub (subject, here the AWS ARN), aud (audience), iat (timestamp this token was minted), exp (timestamp this token expires). The AWS-specific claims include information like the account ID and role name.
The Lambda calls KMS to sign the JWT header and payload. Then, it assembles the final JWT (header.payload.signature)
The workload gets back the JWT, a standard OIDC token it can present to compatible Relying Parties.

Then, an RP (like Tailscale) verifies the token by:

Fetching the JWKS from our public endpoint
Verifying the JWT signature using the public key
Validating the claims (such as issuer, audience, expiration)
Using the verified claims for authorization decisions

As a sequence diagram:

sequenceDiagram
    participant W as AWS Workload
    participant B as OIDC Bridge<br/>(Lambda/API Gateway)
    participant KMS as AWS KMS
    participant RP as Relying Party<br/>(e.g., Tailscale)

    Note over W,RP: Token Request Flow
    W->>B: SigV4-signed HTTP request<br/>(audience: tailscale)
    Note over B: AWS IAM verifies SigV4 signature<br/>and extracts identity (account, role ARN)
    B->>B: Build OIDC token with claims<br/>(iss, sub, aud, iat, exp, aws:*)
    B->>KMS: Sign JWT (header.payload)
    KMS-->>B: Signature
    B->>B: Assemble JWT<br/>(header.payload.signature)
    B-->>W: OIDC token (JWT)

    Note over W,RP: Token Verification Flow
    W->>RP: Present OIDC token
    RP->>B: Fetch JWKS (public key)
    B-->>RP: Public key
    RP->>RP: Verify JWT signature
    RP->>RP: Validate claims<br/>(issuer, audience, expiration)
    RP->>RP: Authorization decision<br/>based on verified claims
    RP-->>W: Access granted

Example usage

The aws-oidc-token-exchange README describes how to deploy the infrastructure. Assuming you have AWS credentials you can issue a curl command to retrieve a token:

curl -sS "https://your-domain.com/token?audience=tailscale.com" \
    --aws-sigv4 aws:amz:us-east-2:execute-api \
    --user "${AWS_ACCESS_KEY_ID}:${AWS_SECRET_ACCESS_KEY}" \
    --header "x-amz-security-token: ${AWS_SESSION_TOKEN}" \
    --header "Accept: application/json"

Curl signs the request with SigV4 using whatever AWS credentials are provided. You get back a JWT:

{
  "access_token": "eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCIsImtpZCI6I...",
  "token_type": "Bearer",
  "expires_in": 600
}

Decode the JWT and you’ll see claims like:

{
  "iss": "https://your-domain.com",
  "sub": "arn:aws:sts::123456789012:assumed-role/MyRole/session",
  "aud": "tailscale.com",
  "iat": 1730736000,
  "exp": 1730736600,
  "aws:account": "123456789012",
  "aws:arn": "arn:aws:sts::123456789012:assumed-role/MyRole/session",
  "aws:user_id": "AROAI...:session",
  "aws:caller_id": "AROAI...:session",
  "aws:access_key": "ASIAI...",
  "aws:principal_org_id": "o-...",
  "aws:source_ip": "0.0.0.0",
  "aws:user_agent": "curl/8.7.1",
  "aws:arn:partition": "aws",
  "aws:arn:service": "sts",
  "aws:arn:account": "123456789012",
  "aws:arn:resource_type": "assumed-role",
  "aws:arn:resource_name": "MyRole/MySession",
  "aws:arn:role_name": "MyRole",
  "aws:arn:session_name": "MySession"
}

Integrating with Tailscale

On the Tailscale side, you configure your bridge as an OIDC provider. Register your issuer URL in Tailscale’s admin console, validate any of the available claims against expected values, and choose which tags the nodes should receive. Finally, configure the workloads that should receive those tags to use the generated audience when requesting tokens from the issuer.

Your AWS workloads can now join the tailnet by presenting their OIDC tokens. The full integration guide is in the repository’s Tailscale documentation.

A note about Tailscale Services

This feature pairs well with the (also brand new) Services feature. It has lots of use cases, but it provides a new solution to an operational issue you might’ve run into if you’re putting a lot of nodes on your tailnet. When you have ephemeral workloads joining and leaving rapidly, you accumulate increasingly ridiculous names: adminconsole-5, adminconsole-17, adminconsole-1-1-final-FINAL (okay, we made that last one up, but you get the idea). If you start using workload identity, you’re likely putting more stuff on your tailnet, and so you’re more likely to have this issue if you hadn’t yet.

Tailscale Services let you define stable service names that persist across multiple backend hosts. The service gets a consistent identity and DNS name regardless of which specific workload is handling traffic at any given moment. Your ECS task authenticates with its AWS IAM role via an OIDC token, joins your tailnet, and exposes a service with a stable name. The task might come and go, but the service identity stays put.

Conclusion

The challenge with workload identity has been feature availability and a lack of interoperability. Every platform has its own identity system (AWS IAM, Google Cloud service accounts, Kubernetes service accounts) that don’t talk to each other. OIDC is emerging as the standard changing that. It’s not perfect, but it’s good enough. The nitpicks don’t weigh up against the risk being mitigated.

Tailscale’s workload identity feature demonstrates this. Combining network identity with workload identity means you know exactly which workload accessed what, no static auth keys to manage, and ACLs based on tags assigned after claim verification.

Our bridge fills the gap for AWS users who want to participate in this ecosystem. It’s not a hosted service (you run it in your own AWS account), but it’s serverless, cheap (a few dollars a month for moderate usage), and secure. The code is on GitHub with docs.

And if you’re at AWS reading this: please build OIDC token issuance into IAM or STS. We’d happily deprecate this project if you did.

Thanks to Alex Gaynor, Matthew McPherrin, Xavier Garceau-Aranda for reviewing this blog post.