Saturday, November 25, 2023

Service to Service Auth with Azure AD, MSI & OAuth 2.0 (Step by Step)

 

An appropriate service to service auth flow, I’ve imagined, should look something like the following:

Service to Service OAuth flow
Potential Auth flow between two services

At its core, I’d wanted to fulfill the following requirements from the auth layer — given two communicating services, A and B:

  • Service A should be able to expose some specific ‘roles’ / ‘scopes’ (I’ll be using roles throughout this article), for instance, ‘Service A Reader’ and ‘Service A Writer’
  • An authorized Azure user, should be able to grant Service B access to some, or all roles of Service A — without tenant wide admin consent
  • Roles should be visible in the signed JWT OAuth token, so RBAC could be implemented on top of it
  • An access to Service A endpoint with a given token should be verifiable by its audience (aud JWT field), issuing date (nbf JWT field), expiry (exp JWT field), signature (JWT’s footer), issuer (iss JWT field) and roles (e.g. ‘Service A Writer’ for some write-semantic operation)
  • Bonus #1: Make Azure AD deny token creation for AD entities with no assigned roles (i.e. only role-bearing clients would be able to create a token for our service, in the first place)
  • Bonus #2: Have the ability to use Managed Identities (MSI & IMDS) in order to issue access tokens from specific VMs without storing the generated client credentials
  • Bonus #3: Minimize manual processes that requires user interaction with Azure Portal

After a day or so of playing with Azure AD’s enterprise applications solution, I’ve managed to accommodate the above criteria, and in this article I’ll delve upon each point individually.

Choosing an appropriate OAuth 2 flow

We’ll be using OAuth 2 in our solution, and so one of the first things we need to cover is choosing an appropriate OAuth 2 flow.

A quick overview of Azure AD’s OAuth 2 flows is given below (feel free to skip if you’re already familiar with them):

  • Authorization code flow — Requires user interaction and consent, typically via the web browser, to get a code which is then used to issue an access token
  • Implicit grant flow — Created for single page web / mobile webview apps, where token creation and handling is done entirely from the front end
  • On-behalf-of flow — Helps us use one user-bound access token, to create another user-bound access token, for a different resource (e.g. Service A uses a given token to issue a new, user-bound token, for service B)
  • Device code flow — Useful for input-constraint devices (IoT peripherals, for instance), helps the user to consent on an out-of-band channel, such as their browser on their computer, via a code and URL displayed on the constrained device
  • Resource Owner Password Credentials (ROPC) flow — Mostly used in legacy applications, where a service holds the actual user credentials
  • SAML bearer assertion flow — An interop between SAML assertions and OAuth, which allows a service to use an already issued SAML assertion to create an OAuth access token
  • Client Credentials flow — The only flow that does not require immediate user interaction, usually used when the OAuth client is acting on-behalf of itself, when user-consent doesn’t make sense, or when authorization primitives could be configured out-of-band (for instance via Azure AD)

Reviewing the supported OAuth flows against our use case, where the user is a service, which would like issue token for a different service — the client credentials flow seems most appropriate, and indeed we’ll be using it throughout the rest of the article.

Application registration within AAD

Now that we’ve settled on the OAuth flow to be used by our services, we need to be able to identify them within Azure AD — for that end, let’s register our applications (Note: You must have app-registration permissions within your Azure AD instance)

Registering our applications via the portal

1. Go to your Azure AD instance

Azure AD instance

2. Click on ‘App registrations’ on the side bar

App registrations

3. Click on ‘New registration’ on the top command bar

4. Choose an appropriate name, the tenant scope (single / multi tenant app), and ‘Web API’ as the platform configuration

5. Repeat steps 1–4 above for Service B

Great ! We now have our services identifiable in Azure AD via the application registrations we’ve just created.

Adding custom service roles

Our next requirement is to be able to expose custom roles in our applications — let’s do just that, in this example, I’m going to add ‘Service.A.Reader’ and ‘Service.A.Writer’ roles to the previously created ‘Service A’ application.

Adding custom service roles via Azure Portal

  1. Go to the ‘App registration’ blade within Azure Active Directory
App registrations

2. Click on your application within the registration blade

Owned application

3. Go to the application manifest

AAD application manifest

4. Add your custom application role objects to the ‘appRoles’ JSON field within the manifest:

"appRoles": [{
"allowedMemberTypes": ["Application"],
"description": "Reader Role",
"displayName": "Service A Reader",
"id": "13371337-1337-1337-1337-133713371337",
"isEnabled": true,
"value": "Service.A.Reader"
},
{
"allowedMemberTypes": ["Application"],
"description": "Writer Role",
"displayName": "Service A Writer",
"id": "13371337-1337-1337-1337-133713371338",
"isEnabled": true,
"value": "Service.A.Writer"
}],

Note: The role “id” could be any string, and there’s no uniqueness constraints between cross-application roles; Using a GUID is a good best practice, though.

5. Click on ‘Save’ to save the manifest

You have now successfully configured custom roles for your application, using the above as a template, you may create any custom role with your own service semantics (Starting with CRUD-like semantics might make sense for a lot of services, though)

Setting an Application ID URI (OAuth resource URI) and generating App Credentials

At this point, we’ve introduced two applications to our Azure AD instance, and have configured some custom roles for one of them.

This is a good point to segue into configuring our applications to support the OAuth client credentials flow — for this we’d need to:

  1. Define a unique Application ID URI, and
  2. Generate app credentials

Adding an Application ID URI via Azure Portal

  1. Within Azure AD App registration blade, go to your application (as shown in previous steps)
  2. Go to the ‘Expose an API’ blade
Expose API blade

3. Click on the ‘Set’ button near the Application ID URI div

Set application ID URI

4. You may accept the default value of ‘api://<appId>’, In this example I’ll set it to ‘api://service-a.example.com’ (Note: The URI could be any string with a supported URI scheme, e.g. api://, https://, …)

5. Repeat the above for service B, changing the URI appropriately

Neat — You have now configured a valid application URI to be used when issuing OAuth tokens for your services.

Generating application credentials via Azure Portal

Moving on, we would now like for service B to access service A, given the application URI set for it above.

In order to generate a valid access token with Azure AD, we’d need to generate application credentials to be used when authentication our service to Azure AD.

  1. Within Azure AD app registration blade, go to Service B (the client of service A — as shown in the previous steps)
  2. Go to the ‘Certificates & secrets’ blade
Certificates and secrets blade

3. Click on ‘New client secret’ to generate application credentials (Note: you can think of the application ID as a username, and the generated secret as a password, for authenticating to Azure AD)

New client secret

4. Set an appropriate description, and choose an expiry time which you’re comfortable with, then click ‘Add’

Adding application credentials

5. Save the generated credentials value, we’ll use it later on (from here on, I’ll reference it as ‘client secret’)

Issuing & inspecting our first OAuth token

At this stage, we should be able to issue tokens to Service A, on behalf of Service B — let’s see that in action.

  1. In Azure AD application registration blade, go to Service B (as shown in previous steps)
  2. In the Overview blade, Click on the ‘Endpoints’ button at the command bar
Azure AD endpoints

3. In the opened Endpoints blade, copy the OAuth 2.0 token endpoint (v2) URL

OAuth 2 token endpoint (v2)

4. Issue a HTTP POST call for the given URL with the following parameters (Note: Be sure to replace all templated parameters annotated by brackets, e.g. <param>):

$> curl -s -XPOST <token-v2-endpoint> \
-d grant_type=client_credentials \
-d client_id=<service-b-app-id> \
-d client_secret=<service-b-client-secret> \
-d scope=<service-a-application-id-uri>/.default

Note: Append a “/.default” to the configured URI, in our example it would be: api://service-a.example.com/.default

5. You should now have been issued a valid, signed, JWT-encoded access token — for accessing service B on behalf of service A; Copy the ‘access_token’ value to https://jwt.ms to inspect your token

Decoded JWT

Inspecting the above, we could note that:

  • The kid (key id) field in the header identifies the certificate which was used to sign the JWT
  • The aud (audience) field represents our ‘server’ Application ID URI(Service A, in our example)
  • The iss (issuer) field represents the Security Token Service issued the token (Azure AD)
  • The nbf (not before) field represents the minimum acceptance time of the token (i.e. the token isn’t valid before that time)
  • The exp field denotes the expiry time as epoch in seconds
  • The appid field shows the client id (Service B, in our example)
  • We would use all of the above later on, when verifying that a token is indeed valid
  • You can read more about all the other fields in the JWT here

Note that one thing that is missing from our token — is any mention of the custom roles we’ve assigned to service A; We’ll deal with that in the next step.

Granting custom roles to an Azure AD application (without Tenant-Admin Consent)

In order to implement RBAC on top of our issued OAuth JWT access tokens, we’d like to grant Service B a custom role of Service A.

Usually, the above is implemented via OAuth scopes, and you might’ve seen that in Azure AD’s ‘Expose an API’ blade

OAuth scopes in AAD

However, adding custom scopes, at least in Azure AD — only applies for OAuth flows with user consent (not client credentials).

When using the client credentials flow, we must fallback to use application roles instead — if we were to try to add our configured Service.A.Reader role via the API permissions blade in Azure Portal, we’ll be quick to note that this operation requires tenant-admin consent;

Tenant Admin consent

In large enterprises, though, a tenant-admin consent might involve quite a bit of ceremony — dealing with in house ticketing systems, or talking to a remote IT / Ops department might take up to a few days, in some cases.

To grant a custom role to an application, without the need of admin consent, we can instead use the Microsoft Graph API (Note: There’s no way to execute this operation within Azure Portal, as of the time of writing).

Using Graph API to grant roles to a registered application

We’ll be invoking the appRoleAssignment creation API from Microsoft Graph on service A — this could be easily invoked via Azure CLI:

$> az rest \
--method post \
--uri https://graph.microsoft.com/beta/servicePrincipals/<service-a-enterprise-object-id>/appRoleAssignments \
-- headers "{\"content-type\": \"application/json\"}" \
-- body "{\"appRoleId\": \"<roleId>\", \"principalId\": \"<service-b-enterprise-object-id>\",
\"principalType\": \"ServicePrincipal\", \"resourceId\": \"<service-a-enterprise-object-id>\"}"

Let’s break it down:

  • We are invoking the appRoleassignments creation Graph API on the Enterprise Application (more on the difference between an Enterprise Application and Application registration below) of Service A (Denoted by the Object ID) — note that this is done similarly to how you would control access to Azure resources, like Azure Key Vault, where you add client grants to the ‘server’ service
  • We are passing it a role ID (In this tutorial, the Service.A.Reader role has an ID of 13371337–1337–1337–1337–133713371337)
  • We are passing in the principal ID of the Enterprise Application of Service B (Denoted by the Object ID)
  • The principal type is “Service Principal”
  • We pass in a target resource, which is once again, the Object ID of the Enterprise Application of Service A

In order to get the proper Object ID of both services, you should:

  1. Go to the App registration blade in Azure AD (as shown in previous steps)
  2. Go to the relevant application (as shown in previous steps)
  3. In the Overview tab, verify the application has an Enterprise Associated with it, by looking at the “Managed application in local directory” div
Create service principal — enterprise application

4. If you see a ‘Create Service Principal’ link like above, click on it and wait for a few minutes — this will create an Enterprise Application instance for your app registration

5. Once an Enterprise Application is created, you should see have a hyperlink pointing to it in the same div above

Enterprise application hyperlink

6. Clicking on the above will lead you to the relevant Enterprise Application, which will show the Object ID you should use in the REST call above

Enterprise Application Object ID

7. Alternatively, you could search your application in the Enterprise Applications blade of your Azure AD instance

A successful invocation should look similar to the below:

This command is in preview. It may be changed/removed in a future release.
{
"@odata.context": "https://graph.microsoft.com/beta/$metadata#appRoleAssignments/$entity",
"appRoleId": "13371337–1337–1337–1337–133713371339",
"creationTimestamp": "2019–10–21T18:20:53.9088367Z",
"id": "Hl8A-br-1kugbGxg1gpt2sQktBGJedJNiheeF4x1KBo",
"principalDisplayName": "Service B",
"principalId": "f9005f1e-feba-4bd6-a06c-6c60d60a6dda",
"principalType": "ServicePrincipal",
"resourceDisplayName": "Service A",
"resourceId": "38b8c0f9–837a-4abd-816f-bc51282519e2"
}

We may now verify that the role was in fact granted, by:

  1. Going to the Enterprise Applications blade within Azure AD
Enterprise application blade

2. Clicking on the Enterprise application instance of Service A (the ‘Server’ app)

Service A enterprise application

3. Going to the Users and Groups blade

Users and groups blade

4. The users table should show Service B, in a fashion similar to the below:

As the time of this writing, the Azure Portal doesn’t render ServicePrincipal types correctly (note the missing Role assigned, the corrupted icon)

Note: You might’ve noticed that the object ID shown in the app registration blade for your service, differs from the object ID in the enterprise application blade.

The difference between app registrations and an enterprise applications, is that an app registration is an instance of your application in Azure AD (this could be single or multi tenant); The object id shown there represents the application itself.

An enterprise application, though, is a unique ‘instance’ of a given application within your Azure AD directory; You may install multi-tenant applications (from, say, the Azure AD gallery) in several directories, each of them will get a unique service principal (object id) in the enterprise application blade.

Inspecting assigned roles within the JWT

We have now assigned our custom roles to Service B, and subsequent AAD issued tokens should include them within the JWT — let’s take a quick look:

  1. Issue a new access token (similar to previous steps)
$> curl -s -XPOST <token-v2-endpoint> \
-d grant_type=client_credentials \
-d client_id=<service-b-app-id> \
-d client_secret=<service-b-client-secret> \
-d scope=<service-a-application-id-uri>/.default

2. Inspect the access token in https://jwt.ms

JWT with roles

3. Note that the JWT above now holds a ‘roles’ field, containing the role we’ve assigned to Service B

We may now verify that a token holds a requested role, and implement RBAC on top of it.

Enforcing tokens could only be issued for applications with one or more roles

Depending on your use case, it might be beneficial to delegate some authorization work to Azure AD, and have it enforce that only applications that have assigned roles, could issue tickets for your services.

The above could be configured on the server application (Service A, in this write up):

  1. Open your service Enterprise Application instance (as shown in previous steps)
Properties blade in Enterprise Applications

2. Set the User assignment required ? Options to yes

Enforce user assignment

3. Click the ‘Save’ button in the command bar

Save button

That’s it — issuing tokens with applications that do not have at least one application role will now fail, as shown below:

{
"error":"invalid_grant",
"error_description":"AADSTS501051: Application 'dd4f719c-fd7b-44f7–9c83–3eae26c72df6'(Service C) is not assigned to a role for the application 'api://service-a.example.com'(Service A).\r\nTrace ID: ce5fd681-b9fd-4d5c-a4d3-e10bcc072100\r\nCorrelation ID: bcc383b8–3680–41a2-b4c8–44fb33308776\r\nTimestamp: 2019–10–21 19:41:18Z",
"error_codes":[501051],
"timestamp":"2019–10–21 19:41:18Z",
"trace_id":"ce5fd681-b9fd-4d5c-a4d3-e10bcc072100",
"correlation_id":"bcc383b8–3680–41a2-b4c8–44fb33308776",
"error_uri":"https://login.microsoftonline.com/error?code=501051"
}

Using Managed Identities (MSI) and IMDS to issue tokens, without an app secret

If you’re like me, you want to minimize the amount of secrets floating around in your services to a minimum — Managed Identities are pretty cool, and can accommodate this need.

Using an Azure managed identity, and assigning it to an Azure VM, you’ll be able to issue tokens from that VM, on behalf of the identity, without supplying any credentials.

Let’s create a managed user identity, assign the ‘Service.A.Writer’ role to it, bind it to an azure VM, and use IMDS to issue a token for service A:

  1. Create a resource group to host the identity via azure CLI:
$> az group create --location westus --name "service-b-identity-resource-group"

2. Create a managed user identity in the above resource group via azure CLI:

$> az identity create --location westus --resource-group "service-b-identity-resource-group" --name "service-b-identity"

A successful operation should output something similar to:

{
"clientId": "15bd7d57-d563-433b-b018-d411baff4d49",
"clientSecretUrl": "https://control-westus.identity.azure.net/subscriptions/<subscription-id>/resourcegroups/service-b-identity-resource-group/providers/Microsoft.ManagedIdentity/userAssignedIdentities/service-b-identity/credentials?tid=<tenant-id>&oid=<object-id>&aid=<client-id>",
"id": "/subscriptions/<subscription-id>/service-b-identity-resource-group/providers/Microsoft.ManagedIdentity/userAssignedIdentities/service-b-identity",
"location": "westus",
"name": "service-b-identity",
"principalId": "<object-id>",
"resourceGroup": "service-b-identity-resource-group",
"tags": {},
"tenantId": "<tenant-id>",
"type": "Microsoft.ManagedIdentity/userAssignedIdentities"
}

3. Assign a custom role to the identity (You may need to wait a couple of minutes before issuing the below, or run it twice if you get a Request_ResourceNotFound error)

$> az rest --method post --uri https://graph.microsoft.com/beta/servicePrincipals/<service-a-enterprise-object-id>/appRoleAssignments --headers "{\"content-type\": \"application/json\"}" --body "{\"appRoleId\": \"13371337-1337-1337-1337-133713371338\", \"principalId\": \"<user-identity-principal-id>\", \"resourceId\": \"<service-a-enterprise-object-id>\", \"principalType\": \"ServicePrincipal\"}"

Once again, note that we need to use the Enterprise Application object ID of service A, and the object id (available in the principalId field in the az CLI response in step #2)

4. Assign the identity to an Azure VM (most likely, a VM that will host service B)

$> az vm identity assign -g "<vm-resource-group>" -n "<vm-name>" --identities "<user-identity-id>"

Note: We need to use the id field from the az CLI response in step #2

5. Login to the VM and use IMDS to issue a token, without secrets

$> curl -s -H Metadata:true "http://169.254.169.254/metadata/identity/oauth2/token?api-version=2019-06-04&resource=api://service-a.example.com&client_id=<identity-client-id>"

Note: You should use the clientId field given at the output of step #2; The client id query param can be omitted if the VM has only one identity assigned, however passing the id is a best practice

6. Inspect the token in https://jwt.ms

Managed identities JWT

Note how the Service.A.Writer role is assigned in the token above.

Validating the JWT — step by step

We now have the ability to issue access tokens that contain any number of custom app roles; Below I describe how service A should validate that a given token is valid.

  1. Validate that token is passed via a header / other mechanism (depending on the protocol used)
  2. Parse the JWT via the library of your choice (I’ve settled on jjwt for java) — note that you might need to trim the signature at this step (keep the last dot in the JWT, delete everything after), because you don’t have public key that signed on the JWT quite yet
  3. Verify the aud (audience) field, it should match your application ID URI
  4. Verify the iss (issuer) field, it should match https://sts.windows.net/<tid>/ (you might want to just compare the host, for multi tenant applications)
  5. Verify that the current time is after the timestamp given at the nbf (not before) field
  6. Verify that the current time is before the timestamp given at the exp (expiry) field
  7. Optionally: Verify that the appid field contains a white-listed application, if such verification makes sense in your use case
  8. Verify that an appropriate role is available under the roles field
  9. If the above criteria passed, the token contents is valid — and you should continue to verify its signature
  10. Retrieve the jwks uri from the OIDC configuration endpoint (this endpoint is available under the Endpoints blade in the app registration blade, shown in previous steps)
OIDC endpoint

11. Issuing an HTTP GET for the endpoint above, will yield a response with a jwks_uri, take note of it

jwks_uri

12. Issuing an HTTP GET for the jwks endpoint, will yield a response with a list of certificates, identifiable by a kid (key id)

jwks_uri keys

13. Back to our JWT — you should now verify that the kid available in the JWT’s header is in the keys list

14. Once verified, you should take note of the certificate chain in the x5c field of the jwks endpoint response

15. Depending on the library you use, you may need to extract the public id from the first certificate in the chain

16. With the public key at hand, use the JWT library of your choice, now without omitting the JWT footer to validate the signature of your token; Alternatively, use the public key to sign on the JWT’s header and payload, and compare the signature yourself

Do note that:

  1. For multi tenant apps, you might want to use the issuing tenant’s jwks endpoint (you may use the tid field from the JWT)
  2. You might want to keep an in memory copy of the STS public keys (jwks endpoint response) via having a background thread (or separate service) that keeps those up to date
  3. You will probably want to introduce additional caching logic, verifying the same token over and over, for instance, could be redundant

Automating application creation with Azure CLI

At our last stop, we’d like to automate the application creation to eliminate any Azure Portal user interaction.

Let’s do it step by step:

  1. Creating an application registration via Azure CLI
$> az ad app create --display-name "Auto Service A" --credential-description "Creds" --identifier-uris "api://auto-service-a.example.com"  --password "s0meSt0ngP@ssword1!" --app-roles @roles.jsonroles.json should be a file in current directory, with the following content (You may add or remove entries as you wish):
[{
"allowedMemberTypes": [
"Application"
],
"description": "Auto Service A Reader",
"displayName": "Auto Service A Reader",
"isEnabled": "true",
"value": "Auto.Service.A.Reader",
"id": "13371337-1337-1337-1337-133713371337"
}]

The above will create an application with our custom roles pre-assigned

2. Creating a service principal for a given application via Azure CLI

$> az ad sp create --id <app-id>Use the app ID given in the appId field in the response of step #1

3. Enforcing app roles assignment required for token issuing

$> az ad sp update --id <object-id> --set appRoleAssignmentRequired=trueUse the object ID given in the objectId field in the response of step #2

4. Repeat 1–3 for the client application, change parameters where applicable

5. Assigning roles for a given application

$> az rest --method post --uri https://graph.microsoft.com/beta/servicePrincipals/<sp-service-a-object-id>/appRoleAssignments --headers "{\"content-type\": \"application/json\"}" --body "{\"appRoleId\": \"<role-id>\", \"principalId\": \"<sp-service-b-object-id>\", \"principalType\": \"ServicePrincipal\", \"resourceId\": \"<sp-service-a-object-id>\"}"

Note that the object IDs are the ones in the objectId field from the output of step #3; Also, you may need to run this twice if you get an Request_ResourceNotFound error (This API is still in beta, and a bit flaky, at the time of writing)

That is all — we have automatically:

  • Created two app registrations, with one or more custom app roles
  • Created two service principals bound to an app registration
  • Enforced app role assignment requirement for token issuing
  • Assigned one or more custom role for a client-app service principal, on the server-app service principal

Final thoughts

Following the steps above, you should have been able to meet all of the required service to service communication auth layer criteria we’ve discussed at the beginning of the article.

I believe that using the outlined steps, we’ve been able to meet most modern authentication and authorization requirements for ‘userless’ microservice communication, on top of Azure AD.

Thanks for reading !

Feel free to leave your comments, tips or tricks below.

No comments:

Post a Comment

How Netflix Scales its API with GraphQL Federation (Part 1)

  Netflix is known for its loosely coupled and highly scalable microservice architecture. Independent services allow for evolving at differe...