Skip to content

Conversation

gladjohn
Copy link
Contributor

Updated the document to reflect changes from 'Short-Lived Credential (SLC)' to 'Certificate' terminology and clarified the handling of certificate revocation scenarios.

Fixes - Spec update

Changes proposed in this request
This pull request updates the documentation for MSI V2 credential revocation, clarifying and expanding the specification to focus on certificate revocation (rather than short-lived credentials) and aligning terminology, flows, error handling, and acceptance tests with current implementation and Azure AD error codes. The changes provide detailed guidance on how MSAL should handle certificate revocation, claims challenges, and telemetry.

Key documentation improvements:

Terminology and Flow Updates:

  • Changed terminology throughout from "Short-Lived Credential (SLC)" to "certificate," clarifying that the revocation and renewal process is certificate-based. Sequence diagrams and flow descriptions are updated for accuracy. [1] [2]

Error Handling and Remediation:

  • Added explicit mapping of AADSTS error codes (1000610–1000614) to certificate/attestation failures and detailed required MSAL remediation steps, including bypassing the cache and minting a new certificate.
  • Provided updated pseudo-code and acceptance tests to reflect these flows, emphasizing auto-remediation and claims challenge handling.

Claims Challenge Handling:

  • Clarified the process for handling claims challenges: when an application receives a 401 with claims, it must pass the claims to MSAL, which then mints a new certificate and retries the token request with the claims.

Acceptance Tests and Telemetry:

  • Revised acceptance test scenarios to match the new flows, including auto-remediation, claims challenges, and telemetry validation.
  • Updated telemetry documentation for MsalMsiCounter to reflect the new tags and expected values for improved diagnostics.

Testing
n/a

Performance impact
n/a

Documentation
n/a

Updated the document to reflect changes from 'Short-Lived Credential (SLC)' to 'Certificate' terminology and clarified the handling of certificate revocation scenarios.
@gladjohn gladjohn requested a review from a team as a code owner September 23, 2025 23:37
@@ -1,46 +1,46 @@
# Short-Lived Credential (SLC) Revocation Specification
# Certificate Revocation Specification
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name of the file should also be changed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated


## Overview

This document outlines the design and implementation details for short-lived credential (SLC) revocation in MSI V2 scenarios.
This document outlines the design and implementation details for **certificate revocation** in MSI V2 scenarios.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have 1 document that discusses all credential revocation ? (cert and token)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

MSAL ->> IMDS: 2. Request Short-Lived Credential (SLC)
IMDS -->> MSAL: 3. Return SLC
MSAL ->> eSTS: 4. Exchange SLC for Access Token
MSAL ->> IMDS: 2. Request Credential (certificate)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's safe to use "certificate" everywhere. I don't see a use case of another type of credential (jwt).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

- MSAL will **relay the response to IMDS as-is**, ensuring support for future suberror codes without requiring modifications.
1. Call the certificate minting endpoint **`/issuecredential?bypass_cache=true`** to force a **new certificate** (ignore any cached/invalid state).
2. Replace the current certificate with the newly issued one.
3. **Retry** the token request with eSTS using the new certificate.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please specify retry policy. IMO, retry once?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Robbie-Microsoft keeping this as STS retry

| **Telemetry Validation** | Ensure `MsalMsiCounter` correctly logs telemetry tags such as `MsiSource`, `TokenType`, `bypassCache`, and `CredentialOutcome`. | Telemetry records correct values for each token acquisition attempt, including failures. |
| **Test Case** | **Description** | **Expected Outcome** |
|-------------------------------------------------|---------------------------------------------------------------------------------------------------------|----------------------|
| **AADSTS1000610–14 Auto-Remediation** | eSTS returns 401 `invalid_client` with any of 1000610–1000614. | MSAL calls `/issuecredential?bypass_cache=true`, obtains a new certificate, retries; success or deterministic failure surfaced. |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would split this into 2 - success after retry / failure after retry to make it clearer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

Copy link
Member

@bgavrilMS bgavrilMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved with comments

7. **Platform**
The runtime/OS environment.
- Examples: `"net6.0-linux"`, `"net472-windows"`
Each time we record `MsalMsiCounter`, include:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point in time, I do not feel that the client telemetry is significant. We can add a feature request for this, but I would not prioritize it today.

Is there any server telemetry that we want to capture?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, there is server-side telemetry that we send and also we know mTLS was used from ESTS


1. **MsiSource** — `"AppService"`, `"CloudShell"`, `"AzureArc"`, `"ImdsV1"`, `"ImdsV2"`, `"ServiceFabric"`
2. **TokenType** — `"Bearer"`, `"POP"`, `"mtls_pop"`
3. **bypassCache** — `"true"` / `"false"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this be captured in the server telemetry? In other words, given a correlation ID, can we identify the call made to the credential endpoint and to the token endpoint?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

3. **bypassCache** — `"true"` / `"false"`
4. **CertType** — `"Platform"`, `"inMemory"`, `"UserProvided"`
5. **CredentialOutcome** — `Not found` / `Retry Failed` / `Retry Succeeded` / `Success`
6. **MsalVersion** — e.g., `"4.61.0"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

version and platform are included by default in the counter, I think,.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

Copy link
Member

@bgavrilMS bgavrilMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about telemetry. Maybe start with having those values logged. Let's focus more on server telemetry.

@gladjohn gladjohn requested a review from bgavrilMS September 25, 2025 00:11
participant eSTS

Application ->> MSAL: 1. Request Access Token
MSAL ->> IMDS: 2. Request Certificate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CSR metadata request is part of the flow. I think you should add it here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

participant eSTS

Application ->> MSAL: 1. Request Access Token
MSAL ->> IMDS: 2. Request Certificate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention /issuecredential like you do below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

MSAL ->> eSTS: 4. Exchange Certificate for Access Token
eSTS -->> MSAL: 5. Response (HTTP 200 / error)

alt Token Revoked / Attestation invalid
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both alts are the same except for 5a and 5b. Do you need to repeat the entire portion of the graph after those sections? Can you just group 5a and 5b together on the graph?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are two different error conditions

participant Application
participant MSAL
participant IMDS
participant eSTS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is there a "IMDS/eSTS" section? Why not point towards the already existing "IMDS" and "eSTS" sections?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one is for cert revocation, that MSAL handles internally. The other is for token revocation.

// Certificate/attestation validation failures
if (aadsts is 1000610 or 1000611 or 1000612 or 1000613 or 1000614 || aadsts == null)
{
// Force a NEW certificate and retry
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're going to force a second cert refresh?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you already do it above if claims are provided

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are two different use cases.

| **AADSTS1000610–14 Auto-Remediation (Retry Fails)** | Same initial condition as above. New cert minted, retry still fails deterministically (e.g., repeated same code). | Failure surfaced after retry. (`CredentialOutcome=Retry Failed`) |
| **Unspecified Credential Issue** | eSTS returns `invalid_client` without codes. MSAL forces new certificate and retries. | Token succeeds or failure surfaced (assert correct `CredentialOutcome`). |
| **Claims Challenge Path** | Resource 401 with claims; app supplies claims; MSAL re-mints cert (`bypass_cache=true`) and retries with claims. | New token with claims. (`CredentialOutcome=Success`) |
| **IMDS/IssueCredential Failure Path** | `/issuecredential` call fails (network / service / malformed). | Failure returned; no infinite retry. (`CredentialOutcome=Retry Failed` if after a retry attempt) |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

specify which retry policy

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure I follow this comment

Updated revocation scenarios and clarified certificate request process.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants