Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Azure managed identity #4897

Open
wants to merge 20 commits into
base: master
Choose a base branch
from

Conversation

bentsherman
Copy link
Member

Close #4871

Thanks @swampie for the minimal example. Is there any config required to authenticate with Azure Batch? Currently with this PR, no batch credentials will be provided if only the managed identity is specified

Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Copy link

netlify bot commented Apr 9, 2024

Deploy Preview for nextflow-docs-staging ready!

Name Link
🔨 Latest commit 127376a
🔍 Latest deploy log https://app.netlify.com/sites/nextflow-docs-staging/deploys/6633550974c0d00008901595
😎 Deploy Preview https://deploy-preview-4897--nextflow-docs-staging.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link
Member

@pditommaso pditommaso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sound good but there are zero unit tests

@swampie
Copy link

swampie commented Apr 11, 2024

@bentsherman in the example I do instantiate a DefaultAzureCredentials as you do in your change. This should be enough as the instance(s) are configured to use that clientId

@adamrtalbot
Copy link
Collaborator

I'm with Ben - where do you authenticate against the Batch service?

Copy link
Collaborator

@adamrtalbot adamrtalbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused as to how this is authenticating against the Batch service. Also, we should leave some scope open for a system assigned (anonymous) managed identity.

docs/azure.md Outdated Show resolved Hide resolved
Comment on lines 230 to 232
final credential = new DefaultAzureCredentialBuilder()
.managedIdentityClientId(clientId)
.build()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be worth checking if this can pick up a system assigned identity as well as a named one. You should be able to do this by dropping the managedIdentityClientId(clientId) part (educated guess here):

        final credential = new DefaultAzureCredentialBuilder()
        if ( clientId ) {
            credential.managedIdentityClientId(clientId)
        }
        
        finalCredential = credential.build()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will also enable people to authenticate as themselves on their personal machines if they're logged in to Azure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need another config option to explicitly enable the system assigned identity.

Here are the docs for the default credential builder: https://learn.microsoft.com/en-us/java/api/com.azure.identity.defaultazurecredentialbuilder?view=azure-java-stable

It's not obvious to me that it defaults to the system-assigned identity, since it seems to support other credentials. But maybe you can discern better than me.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you just call defaultCredentialBuilder it goes in this order: https://learn.microsoft.com/en-us/azure/developer/java/sdk/identity-azure-hosted-auth#default-azure-credential

  1. Environment - DefaultAzureCredential reads account information specified via environment variables and use it to authenticate.
  2. Managed Identity - If the application deploys to an Azure host with Managed Identity enabled, DefaultAzureCredential authenticates with that account.
  3. IntelliJ - If you've authenticated via Azure Toolkit for IntelliJ, DefaultAzureCredential authenticates with that account.
  4. Visual Studio Code - If you've authenticated via the Visual Studio Code Azure Account plugin, DefaultAzureCredential authenticates with that account.
  5. Azure CLI - If you've authenticated an account via the Azure CLI az login command, DefaultAzureCredential authenticates with that account.

More specifically they give this example where they state that the clientId is only required if using a user assigned identity:

/**
 * Authenticate with a managed identity.
 */
public void createManagedIdentityCredential() {
    ManagedIdentityCredential managedIdentityCredential = new ManagedIdentityCredentialBuilder()
        .clientId("<USER ASSIGNED MANAGED IDENTITY CLIENT ID>") // only required for user assigned
        .build();

    // Azure SDK client builders accept the credential as a parameter
    SecretClient client = new SecretClientBuilder()
        .vaultUrl("https://{YOUR_VAULT_NAME}.vault.azure.net")
        .credential(managedIdentityCredential)
        .buildClient();
}

https://github.com/Azure/azure-sdk-for-java/wiki/Azure-Identity-Examples#authenticating-in-azure-with-managed-identity

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that with the system-assigned identity you would only need to know the batch/storage account names and then you could submit jobs from anywhere without any client-side credentials?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or is the "system" the node from which you are submitting tasks i.e. the Nextflow head job which would presumably be running in Azure?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both. You would need to give it the batch and storage account names, but it would authenticate because of what it is, rather than what it knows. So if you ran it on an Azure VM with a System-assigned identity, you would need to:

  • Grant that system assigned identity permissions for the Batch and Storage account
  • Configure Nextflow to try and submit to that Batch and Storage account
  • Run Nextflow on that virtual machine, which will assume that identity

@adamrtalbot
Copy link
Collaborator

You will also need to update the azcopy version for this to work, see here for some additional details: #3314 (comment)

@swampie
Copy link

swampie commented Apr 12, 2024

I'm confused as to how this is authenticating against the Batch service. Also, we should leave some scope open for a system assigned (anonymous) managed identity.

IF the pool has (manually at the moment) assigned to a managed identity (user assigned) then any operation performed (towards batch or storage) by instances of the pool will use the clientId for authentication

@adamrtalbot
Copy link
Collaborator

I'm confused as to how this is authenticating against the Batch service. Also, we should leave some scope open for a system assigned (anonymous) managed identity.

IF the pool has (manually at the moment) assigned to a managed identity (user assigned) then any operation performed (towards batch or storage) by instances of the pool will use the clientId for authentication

But how does it know which Batch account to authenticate against? It will assume the identity, then it needs to know which batch account to be using when it submits jobs and tasks, that doesn't appear to be here in the code?

@bentsherman
Copy link
Member Author

But how does it know which Batch account to authenticate against? It will assume the identity, then it needs to know which batch account to be using when it submits jobs and tasks, that doesn't appear to be here in the code?

Good point, currently with the managed identity it will leave the batch credentials as null and the batch account is ignored.

Here are the docs for batch credentials: https://learn.microsoft.com/en-us/java/api/com.microsoft.azure.batch.auth.batchcredentials?view=azure-java-stable

The following implementations are available:

  • BatchApplicationTokenCredentials: Application token based credentials for use with a Batch Service Client.
  • BatchSharedKeyCredentials: Shared key credentials for an Azure Batch account.
  • BatchUserTokenCredentials: User token based credentials for use with a Batch Service Client.

@adamrtalbot
Copy link
Collaborator

pditommaso and others added 3 commits April 15, 2024 10:18
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
…mProvider.groovy

Signed-off-by: Adam Talbot <12817534+adamrtalbot@users.noreply.github.com>
@vsmalladi
Copy link
Contributor

@adamrtalbot this will be great to try with TES plugin

Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Comment on lines +308 to +325
final batchEndpoint = "https://batch.core.windows.net/"
final authenticationEndpoint = "https://login.microsoftonline.com/"

final clientId = config.managedIdentity().clientId
final credentialBuilder = new DefaultAzureCredentialBuilder()
if( clientId )
credentialBuilder.managedIdentityClientId(clientId)
final credential = credentialBuilder.build()
final token = credential.getTokenSync( new TokenRequestContext() )

return new BatchApplicationTokenCredentials(
config.batch().endpoint,
clientId,
token.getToken(),
null,
batchEndpoint,
authenticationEndpoint
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vsmalladi I adapted this from your C# solution here but I'm pretty sure I'm missing something here. In the C# SDK there is BatchTokenCredentials which only requires the account URL and token, but in the Java SDK there is BatchApplicationTokenCredentials and BatchUserTokenCredentials which both require more arguments.

Perhaps you can enlighten us on how to set up these batch credentials with the managed identity

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spent a while looking into this and found that while the storage account authenticated OK, the Batch Client could not authenticate at all. Here's a snippet from the logs:

May-02 12:03:01.724 [Task submitter] DEBUG n.cloud.azure.batch.AzBatchService - [AZURE BATCH] Creating Azure Batch client using Managed Identity credentials
May-02 12:03:01.725 [Task submitter] INFO  c.a.identity.ChainedTokenCredential - Azure Identity => Attempted credential EnvironmentCredential is unavailable.
May-02 12:03:01.725 [Task submitter] INFO  c.a.identity.ChainedTokenCredential - Azure Identity => Attempted credential WorkloadIdentityCredential is unavailable.
May-02 12:03:01.725 [Task submitter] INFO  c.a.identity.ChainedTokenCredential - Azure Identity => Attempted credential ManagedIdentityCredential is unavailable.
May-02 12:03:01.726 [Task submitter] INFO  c.a.identity.ChainedTokenCredential - Azure Identity => Attempted credential SharedTokenCacheCredential is unavailable.
May-02 12:03:01.726 [Task submitter] INFO  c.a.identity.ChainedTokenCredential - Azure Identity => Attempted credential IntelliJCredential is unavailable.
May-02 12:03:01.726 [Task submitter] INFO  c.a.identity.ChainedTokenCredential - Azure Identity => Attempted credential AzureCliCredential is unavailable.
May-02 12:03:01.726 [Task submitter] INFO  c.a.identity.ChainedTokenCredential - Azure Identity => Attempted credential AzurePowerShellCredential is unavailable.
May-02 12:03:01.726 [Task submitter] ERROR c.a.i.implementation.IdentityClient - Missing scope in request
May-02 12:03:01.726 [Task submitter] INFO  c.a.identity.ChainedTokenCredential - Azure Identity => Attempted credential AzureDeveloperCliCredential is unavailable.
May-02 12:03:01.728 [Task submitter] DEBUG nextflow.processor.TaskProcessor - Handling unexpected condition for
  task: name=sayHello (3); work-dir=az://redacted
  error [com.azure.identity.CredentialUnavailableException]: EnvironmentCredential authentication unavailable. Environment variables are not fully configured.To mitigate this issue, please refer to the troubleshooting guidelines here at https://aka.ms/azsdk/java/identity/environmentcredential/troubleshoot
WorkloadIdentityCredential authentication unavailable. The workload options are not fully configured. See the troubleshooting guide for more information. https://aka.ms/azsdk/java/identity/workloadidentitycredential/troubleshoot
Managed Identity authentication is not available.
SharedTokenCacheCredential authentication unavailable. No accounts were found in the cache.
IntelliJ Authentication not available. Please log in with Azure Tools for IntelliJ plugin in the IDE. Fore more details refer to the troubleshooting guidelines here at https://aka.ms/azsdk/java/identity/intellijcredential/troubleshoot
To convert to a resource string the specified array must be exactly length 1
To convert to a resource string the specified array must be exactly length 1
Missing scope in requestTo mitigate this issue, please refer to the troubleshooting guidelines here at https://aka.ms/azure-identity-java-default-azure-credential-troubleshoot
May-02 12:03:01.732 [Task submitter] ERROR nextflow.processor.TaskProcessor - Error executing process > 'sayHello (3)'

Caused by:
  EnvironmentCredential authentication unavailable. Environment variables are not fully configured.To mitigate this issue, please refer to the troubleshooting guidelines here at https://aka.ms/azsdk/java/identity/environmentcredential/troubleshoot
WorkloadIdentityCredential authentication unavailable. The workload options are not fully configured. See the troubleshooting guide for more information. https://aka.ms/azsdk/java/identity/workloadidentitycredential/troubleshoot
Managed Identity authentication is not available.
SharedTokenCacheCredential authentication unavailable. No accounts were found in the cache.
IntelliJ Authentication not available. Please log in with Azure Tools for IntelliJ plugin in the IDE. Fore more details refer to the troubleshooting guidelines here at https://aka.ms/azsdk/java/identity/intellijcredential/troubleshoot
To convert to a resource string the specified array must be exactly length 1
To convert to a resource string the specified array must be exactly length 1
Missing scope in requestTo mitigate this issue, please refer to the troubleshooting guidelines here at https://aka.ms/azure-identity-java-default-azure-credential-troubleshoot

com.azure.identity.CredentialUnavailableException: EnvironmentCredential authentication unavailable. Environment variables are not fully configured.To mitigate this issue, please refer to the troubleshooting guidelines here at https://aka.ms/azsdk/java/identity/environmentcredential/troubleshoot
WorkloadIdentityCredential authentication unavailable. The workload options are not fully configured. See the troubleshooting guide for more information. https://aka.ms/azsdk/java/identity/workloadidentitycredential/troubleshoot
Managed Identity authentication is not available.
SharedTokenCacheCredential authentication unavailable. No accounts were found in the cache.
IntelliJ Authentication not available. Please log in with Azure Tools for IntelliJ plugin in the IDE. Fore more details refer to the troubleshooting guidelines here at https://aka.ms/azsdk/java/identity/intellijcredential/troubleshoot
To convert to a resource string the specified array must be exactly length 1
To convert to a resource string the specified array must be exactly length 1
Missing scope in requestTo mitigate this issue, please refer to the troubleshooting guidelines here at https://aka.ms/azure-identity-java-default-azure-credential-troubleshoot
	at com.azure.identity.ChainedTokenCredential.getTokenSync(ChainedTokenCredential.java:143)
	at nextflow.cloud.azure.batch.AzBatchService.createBatchCredentialsWithManagedIdentity(AzBatchService.groovy:317)
	at nextflow.cloud.azure.batch.AzBatchService.createBatchClient(AzBatchService.groovy:339)
	at nextflow.cloud.azure.batch.AzBatchService.memoizedMethodPriv$getClient(AzBatchService.groovy:125)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
	at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:343)
	at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:328)
	at groovy.lang.MetaClassImpl.doInvokeMethod(MetaClassImpl.java:1333)
	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1088)
	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1007)
	at org.codehaus.groovy.runtime.InvokerHelper.invokePogoMethod(InvokerHelper.java:645)
	at org.codehaus.groovy.runtime.InvokerHelper.invokeMethod(InvokerHelper.java:628)
	at org.codehaus.groovy.runtime.InvokerHelper.invokeMethodSafe(InvokerHelper.java:82)
	at nextflow.cloud.azure.batch.AzBatchService$_closure1.doCall(AzBatchService.groovy)
	at nextflow.cloud.azure.batch.AzBatchService$_closure1.doCall(AzBatchService.groovy)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
	at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:343)
	at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:328)
	at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:279)
	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1007)
	at groovy.lang.Closure.call(Closure.java:433)
	at org.codehaus.groovy.runtime.memoize.Memoize$MemoizeFunction.lambda$call$0(Memoize.java:137)
	at org.codehaus.groovy.runtime.memoize.ConcurrentCommonCache.getAndPut(ConcurrentCommonCache.java:137)
	at org.codehaus.groovy.runtime.memoize.ConcurrentCommonCache.getAndPut(ConcurrentCommonCache.java:113)
	at org.codehaus.groovy.runtime.memoize.Memoize$MemoizeFunction.call(Memoize.java:136)
	at groovy.lang.Closure.call(Closure.java:412)
	at nextflow.cloud.azure.batch.AzBatchService.getClient(AzBatchService.groovy)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
	at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:343)
	at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:328)
	at groovy.lang.MetaClassImpl.doInvokeMethod(MetaClassImpl.java:1333)
	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1088)
	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1007)
	at org.codehaus.groovy.runtime.InvokerHelper.invokePogoMethod(InvokerHelper.java:645)
	at org.codehaus.groovy.runtime.InvokerHelper.invokeMethod(InvokerHelper.java:628)
	at org.codehaus.groovy.runtime.InvokerHelper.invokeMethodSafe(InvokerHelper.java:82)
	at nextflow.cloud.azure.batch.AzBatchService$_getPool_lambda13.doCall(AzBatchService.groovy:677)
	at dev.failsafe.Functions.lambda$toCtxSupplier$11(Functions.java:236)
	at dev.failsafe.Functions.lambda$get$0(Functions.java:46)
	at dev.failsafe.internal.RetryPolicyExecutor.lambda$apply$0(RetryPolicyExecutor.java:75)
	at dev.failsafe.SyncExecutionImpl.executeSync(SyncExecutionImpl.java:176)
	at dev.failsafe.FailsafeExecutor.call(FailsafeExecutor.java:437)
	at dev.failsafe.FailsafeExecutor.get(FailsafeExecutor.java:115)
	at nextflow.cloud.azure.batch.AzBatchService.apply(AzBatchService.groovy:991)
	at nextflow.cloud.azure.batch.AzBatchService.getPool(AzBatchService.groovy:677)
	at nextflow.cloud.azure.batch.AzBatchService.getOrCreatePool(AzBatchService.groovy:655)
	at nextflow.cloud.azure.batch.AzBatchService.submitTask(AzBatchService.groovy:354)
	at nextflow.cloud.azure.batch.AzBatchTaskHandler.submit(AzBatchTaskHandler.groovy:91)
	at nextflow.processor.TaskPollingMonitor.submit(TaskPollingMonitor.groovy:203)
	at nextflow.processor.TaskPollingMonitor.submitPendingTasks(TaskPollingMonitor.groovy:572)
	at nextflow.processor.TaskPollingMonitor.submitLoop(TaskPollingMonitor.groovy:397)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
	at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:343)
	at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:328)
	at groovy.lang.MetaClassImpl.doInvokeMethod(MetaClassImpl.java:1333)
	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1088)
	at groovy.lang.MetaClassImpl.invokeMethodClosure(MetaClassImpl.java:1017)
	at groovy.lang.MetaClassImpl.doInvokeMethod(MetaClassImpl.java:1207)
	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1088)
	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1007)
	at groovy.lang.Closure.call(Closure.java:433)
	at groovy.lang.Closure.call(Closure.java:412)
	at groovy.lang.Closure.run(Closure.java:505)
	at java.base/java.lang.VirtualThread.run(VirtualThread.java:309)
May-02 12:03:01.735 [Task submitter] DEBUG nextflow.Session - Session aborted -- Cause: EnvironmentCredential authentication unavailable. Environment variables are not fully configured.To mitigate this issue, please refer to the troubleshooting guidelines here at https://aka.ms/azsdk/java/identity/environmentcredential/troubleshoot
WorkloadIdentityCredential authentication unavailable. The workload options are not fully configured. See the troubleshooting guide for more information. https://aka.ms/azsdk/java/identity/workloadidentitycredential/troubleshoot
Managed Identity authentication is not available.
SharedTokenCacheCredential authentication unavailable. No accounts were found in the cache.
IntelliJ Authentication not available. Please log in with Azure Tools for IntelliJ plugin in the IDE. Fore more details refer to the troubleshooting guidelines here at https://aka.ms/azsdk/java/identity/intellijcredential/troubleshoot
To convert to a resource string the specified array must be exactly length 1
To convert to a resource string the specified array must be exactly length 1
Missing scope in requestTo mitigate this issue, please refer to the troubleshooting guidelines here at https://aka.ms/azure-identity-java-default-azure-credential-troubleshoot
May-02 12:03:01.739 [Task submitter] DEBUG nextflow.Session - The following nodes are still active:
  [operator] view

Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Signed-off-by: Adam Talbot <12817534+adamrtalbot@users.noreply.github.com>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Signed-off-by: Adam Talbot <12817534+adamrtalbot@users.noreply.github.com>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Signed-off-by: Adam Talbot <12817534+adamrtalbot@users.noreply.github.com>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Signed-off-by: Adam Talbot <12817534+adamrtalbot@users.noreply.github.com>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Signed-off-by: Adam Talbot <12817534+adamrtalbot@users.noreply.github.com>
pditommaso and others added 6 commits April 22, 2024 09:44
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Signed-off-by: Adam Talbot <12817534+adamrtalbot@users.noreply.github.com>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Signed-off-by: Adam Talbot <12817534+adamrtalbot@users.noreply.github.com>
…4914) [ci fast]

Some commented out lines were leftover from 27d01e3 in Azure Batch pool setup tests. This PR uncomments them so they are explicit.

Signed-off-by: Adam Talbot <12817534+adamrtalbot@users.noreply.github.com>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Signed-off-by: Adam Talbot <12817534+adamrtalbot@users.noreply.github.com>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Signed-off-by: Adam Talbot <12817534+adamrtalbot@users.noreply.github.com>
We switched from using /master/ to using /latest/ - update the quick switch dropdown.

Signed-off-by: Phil Ewels <phil.ewels@seqera.io>
Signed-off-by: Adam Talbot <12817534+adamrtalbot@users.noreply.github.com>
@bentsherman
Copy link
Member Author

@adamrtalbot @swampie can one of you try out this PR with a managed identity? still not sure about the batch auth but the storage auth should work

@adamrtalbot
Copy link
Collaborator

I'm afraid no luck. Steps:

  1. Span up a node in Azure Batch with a managed identity.
  2. ssh onto node and run with vscode for convenience (important), install Java + Gradle with SDKman
  3. Add the following configuration:
process.executor = 'azurebatch'

workDir = "az://redacted"

azure {
    managedIdentity {
        clientId = 'redacted'
    }
    storage {
        accountName = 'redacted'
    }
    batch {
        accountName = 'redacted'
        location = 'redacted'
    }
}

Got the following error message:

batch-explorer-user@b0deb3ce8f9f474f9fefbd4fe61f80b2000000:~/hello$ nextflow-dev run hello -c nextflow.config 

 N E X T F L O W   ~  version 24.02.0-edge

Launching `https://github.com/nextflow-io/hello` [romantic_cantor] DSL2 - revision: 7588c46ffe [master]

ERROR ~ EnvironmentCredential authentication unavailable. Environment variables are not fully configured.To mitigate this issue, please refer to the troubleshooting guidelines here at https://aka.ms/azsdk/java/identity/environmentcredential/troubleshoot
WorkloadIdentityCredential authentication unavailable. The workload options are not fully configured. See the troubleshooting guide for more information. https://aka.ms/azsdk/java/identity/workloadidentitycredential/troubleshoot
Managed Identity authentication is not available.
SharedTokenCacheCredential authentication unavailable. No accounts were found in the cache.
IntelliJ Authentication not available. Please log in with Azure Tools for IntelliJ plugin in the IDE. Fore more details refer to the troubleshooting guidelines here at https://aka.ms/azsdk/java/identity/intellijcredential/troubleshoot
AzureCliCredential authentication unavailable. Azure CLI not installed.To mitigate this issue, please refer to the troubleshooting guidelines here at https://aka.ms/azsdk/java/identity/azclicredential/troubleshoot
Unable to execute PowerShell. Please make sure that it is installed in your system
AzureDeveloperCliCredential authentication unavailable. Azure Developer CLI not installed.To mitigate this issue, please refer to the troubleshooting guidelines here at https://aka.ms/azsdk/java/identity/azdevclicredential/troubleshootTo mitigate this issue, please refer to the troubleshooting guidelines here at https://aka.ms/azure-identity-java-default-azure-credential-troubleshoot

 -- Check '.nextflow.log' file for details

azure-managed-identity-error.log

Note the error tries a number of authentication steps.

To make sure the managed identity was working, I installed azcopy 10.24.0 and downloaded a file using the following script, note no credentials applied:

export AZCOPY_JOB_PLAN_LOCATION="."
export AZCOPY_AUTO_LOGIN_TYPE="MSI"
export AZCOPY_LOG_LOCATION="."

azcopy copy 'https://redacted.blob.core.windows.net/path/test.txt' test.txt
bash test_mi.sh 
INFO: Scanning...
INFO: Login with identity succeeded.
INFO: Authenticating to source using Azure AD
INFO: Any empty folders will not be processed, because source and/or destination doesn't have full folder support

Job 62bdf3e5-bee9-d84c-6ff7-95f37d23dd11 has started
Log file is located at: ./62bdf3e5-bee9-d84c-6ff7-95f37d23dd11.log

100.0 %, 1 Done, 0 Failed, 0 Pending, 0 Skipped, 1 Total, 


Job 62bdf3e5-bee9-d84c-6ff7-95f37d23dd11 summary
Elapsed Time (Minutes): 0.0333
Number of File Transfers: 1
Number of Folder Property Transfers: 0
Number of Symlink Transfers: 0
Total Number of Transfers: 1
Number of File Transfers Completed: 1
Number of Folder Transfers Completed: 0
Number of File Transfers Failed: 0
Number of Folder Transfers Failed: 0
Number of File Transfers Skipped: 0
Number of Folder Transfers Skipped: 0
Total Number of Bytes Transfer


@Memoized
static synchronized BlobServiceClient getOrCreateBlobServiceWithManagedIdentity(String accountName, String clientId) {
log.debug "Creating Azure blob storage client -- accountName: $accountName; clientId: ${clientId ?: '<system-assigned identity>'}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might as well be consistent!

Suggested change
log.debug "Creating Azure blob storage client -- accountName: $accountName; clientId: ${clientId ?: '<system-assigned identity>'}"
log.debug "Creating Azure blob storage client using Managed Identity ${clientId ?: '<system-assigned identity>'}"

@adamrtalbot
Copy link
Collaborator

Interestingly, I ran az login and it was able to authenticate but I got the following error:

nextflow-dev run hello -c nextflow.config 

 N E X T F L O W   ~  version 24.02.0-edge

Launching `https://github.com/nextflow-io/hello` [wise_dubinsky] DSL2 - revision: 7588c46ffe [master]

[-        ] sayHello -
ERROR ~ Error executing process > 'sayHello (2)'

Caused by:
[-        ] sayHello -
ERROR ~ Error executing process > 'sayHello (2)'

Caused by:
  If you are using a StorageSharedKeyCredential, and the server returned an error message that says 'Signature did not match', you can compare the string to sign with the one generated by the SDK. To log the string to sign, pass in the context key value pair 'Azure-Storage-Log-String-To-Sign': true to the appropriate method call.
If you are using a SAS token, and the server returned an error message that says 'Signature did not match', you can compare the string to sign with the one generated by the SDK. To log the string to sign, pass in the context key value pair 'Azure-Storage-Log-String-To-Sign': true to the appropriate generateSas method call.
Please remember to disable 'Azure-Storage-Log-String-To-Sign' before going to production as this string can potentially contain PII.
Status code 403, (empty body)


 -- Check '.nextflow.log' file for details

azure-managed-identity-error-2.log

@mgopez
Copy link

mgopez commented Apr 26, 2024

Interestingly, I ran az login and it was able to authenticate but I got the following error:

nextflow-dev run hello -c nextflow.config 

 N E X T F L O W   ~  version 24.02.0-edge

Launching `https://github.com/nextflow-io/hello` [wise_dubinsky] DSL2 - revision: 7588c46ffe [master]

[-        ] sayHello -
ERROR ~ Error executing process > 'sayHello (2)'

Caused by:
[-        ] sayHello -
ERROR ~ Error executing process > 'sayHello (2)'

Caused by:
  If you are using a StorageSharedKeyCredential, and the server returned an error message that says 'Signature did not match', you can compare the string to sign with the one generated by the SDK. To log the string to sign, pass in the context key value pair 'Azure-Storage-Log-String-To-Sign': true to the appropriate method call.
If you are using a SAS token, and the server returned an error message that says 'Signature did not match', you can compare the string to sign with the one generated by the SDK. To log the string to sign, pass in the context key value pair 'Azure-Storage-Log-String-To-Sign': true to the appropriate generateSas method call.
Please remember to disable 'Azure-Storage-Log-String-To-Sign' before going to production as this string can potentially contain PII.
Status code 403, (empty body)


 -- Check '.nextflow.log' file for details

azure-managed-identity-error-2.log

Related-Unrelated maybe? I've seen this error using a service principal. Needed to give the service principal, or in this case a MI the correct permissions for the storage account as per the docs:

  • Contributor
  • Storage Blob Contributor
  • Storage Blob Reader

and it resolved it for me.

@adamrtalbot
Copy link
Collaborator

Thanks @mgopez. Looks like I might have more permissions than that, but not those specific permissions, so that might be the problem.

…aged Identity credentials

Signed-off-by: Adam Talbot <12817534+adamrtalbot@users.noreply.github.com>
Signed-off-by: Adam Talbot <12817534+adamrtalbot@users.noreply.github.com>
@@ -103,7 +103,7 @@ class AzBatchExecutor extends Executor implements ExtensionPoint {

// Generate an account SAS token using either activeDirectory configs or storage account keys
if (!config.storage().sasToken) {
config.storage().sasToken = config.activeDirectory().isConfigured()
config.storage().sasToken = config.activeDirectory().isConfigured() || config.managedIdentity().isConfigured()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bentsherman @swampie I needed to add this line to get it to recognise it shouldn't use the account key, but I think I may have missed something. Presumably we will need a generateContainerWithManagedIdentity method but I don't know the details yet.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I will update this part

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking into it, I think this change should work, because it will transparently use whichever blob client was created for service principal / managed identity

@adamrtalbot
Copy link
Collaborator

Myself and @vsmalladi took a look at this last week, we found the storage client seems to authenticate OK but the batch client fails to authenticate. We ended up changing the Batch Client code to this but it still didn't work (domain redacted):

    protected BatchCredentials createBatchCredentialsWithManagedIdentity() {
        log.debug "[AZURE BATCH] Creating Azure Batch client using Managed Identity credentials"

        final batchEndpoint = "https://batch.core.windows.net/"
        final authenticationEndpoint = "https://management.core.windows.net/"

        final clientId = config.managedIdentity().clientId
        final credentialBuilder = new DefaultAzureCredentialBuilder()
        if( clientId ) {
            log.debug "[AZURE BATCH] Client ID: ${clientId}"
            credentialBuilder.managedIdentityClientId(clientId)
        }
        final credential = credentialBuilder.build()
        final tokenContext = new TokenRequestContext()
            .setTenantId("${domain}")
            .addScopes(String.format("%s/.default", AzureEnvironment.AZURE.getManagementEndpoint()))
        log.debug "[AZURE BATCH] Tenant ID: ${tokenContext.getTenantId()}"
        final token = credential.getTokenSync( tokenContext )

        return new BatchApplicationTokenCredentials(
                config.batch().endpoint, // base URL
                clientId,           // client ID
                token.getToken(), // secret
                "${domain}", // domain (tenant?)
                batchEndpoint, // batchEndpoint
                authenticationEndpoint // authenticationEndpoint
        )
    }

We were producing a correct token but still getting errors. It's not clear to me how the auth flow works for this, so I tried using the az cli but no progress.

@adamrtalbot
Copy link
Collaborator

I tried to log in via the CLI with --debug enabled to see if I could work out the auth flow. Here's the redacted logs. First one is running az login --debug --identity --username $USER_ASSIGNED_MANAGED_IDENTITY_ID:
azure-mi-login.log

Second is az batch account login --debug --name $AZURE_BATCH_ACCOUNT_NAME --resource-group $AZURE_BATCH_RESOURCE_GROUP:
batch_account_login.log

Hopefully this helps someone work out how to authenticate with the SDK 🤦

@pditommaso
Copy link
Member

I'd suggest creating a minimal implementation to isolate the core goal i.e. a bare simple Java or Groovy class taking the manage identify and authenticating both Storage and Batch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants