Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Azure managed identity #4897

Open
wants to merge 24 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
721a82b
Add support for Azure managed identity
bentsherman Apr 7, 2024
11c6c51
Merge branch 'master' into 4871-azure-managed-identity
pditommaso Apr 15, 2024
08bb147
minor edits
bentsherman Apr 15, 2024
07b3178
Update docs
bentsherman Apr 15, 2024
c2904aa
Update plugins/nf-azure/src/main/nextflow/cloud/azure/nio/AzFileSyste…
adamrtalbot Apr 17, 2024
d94697a
Add option for system-assigned identity
bentsherman Apr 17, 2024
d3a8954
Generate batch credentials from managed identity and SAS token
bentsherman Apr 17, 2024
d93efe0
Bump nf-amazon@2.4.2
pditommaso Apr 15, 2024
7daa94e
Bump nf-azure@1.6.0
pditommaso Apr 15, 2024
913208d
Bump nf-google@1.12.0
pditommaso Apr 15, 2024
757d6d6
Bump nf-tower@1.9.0
pditommaso Apr 15, 2024
6ffc91d
Bump nf-wave@1.4.0
pditommaso Apr 15, 2024
afcd819
Update changelog
pditommaso Apr 15, 2024
dbf9cd6
[release 24.03.0-edge] Update timestamp and build number [ci fast]
pditommaso Apr 15, 2024
fee1d98
fix: remove commented out test lines in Azure Batch Pool opts tests. …
adamrtalbot Apr 15, 2024
b625790
Fix Missing error code when no entry is specified
pditommaso Apr 15, 2024
79f39d4
Fix docs snippet
pditommaso Apr 15, 2024
7dc4305
Docs: Version switcher dropdown link fix (#4918)
ewels Apr 15, 2024
ebdcc5b
Use Azure Active directory to configure container SAS while using Man…
adamrtalbot May 2, 2024
127376a
Merge branch 'master' into 4871-azure-managed-identity
adamrtalbot May 2, 2024
7afb0e5
Add tenant ID
bentsherman May 19, 2024
0bf7f30
Add missing else
bentsherman May 22, 2024
368eae9
Get the tenant id from the config
bentsherman May 27, 2024
c33fa67
Add tenant id to managedIdentity scope, minor edits
bentsherman May 27, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
28 changes: 28 additions & 0 deletions docs/azure.md
Original file line number Diff line number Diff line change
Expand Up @@ -418,6 +418,34 @@ azure {
}
```

(azure-managed-identities)=

## Managed identities

:::{versionadded} 24.03.0-edge
:::

An Azure [managed identity](https://learn.microsoft.com/en-us/entra/identity/managed-identities-azure-resources/overview) can be used to authenticate with Azure Blob storage without using secrets such as a storage account key.

The managed identity can be specified as follows:

```groovy
azure {
managedIdentity {
clientId = '<YOUR MANAGED IDENTITY>'
}

storage {
accountName = '<YOUR STORAGE ACCOUNT NAME>'
}

batch {
accountName = '<YOUR BATCH ACCOUNT NAME>'
location = '<YOUR BATCH ACCOUNT LOCATION>'
}
}
```

## Advanced configuration

Read the {ref}`Azure configuration<config-azure>` section to learn more about advanced configuration options.
12 changes: 3 additions & 9 deletions docs/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -373,29 +373,24 @@ The following settings are available:
: Enable autoscaling feature for the pool identified with `<name>`.

`azure.batch.pools.<name>.fileShareRootPath`
: *New in `nf-azure` version `0.11.0`*
: If mounting File Shares, this is the internal root mounting point. Must be `/mnt/resource/batch/tasks/fsmounts` for CentOS nodes or `/mnt/batch/tasks/fsmounts` for Ubuntu nodes (default is for CentOS).

`azure.batch.pools.<name>.lowPriority`
: *New in `nf-azure` version `1.4.0`*
: Enable the use of low-priority VMs (default: `false`).

`azure.batch.pools.<name>.maxVmCount`
: Specify the max of virtual machine when using auto scale option.

`azure.batch.pools.<name>.mountOptions`
: *New in `nf-azure` version `0.11.0`*
: Specify the mount options for mounting the file shares (default: `-o vers=3.0,dir_mode=0777,file_mode=0777,sec=ntlmssp`).

`azure.batch.pools.<name>.offer`
: *New in `nf-azure` version `0.11.0`*
: Specify the offer type of the virtual machine type used by the pool identified with `<name>` (default: `centos-container`).

`azure.batch.pools.<name>.privileged`
: Enable the task to run with elevated access. Ignored if `runAs` is set (default: `false`).

`azure.batch.pools.<name>.publisher`
: *New in `nf-azure` version `0.11.0`*
: Specify the publisher of virtual machine type used by the pool identified with `<name>` (default: `microsoft-azure-batch`).

`azure.batch.pools.<name>.runAs`
Expand All @@ -411,7 +406,6 @@ The following settings are available:
: Specify the scheduling policy for the pool identified with `<name>`. It can be either `spread` or `pack` (default: `spread`).

`azure.batch.pools.<name>.sku`
: *New in `nf-azure` version `0.11.0`*
: Specify the ID of the Compute Node agent SKU which the pool identified with `<name>` supports (default: `batch.node.centos 8`).

`azure.batch.pools.<name>.startTask.script`
Expand All @@ -435,16 +429,16 @@ The following settings are available:
`azure.batch.pools.<name>.vmType`
: Specify the virtual machine type used by the pool identified with `<name>`.

`azure.managedIdentity.clientId`
: Specify the client ID for an Azure [managed identity](https://learn.microsoft.com/en-us/entra/identity/managed-identities-azure-resources/overview). See {ref}`azure-managed-identities` for more details.

`azure.registry.server`
: *New in `nf-azure` version `0.9.8`*
: Specify the container registry from which to pull the Docker images (default: `docker.io`).

`azure.registry.userName`
: *New in `nf-azure` version `0.9.8`*
: Specify the username to connect to a private container registry.

`azure.registry.password`
: *New in `nf-azure` version `0.9.8`*
: Specify the password to connect to a private container registry.

`azure.retryPolicy.delay`
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -269,24 +269,20 @@ class AzBatchService implements Closeable {
}


protected createBatchCredentialsWithKey() {
protected BatchCredentials createBatchCredentialsWithKey() {
log.debug "[AZURE BATCH] Creating Azure Batch client using shared key creddentials"

if (config.batch().endpoint || config.batch().accountKey || config.batch().accountName) {
// Create batch client
if (!config.batch().endpoint)
throw new IllegalArgumentException("Missing Azure Batch endpoint -- Specify it in the nextflow.config file using the setting 'azure.batch.endpoint'")
if (!config.batch().accountName)
throw new IllegalArgumentException("Missing Azure Batch account name -- Specify it in the nextflow.config file using the setting 'azure.batch.accountName'")
if (!config.batch().accountKey)
throw new IllegalArgumentException("Missing Azure Batch account key -- Specify it in the nextflow.config file using the setting 'azure.batch.accountKey'")
if( !config.batch().endpoint )
throw new IllegalArgumentException("Missing Azure Batch endpoint -- Specify it in the nextflow.config file using the setting 'azure.batch.endpoint'")
if( !config.batch().accountName )
throw new IllegalArgumentException("Missing Azure Batch account name -- Specify it in the nextflow.config file using the setting 'azure.batch.accountName'")
if( !config.batch().accountKey )
throw new IllegalArgumentException("Missing Azure Batch account key -- Specify it in the nextflow.config file using the setting 'azure.batch.accountKey'")

return new BatchSharedKeyCredentials(config.batch().endpoint, config.batch().accountName, config.batch().accountKey)

}
return new BatchSharedKeyCredentials(config.batch().endpoint, config.batch().accountName, config.batch().accountKey)
}

protected createBatchCredentialsWithServicePrincipal() {
protected BatchCredentials createBatchCredentialsWithServicePrincipal() {
log.debug "[AZURE BATCH] Creating Azure Batch client using Service Principal credentials"

final batchEndpoint = "https://batch.core.windows.net/";
Expand All @@ -307,12 +303,15 @@ class AzBatchService implements Closeable {
protected BatchClient createBatchClient() {
log.debug "[AZURE BATCH] Executor options=${config.batch()}"

def cred = config.activeDirectory().isConfigured()
? createBatchCredentialsWithServicePrincipal()
: createBatchCredentialsWithKey()
BatchCredentials cred

if( config.activeDirectory().isConfigured() )
cred = createBatchCredentialsWithServicePrincipal()
else if( config.batch().endpoint || config.batch().accountKey || config.batch().accountName )
adamrtalbot marked this conversation as resolved.
Show resolved Hide resolved
cred = createBatchCredentialsWithKey()

// Create batch client
def client = BatchClient.open(cred as BatchCredentials)
def client = BatchClient.open(cred)

Global.onCleanup((it)->client.protocolLayer().restClient().close())

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@ package nextflow.cloud.azure.batch
import java.nio.file.Path
import java.time.OffsetDateTime

import com.azure.identity.ClientSecretCredentialBuilder
import com.azure.identity.DefaultAzureCredentialBuilder
import com.azure.storage.blob.BlobContainerClient
import com.azure.storage.blob.BlobServiceClient
import com.azure.storage.blob.BlobServiceClientBuilder
Expand Down Expand Up @@ -130,7 +132,7 @@ class AzHelper {
return delegationKey
}

static String generateContainerUserDelegationSas(BlobContainerClient client, Duration duration, UserDelegationKey key) {
static String generateContainerUserDelegationSas(BlobContainerClient client, Duration duration, UserDelegationKey key) {

final startTime = OffsetDateTime.now()
final indicatedExpiryTime = startTime.plusHours(duration.toHours())
Expand All @@ -153,7 +155,7 @@ class AzHelper {
}

static String generateAccountSas(BlobServiceClient client, Duration duration) {
final expiryTime = OffsetDateTime.now().plusSeconds(duration.toSeconds());
final expiryTime = OffsetDateTime.now().plusSeconds(duration.toSeconds())
final signature = new AccountSasSignatureValues(
expiryTime,
ACCOUNT_PERMS,
Expand All @@ -172,8 +174,8 @@ class AzHelper {
static synchronized BlobServiceClient getOrCreateBlobServiceWithKey(String accountName, String accountKey) {
log.debug "Creating Azure blob storage client -- accountName=$accountName; accountKey=${accountKey?.substring(0,5)}.."

final credential = new StorageSharedKeyCredential(accountName, accountKey);
final endpoint = String.format(Locale.ROOT, "https://%s.blob.core.windows.net", accountName);
final credential = new StorageSharedKeyCredential(accountName, accountKey)
final endpoint = String.format(Locale.ROOT, "https://%s.blob.core.windows.net", accountName)

return new BlobServiceClientBuilder()
.endpoint(endpoint)
Expand All @@ -190,12 +192,49 @@ class AzHelper {

log.debug "Creating Azure blob storage client -- accountName: $accountName; sasToken: ${sasToken?.substring(0,10)}.."

final endpoint = String.format(Locale.ROOT, "https://%s.blob.core.windows.net", accountName);
final endpoint = String.format(Locale.ROOT, "https://%s.blob.core.windows.net", accountName)

return new BlobServiceClientBuilder()
.endpoint(endpoint)
.sasToken(sasToken)
.buildClient()
}

@Memoized
static synchronized BlobServiceClient getOrCreateBlobServiceWithServicePrincipal(String accountName, String clientId, String clientSecret, String tenantId) {
log.debug "Creating Azure Blob storage client using Service Principal credentials"

final endpoint = String.format(Locale.ROOT, "https://%s.blob.core.windows.net", accountName)

final credential = new ClientSecretCredentialBuilder()
.clientId(clientId)
.clientSecret(clientSecret)
.tenantId(tenantId)
.build()

return new BlobServiceClientBuilder()
.credential(credential)
.endpoint(endpoint)
.buildClient()
}

@Memoized
static synchronized BlobServiceClient getOrCreateBlobServiceWithManagedIdentity(String accountName, String clientId) {
if( !clientId )
throw new IllegalArgumentException("Missing Azure blob managed identity client ID")

log.debug "Creating Azure blob storage client -- accountName: $accountName; clientId: ${clientId}"

final endpoint = String.format(Locale.ROOT, "https://%s.blob.core.windows.net", accountName)

final credential = new DefaultAzureCredentialBuilder()
.managedIdentityClientId(clientId)
.build()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be worth checking if this can pick up a system assigned identity as well as a named one. You should be able to do this by dropping the managedIdentityClientId(clientId) part (educated guess here):

        final credential = new DefaultAzureCredentialBuilder()
        if ( clientId ) {
            credential.managedIdentityClientId(clientId)
        }
        
        finalCredential = credential.build()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will also enable people to authenticate as themselves on their personal machines if they're logged in to Azure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need another config option to explicitly enable the system assigned identity.

Here are the docs for the default credential builder: https://learn.microsoft.com/en-us/java/api/com.azure.identity.defaultazurecredentialbuilder?view=azure-java-stable

It's not obvious to me that it defaults to the system-assigned identity, since it seems to support other credentials. But maybe you can discern better than me.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you just call defaultCredentialBuilder it goes in this order: https://learn.microsoft.com/en-us/azure/developer/java/sdk/identity-azure-hosted-auth#default-azure-credential

  1. Environment - DefaultAzureCredential reads account information specified via environment variables and use it to authenticate.
  2. Managed Identity - If the application deploys to an Azure host with Managed Identity enabled, DefaultAzureCredential authenticates with that account.
  3. IntelliJ - If you've authenticated via Azure Toolkit for IntelliJ, DefaultAzureCredential authenticates with that account.
  4. Visual Studio Code - If you've authenticated via the Visual Studio Code Azure Account plugin, DefaultAzureCredential authenticates with that account.
  5. Azure CLI - If you've authenticated an account via the Azure CLI az login command, DefaultAzureCredential authenticates with that account.

More specifically they give this example where they state that the clientId is only required if using a user assigned identity:

/**
 * Authenticate with a managed identity.
 */
public void createManagedIdentityCredential() {
    ManagedIdentityCredential managedIdentityCredential = new ManagedIdentityCredentialBuilder()
        .clientId("<USER ASSIGNED MANAGED IDENTITY CLIENT ID>") // only required for user assigned
        .build();

    // Azure SDK client builders accept the credential as a parameter
    SecretClient client = new SecretClientBuilder()
        .vaultUrl("https://{YOUR_VAULT_NAME}.vault.azure.net")
        .credential(managedIdentityCredential)
        .buildClient();
}

https://github.com/Azure/azure-sdk-for-java/wiki/Azure-Identity-Examples#authenticating-in-azure-with-managed-identity

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that with the system-assigned identity you would only need to know the batch/storage account names and then you could submit jobs from anywhere without any client-side credentials?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or is the "system" the node from which you are submitting tasks i.e. the Nextflow head job which would presumably be running in Azure?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both. You would need to give it the batch and storage account names, but it would authenticate because of what it is, rather than what it knows. So if you ran it on an Azure VM with a System-assigned identity, you would need to:

  • Grant that system assigned identity permissions for the Batch and Storage account
  • Configure Nextflow to try and submit to that Batch and Storage account
  • Run Nextflow on that virtual machine, which will assume that identity


return new BlobServiceClientBuilder()
.credential(credential)
.endpoint(endpoint)
.buildClient()
}

}
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ import groovy.transform.CompileStatic
import nextflow.cloud.azure.nio.AzFileSystemProvider

/**
* Model Azure identity options from nextflow config file
* Model Azure Entra (formerly Active Directory) config options
*
* @author Abhinav Sharma <abhi18av@outlook.com>
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -40,13 +40,16 @@ class AzConfig {

private AzActiveDirectoryOpts activeDirectoryOpts

private AzManagedIdentityOpts managedIdentityOpts

AzConfig(Map azure) {
this.batchOpts = new AzBatchOpts( (Map)azure.batch ?: Collections.emptyMap() )
this.storageOpts = new AzStorageOpts( (Map)azure.storage ?: Collections.emptyMap() )
this.registryOpts = new AzRegistryOpts( (Map)azure.registry ?: Collections.emptyMap() )
this.azcopyOpts = new AzCopyOpts( (Map)azure.azcopy ?: Collections.emptyMap() )
this.retryConfig = new AzRetryConfig( (Map)azure.retryPolicy ?: Collections.emptyMap() )
this.activeDirectoryOpts = new AzActiveDirectoryOpts((Map) azure.activeDirectory ?: Collections.emptyMap())
this.managedIdentityOpts = new AzManagedIdentityOpts((Map) azure.managedIdentity ?: Collections.emptyMap())
}

AzCopyOpts azcopy() { azcopyOpts }
Expand All @@ -61,6 +64,8 @@ class AzConfig {

AzActiveDirectoryOpts activeDirectory() { activeDirectoryOpts }

AzManagedIdentityOpts managedIdentity() { managedIdentityOpts }

static AzConfig getConfig(Session session) {
if( !session )
throw new IllegalStateException("Missing Nextflow session")
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
/*
* Copyright 2013-2024, Seqera Labs
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package nextflow.cloud.azure.config

import groovy.transform.CompileStatic
import nextflow.cloud.azure.nio.AzFileSystemProvider

/**
* Model Azure managed identity config options
*
* @author Ben Sherman <bentshermann@gmail.com>
*/
@CompileStatic
class AzManagedIdentityOpts {

String clientId

AzManagedIdentityOpts(Map config) {
assert config != null
this.clientId = config.clientId
}
bentsherman marked this conversation as resolved.
Show resolved Hide resolved

Map<String, Object> getEnv() {
Map<String, Object> props = new HashMap<>();
props.put(AzFileSystemProvider.AZURE_MANAGED_IDENTITY, clientId)
return props
}

}
Original file line number Diff line number Diff line change
Expand Up @@ -49,10 +49,10 @@ class AzPathFactory extends FileSystemPathFactory {
throw new IllegalArgumentException("Invalid Azure path URI - make sure the schema prefix does not container more than two slash characters - offending value: $uri")

final storageConfigEnv = AzConfig.getConfig().storage().getEnv()

final activeDirectoryConfigEnv = AzConfig.getConfig().activeDirectory().getEnv()
final managedIdentityConfigEnv = AzConfig.getConfig().managedIdentity().getEnv()

final configEnv = storageConfigEnv + activeDirectoryConfigEnv
final configEnv = storageConfigEnv + activeDirectoryConfigEnv + managedIdentityConfigEnv

// find the related file system
final fs = getFileSystem(uri0(uri), configEnv)
Expand Down