Skip to content

Terraform module to setup Managed Workflows with Apache Airflow. (Airflow as managed service by AWS)

License

Notifications You must be signed in to change notification settings

idealo/terraform-aws-mwaa

Repository files navigation

AWS MWAA Terraform Module

Terraform module which creates AWS MWAA resources and connects them together.

How to

Contribute

If the automated doc generation (listed under checks) fails as part of a PR from a fork, please mention us in the PR conversation or raise an issue.

Use

Use this code to create a basic MWAA environment (using all default parameters, see Inputs):

module "airflow" {
  source = "idealo/mwaa/aws"
  version = "x.x.x"
  
  account_id = "12345679"
  environment_name = "MyEnvironment"
  internet_gateway_id = "ig-12345"
  private_subnet_cidrs = ["10.0.1.0/24","10.0.2.0/24"] # depending on your vpc ip range
  public_subnet_cidrs = ["10.0.3.0/24","10.0.4.0/24"] # depending on your vpc ip range
  region = "us-west-1"
  source_bucket_arn = "arn:aws:s3:::MyMwaaBucket"
  vpc_id = "vpc-12345"
}

Add permissions to the Airflow execution role

To give additional permissions to your airflow executions role (e.g. elasticmapreduce:CreateJobFlow to start an EMR cluster), create a Policy document containing the permissions you need:

data aws_iam_policy_document "additional_execution_policy_doc" {
  statement {
    effect = "Allow"
    actions = [
      "<Your permissions>"
    ]
    resources = [
      "<YourResource>"]
  }
}

and pass the document json to the module:

module "airflow" {
  ...
  additional_execution_role_policy_document_json = data.aws_iam_policy_document.additional_execution_policy_doc.json
  ...
}

Add custom plugins

Simply upload the plugins.zip to s3 and pass the relative path inside the MWAA bucket to the plugins_s3_path parameter. If you zip and upload it via terraform, this would look like this:

module "airflow" {
  ...
  plugins_s3_path = aws_s3_bucket_object.your_plugin.key
  ...
}

Use your own networking config

If you set create_networking_config = false no subnets, eip, NAT gateway and route tables will be created. Be aware that you still need the networking resources to get your environment running, follow the official documentation to create them properly.

S3 Bucket configuration

MWAA needs a S3 bucket to store the DAG files. Here is a minimal configuration for this S3 bucket:

resource "aws_s3_bucket" "mwaa" {
  bucket = ""
}

resource "aws_s3_bucket_versioning" "mwaa" {
  bucket = aws_s3_bucket.mwaa.bucket
  versioning_configuration {
    status = "Enabled"
  }
}

resource "aws_s3_bucket_public_access_block" "mwaa" {
  # required: https://docs.aws.amazon.com/mwaa/latest/userguide/mwaa-s3-bucket.html
  bucket                  = aws_s3_bucket.mwaa.id
  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}

Requirements

Name Version
terraform >=1.0.0
aws >= 5.0.0

Providers

Name Version
aws 5.45.0

Modules

No modules.

Resources

Name Type
aws_eip.this resource
aws_iam_role.this resource
aws_iam_role_policy.this resource
aws_internet_gateway.this resource
aws_mwaa_environment.this resource
aws_nat_gateway.this resource
aws_route_table.private resource
aws_route_table.public resource
aws_route_table_association.private resource
aws_route_table_association.public resource
aws_security_group.this resource
aws_security_group_rule.egress_all_ipv4 resource
aws_security_group_rule.egress_all_ipv6 resource
aws_security_group_rule.ingress_from_self resource
aws_subnet.private resource
aws_subnet.public resource
aws_availability_zones.available data source
aws_iam_policy_document.assume data source
aws_iam_policy_document.base data source
aws_iam_policy_document.this data source

Inputs

Name Description Type Default Required
account_id Account ID of the account in which MWAA will be started string n/a yes
additional_associated_security_group_ids Security group IDs of existing security groups that should be associated with the MWAA environment. list(string) [] no
additional_execution_role_policy_document_json Additional permissions to attach to the base mwaa execution role string "{}" no
airflow_configuration_options additional configuration to overwrite airflows standard config map(string) {} no
airflow_version Airflow version to be used string "2.0.2" no
create_networking_config true if networking resources (subnets, eip, NAT gateway and route table) should be created. bool true no
dag_processing_logs_enabled n/a bool true no
dag_processing_logs_level One of: DEBUG, INFO, WARNING, ERROR, CRITICAL string "WARNING" no
dag_s3_path Relative path of the dags folder within the source bucket string "dags/" no
enable_ipv6_in_security_group Enable IPv6 in the security group bool false no
environment_class n/a string "mw1.small" no
environment_name Name of the MWAA environment string n/a yes
internet_gateway_id ID of the internet gateway to the VPC, if not set and create_networking_config = true an internet gateway will be created string null no
kms_key_arn KMS CMK ARN to use by MWAA for data encryption. MUST reference the same KMS key as used by S3 bucket specified by source_bucket_arn, if the bucket uses KMS. If not specified, the default AWS owned key for MWAA will be used for backward compatibility with version 1.0.1 of this module. string null no
max_workers numeric string, min 1 string "10" no
min_workers numeric string, min 1 string "1" no
plugins_s3_object_version n/a string null no
plugins_s3_path relative path of the plugins.zip within the source bucket string null no
private_subnet_cidrs CIDR blocks for the private subnets MWAA uses. Must be at least 2 if create_networking_config=true list(string) [] no
private_subnet_ids Subnet Ids of the existing private subnets that should be used if create_networking_config=false list(string) [] no
public_subnet_cidrs CIDR blocks for the public subnets MWAA uses. Must be at least 2 if create_networking_config=true list(string) [] no
region AWS Region where the environment and its resources will be created string n/a yes
requirements_s3_object_version n/a string null no
requirements_s3_path relative path of the requirements.txt (incl. filename) within the source bucket string null no
scheduler_logs_enabled n/a bool true no
scheduler_logs_level One of: DEBUG, INFO, WARNING, ERROR, CRITICAL string "WARNING" no
source_bucket_arn ARN of the bucket in which DAGs, Plugin and Requirements are put string n/a yes
startup_script_s3_object_version n/a string null no
startup_script_s3_path The relative path to the script hosted in your bucket. The script runs as your environment starts before starting the Apache Airflow process. string null no
tags tags and logging map(string) {} no
task_logs_enabled n/a bool true no
task_logs_level One of: DEBUG, INFO, WARNING, ERROR, CRITICAL string "INFO" no
vpc_id VPC id of the VPC in which the environments resources are created string n/a yes
webserver_access_mode Default: PRIVATE_ONLY string null no
webserver_logs_enabled n/a bool true no
webserver_logs_level One of: DEBUG, INFO, WARNING, ERROR, CRITICAL string "WARNING" no
weekly_maintenance_window_start The day and time of the week in Coordinated Universal Time (UTC) 24-hour standard time to start weekly maintenance updates of your environment in the following format: DAY:HH:MM. For example: TUE:03:30. You can specify a start time in 30 minute increments only string "MON:01:00" no
worker_logs_enabled n/a bool true no
worker_logs_level One of: DEBUG, INFO, WARNING, ERROR, CRITICAL string "WARNING" no

Outputs

Name Description
mwaa_arn The arn of the created MWAA environment.
mwaa_execution_role_arn The IAM Role arn for MWAA Execution Role.
mwaa_nat_gateway_public_ips List of the ips of the nat gateways created by this module.
mwaa_security_group_id The security group id of the MWAA Environment.
mwaa_service_role_arn The Service Role arn for MWAA environment.
mwaa_webserver_url The webserver URL of the MWAA Environment.