0% found this document useful (0 votes)

207 views

AWS Batch User Guide

AWS Batch allows users to run batch computing workloads on AWS. It includes four main components: jobs which define the work to be done, job definitions which provide templates for matching jobs, job queues which hold jobs waiting to run, and compute environments which provide the resources for jobs to run on. This user guide provides instructions for getting started with AWS Batch including setting up IAM roles and security groups, submitting example jobs, and creating job definitions.

Uploaded by

Himanshu Khare

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

207 views

AWS Batch User Guide

Uploaded by

Himanshu Khare

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 215

AWS Batch

User Guide
AWS Batch User Guide

AWS Batch: User Guide

Amazon's trademarks and trade dress may not be used in connection with any product or service that is not
Amazon's, in any manner that is likely to cause confusion among customers, or in any manner that disparages or
discredits Amazon. All other trademarks not owned by Amazon are the property of their respective owners, who may
or may not be aﬃliated with, connected to, or sponsored by Amazon.
AWS Batch User Guide

Table of Contents
What Is AWS Batch? ........................................................................................................................... 1
Components of AWS Batch ......................................................................................................... 1
Jobs ................................................................................................................................. 1
Job Definitions ................................................................................................................... 1
Job Queues ....................................................................................................................... 1
Compute Environment ........................................................................................................ 1
Getting Started .......................................................................................................................... 2
Setting Up ........................................................................................................................................ 3
Sign Up for AWS ........................................................................................................................ 3
Create an IAM User .................................................................................................................... 3
Create IAM Roles for your Compute Environments and Container Instances ........................................ 5
Create a Key Pair ....................................................................................................................... 5
Create a Virtual Private Cloud ..................................................................................................... 7
Create a Security Group .............................................................................................................. 7
Install the AWS CLI .................................................................................................................... 8
Getting Started .................................................................................................................................. 9
Step 1: Define a Job ................................................................................................................... 9
Step 2: Configure the Compute Environment and Job Queue ......................................................... 11
Jobs ................................................................................................................................................ 14
Submitting a Job ..................................................................................................................... 14
Job States ............................................................................................................................... 16
Job Environment Variables ........................................................................................................ 17
Automated Job Retries .............................................................................................................. 18
Job Dependencies .................................................................................................................... 19
Job Timeouts ........................................................................................................................... 19
Array Jobs ............................................................................................................................... 20
Example Array Job Workflow ............................................................................................. 21
Tutorial: Using array job index ........................................................................................... 23
Multi-node Parallel Jobs ............................................................................................................ 27
Environment Variables ...................................................................................................... 28
Node Groups ................................................................................................................... 28
Job Lifecycle .................................................................................................................... 28
Compute Environment Considerations ................................................................................. 29
GPU Jobs ................................................................................................................................ 30
Job definitions ................................................................................................................................. 31
Creating a job definition ........................................................................................................... 31
Creating a multi-node parallel job definition ................................................................................ 36
Job definition template ............................................................................................................. 39
Job definition parameters ......................................................................................................... 43
Job definition name ......................................................................................................... 44
Type ............................................................................................................................... 44
Parameters ...................................................................................................................... 44
Platform capabilities ......................................................................................................... 45
Propagate tags ................................................................................................................. 45
Container properties ......................................................................................................... 45
Node properties ............................................................................................................... 61
Retry strategy .................................................................................................................. 62
Tags ................................................................................................................................ 64
Timeout .......................................................................................................................... 64
Using the awslogs log driver ...................................................................................................... 64
Available awslogs log driver options ................................................................................... 65
Specifying a log configuration in your job definition ............................................................. 66
Specifying sensitive data ........................................................................................................... 67
Using Secrets Manager ...................................................................................................... 67

iii
AWS Batch User Guide

Using Systems Manager Parameter Store ............................................................................ 73

Amazon EFS volumes ................................................................................................................ 75
Amazon EFS volume considerations .................................................................................... 76
Using Amazon EFS access points ........................................................................................ 76
Specifying an Amazon EFS file system in your job definition .................................................. 77
Example job definitions ............................................................................................................. 79
Use environment variables ................................................................................................ 79
Using parameter substitution ............................................................................................. 80
Test GPU functionality ...................................................................................................... 80
Multi-node parallel job ..................................................................................................... 81
Job queues ...................................................................................................................................... 82
Creating a job queue ................................................................................................................ 82
Job queue template ......................................................................................................... 83
Job queue parameters .............................................................................................................. 83
Job queue name .............................................................................................................. 83
Priority ............................................................................................................................ 83
Scheduling policy ............................................................................................................. 84
State ............................................................................................................................... 84
Compute environment order .............................................................................................. 84
Tags ................................................................................................................................ 85
Job Scheduling ................................................................................................................................ 86
Compute environment ...................................................................................................................... 87
Managed compute environments ................................................................................................ 87
Unmanaged compute environments ........................................................................................... 88
Compute resource AMIs ............................................................................................................ 88
Compute resource AMI specification ................................................................................... 89
Creating a compute resource AMI ....................................................................................... 90
Using a GPU workload AMI ................................................................................................ 92
Launch template support .......................................................................................................... 96
Amazon EC2 user data in launch templates ......................................................................... 97
Creating a compute environment ............................................................................................... 99
To create a managed compute environment using AWS Fargate resources .............................. 100
To create a managed compute environment using EC2 resources .......................................... 101
To create an unmanaged compute environment using EC2 resources ..................................... 103
Compute environment template ............................................................................................... 104
Compute environment parameters ............................................................................................ 105
Compute environment name ............................................................................................ 105
Type ............................................................................................................................. 106
State ............................................................................................................................. 106
Compute resources ......................................................................................................... 106
Service role .................................................................................................................... 112
Tags .............................................................................................................................. 112
EC2 Configurations ................................................................................................................. 113
Allocation strategies ............................................................................................................... 113
Memory Management ............................................................................................................. 114
Reserving System Memory ............................................................................................... 114
Viewing Compute Resource Memory ................................................................................. 114
Scheduling policies ......................................................................................................................... 116
Creating a scheduling policy .................................................................................................... 116
Scheduling policy template .............................................................................................. 117
Scheduling policy parameters .................................................................................................. 117
Scheduling policy name .................................................................................................. 117
Fair share policy ............................................................................................................. 118
Tags .............................................................................................................................. 119
Orchestrate AWS Batch jobs ............................................................................................................ 120
Viewing state machine details .................................................................................................. 120
Editing a state machine ........................................................................................................... 120

iv
AWS Batch User Guide

Running a state machine ......................................................................................................... 121

AWS Batch on AWS Fargate ............................................................................................................. 122
When to use Fargate ............................................................................................................... 122
Job definitions on Fargate ....................................................................................................... 122
Job queues on Fargate ............................................................................................................ 124
Compute environments on Fargate ........................................................................................... 124
Elastic Fabric Adapter ...................................................................................................................... 125
IAM policies, roles, and permissions .................................................................................................. 127
Policy structure ...................................................................................................................... 127
Policy syntax .................................................................................................................. 128
Actions for AWS Batch .................................................................................................... 128
Amazon Resource Names for AWS Batch ........................................................................... 129
Testing permissions ........................................................................................................ 129
Supported resource-level permissions ....................................................................................... 130
Condition keys ............................................................................................................... 137
Example policies ..................................................................................................................... 138
Read-only access ............................................................................................................ 138
Restricting user, image, privilege, role ............................................................................... 139
Restrict job submission .................................................................................................... 140
Restrict job queue .......................................................................................................... 140
AWS Batch managed policy ..................................................................................................... 141
AWSBatchFullAccess ........................................................................................................ 141
Creating IAM policies .............................................................................................................. 142
AWS Batch service IAM role ..................................................................................................... 142
Amazon ECS instance role ....................................................................................................... 145
Amazon EC2 spot fleet role ..................................................................................................... 145
Create Amazon EC2 spot fleet roles in the AWS Management Console ................................... 146
Create Amazon EC2 Spot Fleet Roles with the AWS CLI ....................................................... 146
EventBridge IAM role .............................................................................................................. 147
EventBridge ................................................................................................................................... 149
AWS Batch Events .................................................................................................................. 149
Job State Change Events ................................................................................................. 149
AWS Batch Jobs as EventBridge Targets .................................................................................... 151
Creating a Scheduled Job ................................................................................................ 151
Event Input Transformer .................................................................................................. 152
Tutorial: Listening for AWS Batch EventBridge ............................................................................ 154
Prerequisites .................................................................................................................. 154
Step 1: Create the Lambda Function ................................................................................. 154
Step 2: Register Event Rule .............................................................................................. 155
Step 3: Test Your Configuration ........................................................................................ 156
Tutorial: Sending Amazon Simple Notification Service Alerts for Failed Job Events ........................... 157
Prerequisites .................................................................................................................. 157
Step 1: Create and Subscribe to an Amazon SNS Topic ........................................................ 157
Step 2: Register Event Rule .............................................................................................. 157
Step 3: Test Your Rule ..................................................................................................... 158
CloudWatch Logs ............................................................................................................................ 159
CloudWatch Logs IAM Policy .................................................................................................... 159
Installing and configuring the CloudWatch agent ........................................................................ 160
Viewing CloudWatch Logs ....................................................................................................... 160
CloudTrail ...................................................................................................................................... 162
AWS Batch Information in CloudTrail ........................................................................................ 162
Understanding AWS Batch Log File Entries ................................................................................ 163
Tutorial: Creating a VPC .................................................................................................................. 165
Step 1: Create an Elastic IP Address for Your NAT Gateway .......................................................... 165
Step 2: Run the VPC Wizard .................................................................................................... 165
Step 3: Create Additional Subnets ............................................................................................ 166
Next Steps ............................................................................................................................. 166

v
AWS Batch User Guide

Security ......................................................................................................................................... 168

Identity and Access Management .............................................................................................. 168
Audience ....................................................................................................................... 168
Authenticating with identities .......................................................................................... 169
Managing access using policies ......................................................................................... 170
How AWS Batch works with IAM ...................................................................................... 172
Execution IAM role .......................................................................................................... 176
Identity-based policy examples ........................................................................................ 178
Troubleshooting ............................................................................................................. 179
Using Service-Linked Roles .............................................................................................. 181
AWS managed policies .................................................................................................... 189
Compliance Validation ............................................................................................................. 195
Infrastructure Security ............................................................................................................. 196
Tagging your resources ................................................................................................................... 197
Tag basics .............................................................................................................................. 197
Tagging your resources ........................................................................................................... 197
Tag restrictions ...................................................................................................................... 198
Working with tags using the console ......................................................................................... 199
Adding tags on an individual resource on creation .............................................................. 199
Adding and deleting tags on an individual resource ............................................................ 199
Working with tags using the CLI or API ..................................................................................... 199
Service Quotas ............................................................................................................................... 201
Troubleshooting ............................................................................................................................. 202
INVALID Compute Environment .............................................................................................. 202
Incorrect Role Name or ARN ............................................................................................ 202
Repairing an INVALID Compute Environment .................................................................... 203
Jobs Stuck in RUNNABLE Status ................................................................................................ 203
Spot Instances Not Tagged on Creation ..................................................................................... 204
Spot Instances not scaling down .............................................................................................. 204
Attach AmazonEC2SpotFleetTaggingRole managed policy to your Spot Fleet role in the AWS
Management Console ...................................................................................................... 205
Attach AmazonEC2SpotFleetTaggingRole managed policy to your Spot Fleet role with the
AWS CLI ........................................................................................................................ 205
Can't retrieve Secrets Manager secrets ...................................................................................... 205
Can't override job deﬁnition resource requirements ..................................................................... 206
Document history ........................................................................................................................... 207

vi
AWS Batch User Guide
Components of AWS Batch

What Is AWS Batch?

AWS Batch helps you to run batch computing workloads on the AWS Cloud. Batch computing is a
common way for developers, scientists, and engineers to access large amounts of compute resources.
AWS Batch removes the undifferentiated heavy lifting of configuring and managing the required
infrastructure, similar to traditional batch computing software. This service can efficiently provision
resources in response to jobs submitted in order to eliminate capacity constraints, reduce compute costs,
and deliver results quickly.

As a fully managed service, AWS Batch helps you to run batch computing workloads of any scale. AWS
Batch automatically provisions compute resources and optimizes the workload distribution based on
the quantity and scale of the workloads. With AWS Batch, there's no need to install or manage batch
computing software, so you can focus your time on analyzing results and solving problems.

Components of AWS Batch

AWS Batch simplifies running batch jobs across multiple Availability Zones within a Region. You can
create AWS Batch compute environments within a new or existing VPC. After a compute environment is
up and associated with a job queue, you can define job definitions that specify which Docker container
images to run your jobs. Container images are stored in and pulled from container registries, which may
exist within or outside of your AWS infrastructure.

Jobs
A unit of work (such as a shell script, a Linux executable, or a Docker container image) that you submit
to AWS Batch. It has a name, and runs as a containerized application on AWS Fargate or Amazon EC2
resources in your compute environment, using parameters that you specify in a job deﬁnition. Jobs can
reference other jobs by name or by ID, and can be dependent on the successful completion of other jobs.
For more information, see Jobs (p. 14).

Job Definitions
A job definition specifies how jobs are to be run. You can think of a job definition as a blueprint for the
resources in your job. You can supply your job with an IAM role to provide access to other AWS resources.
You also specify both memory and CPU requirements. The job definition can also control container
properties, environment variables, and mount points for persistent storage. Many of the specifications in
a job definition can be overridden by specifying new values when submitting individual Jobs. For more
information, see Job definitions (p. 31)

Job Queues
When you submit an AWS Batch job, you submit it to a particular job queue, where the job resides until
it's scheduled onto a compute environment. You associate one or more compute environments with a job
queue. You can also assign priority values for these compute environments and even across job queues
themselves. For example, you can have a high priority queue that you submit time-sensitive jobs to, and
a low priority queue for jobs that can run anytime when compute resources are cheaper.

Compute Environment
A compute environment is a set of managed or unmanaged compute resources that are used to run
jobs. With managed compute environments, you can specify desired compute type (Fargate or EC2) at

1
AWS Batch User Guide
Getting Started

several levels of detail. You can set up compute environments that use a particular type of EC2 instance,
a particular model such as c5.2xlarge or m5.10xlarge. Or, you can choose only to specify that
you want to use the newest instance types. You can also specify the minimum, desired, and maximum
number of vCPUs for the environment, along with the amount that you're willing to pay for a Spot
Instance as a percentage of the On-Demand Instance price and a target set of VPC subnets. AWS Batch
eﬃciently launches, manages, and terminates compute types as needed. You can also manage your own
compute environments. As such, you're responsible for setting up and scaling the instances in an Amazon
ECS cluster that AWS Batch creates for you. For more information, see Compute environment (p. 87).

Getting Started
Get started with AWS Batch by creating a job deﬁnition, compute environment, and a job queue in the
AWS Batch console.

The AWS Batch ﬁrst-run wizard gives you the option of creating a compute environment and a job queue
and submitting a sample Hello World job. If you already have a Docker image you want to launch in AWS
Batch, you can create a job deﬁnition with that image and submit that to your queue instead. For more
information, see Getting Started with AWS Batch (p. 9).

2
AWS Batch User Guide
Sign Up for AWS

Setting Up with AWS Batch

If you've already signed up for Amazon Web Services (AWS) and have been using Amazon Elastic
Compute Cloud (Amazon EC2) or Amazon Elastic Container Service (Amazon ECS), you are close to being
able to use AWS Batch. The setup process for these services is very similar, as AWS Batch uses Amazon
ECS container instances in its compute environments. To use the AWS CLI with AWS Batch , you must use
a version of the AWS CLI that supports the latest AWS Batch features. If you do not see support for an
AWS Batch feature in the AWS CLI, you should upgrade to the latest version. For more information, see
http://aws.amazon.com/cli/.
Note
Because AWS Batch uses components of Amazon EC2, you use the Amazon EC2 console for
many of these steps.

Complete the following tasks to get set up for AWS Batch. If you have already completed any of these
steps, you may skip them and move on to installing the AWS CLI.

1. Sign Up for AWS (p. 3)

2. Create an IAM User (p. 3)
3. Create IAM Roles for your Compute Environments and Container Instances (p. 5)
4. Create a Key Pair (p. 5)
5. Create a Virtual Private Cloud (p. 7)
6. Create a Security Group (p. 7)
7. Install the AWS CLI (p. 8)

Sign Up for AWS

When you sign up for AWS, your AWS account is automatically signed up for all services, including
Amazon EC2 and AWS Batch. You are charged only for the services that you use.

If you have an AWS account already, skip to the next task. If you don't have an AWS account, use the
following procedure to create one.

To create an AWS account

1. Open https://portal.aws.amazon.com/billing/signup.
2. Follow the online instructions.

Part of the sign-up procedure involves receiving a phone call and entering a veriﬁcation code on the
phone keypad.

Note your AWS account number, because you'll need it for the next task.

Create an IAM User

Services in AWS, such as Amazon EC2 and AWS Batch, require that you provide credentials when you
access them, so that the service can determine whether you have permission to access its resources. The

3
AWS Batch User Guide
Create an IAM User

console requires your password. You can create access keys for your AWS account to access the command
line interface or API. However, we don't recommend that you access AWS using the credentials for your
AWS account; we recommend that you use AWS Identity and Access Management (IAM) instead. Create
an IAM user, and then add the user to an IAM group with administrative permissions or grant this user
administrative permissions. You can then access AWS using a special URL and the IAM user's credentials.

If you signed up for AWS but have not created an IAM user for yourself, you can create one using the IAM
console.

To create an administrator user for yourself and add the user to an administrators group
(console)

1. Sign in to the IAM console as the account owner by choosing Root user and entering your AWS
account email address. On the next page, enter your password.
Note
We strongly recommend that you adhere to the best practice of using the Administrator
IAM user that follows and securely lock away the root user credentials. Sign in as the root
user only to perform a few account and service management tasks.
2. In the navigation pane, choose Users and then choose Add user.
3. For User name, enter Administrator.
4. Select the check box next to AWS Management Console access. Then select Custom password, and
then enter your new password in the text box.
5. (Optional) By default, AWS requires the new user to create a new password when ﬁrst signing in. You
can clear the check box next to User must create a new password at next sign-in to allow the new
user to reset their password after they sign in.
6. Choose Next: Permissions.
7. Under Set permissions, choose Add user to group.
8. Choose Create group.
9. In the Create group dialog box, for Group name enter Administrators.
10. Choose Filter policies, and then select AWS managed - job function to ﬁlter the table contents.
11. In the policy list, select the check box for AdministratorAccess. Then choose Create group.
Note
You must activate IAM user and role access to Billing before you can use the
AdministratorAccess permissions to access the AWS Billing and Cost Management
console. To do this, follow the instructions in step 1 of the tutorial about delegating access
to the billing console.
12. Back in the list of groups, select the check box for your new group. Choose Refresh if necessary to
see the group in the list.
13. Choose Next: Tags.
14. (Optional) Add metadata to the user by attaching tags as key-value pairs. For more information
about using tags in IAM, see Tagging IAM entities in the IAM User Guide.
15. Choose Next: Review to see the list of group memberships to be added to the new user. When you
are ready to proceed, choose Create user.

You can use this same process to create more groups and users and to give your users access to your AWS
account resources. To learn about using policies that restrict user permissions to speciﬁc AWS resources,
see Access management and Example policies.

To sign in as this new IAM user, sign out of the AWS console, then use the following URL, where
your_aws_account_id is your AWS account number without the hyphens (for example, if your AWS
account number is 1234-5678-9012, your AWS account ID is 123456789012):

4
AWS Batch User Guide
Create IAM Roles for your Compute
Environments and Container Instances

https://your_aws_account_id.signin.aws.amazon.com/console/

Enter the IAM user name and password that you just created. When you're signed in, the navigation bar
displays "your_user_name @ your_aws_account_id".

If you don't want the URL for your sign-in page to contain your AWS account ID, you can create an
account alias. From the IAM dashboard, choose Create Account Alias and enter an alias, such as your
company name. To sign in after you create an account alias, use the following URL:

https://your_account_alias.signin.aws.amazon.com/console/

To verify the sign-in link for IAM users for your account, open the IAM console and check under IAM
users sign-in link on the dashboard.

For more information about IAM, see the AWS Identity and Access Management User Guide.

Create IAM Roles for your Compute Environments

and Container Instances
Your AWS Batch compute environments and container instances require AWS account credentials to
make calls to other AWS APIs on your behalf. You must create an IAM role that provides these credentials
to your compute environments and container instances, then associate that role with your compute
environments.
Note
The AWS Batch compute environment and container instance roles are automatically created
for you in the console ﬁrst-run experience, so if you intend to use the AWS Batch console,
you can move ahead to the next section. If you plan to use the AWS CLI instead, complete the
procedures in AWS Batch service IAM role (p. 142) and Amazon ECS instance role (p. 145)
before creating your ﬁrst compute environment.

Create a Key Pair

AWS uses public-key cryptography to secure the login information for your instance. A Linux instance,
such as an AWS Batch compute environment container instance, has no password to use for SSH access;
you use a key pair to log in to your instance securely. You specify the name of the key pair when you
create your compute environment, then provide the private key when you log in using SSH.

If you haven't created a key pair already, you can create one using the Amazon EC2 console. Note that
if you plan to launch instances in multiple Regions, you'll need to create a key pair in each Region. For
more information about Regions, see Regions and Availability Zones in the Amazon EC2 User Guide for
Linux Instances.

To create a key pair

1. Open the Amazon EC2 console at https://console.aws.amazon.com/ec2/.

2. From the navigation bar, select a Region for the key pair. You can select any Region that's available
to you, regardless of your location: however, key pairs are speciﬁc to a Region. For example, if you
plan to launch an instance in the US West (Oregon) Region, you must create a key pair for the
instance in the same Region.

5
AWS Batch User Guide
Create a Key Pair

3. In the navigation pane, choose Key Pairs, Create Key Pair.

4. In the Create Key Pair dialog box, for Key pair name, enter a name for the new key pair , and choose
Create. Choose a name that you can remember, such as your IAM user name, followed by -key-
pair, plus the Region name. For example, me-key-pair-uswest2.
5. The private key file is automatically downloaded by your browser. The base file name is the name
you specified as the name of your key pair, and the file name extension is .pem. Save the private key
file in a safe place.
Important
This is the only chance for you to save the private key file. You'll need to provide the name
of your key pair when you launch an instance and the corresponding private key each time
you connect to the instance.
6. If you will use an SSH client on a Mac or Linux computer to connect to your Linux instance, use the
following command to set the permissions of your private key file so that only you can read it.

$ chmod 400 your_user_name-key-pair-region_name.pem

For more information, see Amazon EC2 Key Pairs in the Amazon EC2 User Guide for Linux Instances.

To connect to your instance using your key pair

To connect to your Linux instance from a computer running Mac or Linux, specify the .pem file to your
SSH client with the -i option and the path to your private key. To connect to your Linux instance from a
computer running Windows, you can use either MindTerm or PuTTY. If you plan to use PuTTY, you'll need
to install it and use the following procedure to convert the .pem file to a .ppk file.

(Optional) To prepare to connect to a Linux instance from Windows using PuTTY

1. Download and install PuTTY from http://www.chiark.greenend.org.uk/~sgtatham/putty/. Be sure

to install the entire suite.
2. Start PuTTYgen (for example, from the Start menu, choose All Programs, PuTTY, and PuTTYgen).
3. Under Type of key to generate, choose RSA. If you're using an earlier version of PuTTYgen, choose
SSH-2 RSA.

4. Choose Load. By default, PuTTYgen displays only files with the extension .ppk. To locate your .pem
file, choose the option to display files of all types.

5. Select the private key file that you created in the previous procedure and choose Open. Choose OK
to dismiss the confirmation dialog box.
6. Choose Save private key. PuTTYgen displays a warning about saving the key without a passphrase.
Choose Yes.
7. Specify the same name for the key that you used for the key pair. PuTTY automatically adds the
.ppk file extension.

6
AWS Batch User Guide
Create a Virtual Private Cloud

Create a Virtual Private Cloud

Amazon Virtual Private Cloud (Amazon VPC) enables you to launch AWS resources into a virtual network
that you've deﬁned. We strongly suggest that you launch your container instances in a VPC.

If you have a default VPC, you also can skip this section and move to the next task, Create a Security
Group (p. 7). To determine whether you have a default VPC, see Supported Platforms in the Amazon
EC2 Console in the Amazon EC2 User Guide for Linux Instances. Otherwise, you can create a nondefault
VPC in your account using the steps below.

To create a nondefault VPC

1. Open the Amazon VPC console at https://console.aws.amazon.com/vpc/.

2. From the navigation bar, select a Region for the VPC. VPCs are specific to a Region, so you should
select the same Region in which you created your key pair.
3. On the VPC dashboard, choose Start VPC Wizard.
4. On the Step 1: Select a VPC Configuration page, ensure that VPC with a Single Public Subnet is
selected, and choose Select.
5. On the Step 2: VPC with a Single Public Subnet page, enter a friendly name for your VPC for VPC
name. Leave the other default configuration settings, and choose Create VPC. On the confirmation
page, choose OK.

For more information about Amazon VPC, see What is Amazon VPC? in the Amazon VPC User Guide.

Create a Security Group

Security groups act as a ﬁrewall for associated compute environment container instances, controlling
both inbound and outbound traﬃc at the container instance level. You can add rules to a security group
that enable you to connect to your container instance from your IP address using SSH. You can also add
rules that allow inbound and outbound HTTP and HTTPS access from anywhere. Add any rules to open
ports that are required by your tasks.

Note that if you plan to launch container instances in multiple Regions, you need to create a security
group in each Region. For more information, see Regions and Availability Zones in the Amazon EC2 User
Guide for Linux Instances.
Note
You need the public IP address of your local computer, which you can get using a service.
For example, we provide the following service: http://checkip.amazonaws.com/ or https://
checkip.amazonaws.com/. To locate another service that provides your IP address, use the
search phrase "what is my IP address." If you are connecting through an Internet service provider
(ISP) or from behind a ﬁrewall without a static IP address, you need to ﬁnd out the range of IP
addresses used by client computers.

To create a security group with least privilege

1. Open the Amazon EC2 console at https://console.aws.amazon.com/ec2/.

2. From the navigation bar, select a Region for the security group. Security groups are speciﬁc to a
Region, so you should select the same Region in which you created your key pair.
3. In the navigation pane, choose Security Groups, Create Security Group.
4. Enter a name for the new security group and a description. Choose a name that you can remember,
such as your IAM user name, followed by _SG_, plus the Region name. For example, me_SG_useast1.
5. In the VPC list, ensure that your default VPC is selected; it's marked with an asterisk (*).

7
AWS Batch User Guide
Install the AWS CLI

6. AWS Batch container instances do not require any inbound ports to be open. However, you might
want to add an SSH rule so you can log into the container instance and examine the containers in
jobs with Docker commands. You can also add rules for HTTP if you want your container instance to
host a job that runs a web server. Complete the following steps to add these optional security group
rules.

On the Inbound tab, create the following rules and choose Create:

• Choose Add Rule. For Type, choose HTTP. For Source, choose Anywhere (0.0.0.0/0).
• Choose Add Rule. For Type, choose SSH. For Source, ensure that Custom IP is selected, and
specify the public IP address of your computer or network in CIDR notation. To specify an
individual IP address in CIDR notation, add the routing preﬁx /32. For example, if your IP address
is 203.0.113.25, specify 203.0.113.25/32. If your company allocates addresses from a range,
specify the entire range, such as 203.0.113.0/24.
Note
For security reasons, we don't recommend that you allow SSH access from all IP addresses
(0.0.0.0/0) to your instance, except for testing purposes and only for a short time.

Install the AWS CLI

To use the AWS CLI with AWS Batch, install the latest AWS CLI version. For information about installing
the AWS CLI or upgrading it to the latest version, see Installing the AWS Command Line Interface in the
AWS Command Line Interface User Guide.

8
AWS Batch User Guide
Step 1: Deﬁne a Job

Getting Started with AWS Batch

Get started with AWS Batch by creating a job deﬁnition, compute environment, and a job queue in the
AWS Batch console.

With the AWS Batch ﬁrst-run wizard, you can create a compute environment and a job queue and can
optionally also submit a sample hello world job. If you already have a Docker image that you want to
launch in AWS Batch, you can create a job deﬁnition with that image and submit that to your queue
instead.
Important
Before you begin, be sure that you completed the steps in Setting Up with AWS Batch (p. 3)
and that your AWS user has the required permissions. Admin users don't need to worry about
permissions issues. For more information, see Creating Your First IAM Admin User and Group in
the IAM User Guide.

Step 1: Deﬁne a Job

In this section, you choose to define your job definition or move ahead to creating a compute
environment and job queue without a job definition.

To conﬁgure job options

1. Open the AWS Batch console ﬁrst-run wizard at https://console.aws.amazon.com/batch/home#/

wizard.
2. To create an AWS Batch job definition, compute environment, and job queue and then submit your
job, choose Using Amazon EC2. To only create the compute environment and job queue without
submitting a job, choose No job submission.
3. If you chose to create a job definition, then complete the next four sections of the first-run wizard.
They are Job run-time, Environment, Parameters, and Environment variables. Then, choose Next.
If you're not creating a job definition, choose Next and move on to Step 2: Configure the Compute
Environment and Job Queue (p. 11).

To specify job run time

1. If you're creating a new job definition, for Job definition name, specify a name for your job
definition.
2. (Optional) For Job role, specify an IAM role that provides the container in your job with permissions
to use the AWS APIs. This feature uses Amazon ECS IAM roles for task functionality. For more
information about this feature, including configuration prerequisites, see IAM Roles for Tasks in the
Amazon Elastic Container Service Developer Guide.
Note
Only roles that have the Amazon Elastic Container Service Task Role trust relationship are
shown here. For instructions on creating an IAM role for your AWS Batch jobs, see Creating
an IAM Role and Policy for your Tasks in the Amazon Elastic Container Service Developer
Guide.
3. For Container image, choose the Docker image to use for your job. By default, images in the Docker
Hub registry are available. Optionally, you can also specify other repositories with repository-
url/image:tag. The parameter can be up to 255 characters in length. It can contain uppercase and
lowercase letters, numbers, hyphens (-), underscores (_), colons (:), periods (.), forward slashes (/),
and number signs (#). The parameter maps to Image in the Create a container section of the Docker
Remote API and the IMAGE parameter of docker run.

9
AWS Batch User Guide
Step 1: Deﬁne a Job

Note
Docker image architecture must match the processor architecture of the compute resources
that they're scheduled on. For example, ARM-based Docker images can only run on ARM-
based compute resources.

• Images in Amazon ECR Public repositories use the full registry/repository[:tag]

To specify resources for your environment

1. For Command, specify the command to pass to the container. This parameter maps to Cmd in the
Create a container section of the Docker Remote API and the COMMAND parameter to docker run. For
more information about the Docker CMD parameter, see https://docs.docker.com/engine/reference/
builder/#cmd.
Note
You can use parameter substitution default values and placeholders in your command. For
more information, see Parameters (p. 44).
2. For vCPUs, specify the number of vCPUs to reserve for the container. This parameter maps to
CpuShares in the Create a container section of the Docker Remote API and the --cpu-shares
option to docker run. Each vCPU is equivalent to 1,024 CPU shares.
3. For Memory, specify the hard limit (in MiB) of memory to present to the job's container. If your
container attempts to exceed the memory speciﬁed here, the container is stopped. This parameter
maps to Memory in the Create a container section of the Docker Remote API and the --memory
option to docker run.
4. For Job attempts, specify the maximum number of times to attempt your job (in case it fails). For
more information, see Automated Job Retries (p. 18).

Parameters

(Optional) Specify parameter substitution default values and placeholders in your command. For more
information, see Parameters (p. 44).

1. For Key, specify the key for your parameter.

2. For Value, specify the value for your parameter.

To specify environment variables

(Optional) Specify environment variables to pass to your job's container. This parameter maps to Env in
the Create a container section of the Docker Remote API and the --env option to docker run.
Important
We don't recommend that you use plaintext environment variables for sensitive information,
such as credential data.

1. For Key, specify the key for your environment variable.

10
AWS Batch User Guide
Step 2: Conﬁgure the Compute
Environment and Job Queue

2. For Value, specify the value for your environment variable.

Step 2: Conﬁgure the Compute Environment and

Job Queue
A compute environment is a way to reference your compute resources (Amazon EC2 instances): the
settings and constraints that tell AWS Batch how instances should be conﬁgured and automatically
launched. You submit your jobs to a job queue that stores jobs until the AWS Batch scheduler runs the
job on a compute resource within your compute environment.
Note
At this time, you can only create a managed compute environment in the ﬁrst run wizard. To
create an unmanaged compute environment, see Creating a compute environment (p. 99).

To conﬁgure your compute environment type

1. For Compute environment name, specify a unique name for your compute environment.
2. For Service role, choose to create a new role or use an existing role that allows the AWS Batch
service to make calls to the required AWS APIs on your behalf. For more information, see
AWS Batch service IAM role (p. 142). If you choose to create a new role, the required role
(AWSBatchServiceRole) is created for you.
3. For EC2 instance role, choose to create a new role or use an existing role that allows the Amazon
ECS container instances that are created for your compute environment to make calls to the required
AWS APIs on your behalf. For more information, see Amazon ECS instance role (p. 145). If you
choose to create a new role, the required role (ecsInstanceRole) is created for you.

To conﬁgure your instances

1. For Provisioning model, choose On-Demand to launch Amazon EC2 On-Demand instances or Spot
to use Amazon EC2 Spot Instances.
2. If you chose to use Amazon EC2 Spot Instances:

a. For Maximum bid price, choose the maximum percentage that a Spot Instance price must be
compared with the On-Demand price for that instance type before instances are launched.
For example, if your bid percentage is 20%, then the Spot price must be less than 20% of the
current On-Demand price for that EC2 instance. You always pay the lowest (market) price and
never more than your maximum percentage.
b. For Spot fleet role, choose to create a new role or use an existing Amazon EC2 Spot Fleet
IAM role to apply to your Spot compute environment. If you choose to create a new role, the
required role (aws-ec2-spot-fleet-role) is created for you. For more information, see
Amazon EC2 spot fleet role (p. 145).
3. For Allowed instance types, choose the Amazon EC2 instance types that can be launched. You
can specify instance families to launch any instance type within those families (for example, c5,
c5n, or p3), or you can specify specific sizes within a family (such as c5.8xlarge). Note that metal
instance types aren't in the instance families (for example, c5 doesn't include c5.metal). You can
also choose optimal to pick instance types (from the C4, M4, and R4 instance families) on the fly
that match the demand of your job queues.
Note
When you create a compute environment, the instance types that you select for the
compute environment must share the same architecture. For example, you can't mix x86
and ARM instances in the same compute environment.

11
AWS Batch User Guide
Step 2: Conﬁgure the Compute
Environment and Job Queue

Note
Currently, optimal uses instance types from the C4, M4, and R4 instance families. In
Regions that don't have instance types from those instance families, instance types from
the C5, M5, and R5 instance families are used.
4. For Minimum vCPUs, choose the minimum number of EC2 vCPUs that your compute environment
should maintain, regardless of job queue demand.
5. For Desired vCPUs, choose the number of EC2 vCPUs that your compute environment should launch
with. As your job queue demand increases, AWS Batch can increase the desired number of vCPUs
in your compute environment and add EC2 instances, up to the maximum vCPUs, and as demand
decreases, AWS Batch can decrease the desired number of vCPUs in your compute environment and
remove instances, down to the minimum vCPUs.
6. For Maximum vCPUs, choose the maximum number of EC2 vCPUs that your compute environment
can scale out to, regardless of job queue demand.

To set up your networking

Compute resources are launched into the VPC and subnets that you specify here. This way, you can
control the network isolation of AWS Batch compute resources.
Important
Compute resources need access to communicate with the Amazon ECS service endpoint. This
can be through an interface VPC endpoint or through your compute resources having public IP
addresses.
For more information about interface VPC endpoints, see Amazon ECS Interface VPC Endpoints
(AWS PrivateLink) in the Amazon Elastic Container Service Developer Guide.
If you do not have an interface VPC endpoint conﬁgured and your compute resources do not
have public IP addresses, then they must use network address translation (NAT) to provide
this access. For more information, see NAT gateways in the Amazon VPC User Guide. For more
information, see Tutorial: Creating a VPC with Public and Private Subnets for Your Compute
Environments (p. 165).

1. For VPC Id, choose a VPC to launch your instances into.

2. For Subnets, choose which subnets in the selected VPC should host your instances. By default, all
subnets that are within the selected VPC are chosen.
3. For Security groups, choose a security group to attach to your instances. By default, the default
security group for your VPC is chosen.

To tag your instances

(Optional) Apply key-value pair tags to instances that are launched in your compute environment. For
example, you can specify "Name": "AWS Batch Instance - C4OnDemand" as a tag so that each
instance in your compute environment has that name. This is helpful for recognizing your AWS Batch
instances in the Amazon EC2 console. By default, the compute environment name is used to tag your
instances.

1. For Key, specify the key for your tag.

2. For Value, specify the value for your tag.

To set up your job queue

Submit your jobs to a job queue which stores jobs until the AWS Batch scheduler runs the job on a
compute resource within your compute environment.

• For Job queue name, choose a unique name for your job queue.

12
AWS Batch User Guide
Step 2: Conﬁgure the Compute
Environment and Job Queue

To review and create

The Connected compute environments for this job queue section shows that your new compute
environment is associated with your new job queue and its order. Later, you can associate other compute
environments with the job queue. The job scheduler uses the compute environment order to determine
which compute environment should start a given job. Compute environments must be in the VALID state
before you can associate them with a job queue. You can associate up to three compute environments
with a job queue.

• Review the compute environment and job queue conﬁguration and choose Create to create your
compute environment.

13
AWS Batch User Guide
Submitting a Job

Jobs
Jobs are the unit of work invoked by AWS Batch. Jobs can be invoked as containerized applications
running on Amazon ECS container instances in an ECS cluster.

Containerized jobs can reference a container image, command, and parameters. For more information,
see Job deﬁnition parameters (p. 43).

You can submit a large number of independent, simple jobs.

Topics
• Submitting a Job (p. 14)
• Job States (p. 16)
• AWS Batch Job Environment Variables (p. 17)
• Automated Job Retries (p. 18)
• Job Dependencies (p. 19)
• Job Timeouts (p. 19)
• Array Jobs (p. 20)
• Multi-node Parallel Jobs (p. 27)
• GPU Jobs (p. 30)

Submitting a Job
After you have registered a job definition, you can submit it as a job to an AWS Batch job queue. Many of
the parameters that are specified in the job definition can be overridden at runtime.

To submit a job

1. Open the AWS Batch console at https://console.aws.amazon.com/batch/.

2. From the navigation bar, select the Region to use.
3. In the navigation pane, choose Jobs, Submit job.
4. For Job name, choose a unique name for your job.
5. For Job definition, choose a previously created job definition for your job. For more information, see
Creating a job definition (p. 31).
6. For Job queue, choose a previously created job queue. For more information, see Creating a job
queue (p. 82).
7. For Job type, choose Single for a single job or Array to submit an array job. For more information,
see Array Jobs (p. 20). This option isn't available for multi-node parallel jobs.
8. (Array jobs only) For Array size, specify an array size between 2 and 10,000.
9. (Optional) Declare any job dependencies. A job may have up to 20 dependencies. For more
information, see Job Dependencies (p. 19).

a. For Job depends on, enter the job IDs for any jobs that must ﬁnish before this job starts.

14
AWS Batch User Guide
Submitting a Job

b. (Array jobs only) For N-To-N job dependencies, specify one or more job IDs for any array jobs
for which each child job index of this job should depend on the corresponding child index job of
the dependency. For example, JobB:1 depends on JobA:1, and so on.
c. (Array jobs only) Select Run children sequentially to create a SEQUENTIAL dependency for the
current array job. This ensures that each child index job waits for its earlier sibling to ﬁnish. For
example, JobA:1 depends on JobA:0 and so on.
10. For Job attempts, specify the maximum number of times to attempt your job (in case it fails). For
more information, see Automated Job Retries (p. 18).
11. (Optional) For Execution timeout, specify the maximum number of seconds to allow your job
attempts to run. If an attempt exceeds the timeout duration, it is stopped and the status moves to
FAILED. For more information, see Job Timeouts (p. 19).
Important
Jobs running on Fargate resources can't expect to run for more than 14 days. After 14 days,
the Fargate resources may no longer be available and the job will be terminated.
12. (Optional) In the Parameters section, you can specify parameter substitution default values
and placeholders to use in the command that your job's container runs when it starts. For more
information, see Parameters (p. 44).

a. Choose Add parameter.

b. For Key, specify the key for your parameter.
c. For Value, specify the value for your parameter.
13. For vCPUs, specify the number of vCPUs to reserve for the container. This parameter maps to
CpuShares in the Create a container section of the Docker Remote API and the --cpu-shares
option to docker run. Each vCPU is equivalent to 1,024 CPU shares. You must specify at least one
vCPU.
14. For Memory, specify the hard limit (in MiB) of memory to present to the job's container. If your
container attempts to exceed the memory speciﬁed here, the container is killed. This parameter
maps to Memory in the Create a container section of the Docker Remote API and the --memory
option to docker run. You must specify at least 4 MiB of memory for a job.
15. (Optional) For Number of GPUs, specify the number of GPUs your job will use.

The job will run on a container with the speciﬁed number of GPUs pinned to that container.
16. For Command, specify the command to pass to the container. For simple commands, you can type
the command as you would at a command prompt in the Space delimited tab. Verify that the
JSON result (which is passed to the Docker daemon) is correct. For more complicated commands
(for example, with special characters), you can switch to the JSON tab and enter the string array
equivalent there.

a. Choose Add environment variable.

b. For Key, specify the key for your environment variable.

15
AWS Batch User Guide
Job States

Note
Environment variables must not start with AWS_BATCH; this naming convention is
reserved for variables that are set by the AWS Batch service.
c. For Value, specify the value for your environment variable.
18. (Optional) In the Tags section, you can specify the key and value for each tag to associate with the
job. For more information, see Tagging your AWS Batch resources (p. 197).
19. Choose Submit job.
Note
Logs for RUNNING, SUCCEEDED, and FAILED jobs are available in CloudWatch
Logs; the log group is /aws/batch/job, and the log stream name format is
first200CharsOfJobDefinitionName/default/ecs_task_id (this format may
change in the future).
After a job reaches the RUNNING status, you can programmatically retrieve its log stream
name with the DescribeJobs API operation. For more information, see View Log Data Sent
to CloudWatch Logs in the Amazon CloudWatch Logs User Guide. By default, these logs are
set to never expire, but you can modify the retention period. For more information, see
Change Log Data Retention in CloudWatch Logs in the Amazon CloudWatch Logs User Guide.

Job States
When you submit a job to an AWS Batch job queue, the job enters the SUBMITTED state. It then passes
through the following states until it succeeds (exits with code 0) or fails (exits with a non-zero code). AWS
Batch jobs can have the following states:

SUBMITTED

A job that has been submitted to the queue, and has not yet been evaluated by the scheduler. The
scheduler evaluates the job to determine if it has any outstanding dependencies on the successful
completion of any other jobs. If there are dependencies, the job is moved to PENDING. If there are no
dependencies, the job is moved to RUNNABLE.
PENDING

A job that resides in the queue and isn't yet able to run due to a dependency on another job or
resource. After the dependencies are satisﬁed, the job is moved to RUNNABLE.
RUNNABLE

A job that resides in the queue, has no outstanding dependencies, and is therefore ready to be
scheduled to a host. Jobs in this state are started as soon as sufficient resources are available in one
of the compute environments that are mapped to the job's queue. However, jobs can remain in this
state indefinitely when sufficient resources are unavailable.
Note
If your jobs do not progress to STARTING, see Jobs Stuck in RUNNABLE Status (p. 203) in
the troubleshooting section.
STARTING

These jobs have been scheduled to a host and the relevant container initiation operations are
underway. After the container image is pulled and the container is up and running, the job
transitions to RUNNING.
RUNNING

The job is running as a container job on an Amazon ECS container instance within a compute
environment. When the job's container exits, the process exit code determines whether the job

16
AWS Batch User Guide
Job Environment Variables

succeeded or failed. An exit code of 0 indicates success, and any non-zero exit code indicates failure.
If the job associated with a failed attempt has any remaining attempts left in its optional retry
strategy conﬁguration, the job is moved to RUNNABLE again. For more information, see Automated
Job Retries (p. 18).
Note
Logs for RUNNING jobs are available in CloudWatch Logs; the log group is /aws/batch/
job, and the log stream name format is first200CharsOfJobDefinitionName/
default/ecs_task_id (this format may change in the future).
After a job reaches the RUNNING status, you can programmatically retrieve its log stream
name with the DescribeJobs API operation. For more information, see View Log Data Sent
to CloudWatch Logs in the Amazon CloudWatch Logs User Guide. By default, these logs are
set to never expire, but you can modify the retention period. For more information, see
Change Log Data Retention in CloudWatch Logs in the Amazon CloudWatch Logs User Guide.
SUCCEEDED

The job has successfully completed with an exit code of 0. The job state for SUCCEEDED jobs is
persisted in AWS Batch for at least 24 hours.
Note
Logs for SUCCEEDED jobs are available in CloudWatch Logs; the log group is /aws/batch/
job, and the log stream name format is first200CharsOfJobDefinitionName/
default/ecs_task_id (this format may change in the future).
After a job reaches the RUNNING status, you can programmatically retrieve its log stream
name with the DescribeJobs API operation. For more information, see View Log Data Sent
to CloudWatch Logs in the Amazon CloudWatch Logs User Guide. By default, these logs are
set to never expire, but you can modify the retention period. For more information, see
Change Log Data Retention in CloudWatch Logs in the Amazon CloudWatch Logs User Guide.
FAILED

The job has failed all available attempts. The job state for FAILED jobs is persisted in AWS Batch for
at least 24 hours.
Note
Logs for FAILED jobs are available in CloudWatch Logs; the log group is /aws/batch/
job, and the log stream name format is first200CharsOfJobDefinitionName/
default/ecs_task_id (this format may change in the future).
After a job reaches the RUNNING status, you can programmatically retrieve its log stream
with the DescribeJobs API operation. For more information, see View Log Data Sent to
CloudWatch Logs in the Amazon CloudWatch Logs User Guide. By default, these logs are set
to never expire, but you can modify the retention period. For more information, see Change
Log Data Retention in CloudWatch Logs in the Amazon CloudWatch Logs User Guide.

AWS Batch Job Environment Variables

AWS Batch automatically sets specific environment variables in container jobs. These environment
variables provide introspection for the containers inside jobs, and you can use the values of these
variables in the logic of your applications. All variables that are set by AWS Batch begin with the prefix,
AWS_BATCH_. This is a protected environment variable prefix, and you cannot use this prefix for your
own variables in job definitions or overrides.

The following environment variables are available in job containers:

AWS_BATCH_CE_NAME

This variable is set to the name of the compute environment in which your job is placed.

17
AWS Batch User Guide
Automated Job Retries

AWS_BATCH_JOB_ARRAY_INDEX

This variable is only set in child array jobs. The array job index begins at 0, and each child job
receives a unique index number. For example, an array job with 10 children has index values of 0-9.
You can use this index value to control how your array job children are diﬀerentiated. For more
information, see Tutorial: Using the array job index to control job diﬀerentiation (p. 23).
AWS_BATCH_JOB_ATTEMPT

This variable is set to the job attempt number. The ﬁrst attempt is numbered 1. For more
information, see Automated Job Retries (p. 18).
AWS_BATCH_JOB_ID

This variable is set to the AWS Batch job ID.

AWS_BATCH_JOB_MAIN_NODE_INDEX

This variable is only set in multi-node parallel jobs. This variable is set to the index number of the
job's main node. Your application code can compare the AWS_BATCH_JOB_MAIN_NODE_INDEX to
the AWS_BATCH_JOB_NODE_INDEX on an individual node to determine if it is the main node.
AWS_BATCH_JOB_MAIN_NODE_PRIVATE_IPV4_ADDRESS

This variable is only set in multi-node parallel job child nodes (it isn't present on the main node).
This variable is set to the private IPv4 address of the job's main node. Your child node's application
code can use this address to communicate with the main node.
AWS_BATCH_JOB_NODE_INDEX

This variable is only set in multi-node parallel jobs. This variable is set to the node index number of
the node. The node index begins at 0, and each node receives a unique index number. For example, a
multi-node parallel job with 10 children has index values of 0-9.
AWS_BATCH_JOB_NUM_NODES

This variable is only set in multi-node parallel jobs. This variable is set to the number of nodes that
you have requested for your multi-node parallel job.
AWS_BATCH_JQ_NAME

This variable is set to the name of the job queue to which your job was submitted.

Automated Job Retries

You can apply a retry strategy to your jobs and job deﬁnitions that allows failed jobs to be automatically
retried. Possible failure scenarios include:

• Any non-zero exit code from a container job

• Amazon EC2 instance failure or termination
• Internal AWS service error or outage

When a job is submitted to a job queue and placed into the RUNNING state, that is considered an
attempt. By default, each job is given one attempt to move to either the SUCCEEDED or FAILED job
state. However, both the job deﬁnition and the job submission workﬂows allow you to specify a retry
strategy with between 1 and 10 attempts. For more information, see Retry strategy (p. 62).

At runtime, the AWS_BATCH_JOB_ATTEMPT environment variable is set to the container's corresponding

job attempt number. The ﬁrst attempt is numbered 1, and subsequent attempts are in ascending order
(2, 3, 4, and so on).

18
AWS Batch User Guide
Job Dependencies

If a job attempt fails for any reason, and the number of attempts specified in the retry configuration is
greater than the AWS_BATCH_JOB_ATTEMPT number, then the job is placed back in the RUNNABLE state.
For more information, see Job States (p. 16).
Note
Jobs that have been cancelled or terminated are not retried. Also, jobs that fail due to an invalid
job definition are not retried.

For more information, see Creating a job deﬁnition (p. 31) and Submitting a Job (p. 14).

Job Dependencies
When you submit an AWS Batch job, you can specify the job IDs on which the job depends. When you
do so, the AWS Batch scheduler ensures that your job is run only after the speciﬁed dependencies
have successfully completed. After they succeed, the dependent job transitions from PENDING to
RUNNABLE and then to STARTING and RUNNING. If any of the job dependencies fail, the dependent job
automatically transitions from PENDING to FAILED.

For example, Job A can express a dependency on up to 20 other jobs that must succeed before it can run.
You can then submit additional jobs that have a dependency on Job A and up to 19 other jobs.

For array jobs, you can specify a SEQUENTIAL type dependency without specifying a job ID so that
each child array job completes sequentially, starting at index 0. You can also specify an N_TO_N type
dependency with a job ID. That way, each index child of this job must wait for the corresponding
index child of each dependency to complete before it can begin. For more information, see Array
Jobs (p. 20).

To submit an AWS Batch job with dependencies, see Submitting a Job (p. 14).

Job Timeouts
You can conﬁgure a timeout duration for your jobs so that if a job runs longer than that, AWS Batch
terminates the job. For example, you might have a job that you know should only take 15 minutes to
complete. Sometimes your application gets stuck in a loop and runs forever, so you can set a timeout of
30 minutes to terminate the stuck job.

You specify an attemptDurationSeconds parameter, which must be at least 60 seconds, either in your
job deﬁnition, or when you submit the job. When this number of seconds has passed following the job
attempt's startedAt timestamp, AWS Batch terminates the job. On the compute resource, your job's
container receives a SIGTERM signal to give your application a chance to shut down gracefully. If the
container is still running after 30 seconds, a SIGKILL signal is sent to forcefully shut down the container.

Timeout terminations are handled on a best-effort basis. You shouldn't expect your timeout termination
to happen exactly when the job attempt times out (it may take a few seconds longer). If your application
requires precise timeout execution, you should implement this logic within the application. If you have
a large number of jobs timing out concurrently, the timeout terminations behave as a first in, first out
queue, where jobs are terminated in batches.

If a job is terminated for exceeding the timeout duration, it isn't retried. If a job attempt fails on its own,
then it can retry if retries are enabled, and the timeout countdown is started over for the new attempt.
Important
Jobs running on Fargate resources can't expect to run for more than 14 days. If the timeout
duration exceeds 14 days, the Fargate resources may no longer be available and the job will be
terminated.

For array jobs, child jobs have the same timeout conﬁguration as the parent job.

19
AWS Batch User Guide
Array Jobs

For information about submitting an AWS Batch job with a timeout conﬁguration, see Submitting a
Job (p. 14).

Array Jobs
An array job is a job that shares common parameters, such as the job deﬁnition, vCPUs, and memory. It
runs as a collection of related, yet separate, basic jobs that may be distributed across multiple hosts and
may run concurrently. Array jobs are the most eﬃcient way to run extremely parallel jobs such as Monte
Carlo simulations, parametric sweeps, or large rendering jobs.

AWS Batch array jobs are submitted just like regular jobs. However, you specify an array size (between 2
and 10,000) to deﬁne how many child jobs should run in the array. If you submit a job with an array size
of 1000, a single job runs and spawns 1000 child jobs. The array job is a reference or pointer to manage
all the child jobs. This allows you to submit large workloads with a single query.

When you submit an array job, the parent array job gets a normal AWS Batch job ID. Each child job has
the same base ID, but the array index for the child job is appended to the end of the parent ID, such as
example_job_ID:0 for the ﬁrst child job of the array.

At runtime, the AWS_BATCH_JOB_ARRAY_INDEX environment variable is set to the container's

corresponding job array index number. The first array job index is numbered 0, and subsequent attempts
are in ascending order (1, 2, 3, and so on). You can use this index value to control how your array job
children are differentiated. For more information, see Tutorial: Using the array job index to control job
differentiation (p. 23).

For array job dependencies, you can specify a type for a dependency, such as SEQUENTIAL or N_TO_N.
You can specify a SEQUENTIAL type dependency (without specifying a job ID) so that each child array job
completes sequentially, starting at index 0. For example, if you submit an array job with an array size of
100, and specify a dependency with type SEQUENTIAL, 100 child jobs are spawned sequentially, where
the first child job must succeed before the next child job starts. The figure below shows Job A, an array
job with an array size of 10. Each job in Job A's child index is dependent on the previous child job. Job A:1
can't start until job A:0 finishes.

You can also specify an N_TO_N type dependency with a job ID for array jobs so that each index child of
this job must wait for the corresponding index child of each dependency to complete before it can begin.

20
AWS Batch User Guide
Example Array Job Workﬂow

The ﬁgure below shows Job A and Job B, two array jobs with an array size of 10,000 each. Each job in Job
B's child index is dependent on the corresponding index in Job A. Job B:1 can't start until job A:1 ﬁnishes.

If you cancel or terminate a parent array job, all of the child jobs are cancelled or terminated with it. You
can cancel or terminate individual child jobs (which moves them to the FAILED status) without aﬀecting
the other child jobs. However, if a child array job fails (on its own or by cancelling/terminating manually),
the parent job also fails.

Example Array Job Workﬂow

A common workﬂow for AWS Batch customers is to run a prerequisite setup job, run a series of
commands against a large number of input tasks, and then conclude with a job that aggregates results
and writes summary data to Amazon S3, DynamoDB, Amazon Redshift, or Aurora.

For example:

• JobA: A standard, non-array job that performs a quick listing and metadata validation of objects in an
Amazon S3 bucket, BucketA. The SubmitJob JSON syntax is shown below.

{
"jobName": "JobA",
"jobQueue": "ProdQueue",
"jobDefinition": "JobA-list-and-validate:1"
}

• JobB: An array job with 10,000 copies that is dependent upon JobA, that runs CPU-intensive
commands against each object in BucketA and uploads results to BucketB. The SubmitJob JSON
syntax is shown below.

{
"jobName": "JobB",
"jobQueue": "ProdQueue",
"jobDefinition": "JobB-CPU-Intensive-Processing:1",
"containerOverrides": {
"resourceRequirements": [
{
"type": "MEMORY",
"value": "4096"
},
{

21
AWS Batch User Guide
Example Array Job Workﬂow

"type": "VCPU",
"value": "32"
}
]
}
"arrayProperties": {
"size": 10000
},
"dependsOn": [
{
"jobId": "JobA_job_ID"
}
]
}

• JobC: Another 10,000 copy array job that is dependent upon JobB with an N_TO_N dependency
model, that runs memory-intensive commands against each item in BucketB, writes metadata to
DynamoDB, and uploads the resulting output to BucketC. The SubmitJob JSON syntax is shown
below.

{
"jobName": "JobC",
"jobQueue": "ProdQueue",
"jobDefinition": "JobC-Memory-Intensive-Processing:1",
"containerOverrides": {
"resourceRequirements": [
{
"type": "MEMORY",
"value": "32768"
},
{
"type": "VCPU",
"value": "1"
}
]
}
"arrayProperties": {
"size": 10000
},
"dependsOn": [
{
"jobId": "JobB_job_ID",
"type": "N_TO_N"
}
]
}

• JobD: An array job that performs 10 validation steps that each need to query DynamoDB and may
interact with any of the above Amazon S3 buckets. Each of the steps in JobD run the same command,
but the behavior is diﬀerent based on the value of the AWS_BATCH_JOB_ARRAY_INDEX environment
variable within the job's container. These validation steps run sequentially (for example, JobD:0, then
JobD:1, and so on. The SubmitJob JSON syntax is shown below.

{
"jobName": "JobD",
"jobQueue": "ProdQueue",
"jobDefinition": "JobD-Sequential-Validation:1",
"containerOverrides": {
"resourceRequirements": [
{
"type": "MEMORY",
"value": "32768"
},

22
AWS Batch User Guide
Tutorial: Using array job index

{
"type": "VCPU",
"value": "1"
}
]
}
"arrayProperties": {
"size": 10
},
"dependsOn": [
{
"jobId": "JobC_job_ID"
},
{
"type": "SEQUENTIAL"
},

]
}

• JobE: A ﬁnal, non-array job that performs some simple cleanup operations and sends an Amazon
SNS notiﬁcation with a message that the pipeline has completed and a link to the output URL. The
SubmitJob JSON syntax is shown below.

{
"jobName": "JobE",
"jobQueue": "ProdQueue",
"jobDefinition": "JobE-Cleanup-and-Notification:1",
"parameters": {
"SourceBucket": "s3://JobD-Output-Bucket",
"Recipient": "[email protected]"
},
"dependsOn": [
{
"jobId": "JobD_job_ID"
}
]
}

Tutorial: Using the array job index to control job

differentiation
This tutorial shows how to use the AWS_BATCH_JOB_ARRAY_INDEX environment variable (that each
child job is assigned) to differentiate the child jobs. The example uses the child job's index number to
read a specific line in a file. Then, it substitutes the parameter associated with that line number with a
command inside the job's container. The result is that you can have multiple AWS Batch jobs running the
same Docker image and command arguments. However, the results are different because the array job
index is used as a modifier.

In this tutorial, you create a text file that has all of the colors of the rainbow, each on its own line. Then,
you create an entrypoint script for a Docker container that converts the index into a value that can be
used for a line number in the color file. The index starts at zero, but line numbers start at one. Create
a Dockerfile that copies the color and index files to the container image and sets ENTRYPOINT for the
image to the entrypoint script. The Dockerfile and resources are built to a Docker image that's pushed
to Amazon ECR. You then register a job definition that uses your new container image, submit an AWS
Batch array job with that job definition, and view the results.

23
AWS Batch User Guide
Tutorial: Using array job index

Prerequisites
This tutorial has the following prerequisites:

• An AWS Batch compute environment. For more information, see Creating a compute
environment (p. 99).
• An AWS Batch job queue and associated compute environment. For more information, see Creating a
job queue (p. 82).
• The AWS CLI installed on your local system. For more information, see Installing the AWS Command
Line Interface in the AWS Command Line Interface User Guide.
• Docker installed on your local system. For more information, see About Docker CE in the Docker
documentation.

Step 1: Build a Container Image

You can use the AWS_BATCH_JOB_ARRAY_INDEX in a job deﬁnition in the command parameter.
However, we recommend that you create a container image that uses the variable in an entrypoint script
instead. This section describes how to create such a container image.

To build your Docker container image

1. Create a new directory to use as your Docker image workspace and navigate to it.
2. Create a ﬁle named colors.txt in your workspace directory and paste the following into it.

red
orange
yellow
green
blue
indigo
violet

3. Create a ﬁle named print-color.sh in your workspace directory and paste the following into it.
Note
The LINE variable is set to the AWS_BATCH_JOB_ARRAY_INDEX + 1 because the array
index starts at 0, but line numbers start at 1. The COLOR variable is set to the color in
colors.txt that's associated with its line number.

#!/bin/sh
LINE=$((AWS_BATCH_JOB_ARRAY_INDEX + 1))
COLOR=$(sed -n ${LINE}p /tmp/colors.txt)
echo My favorite color of the rainbow is $COLOR.

4. Create a file named Dockerfile in your workspace directory and paste the contents below into it.
This Dockerfile copies the previous files to your container and sets the entrypoint script to run when
the container starts.

FROM busybox
COPY print-color.sh /tmp/print-color.sh
COPY colors.txt /tmp/colors.txt
RUN chmod +x /tmp/print-color.sh
ENTRYPOINT /tmp/print-color.sh

5. Build your Docker image:

docker build -t print-color .

24
AWS Batch User Guide
Tutorial: Using array job index

6. Test your container with the following script. This script sets the AWS_BATCH_JOB_ARRAY_INDEX
variable to 0 locally and then increments it to simulate what an array job with seven children does.

AWS_BATCH_JOB_ARRAY_INDEX=0
while [ $AWS_BATCH_JOB_ARRAY_INDEX -le 6 ]
do
docker run -e AWS_BATCH_JOB_ARRAY_INDEX=$AWS_BATCH_JOB_ARRAY_INDEX print-color
AWS_BATCH_JOB_ARRAY_INDEX=$((AWS_BATCH_JOB_ARRAY_INDEX + 1))
done

The following is the output.

My favorite color of the rainbow is red.

My favorite color of the rainbow is orange.
My favorite color of the rainbow is yellow.
My favorite color of the rainbow is green.
My favorite color of the rainbow is blue.
My favorite color of the rainbow is indigo.
My favorite color of the rainbow is violet.

Step 2: Push your image to Amazon ECR

Now that you built and tested your Docker container, push it to an image repository. This example uses
Amazon ECR, but you can use another registry, such as DockerHub.

1. Create an Amazon ECR image repository to store your container image. This example only uses the
AWS CLI, but you can also use the AWS Management Console. For more information, see Creating a
Repository in the Amazon Elastic Container Registry User Guide.

aws ecr create-repository --repository-name print-color

2. Tag your print-color image with your Amazon ECR repository URI that was returned from the
previous step.

docker tag print-color aws_account_id.dkr.ecr.region.amazonaws.com/print-color

3. Log in to your Amazon ECR registry. For more information, see Registry Authentication in the
Amazon Elastic Container Registry User Guide.

aws ecr get-login-password --region region | docker login --username AWS \

--password-stdin aws_account_id.dkr.ecr.region.amazonaws.com

4. Push your image to Amazon ECR:

docker push aws_account_id.dkr.ecr.region.amazonaws.com/print-color

Step 3: Create and register a Job deﬁnition

Now that your Docker image is in an image registry, you can specify it in an AWS Batch job deﬁnition.
Then, you can use it later to run an array job. This example only uses the AWS CLI. However, you can also
use the AWS Management Console. For more information, see Creating a job deﬁnition (p. 31).

25
AWS Batch User Guide
Tutorial: Using array job index

To create a job deﬁnition

1. Create a ﬁle named print-color-job-def.json in your workspace directory and paste the
following into it. Replace the image repository URI with your own image's URI.

{
"jobDefinitionName": "print-color",
"type": "container",
"containerProperties": {
"image": "aws_account_id.dkr.ecr.region.amazonaws.com/print-color",
"resourceRequirements": [
{
"type": "MEMORY",
"value": "250"
},
{
"type": "VCPU",
"value": "1"
}
]
}
}

2. Register the job deﬁnition with AWS Batch:

aws batch register-job-definition --cli-input-json file://print-color-job-def.json

Step 4: Submit an AWS Batch array job

After you registered your job deﬁnition, you can submit an AWS Batch array job that uses your new
container image.

To submit an AWS Batch array job

1. Create a file named print-color-job.json in your workspace directory and paste the following
into it.
Note
This example assumes the default job queue name that's created by the AWS Batch first-run
wizard. If your job queue name is different, replace the first-run-job-queue name with
your job queue name.

{
"jobName": "print-color",
"jobQueue": "first-run-job-queue",
"arrayProperties": {
"size": 7
},
"jobDefinition": "print-color"
}

2. Submit the job to your AWS Batch job queue. Note the job ID that's returned in the output.

aws batch submit-job --cli-input-json file://print-color-job.json

3. Describe the job's status and wait for the job to move to SUCCEEDED.

26
AWS Batch User Guide
Multi-node Parallel Jobs

Step 5: View your array job logs

After your job reaches the SUCCEEDED status, you can view the CloudWatch Logs from the job's
container.

To view your job's logs in CloudWatch Logs

1. Open the AWS Batch console at https://console.aws.amazon.com/batch/.

2. In the left navigation pane, choose Jobs.
3. For Job queue, select a queue.
4. In the Status section, choose succeeded.
5. To display all of the child jobs for your array job, select the job ID that was returned in the previous
section.
6. To see the logs from the job's container, select one of the child jobs and choose View logs.

7. View the other child job's logs. Each job returns a diﬀerent color of the rainbow.

Multi-node Parallel Jobs

Multi-node parallel jobs enable you to run single jobs that span multiple Amazon EC2 instances.
With AWS Batch multi-node parallel jobs, you can run large-scale, tightly coupled, high performance
computing applications and distributed GPU model training without the need to launch, conﬁgure,
and manage Amazon EC2 resources directly. An AWS Batch multi-node parallel job is compatible with
any framework that supports IP-based, internode communication, such as Apache MXNet, TensorFlow,
Caﬀe2, or Message Passing Interface (MPI).

Multi-node parallel jobs are submitted as a single job. However, your job definition (or job submission
node overrides) specifies the number of nodes to create for the job and what node groups to create. Each
multi-node parallel job contains a main node, which is launched first. After the main node is up, the
child nodes are launched and started. If the main node exits, the job is considered finished, and the child
nodes are stopped. For more information, see Node Groups (p. 28).

Multi-node parallel job nodes are single-tenant, meaning that only a single job container is run on each
Amazon EC2 instance.

The ﬁnal job status (SUCCEEDED or FAILED) is determined by the ﬁnal job status of the main node. To
get the status of a multi-node parallel job, you can describe the job using the job ID that was returned
when you submitted the job. If you need the details for child nodes, then you must describe each child
node individually. Nodes are addressed using #N notation (starting with 0). For example, to access the
details of the second node of a job, you need to describe aws_batch_job_id#1 using the AWS Batch
DescribeJobs API action. The started, stoppedAt, statusReason, and exit information for a multi-
node parallel job is populated from the main node.

27
AWS Batch User Guide
Environment Variables

If you specify job retries, then a main node failure triggers another attempt; child node failures do not.
Each new attempt of a multi-node parallel job updates the corresponding attempt of its associated child
nodes.

To run multi-node parallel jobs on AWS Batch, your application code must contain the frameworks and
libraries necessary for distributed communication.

Environment Variables
At runtime, in addition to the standard environment variables that all AWS Batch jobs receive, each node
is conﬁgured with the following environment variables that are speciﬁc to multi-node parallel jobs:

AWS_BATCH_JOB_MAIN_NODE_INDEX

This variable is set to the index number of the job's main node. Your application code can compare
the AWS_BATCH_JOB_MAIN_NODE_INDEX to the AWS_BATCH_JOB_NODE_INDEX on an individual
node to determine if it is the main node.
AWS_BATCH_JOB_MAIN_NODE_PRIVATE_IPV4_ADDRESS

This variable is set to the node index number of the node. The node index begins at 0, and each
node receives a unique index number. For example, a multi-node parallel job with 10 children has
index values of 0-9.
AWS_BATCH_JOB_NUM_NODES

This variable is set to the number of nodes that you have requested for your multi-node parallel job.

Node Groups
A node group is an identical group of job nodes that all share the same container properties. AWS Batch
lets you specify up to ﬁve distinct node groups for each job.

Each group can have its own container images, commands, environment variables, and so on. For
example, you can submit a job that requires a single c4.xlarge instance for the main node, and ﬁve
c4.xlarge instance child nodes; each of these distinct node groups may specify diﬀerent container
images or commands to run for each job.

Alternatively, all of the nodes in your job can use a single node group, and your
application code can diﬀerentiate node roles (main node vs. child node) by comparing
the AWS_BATCH_JOB_MAIN_NODE_INDEX environment variable against its own value for
AWS_BATCH_JOB_NODE_INDEX. You may have up to 1000 nodes in a single job. This is the default limit
for instances in an Amazon ECS cluster, which can be increased on request.
Note
Currently all node groups in a multi-node parallel job must use the same instance type.

Job Lifecycle
When you submit a multi-node parallel job, the job enters the SUBMITTED status, and it waits for any
job dependencies to ﬁnish. Then the job moves to the RUNNABLE status, and AWS Batch provisions the
instance capacity required to run your job and launches these instances.

28
AWS Batch User Guide
Compute Environment Considerations

Each multi-node parallel job contains a main node. The main node is a single subtask that AWS Batch
monitors to determine the outcome of the submitted multi node job. The main node is launched ﬁrst
and it moves to the STARTING status.

When the main node reaches the RUNNING status (after the node's container is running), the child
nodes are launched and they also move to the STARTING status. The child nodes come up in
random order. There are no guarantees on the timing or ordering of child node launch. To ensure
that the all the nodes of the jobs are in the RUNNING status (after the node's container is running),
your application code can either query the AWS Batch API to get the main node and child node
information, or coordinate within the application code to wait until all nodes are online before
starting any distributed processing task. The private IP address of the main node is available as the
AWS_BATCH_JOB_MAIN_NODE_PRIVATE_IPV4_ADDRESS environment variable in each child node. Your
application code may use this information to coordinate and communicate data between each task.

As individual nodes exit, they move to SUCCEEDED or FAILED, depending on their exit code. If the main
node exits, the job is considered ﬁnished, and all of the child nodes are stopped. If a child node dies, AWS
Batch does not take any action on the other nodes in the job. If you do not want your job to continue
with a reduced number of nodes, you must factor this into your application code to terminate or cancel
the job.

Compute Environment Considerations

There are several things to consider when conﬁguring compute environments to run multi-node parallel
jobs with AWS Batch.

• Multi-node parallel jobs are not supported on UNMANAGED compute environments.

• If you intend to submit multi-node parallel jobs to a compute environment, consider creating a cluster
placement group in a single Availability Zone and associating it with your compute resources. This
keeps your multi-node parallel jobs on a logical grouping of instances in close proximity with high
network flow potential. For more information, see Placement Groups in the Amazon EC2 User Guide for
Linux Instances.
• Multi-node parallel jobs are not supported on compute environments that use Spot Instances.
• AWS Batch multi-node parallel jobs use the Amazon ECS awsvpc network mode, which gives your
multi-node parallel job containers the same networking properties as Amazon EC2 instances. Each
multi-node parallel job container gets its own elastic network interface, a primary private IP address,
and an internal DNS hostname. The network interface is created in the same VPC subnet as its host
compute resource. Any security groups that are applied to your compute resources are also applied to
it. For more information, see Task Networking with the awsvpc Network Mode in the Amazon Elastic
Container Service Developer Guide.
• Your compute environment may have no more than five security groups associated with it.
• The elastic network interfaces that are created and attached to your compute resources cannot be
detached manually or modified by your account. This is to prevent the accidental deletion of an elastic
network interface that is associated with a running job. To release the elastic network interfaces for a
task, terminate the job.
• Your compute environment must have enough maximum vCPUs to support your multi-node parallel
job.
• Your Amazon EC2 instance limits must be able to satisfy the number of instances required to run your
job. For example, if your job requires 30 instances, but your account can only run 20 instances in a
Region, your job gets stuck in the RUNNABLE status.
• If you specify an instance type for a node group in a multi-node parallel job, your compute
environment must be able to launch that instance type.

29
AWS Batch User Guide
GPU Jobs

GPU Jobs
GPU jobs help you to run jobs that use an instance's GPUs.

The following Amazon EC2 GPU-based instance types are supported. For more information, see Amazon
EC2 G3 Instances, Amazon EC2 G4 Instances, Amazon EC2 P2 Instances, Amazon EC2 P3 Instances, and
Amazon EC2 P4d Instances.

Instance type GPUs GPU Memory vCPUs Memory Network Bandwidth

g3s.xlarge 1 8 GiB 4 30.5 GiB 10 Gbps

g3.4xlarge 1 8 GiB 16 122 GiB Up to 10 Gbps

g3.8xlarge 2 16 GiB 32 244 GiB 10 Gbps

g3.16xlarge 4 32 GiB 64 488 GiB 25 Gbps

g4dn.xlarge 1 16 GiB 4 16 GiB Up to 25 Gbps

g4dn.2xlarge 1 16 GiB 8 32 GiB Up to 25 Gbps

g4dn.4xlarge 1 16 GiB 16 64 GiB Up to 25 Gbps

g4dn.8xlarge 1 16 GiB 32 128 GiB 50 Gbps

g4dn.12xlarge 4 64 GiB 48 192 GiB 50 Gbps

g4dn.16xlarge 1 16 GiB 64 256 GiB 50 Gbps

p2.xlarge 1 12 GiB 4 61 GiB High

p2.8xlarge 8 96 GiB 32 488 GiB 10 Gbps

p2.16xlarge 16 192 GiB 64 732 GiB 20 Gbps

p3.2xlarge 1 16 GiB 8 61 GiB Up to 10 Gbps

p3.8xlarge 4 64 GiB 32 244 GiB 10 Gbps

p3.16xlarge 8 128 GiB 64 488 GiB 25 Gbps

p3dn.24xlarge 8 256 GiB 96 768 GiB 100 Gbps

p4d.24xlarge 8 320 GiB 96 1152 GiB 4x100 Gbps

The resourceRequirements (p. 55) parameter for the job deﬁnition speciﬁes the number of GPUs to
be pinned to the container. This number of GPUs isn't available to any other job running on that instance
for the duration of that job. All instance types in a compute environment that run GPU jobs should be
from the p2, p3, p4, g3, g3s, or g4 instance families. If this isn't done a GPU job might get stuck in the
RUNNABLE status.

Jobs that don't use the GPUs can be run on GPU instances. However, they might cost more to run on the
GPU instances than on similar non-GPU instances. Depending on the speciﬁc vCPU, memory, and time
needed, these non-GPU jobs might block GPU jobs from running.

30
AWS Batch User Guide
Creating a job deﬁnition

Job definitions
AWS Batch job definitions specify how jobs are to be run. While each job must reference a job definition,
many of the parameters that are specified in the job definition can be overridden at runtime.

Contents
• Creating a job definition (p. 31)
• Creating a multi-node parallel job definition (p. 36)
• Job definition template (p. 39)
• Job definition parameters (p. 43)
• Using the awslogs log driver (p. 64)
• Specifying sensitive data (p. 67)
• Amazon EFS volumes (p. 75)
• Example job definitions (p. 79)

Some of the attributes speciﬁed in a job deﬁnition include:

• Which Docker image to use with the container in your job

• How many vCPUs and how much memory to use with the container
• The command the container should run when it is started
• What (if any) environment variables should be passed to the container when it starts
• Any data volumes that should be used with the container
• What (if any) IAM role your job should use for AWS permissions

For a complete description of the parameters available in a job deﬁnition, see Job deﬁnition
parameters (p. 43).

Creating a job deﬁnition

Before you can run jobs in AWS Batch, you must create a job deﬁnition. This process varies slightly for
single-node and multi-node parallel jobs. This topic covers creating a job deﬁnition for an AWS Batch job
that's not a multi-node parallel job.

To create a multi-node parallel job deﬁnition, see Creating a multi-node parallel job deﬁnition (p. 36).
For more information about multi-node parallel jobs, see Multi-node Parallel Jobs (p. 27).

To create a new job deﬁnition

1. Open the AWS Batch console at https://console.aws.amazon.com/batch/.

2. From the navigation bar, select the Region to use.
3. In the navigation pane, choose Job deﬁnitions, Create.
4. For Name, enter a unique name for your job deﬁnition. The name can be up to 128 characters in
length. It can contain uppercase and lowercase letters, numbers, hyphens (-), and underscores (_).
5. For Platform, choose EC2 if the job runs on EC2 instances, or Fargate if the job runs on AWS Fargate
capacity. For more information, see AWS Batch on AWS Fargate (p. 122).
6. In the Retry Strategies section, you can specify the number of times to retry a job. You can also
create conditions to decide whether a failed job should be retried. These conditions are based

31
AWS Batch User Guide
Creating a job deﬁnition

on string matching of the error code and reasons that are listed for the job attempt. For more
information, see Automated Job Retries (p. 18).

a. For Job attempts, specify the number of times to attempt your job if it fails. This number must
be between one (1) and ten (10), inclusive.
b. (Optional) Select Add evaluate on exit to add up to five (5) conditions to match string patterns
with the exit code, status reason, and reason that are returned in the job attempt. For each set
of conditions, Action must be set to either Retry (to retry until the number of job attempts has
been reached), or Exit to stop retrying the job.
7. (Optional) For Execution timeout, specify the maximum number of seconds that you want to allow
your job attempts to run. If an attempt exceeds the timeout duration, it's stopped and the status
moves to FAILED. For more information, see Job Timeouts (p. 19).
8. For Multi-node parallel, leave this box unchecked. To create a multi-node parallel job definition
instead, see Creating a multi-node parallel job definition (p. 36).
9. In Container properties, you can specify properties that are passed to the Docker daemon when the
job is placed.

a. For Image, choose the Docker image to use for your job. Images in the Docker Hub registry
are available by default. You can also specify other repositories with repository-
url/image:tag. Up to 255 letters (uppercase and lowercase), numbers, hyphens, underscores,
colons, periods, forward slashes, and number signs are allowed. This parameter maps to Image
in the Create a container section of the Docker Remote API and the IMAGE parameter of docker
run.
Note
Docker image architecture must match the processor architecture of the compute
resources that they're scheduled on. For example, ARM-based Docker images can only
run on ARM-based compute resources.

• Images in Amazon ECR Public repositories use the full registry/repository[:tag]

or registry/repository[@digest] naming conventions. For example,
public.ecr.aws/registry_alias/my-web-app:latest.
• Images in Amazon ECR repositories use the full registry/repository[:tag] naming
convention. For example, aws_account_id.dkr.ecr.region.amazonaws.com/my-web-
app:latest.
• Images in official repositories on Docker Hub use a single name (for example, ubuntu or
mongo).
• Images in other repositories on Docker Hub are qualified with an organization name (for
example, amazon/amazon-ecs-agent).
• Images in other online repositories are qualified further by a domain name (for example,
quay.io/assemblyline/ubuntu).
b. For Command, specify the command to pass to the container. For simple commands, you can
type the command as you would at a command prompt in the Space delimited tab. Then, verify
that the JSON result (which is passed to the Docker daemon) is correct. For more complicated
commands (for example, with special characters), you can switch to the JSON tab and enter the
string array equivalent there.

option to docker run. Each vCPU is equivalent to 1,024 CPU shares. You must specify at least
one vCPU.
d. For Memory, specify the hard limit (in MiB) of memory to present to the job's container. If your
container attempts to exceed the memory speciﬁed here, the container is killed. This parameter
maps to Memory in the Create a container section of the Docker Remote API and the --memory
option to docker run. You must specify at least 4 MiB of memory for a job.
Note
You can maximize your resource utilization by prioritizing memory for jobs of a speciﬁc
instance type. For instructions, see Compute Resource Memory Management (p. 114).
e. (Optional) For Number of GPUs, specify the number of GPUs your job uses.

The job runs on a container with the speciﬁed number of GPUs pinned to that container.
f. In the Additional conﬁguration section, you can specify additional parameters to be used with
the container.

i. (Optional) For Job role, you can specify an IAM role that provides the container in your job
with permissions to use the AWS APIs. This feature uses Amazon ECS IAM roles for task
functionality. For more information, including configuration prerequisites, see IAM Roles for
Tasks in the Amazon Elastic Container Service Developer Guide.
Note
A job role is required for jobs that are running on Fargate resources.
Note
Only roles that have the Amazon Elastic Container Service Task Role trust
relationship are shown here. For more information about creating an IAM role for
your AWS Batch jobs, see Creating an IAM Role and Policy for your Tasks in the
Amazon Elastic Container Service Developer Guide.
ii. For Execution role, you can specify an IAM role that grants the Amazon ECS container
and Fargate agents permission to make AWS API calls on your behalf. This feature uses
Amazon ECS IAM roles for task functionality. For more information, including configuration
prerequisites, see Amazon ECS task execution IAM roles in the Amazon Elastic Container
Service Developer Guide.
Note
An execution role is required for jobs running on Fargate resources.
iii. (Optional, only for jobs running on Fargate resources) In the Assign public IP section,
select Enable to give the job a public IP address. For a job that's running in a private subnet
to send outbound traffic to the internet, the private subnet requires a NAT gateway be
attached to route requests to the internet. You might want to do this so that you can pull
container images. For more information, see Amazon ECS task networking in the Amazon
Elastic Container Service Developer Guide.
iv. (Optional) In the Mount points section, you can configure mount points for your job's
container to access.

A. For Container path, enter the path on the container at which to mount the host
volume.
B. For Source volume, enter the name of the volume to mount.
C. To make the volume read-only for the container, choose Read-only.
v. (Optional, only for jobs running on EC2 resources) In the Ulimits section, you can conﬁgure
any ulimit values to use for your job's container.

A. Choose Add limit.

B. For Limit name, choose a ulimit to apply.
C. For Soft limit, choose the soft limit to apply for the ulimit type.
D. For Hard limit, choose the hard limit to apply for the ulimit type.
33
AWS Batch User Guide
Creating a job deﬁnition

vi. (Optional) In the Environment variables section, you can specify environment variables to
pass to your job's container. This parameter maps to Env in the Create a container section
of the Docker Remote API and the --env option to docker run.
Important
We don't recommend that you use plaintext environment variables for sensitive
information, such as credential data.

A. Choose Add environment variable.

B. For Key, specify the key for your environment variable.
Note
Environment variables must not start with AWS_BATCH. This naming
convention is reserved for variables that are set by the AWS Batch service.
C. For Value, specify the value for your environment variable.
vii. (Optional) In the Volumes section, you can specify data volumes for your job to pass to
your job's container. To add a volume, select Add volume.

A. For Name, enter a name for your volume. The name can be up to 255 characters in
length. It can contain uppercase and lowercase letters, numbers, hyphens (-), and
underscores (_).
B. (Optional) To use an Amazon EFS ﬁle system, select Enable EFS

I. For Filesystem ID, enter the ﬁle system ID.

II. (Optional) For Root directory, enter the directory within the Amazon EFS file
system to mount as the root directory inside the host. If this parameter is omitted,
the root of the Amazon EFS volume is used. Specifying / has the same effect as
omitting this parameter.
C. (Optional, only for jobs running on EC2 resources) For Source Path, enter the path on
the host instance to present to the container. If you leave this field empty, then the
Docker daemon assigns a host path for you. If you specify a source path, then the data
volume persists at the specified location on the host container instance until you delete
it. If the source path doesn't exist on the host container instance, the Docker daemon
creates it. If the location does exist, the contents of the source path folder are exported
to the container.
D. (Optional) To use transit encryption, select Enable transit encryption. Transit
encryption enables encryption for Amazon EFS data in transit between the AWS Batch
host and the Amazon EFS server. Transit encryption must be enabled if Amazon EFS
IAM authorization is used. For more information, see Encrypting data in transit in the
Amazon Elastic File System User Guide.

I. (Optional) For Transit encryption port, enter the port to use when sending
encrypted data between the AWS Batch host and the Amazon EFS server. If you
don't specify a transit encryption port, it uses the port selection strategy that the
Amazon EFS mount helper uses. The value must be between 0 and 65,535. For
more information, see EFS Mount Helper in the Amazon Elastic File System User
Guide.
II. (Optional) For Access point ID, enter the access point ID to use. If an access point
is specified, the root directory value must either be omitted or set to /. For more
information, see Working with Amazon EFS Access Points in the Amazon Elastic File
System User Guide.
III. (Optional) To use the execution role when mounting the Amazon EFS file system,
select Use selected job role. For more information, see AWS Batch execution IAM
role (p. 176).
viii. (Optional) In the Security section, you can configure security options for your job's
container.

34
AWS Batch User Guide
Creating a job deﬁnition

A. To give your job's container elevated permissions on the host instance (similar to the
root user), select Enable privileged mode. This parameter maps to Privileged
in the Create a container section of the Docker Remote API and the --privileged
option to docker run.
B. For User, enter the user name to use inside the container. This parameter maps to User
in the Create a container section of the Docker Remote API and the --user option to
docker run.
ix. (Optional) In the Linux Parameters section, you can conﬁgure any device mappings to use
for your job's container. This allows the container to be able to access a device on the host
instance.

A. (Optional) In the Devices section, choose Add device.

I. (Optional) In the Devices section, to add a device choose Add device.

II. For Host path, specify the path of a device in the host instance.
III. For Container path, specify the path of in the container instance to expose the
device mapped to the host instance. If this is left blank (unspecified), then the host
path is used in the container.
IV. For Permissions, choose one or more permissions to apply to the device in the
container. The available permissions are READ, WRITE, and MKNOD.
B. (Optional) In the Shared memory size section, enter the size (in MiB) of the /dev/shm
volume.
C. (Optional) In the Max swap size section, enter the total amount of swap memory (in
MiB) that the container can use.
D. (Optional) In the Swappiness section, enter a value between 0 and 100 to indicate the
swappiness behavior of the container. If it's not specified and swapping is enabled, the
default value is 60. For more information, see swappiness (p. 49) in Job definition
parameters (p. 43).
E. (Optional) In the Tmpfs section, to add a tmpfs mount, choose Add tmpfs.

I. In the Container path field, enter the absolute file path in the container where the
tmpfs volume is mounted.
II. In the Size field, enter size (in MiB) of the tmpfs volume.
III. (Optional) In the Mount options field, enter the mount options. For
more information, including the list of available mount options, see
mountOptions (p. 50) in Job definition parameters (p. 43).
x. (Optional) In the Log configuration section, you can configure the log driver to use for your
job's container. By default, the awslogs log driver is used.

A. In the Log driver section, select the log driver to use. For more information about the
available log drivers, see logDriver (p. 51) in Job deﬁnition parameters (p. 43).
B. (Optional) In the Options section, select Add option to add an option.

I. In the Name ﬁeld, enter the name of the option. The options available vary by log
driver. For more information, see the log driver documentation.
II. In the Value ﬁeld, enter the value of the option.
C. (Optional) In the Secrets section, select Add secret to add a secret.

I. In the Name field, enter the name of the secret. For more information, see
secretOptions (p. 53) in Job definition parameters (p. 43).
II. In the Value field, enter the ARN of the secret.

35
AWS Batch User Guide
Creating a multi-node parallel job deﬁnition

10. (Optional) In the Parameters section, you can specify parameter substitution default values
and placeholders to use in the command that your job's container runs when it starts. For more
information, see Parameters (p. 44).

a. Choose Add parameter.

b. For Key, specify the key for your parameter.
c. For Value, specify the value for your parameter.
11. (Optional) In the Tags section, you can specify the key and value for each tag to associate with the
job deﬁnition. For more information, see Tagging your AWS Batch resources (p. 197).
12. Choose Create job deﬁnition.

Creating a multi-node parallel job deﬁnition

To create a single-node job deﬁnition, see Creating a job deﬁnition (p. 31).

To create a multi-node parallel job deﬁnition

1. Open the AWS Batch console at https://console.aws.amazon.com/batch/.

2. From the navigation bar, select the Region to use.
3. In the navigation pane, choose Job deﬁnitions, Create.
4. For Name, enter a unique name for your job deﬁnition. Up to 128 letters (uppercase and lowercase),
numbers, hyphens, and underscores are allowed.
5. For Platform, choose EC2.
6. In the Retry Strategies section, you can specify the number of times to retry a job. You can also
create conditions to decide whether a failed job should be retried. This is based on string matching
of the error code and reasons listed for the job attempt. For more information, see Automated Job
Retries (p. 18).

a. For Job attempts, specify the number of times to attempt your job (in case it fails). This number
must be between one (1) and ten (10), inclusive.
b. (Optional) Select Add evaluate on exit to add up to five (5) conditions to match string patterns
with the exit code, status reason, and reason that is returned in the job attempt. For each set
of conditions, Action must be set to either Retry (to retry until the number of job attempts has
been reached), or Exit to stop retrying the job.
7. (Optional) For Execution timeout, specify the maximum number of seconds you would like to allow
your job attempts to run. If an attempt exceeds the timeout duration, it is stopped and the status
moves to FAILED. For more information, see Job Timeouts (p. 19).
8. For Multi-node parallel, select Enable multi-node parallel and then complete the
following substeps. To create a single node parallel job definition instead, see Creating a job
definition (p. 31).

a. For Number of nodes, enter the total number of nodes to use for your job.
b. For Main node, enter the node index to use for the main node. The default main node index is 0.
c. Select Add node range. This creates a Node range section.

i. For Target nodes, specify the range for your node group, using range_start:range_end
notation.

36
AWS Batch User Guide
Creating a multi-node parallel job deﬁnition

You can create up to five node ranges for the number of nodes you specified for your job.
Node ranges use the index value for a node, and the node index begins at 0. The range end
index value of your final node group should be the number of nodes you specified in Step
8.a (p. 36), minus one. For example, If you specified 10 nodes, and you want to use a
single node group, then your end range should be 9.
ii. In Container properties, you can specify properties that are passed to the Docker daemon
for the nodes in the node range.

A. For Image, choose the Docker image to use for your job. Images in the Docker
Hub registry are available by default. You can also specify other repositories with
repository-url/image:tag. Up to 255 letters (uppercase and lowercase), numbers,
hyphens, underscores, colons, periods, forward slashes, and number signs are allowed.
This parameter maps to Image in the Create a container section of the Docker Remote
API and the IMAGE parameter of docker run.
Note
Docker image architecture must match the processor architecture of the
compute resources that they're scheduled on. For example, ARM-based Docker
images can only run on ARM-based compute resources.

• Images in Amazon ECR Public repositories use the full registry/

repository[:tag] or registry/repository[@digest] naming conventions.
For example, public.ecr.aws/registry_alias/my-web-app:latest.
• Images in Amazon ECR repositories use the full registry/
repository[:tag] naming convention. For example,
aws_account_id.dkr.ecr.region.amazonaws.com/my-web-app:latest
• Images in official repositories on Docker Hub use a single name (for example,
ubuntu or mongo).
• Images in other repositories on Docker Hub are qualified with an organization name
(for example, amazon/amazon-ecs-agent).
• Images in other online repositories are qualified further by a domain name (for
example, quay.io/assemblyline/ubuntu).
B. For Command, specify the command to pass to the container. For simple commands,
you can type the command as you would at a command prompt in the Space delimited
tab. Then, verify that the JSON result (which is passed to the Docker daemon) is
correct. For more complicated commands (for example, with special characters), you
can switch to the JSON tab and enter the string array equivalent there.

This parameter maps to Cmd in the Create a container section of the Docker Remote
API and the COMMAND parameter to docker run. For more information about the Docker
CMD parameter, go to https://docs.docker.com/engine/reference/builder/#cmd.
Note
You can use default values for parameter substitution and placeholders in your
command. For more information, see Parameters (p. 44).
C. For vCPUs, specify the number of vCPUs to reserve for the container. This parameter
maps to CpuShares in the Create a container section of the Docker Remote API and
the --cpu-shares option to docker run. Each vCPU is equivalent to 1,024 CPU
shares. You must specify at least one vCPU.
D. For Memory, specify the hard limit (in MiB) of memory to present to the job's
container. If your container attempts to exceed the memory speciﬁed here, the
container is killed. This parameter maps to Memory in the Create a container section of
the Docker Remote API and the --memory option to docker run. You must specify at
least 4 MiB of memory for a job.

37
AWS Batch User Guide
Creating a multi-node parallel job deﬁnition

Note
If you're trying to maximize your resource utilization by providing your jobs
as much memory as possible for a particular instance type, see Compute
Resource Memory Management (p. 114).
E. (Optional) For Number of GPUs, specify the number of GPUs your job uses.

The job runs on a container with the speciﬁed number of GPUs pinned to that
container.
F. In the Additional conﬁguration section, you can specify additional parameters to be
used with the container.

I. (Optional) For Job role, you can specify an IAM role that provides the container
in your job with permissions to use the AWS APIs. This feature uses Amazon ECS
IAM roles for task functionality. For more information, including conﬁguration
prerequisites, see IAM Roles for Tasks in the Amazon Elastic Container Service
Developer Guide.
Note
A job role is required for jobs that are running on Fargate resources.
Note
Only roles that have the Amazon Elastic Container Service Task Role
trust relationship are shown here. For more information about creating an
IAM role for your AWS Batch jobs, see Creating an IAM Role and Policy for
your Tasks in the Amazon Elastic Container Service Developer Guide.
II. (Optional) In the Volumes section, you can specify data volumes for your job to
pass to your job's container.

1. For Name, enter a name for your volume. Up to 255 letters (uppercase and
lowercase), numbers, hyphens, and underscores are allowed.
2. (Optional) For Source Path, enter the path on the host instance to present to
the container. If you leave this field empty, then the Docker daemon assigns a
host path for you. If you specify a source path, then the data volume persists
at the specified location on the host container instance until you delete it
manually. If the source path doesn't exist on the host container instance, the
Docker daemon creates it. If the location does exist, the contents of the source
path folder are exported to the container.
III. (Optional) In the Mount points section, you can configure mount points for your
job's container to access.

1. For Container path, enter the path on the container at which to mount the
host volume.
2. For Source volume, enter the name of the volume to mount.
3. To make the volume read-only for the container, choose Read-only.
IV. (Optional) In the Ulimits section, you can conﬁgure any ulimit values to use for
your job's container.

1. Choose Add limit.

2. For Limit name, choose a ulimit to apply.
3. For Soft limit, choose the soft limit to apply for the ulimit type.
4. For Hard limit, choose the hard limit to apply for the ulimit type.
V. (Optional) In the Environment variables section, you can specify environment
variables to pass to your job's container. This parameter maps to Env in the Create
a container section of the Docker Remote API and the --env option to docker run.

38
AWS Batch User Guide
Job deﬁnition template

Important
We don't recommend using plaintext environment variables for sensitive
information, such as credential data.

1. Choose Add environment variable.

2. For Key, specify the key for your environment variable.
Note
Environment variables must not start with AWS_BATCH; this naming
convention is reserved for variables that are set by the AWS Batch
service.
3. For Value, specify the value for your environment variable.
VI. (Optional) In the Security section, you can conﬁgure security options for your job's
container.

1. To give your job's container elevated privileges on the host instance (similar to
the root user), select Privileged. This parameter maps to Privileged in the
Create a container section of the Docker Remote API and the --privileged
option to docker run.
2. For User, enter the user name to use inside the container. This parameter
maps to User in the Create a container section of the Docker Remote API and
the --user option to docker run.
VII. (Optional) In the Linux Parameters section, you can conﬁgure any device
mappings to use for your job's container so that the container can access a device
on the host instance.

1. In the Devices section, choose Add device.

2. For Host path, specify the path of a device in the host instance.
3. For Container path, specify the path of in the container instance to expose the
device mapped to the host instance. If this is left blank then the host path is
used in the container.
4. For Permissions, choose one or more permissions to apply to the device in the
container. The available permissions are READ, WRITE, and MKNOD.
9. Return to Step 8.c.i (p. 36) and repeat for each node group to conﬁgure for your job.
10. (Optional) In the Parameters section, you can specify parameter substitution default values
and placeholders to use in the command that your job's container runs when it starts. For more
information, see Parameters (p. 44).

a. Choose Add parameter.

Job deﬁnition template

The following is an empty job definition template. You can use this template to create your job
definition, which can then be saved to a file and used with the AWS CLI --cli-input-json option. For
more information about these parameters, see Job definition parameters (p. 43).

{
"jobDefinitionName": "",

39
AWS Batch User Guide
Job deﬁnition template

"type": "container",
"parameters": {
"KeyName": ""
},
"containerProperties": {
"image": "",
"vcpus": 0,
"memory": 0,
"command": [
""
],
"jobRoleArn": "",
"executionRoleArn": "",
"volumes": [
{
"host": {
"sourcePath": ""
},
"name": "",
"efsVolumeConfiguration": {
"fileSystemId": "",
"rootDirectory": "",
"transitEncryption": "ENABLED",
"transitEncryptionPort": 0,
"authorizationConfig": {
"accessPointId": "",
"iam": "ENABLED"
}
}
}
],
"environment": [
{
"name": "",
"value": ""
}
],
"mountPoints": [
{
"containerPath": "",
"readOnly": true,
"sourceVolume": ""
}
],
"readonlyRootFilesystem": true,
"privileged": true,
"ulimits": [
{
"hardLimit": 0,
"name": "",
"softLimit": 0
}
],
"user": "",
"instanceType": "",
"resourceRequirements": [
{
"value": "",
"type": "VCPU"
}
],
"linuxParameters": {
"devices": [
{
"hostPath": "",
"containerPath": "",

40
AWS Batch User Guide
Job deﬁnition template

"permissions": [
"MKNOD"
]
}
],
"initProcessEnabled": true,
"sharedMemorySize": 0,
"tmpfs": [
{
"containerPath": "",
"size": 0,
"mountOptions": [
""
]
}
],
"maxSwap": 0,
"swappiness": 0
},
"logConfiguration": {
"logDriver": "json-file",
"options": {
"KeyName": ""
},
"secretOptions": [
{
"name": "",
"valueFrom": ""
}
]
},
"secrets": [
{
"name": "",
"valueFrom": ""
}
],
"networkConfiguration": {
"assignPublicIp": "ENABLED"
},
"fargatePlatformConfiguration": {
"platformVersion": ""
}
},
"nodeProperties": {
"numNodes": 0,
"mainNode": 0,
"nodeRangeProperties": [
{
"targetNodes": "",
"container": {
"image": "",
"vcpus": 0,
"memory": 0,
"command": [
""
],
"jobRoleArn": "",
"executionRoleArn": "",
"volumes": [
{
"host": {
"sourcePath": ""
},
"name": "",
"efsVolumeConfiguration": {

41
AWS Batch User Guide
Job deﬁnition template

"fileSystemId": "",
"rootDirectory": "",
"transitEncryption": "DISABLED",
"transitEncryptionPort": 0,
"authorizationConfig": {
"accessPointId": "",
"iam": "DISABLED"
}
}
}
],
"environment": [
{
"name": "",
"value": ""
}
],
"mountPoints": [
{
"containerPath": "",
"readOnly": true,
"sourceVolume": ""
}
],
"readonlyRootFilesystem": true,
"privileged": true,
"ulimits": [
{
"hardLimit": 0,
"name": "",
"softLimit": 0
}
],
"user": "",
"instanceType": "",
"resourceRequirements": [
{
"value": "",
"type": "GPU"
}
],
"linuxParameters": {
"devices": [
{
"hostPath": "",
"containerPath": "",
"permissions": [
"MKNOD"
]
}
],
"initProcessEnabled": true,
"sharedMemorySize": 0,
"tmpfs": [
{
"containerPath": "",
"size": 0,
"mountOptions": [
""
]
}
],
"maxSwap": 0,
"swappiness": 0
},
"logConfiguration": {

42
AWS Batch User Guide
Job deﬁnition parameters

"logDriver": "awslogs",
"options": {
"KeyName": ""
},
"secretOptions": [
{
"name": "",
"valueFrom": ""
}
]
},
"secrets": [
{
"name": "",
"valueFrom": ""
}
],
"networkConfiguration": {
"assignPublicIp": "DISABLED"
},
"fargatePlatformConfiguration": {
"platformVersion": ""
}
}
}
]
},
"retryStrategy": {
"attempts": 0,
"evaluateOnExit": [
{
"onStatusReason": "",
"onReason": "",
"onExitCode": "",
"action": "EXIT"
}
]
},
"propagateTags": true,
"timeout": {
"attemptDurationSeconds": 0
},
"tags": {
"KeyName": ""
},
"platformCapabilities": [
"FARGATE"
]
}

Note
You can generate the preceding job deﬁnition template with the following AWS CLI command:

$ aws batch register-job-definition --generate-cli-skeleton

Job deﬁnition parameters

Job definitions are split into four basic parts: the job definition name, the type of the job definition,
parameter substitution placeholder defaults, and the container properties for the job.

Contents

43
AWS Batch User Guide
Job deﬁnition name

• Job deﬁnition name (p. 44)

• Type (p. 44)
• Parameters (p. 44)
• Platform capabilities (p. 45)
• Propagate tags (p. 45)
• Container properties (p. 45)
• Node properties (p. 61)
• Retry strategy (p. 62)
• Tags (p. 64)
• Timeout (p. 64)

Job deﬁnition name

jobDefinitionName

When you register a job definition, you specify a name. The name can be up to 128 characters in
length. It can contain uppercase and lowercase letters, numbers, hyphens (-), and underscores (_).
The first job definition that's registered with that name is given a revision of 1. Any subsequent job
definitions that are registered with that name are given an incremental revision number.

Type: String

Required: Yes

Type
type

When you register a job deﬁnition, you specify the type of job. If the job runs on Fargate resources,
then multinode isn't supported. For more information about multi-node parallel jobs, see the
section called “Creating a multi-node parallel job deﬁnition” (p. 36).

Type: String

Valid values: container | multinode

Required: Yes

Parameters
parameters

When you submit a job, you can specify parameters that should replace the placeholders or override
the default job definition parameters. Parameters in job submission requests take precedence over
the defaults in a job definition. This means that you can use the same job definition for multiple jobs
that use the same format, and programmatically change values in the command at submission time.

Type: String to string map

Required: No

When you register a job deﬁnition, you can use parameter substitution placeholders in the command
ﬁeld of a job's container properties. For example:

44
AWS Batch User Guide
Platform capabilities

"command": [ "ffmpeg", "-i", "Ref::inputfile", "-c", "Ref::codec", "-o",

"Ref::outputfile" ]

In the above example, there are Ref::inputfile, Ref::codec, and Ref::outputfile

parameter substitution placeholders in the command. You can use the parameters object in the
job deﬁnition to set default values for these placeholders. For example, to set a default for the
Ref::codec placeholder, you specify the following in the job deﬁnition:

"parameters" : {"codec" : "mp4"}

When this job deﬁnition is submitted to run, the Ref::codec argument in the command for the
container is replaced with the default value, mp4.

Platform capabilities
platformCapabilities

The platform capabilities that's required by the job definition. If no value is specified, it defaults to
EC2. For jobs that run on Fargate resources, FARGATE is specified.

Type: String

Valid values: EC2 | FARGATE

Required: No

Propagate tags
propagateTags

Specifies whether to propagate the tags from the job or job definition to the corresponding Amazon
ECS task. If no value is specified, the tags aren't propagated. Tags can only be propagated to the
tasks when the task is created. For tags with the same name, job tags are given priority over job
definitions tags. If the total number of combined tags from the job and job definition is over 50, the
job's moved to the FAILED state.

Type: Boolean

Required: No

Container properties
When you register a job definition, you must specify a list of container properties that are passed to the
Docker daemon on a container instance when the job is placed. The following container properties are
allowed in a job definition. For single-node jobs, these container properties are set at the job definition
level. For multi-node parallel jobs, container properties are set in the Node properties (p. 61) level, for
each node group.

command

The command that's passed to the container. This parameter maps to Cmd in the Create a container
section of the Docker Remote API and the COMMAND parameter to docker run. For more information
about the Docker CMD parameter, see https://docs.docker.com/engine/reference/builder/#cmd.

45
AWS Batch User Guide
Container properties

"command": ["string", ...]

Type: String array

Required: No
environment

The environment variables to pass to a container. This parameter maps to Env in the Create a
container section of the Docker Remote API and the --env option to docker run.
Important
We don't recommend that you use plaintext environment variables for sensitive
information, such as credential data.
Note
Environment variables must not start with AWS_BATCH. This naming convention is reserved
for variables that are set by the AWS Batch service.

Type: Array of key-value pairs

Required: No
name

The name of the environment variable.

Type: String

Required: Yes, when environment is used.

value

The value of the environment variable.

Type: String

Required: Yes, when environment is used.

"environment" : [
{ "name" : "envName1", "value" : "envValue1" },
{ "name" : "envName2", "value" : "envValue2" }
]

executionRoleArn

When you register a job deﬁnition, you can specify an IAM role. The role provides the Amazon ECS
container agent with permissions to call the API actions that are speciﬁed in its associated policies
on your behalf. Jobs that are running on Fargate resources must provide an execution role. For more
information, see AWS Batch execution IAM role (p. 176).

Type: String

Required: No
fargatePlatformConfiguration

The platform conﬁguration for jobs that are running on Fargate resources. Jobs that are running on
EC2 resources must not specify this parameter.

Type: FargatePlatformConﬁguration object

46
AWS Batch User Guide
Container properties

Required: No
platformVersion

The AWS Fargate platform version use for the jobs, or LATEST to use a recent, approved version
of the AWS Fargate platform.

Type: String

Default: LATEST

Required: No
image

The image used to start a job. This string is passed directly to the Docker daemon. Images in
the Docker Hub registry are available by default. You can also specify other repositories with
repository-url/image:tag. Up to 255 letters (uppercase and lowercase), numbers, hyphens,
underscores, colons, periods, forward slashes, and number signs are allowed. This parameter maps
to Image in the Create a container section of the Docker Remote API and the IMAGE parameter of
docker run.
Note
Docker image architecture must match the processor architecture of the compute resources
that they're scheduled on. For example, ARM-based Docker images can only run on ARM-
based compute resources.
• Images in Amazon ECR Public repositories use the full registry/repository[:tag]
or registry/repository[@digest] naming conventions. For example,
public.ecr.aws/registry_alias/my-web-app:latest.
• Images in Amazon ECR repositories use the full registry/repository:[tag] naming
convention. For example, aws_account_id.dkr.ecr.region.amazonaws.com/my-web-
app:latest.
• Images in official repositories on Docker Hub use a single name (for example, ubuntu or mongo).
• Images in other repositories on Docker Hub are qualified with an organization name (for example,
amazon/amazon-ecs-agent).
• Images in other online repositories are qualified further by a domain name (for example,
quay.io/assemblyline/ubuntu).

Type: String

Required: Yes
instanceType

The instance type to use for a multi-node parallel job. All node groups in a multi-node parallel job
must use the same instance type. This parameter isn't valid for single-node container jobs or for jobs
running on Fargate resources.

Type: String

Required: No
jobRoleArn

When you register a job deﬁnition, you can specify an IAM role. The role provides the job container
with permissions to call the API actions that are speciﬁed in its associated policies on your behalf.
For more information, see IAM Roles for Tasks in the Amazon Elastic Container Service Developer
Guide.

Type: String

47
AWS Batch User Guide
Container properties

Required: No
linuxParameters

Linux-speciﬁc modiﬁcations that are applied to the container, such as details for device mappings.

"linuxParameters": {
"devices": [
{
"hostPath": "string",
"containerPath": "string",
"permissions": [
"READ", "WRITE", "MKNOD"
]
}
],
"initProcessEnabled": true|false,
"sharedMemorySize": 0,
"tmpfs": [
{
"containerPath": "string",
"size": integer,
"mountOptions": [
"string"
]
}
],
"maxSwap": integer,
"swappiness": integer
}

Type: LinuxParameters object

Required: No
devices

List of devices mapped into the container. This parameter maps to Devices in the Create a
container section of the Docker Remote API and the --device option to docker run.
Note
This parameter isn't applicable to jobs that are running on Fargate resources and
shouldn't be provided.

Type: Array of Device objects

Required: No
hostPath

Path where the device available in the host container instance is.

Type: String

Required: Yes
containerPath

Path where the device is exposed in the container is. If this isn't speciﬁed, the device is
exposed at the same path as the host path.

Type: String

Required: No

48
AWS Batch User Guide
Container properties

permissions

Permissions for the device in the container. If this isn't speciﬁed the permissions are set to
READ, WRITE, and MKNOD.

Type: Array of strings

Required: No

Valid values: READ | WRITE | MKNOD

initProcessEnabled

If true, run an init process inside the container that forwards signals and reaps processes.
This parameter maps to the --init option to docker run. This parameter requires version 1.25
of the Docker Remote API or greater on your container instance. To check the Docker Remote
API version on your container instance, log into your container instance and run the following
command: sudo docker version | grep "Server API version"

Type: Boolean

Required: No
maxSwap

The total amount of swap memory (in MiB) a job can use. This parameter is translated to the
--memory-swap option to docker run where the value is the sum of the container memory
plus the maxSwap value. For more information, see --memory-swap details in the Docker
documentation.

If a maxSwap value of 0 is speciﬁed, the container doesn't use swap. Accepted values are 0
or any positive integer. If the maxSwap parameter is omitted, the container uses the swap
conﬁguration for the container instance that it's running on. A maxSwap value must be set for
the swappiness parameter to be used.
Note
This parameter isn't applicable to jobs that are running on Fargate resources and
shouldn't be provided.

Type: Integer

Required: No
sharedMemorySize

The value for the size (in MiB) of the /dev/shm volume. This parameter maps to the --shm-
size option to docker run.
Note
This parameter isn't applicable to jobs running on Fargate resources and shouldn't be
provided.

Type: Integer

Required: No
swappiness

You can use this to tune a container's memory swappiness behavior. A swappiness value of
0 causes swapping to not happen unless absolutely necessary. A swappiness value of 100
causes pages to be swapped very aggressively. Accepted values are whole numbers between 0
and 100. If the swappiness parameter isn't speciﬁed, a default value of 60 is used. If a value

49
AWS Batch User Guide
Container properties

isn't speciﬁed for maxSwap, then this parameter is ignored. If maxSwap is set to 0, the container
doesn't use swap. This parameter maps to the --memory-swappiness option to docker run.

Consider the following when you use a per-container swap conﬁguration.

• Swap space must be enabled and allocated on the container instance for the containers to
use.
Note
The Amazon ECS optimized AMIs don't have swap enabled by default. You must
enable swap on the instance to use this feature. For more information, see Instance
Store Swap Volumes in the Amazon EC2 User Guide for Linux Instances or How do I
allocate memory to work as swap space in an Amazon EC2 instance by using a swap
file?.
• The swap space parameters are only supported for job definitions using EC2 resources.
• If the maxSwap and swappiness parameters are omitted from a job definition, each
container has a default swappiness value of 60 and the total swap usage is limited to two
times the memory reservation of the container.
Note
This parameter isn't applicable to jobs that are running on Fargate resources and
shouldn't be provided.

Type: Integer

Required: No
tmpfs

The container path, mount options, and size of the tmpfs mount.

Type: Array of Tmpfs objects

Note
This parameter isn't applicable to jobs that are running on Fargate resources and
shouldn't be provided.

Required: No
containerPath

The absolute ﬁle path in the container where the tmpfs volume is mounted.

Type: String

Required: Yes
mountOptions

The list of tmpfs volume mount options.

Type: Array of strings

Required: No

50
AWS Batch User Guide
Container properties

size

The size (in MiB) of the tmpfs volume.

Type: Integer

Required: Yes
logConfiguration

The log conﬁguration speciﬁcation for the job.

This parameter maps to LogConfig in the Create a container section of the Docker Remote API and
the --log-driver option to docker run. By default, containers use the same logging driver that
the Docker daemon uses. However the container can use a different logging driver than the Docker
daemon by specifying a log driver with this parameter in the container definition. To use a different
logging driver for a container, the log system must be either configured on the container instance or
on another log server to provide remote logging options. For more information about the options
for different supported log drivers, see Configure logging drivers in the Docker documentation.
Note
AWS Batch currently supports a subset of the logging drivers available to the Docker
daemon (shown in the LogConfiguration data type).

"logConfiguration": {
"devices": [
{
"logDriver": "string",
"options": {
"optionName1" : "optionValue1",
"optionName2" : "optionValue2"
}
"secretOptions": [
{
"name" : "secretOptionName1",
"valueFrom" : "secretOptionArn1"
},
{
"name" : "secretOptionName2",
"valueFrom" : "secretOptionArn2"
}
]
}
]
}

Type: LogConﬁguration object

Required: No
logDriver

The log driver to use for the job. By default, AWS Batch enables the awslogs log driver. The
valid values listed for this parameter are log drivers that the Amazon ECS container agent can
communicate with by default.

This parameter maps to LogConfig in the Create a container section of the Docker Remote
API and the --log-driver option to docker run. By default, jobs use the same logging driver

51
AWS Batch User Guide
Container properties

that the Docker daemon uses. However, the job can use a different logging driver than the
Docker daemon by specifying a log driver with this parameter in the job definition. If you want
to specify another logging driver for a job, then the log system must be configured on the
container instance in the compute environment. Or, alternatively, you should configure it on
another log server to provide remote logging options. For more information about the options
for different supported log drivers, see Configure logging drivers in the Docker documentation.
Note
AWS Batch currently supports a subset of the logging drivers that are available to the
Docker daemon. Additional log drivers might be available in future releases of the
Amazon ECS container agent.

The supported log drivers are awslogs, fluentd, gelf, json-file, journald, logentries,
syslog, and splunk.
Note
Jobs that are running on Fargate resources are restricted to the awslogs and splunk
log drivers.

This parameter requires version 1.18 of the Docker Remote API or greater on your container
instance. To check the Docker Remote API version on your container instance, log into your
container instance and run the following command: sudo docker version | grep
"Server API version"
Note
The Amazon ECS container agent that's running on a container instance
must register the logging drivers that are available on that instance with the
ECS_AVAILABLE_LOGGING_DRIVERS environment variable. Otherwise, the containers
placed on that instance can't use these log conﬁguration options. For more information,
see Amazon ECS Container Agent Conﬁguration in the Amazon Elastic Container Service
Developer Guide.
awslogs

Speciﬁes the Amazon CloudWatch Logs logging driver. For more information, see Using the
awslogs log driver (p. 64) and Amazon CloudWatch Logs logging driver in the Docker
documentation.
fluentd

Speciﬁes the Fluentd logging driver. For more information, including usage and options, see
Fluentd logging driver in the Docker documentation.
gelf

Speciﬁes the Graylog Extended Format (GELF) logging driver. For more information,
including usage and options, see Graylog Extended Format logging driver in the Docker
documentation.
journald

Speciﬁes the journald logging driver. For more information, including usage and options,
see Journald logging driver in the Docker documentation.
json-file

Speciﬁes the JSON ﬁle logging driver. For more information, including usage and options,
see JSON File logging driver in the Docker documentation.
splunk

Speciﬁes the Splunk logging driver. For more information, including usage and options, see
Splunk logging driver in the Docker documentation.

52
AWS Batch User Guide
Container properties

syslog

Speciﬁes the syslog logging driver. For more information, including usage and options, see
Syslog logging driver in the Docker documentation.

Type: String

Required: Yes

Valid values: awslogs | fluentd | gelf | journald | json-file | splunk | syslog

Note
If you have a custom driver that's not listed earlier that you would like to work with the
Amazon ECS container agent, you can fork the Amazon ECS container agent project
that's available on GitHub and customize it to work with that driver. We encourage you
to submit pull requests for changes that you would like to have included. However,
Amazon Web Services doesn't currently support that are running modiﬁed copies of
this software.
options

Log conﬁguration options to send to a log driver for the job.

This parameter requires version 1.19 of the Docker Remote API or greater on your container
instance.

Type: String to string map

Required: No
secretOptions

An object representing the secret to pass to the log conﬁguration. For more information, see
Specifying sensitive data (p. 67).

Type: object array

Required: No
name

The name of the log driver option to set in the job.

Type: String

Required: Yes
valueFrom

The ARN of the secret to expose to the log configuration of the container. The supported
values are either the full ARN of the Secrets Manager secret or the full ARN of the
parameter in the SSM Parameter Store.
Note
If the SSM Parameter Store parameter exists in the same Region as the task you're
launching, then you can use either the full ARN or name of the parameter. If the
parameter exists in a different Region, then the full ARN must be specified.

Type: String

Required: Yes
memory

This parameter is deprecated, use resourceRequirements (p. 55) instead.

53
AWS Batch User Guide
Container properties

The number of MiB of memory reserved for the job.

As an example for how to use resourceRequirements (p. 55), if your job deﬁnition contains
lines similar to this:

"containerProperties": {
"memory": 512
}

The equivalent lines using resourceRequirements (p. 55) is as follows.

"containerProperties": {
"resourceRequirements": [
{
"type": "MEMORY",
"value": "512"
}
]
}

Type: Integer

Required: Yes
mountPoints

The mount points for data volumes in your container. This parameter maps to Volumes in the
Create a container section of the Docker Remote API and the --volume option to docker run.

"mountPoints": [
{
"sourceVolume": "string",
"containerPath": "string",
"readOnly": true|false
}
]

Type: Object array

Required: No
sourceVolume

The name of the volume to mount.

Type: String

Required: Yes, when mountPoints is used.

containerPath

The path on the container where to mount the host volume.

Type: String

Required: Yes, when mountPoints is used.

readOnly

If this value is true, the container has read-only access to the volume. If this value is false,
then the container can write to the volume.

54
AWS Batch User Guide
Container properties

Type: Boolean

Required: No

Default: False
networkConfiguration

The network conﬁguration for jobs that are running on Fargate resources. Jobs that are running on
EC2 resources must not specify this parameter.

"networkConfiguration": {
"assignPublicIp": "string"
}

Type: Object array

Required: No
assignPublicIp

Indicates whether the job should have a public IP address. This is required if the job needs
outbound network access.

Type: String

Valid values: ENABLED | DISABLED

Required: No

Default: DISABLED
privileged

When this parameter is true, the container is given elevated permissions on the host container
instance (similar to the root user). This parameter maps to Privileged in the Create a container
section of the Docker Remote API and the --privileged option to docker run. This parameter isn't
applicable to jobs running on Fargate resources and shouldn't be provided, or speciﬁed as false.

"privileged": true|false

Type: Boolean

Required: No
readonlyRootFilesystem

When this parameter is true, the container is given read-only access to its root ﬁle system. This
parameter maps to ReadonlyRootfs in the Create a container section of the Docker Remote API
and the --read-only option to docker run.

"readonlyRootFilesystem": true|false

Type: Boolean

Required: No
resourceRequirements

The type and amount of a resource to assign to a container. The supported resources include GPU,
MEMORY, and VCPU.

55
AWS Batch User Guide
Container properties

"resourceRequirements" : [
{
"type": "GPU",
"value": "number"
}
]

Type: Object array

Required: No
type

The type of resource to assign to a container. The supported resources include GPU, MEMORY, and
VCPU.

Type: String

Required: Yes, when resourceRequirements is used.

value

The quantity of the speciﬁed resource to reserve for the container. The values vary based on the
type speciﬁed.
type="GPU"

The number of physical GPUs to reserve for the container. The number of GPUs reserved
for all containers in a job shouldn't exceed the number of available GPUs on the compute
resource that the job is launched on.
type="MEMORY"

The hard limit (in MiB) of memory to present to the container. If your container attempts to
exceed the memory specified here, the container is killed. This parameter maps to Memory
in the Create a container section of the Docker Remote API and the --memory option to
docker run. You must specify at least 4 MiB of memory for a job. This is required but can be
specified in several places for multi-node parallel (MNP) jobs. It must be specified for each
node at least once. This parameter maps to Memory in the Create a container section of the
Docker Remote API and the --memory option to docker run.
Note
If you're trying to maximize your resource utilization by providing your jobs as
much memory as possible for a particular instance type, see Compute Resource
Memory Management (p. 114).

For jobs that are running on Fargate resources, then value must match one of the
supported values. Moreover, the VCPU values must be one of the values supported for that
memory value.

VCPU MEMORY

0.25 vCPU 512, 1024, and 2048 MiB

0.5 vCPU 1024, 2048, 3072, and 4096 MiB

1 vCPU 2048, 3072, 4096, 5120, 6144, 7168, and 8192 MiB

2 vCPU 4096, 5120, 6144, 7168, 8192, 9216, 10240, 11264,

12288, 13312, 14336, 15360, and 16384 MiB

56
AWS Batch User Guide
Container properties

VCPU MEMORY

4 vCPU 8192, 9216, 10240, 11264, 12288, 13312, 14336, 15360,

16384, 17408, 18432, 19456, 20480, 21504, 22528,
23552, 24576, 25600, 26624, 27648, 28672, 29696, and
30720 MiB

type="VCPU"

The number of vCPUs reserved for the job. This parameter maps to CpuShares in the
Create a container section of the Docker Remote API and the --cpu-shares option to
docker run. Each vCPU is equivalent to 1,024 CPU shares. For jobs that are running on
EC2 resources, you must specify at least one vCPU. This is required but can be speciﬁed in
several places. It must be speciﬁed for each node at least once.

For jobs that are running on Fargate resources, then value must match one of the
supported values and the MEMORY values must be one of the values supported for that
VCPU value. The supported values are 0.25, 0.5, 1, 2, and 4.

Type: String

Required: Yes, when resourceRequirements is used.

secrets

The secrets for the job that are exposed as environment variables. For more information, see
Specifying sensitive data (p. 67).

"secrets": [
{
"name": "secretName1",
"valueFrom": "secretArn1"
},
{
"name": "secretName2",
"valueFrom": "secretArn2"
}
...
]

Type: Object array

Required: No
name

The name of the environment variable that contains the secret.

Type: String

Required: Yes, when secrets is used.

valueFrom

The secret to expose to the container. The supported values are either the full ARN of the
Secrets Manager secret or the full ARN of the parameter in the SSM Parameter Store.
Note
If the SSM Parameter Store parameter exists in the same Region as the job you're
launching, then you can use either the full ARN or name of the parameter. If the
parameter exists in a diﬀerent Region, then the full ARN must be speciﬁed.

57
AWS Batch User Guide
Container properties

Type: String

Required: Yes, when secrets is used.

ulimits

A list of ulimits values to set in the container. This parameter maps to Ulimits in the Create a
container section of the Docker Remote API and the --ulimit option to docker run.

"ulimits": [
{
"name": string,
"softLimit": integer,
"hardLimit": integer
}
...
]

Type: Object array

Required: No
name

The type of the ulimit.

Type: String

Required: Yes, when ulimits is used.

hardLimit

The hard limit for the ulimit type.

Type: Integer

Required: Yes, when ulimits is used.

softLimit

The soft limit for the ulimit type.

Type: Integer

Required: Yes, when ulimits is used.

user

The user name to use inside the container. This parameter maps to User in the Create a container
section of the Docker Remote API and the --user option to docker run.

"user": "string"

Type: String

Required: No
vcpus

This parameter is deprecated, use resourceRequirements (p. 55) instead.

The number of vCPUs reserved for the container.

58
AWS Batch User Guide
Container properties

As an example for how to use resourceRequirements, if your job deﬁnition contains lines similar
to this:

"containerProperties": {
"vcpus": 2
}

The equivalent lines using resourceRequirements (p. 55) is as follows.

"containerProperties": {
"resourceRequirements": [
{
"type": "VCPU",
"value": "2"
}
]
}

Type: Integer

Required: Yes
volumes

When you register a job deﬁnition, you can specify a list of volumes that are passed to the Docker
daemon on a container instance. The following parameters are allowed in the container properties:

"volumes": [
{
"name": "string",
"host": {
"sourcePath": "string"
},
"efsVolumeConfiguration": {
"authorizationConfig": {
"accessPointId": "string",
"iam": "string"
},
"fileSystemId": "string",
"rootDirectory": "string",
"transitEncryption": "string",
"transitEncryptionPort": number
}
}
]

name

The name of the volume. Up to 255 letters (uppercase and lowercase), numbers, hyphens, and
underscores are allowed. This name is referenced in the sourceVolume parameter of container
deﬁnition mountPoints.

Type: String

Required: No
host

The contents of the host parameter determine whether your data volume persists on the
host container instance and where it's stored. If the host parameter is empty, then the Docker
daemon assigns a host path for your data volume. However, the data isn't guaranteed to persist
after the container associated with it stops running.

59
AWS Batch User Guide
Container properties

Note
This parameter isn't applicable to jobs that are running on Fargate resources and
shouldn't be provided.

Type: Object

Required: No
sourcePath

The path on the host container instance that's presented to the container. If this parameter
is empty, then the Docker daemon assigns a host path for you.

If the host parameter contains a sourcePath ﬁle location, then the data volume persists
at the speciﬁed location on the host container instance until you delete it manually. If the
sourcePath value doesn't exist on the host container instance, the Docker daemon creates
it. If the location does exist, the contents of the source path folder are exported.

Type: String

Required: No
efsVolumeConfiguration

This parameter is speciﬁed when you're using an Amazon Elastic File System ﬁle system for task
storage. For more information, see Amazon EFS Volumes in the AWS Batch User Guide.

Type: Object

Required: No
authorizationConfig

The authorization conﬁguration details for the Amazon EFS ﬁle system.

Type: String

Required: No
accessPointId

The Amazon EFS access point ID to use. If an access point is speciﬁed, the root directory
value that's speciﬁed in the EFSVolumeConfiguration must either be omitted or
set to /. This enforces the path that's set on the EFS access point. If an access point is
used, transit encryption must be enabled in the EFSVolumeConfiguration. For more
information, see Working with Amazon EFS Access Points in the Amazon Elastic File
System User Guide.

Type: String

Required: No
iam

Determines whether to use the AWS Batch job IAM role defined in a job definition when
mounting the Amazon EFS file system. If enabled, transit encryption must be enabled
in the EFSVolumeConfiguration. If this parameter is omitted, the default value of
DISABLED is used. For more information, see Using Amazon EFS Access Points in the
AWS Batch User Guide.

Type: String

Valid values: ENABLED | DISABLED

60
AWS Batch User Guide
Node properties

Required: No
fileSystemId

The Amazon EFS ﬁle system ID to use.

Type: String

Required: No
rootDirectory

The directory within the Amazon EFS file system to mount as the root directory inside
the host. If this parameter is omitted, the root of the Amazon EFS volume is used. If you
specify /, it has the same effect as omitting this parameter. The maximum length is 4,096
characters.
Important
If an EFS access point is specified in the authorizationConfig, the root
directory parameter must either be omitted or set to /. This enforces the path
that's set on the Amazon EFS access point.

Type: String

Required: No
transitEncryption

Determines whether to enable encryption for Amazon EFS data in transit between the
Amazon ECS host and the Amazon EFS server. Transit encryption must be enabled if
Amazon EFS IAM authorization is used. If this parameter is omitted, the default value of
DISABLED is used. For more information, see Encrypting data in transit in the Amazon
Elastic File System User Guide.

Type: String

Valid values: ENABLED | DISABLED

Required: No
transitEncryptionPort

The port to use when sending encrypted data between the Amazon ECS host and the
Amazon EFS server. If you don't specify a transit encryption port, it uses the port selection
strategy that the Amazon EFS mount helper uses. The value must be between 0 and 65,535.
For more information, see EFS Mount Helper in the Amazon Elastic File System User Guide.

Type: Integer

Required: No

Node properties
nodeProperties

When you register a multi-node parallel job definition, you must specify a list of node properties.
These node properties should define the number of nodes to use in your job, the main node index,
and the different node ranges to use. If the job runs on Fargate resources, then you can't specify
nodeProperties. Rather, you should use containerProperties instead. The following node
properties are allowed in a job definition. For more information, see Multi-node Parallel Jobs (p. 27).

Type: NodeProperties object

61
AWS Batch User Guide
Retry strategy

Required: No
mainNode

Speciﬁes the node index for the main node of a multi-node parallel job. This node index value
must be smaller than the number of nodes.

Type: Integer

Required: Yes
numNodes

The number of nodes that are associated with a multi-node parallel job.

Type: Integer

Required: Yes
nodeRangeProperties

A list of node ranges and their properties that are associated with a multi-node parallel job.

Type: Array of NodeRangeProperty objects

Required: Yes
targetNodes

The range of nodes, using node index values. A range of 0:3 indicates nodes with index
values of 0 through 3. If the starting range value is omitted (:n), then 0 is used to start
the range. If the ending range value is omitted (n:), then the highest possible node index
is used to end the range. Your accumulative node ranges must account for all nodes
(0:n). You can nest node ranges, for example 0:10 and 4:5. For this case, the 4:5 range
properties override the 0:10 properties.

Type: String

Required: No
container

The container details for the node range. For more information, see Container
properties (p. 45).

Type: ContainerProperties object

Required: No

Retry strategy
retryStrategy

When you register a job definition, you can optionally specify a retry strategy to use for failed jobs
that are submitted with this job definition. Any retry strategy that's specified during a SubmitJob
operation overrides the retry strategy defined here. By default, each job is attempted one time. If
you specify more than one attempt, the job is retried if it fails. Examples of a fail attempt include the
job returns a non-zero exit code or the container instance is terminated. For more information, see
Automated job retries.

Type: RetryStrategy object

Required: No

62
AWS Batch User Guide
Retry strategy

attempts

The number of times to move a job to the RUNNABLE status. You can specify between 1 and 10
attempts. If attempts is greater than one, the job is retried that many times if it fails, until it
has moved to RUNNABLE.

"attempts": integer

Type: Integer

Required: No
evaluateOnExit

Array of up to 5 objects that specify conditions under which the job should be retried or failed. If
this parameter is speciﬁed, then the attempts parameter must also be speciﬁed.

"evaluateOnExit": [
{
"action": "string",
"onExitCode": "string",
"onReason": "string",
"onStatusReason": "string"
}
]

Type: Array of EvaluateOnExit objects

Required: No
action

Speciﬁes the action to take if all of the speciﬁed conditions (onStatusReason, onReason,
and onExitCode) are met. The values aren't case sensitive.

Type: String

Required: Yes

Valid values: RETRY | EXIT

onExitCode

Contains a glob pattern to match against the decimal representation of the ExitCode
that's returned for a job. The pattern can be up to 512 characters in length. It can contain
only numbers. It cannot contain letters or special characters. It can optionally end with an
asterisk (*) so that only the start of the string needs to be an exact match.

Type: String

Required: No
onReason

Contains a glob pattern to match against the Reason that's returned for a job. The pattern
can be up to 512 characters in length. It can contain letters, numbers, periods (.), colons
(:), and white space (spaces, tabs). It can optionally end with an asterisk (*) so that only the
start of the string needs to be an exact match.

Type: String

Required: No

63
AWS Batch User Guide
Tags

onStatusReason

Contains a glob pattern to match against the StatusReason that's returned for a job. The
pattern can be up to 512 characters in length. It can contain letters, numbers, periods (.),
colons (:), and white space (spaces, tabs). It can optionally end with an asterisk (*) so that
only the start of the string needs to be an exact match.

Type: String

Required: No

Tags
tags

Key-value pair tags to associate with the job deﬁnition. For more information, see Tagging your AWS
Batch resources (p. 197).

Type: String to string map

Required: No

Timeout
timeout

You can configure a timeout duration for your jobs so that if a job runs longer than that, AWS Batch
terminates the job. For more information, see Job Timeouts (p. 19). If a job is terminated due to a
timeout, it isn't retried. Any timeout configuration that's specified during a SubmitJob operation
overrides the timeout configuration defined here. For more information, see Job Timeouts (p. 19).

Type: JobTimeout object

Required: No
attemptDurationSeconds

The time duration in seconds (measured from the job attempt's startedAt timestamp) after
AWS Batch terminates unﬁnished jobs. The minimum value for the timeout is 60 seconds.

Type: Integer

Required: No

Using the awslogs log driver

By default, AWS Batch enables the awslogs log driver to send log information to CloudWatch Logs. You
can use this feature to view different logs from your containers in one convenient location and prevent
your container logs from taking up disk space on your container instances. This topic helps you configure
the awslogs log driver in your job definitions.
Note
The type of information that's logged by the containers in your job depends mostly on their
ENTRYPOINT command. By default, the logs that are captured show the command output
that you normally see in an interactive terminal if you ran the container locally, which are the
STDOUT and STDERR I/O streams. The awslogs log driver simply passes these logs from Docker

64
AWS Batch User Guide
Available awslogs log driver options

to CloudWatch Logs. For more information about how Docker logs are processed, including
alternative ways to capture diﬀerent ﬁle data or streams, see View logs for a container or service
in the Docker documentation.

To send system logs from your container instances to CloudWatch Logs, see Using CloudWatch Logs
with AWS Batch (p. 159). For more information about CloudWatch Logs, see Monitoring Log Files and
CloudWatch Logs quotas in the Amazon CloudWatch Logs User Guide.

Available awslogs log driver options

The awslogs log driver supports the following options in AWS Batch job deﬁnitions. For more
information, see CloudWatch Logs logging driver in the Docker documentation.

awslogs-region

Required: No

Specify the Region where the awslogs log driver should send your Docker logs. By default, the
Region that's used is the same one as the one for the job. You can choose to send all of your logs
from jobs in different Regions to a single Region in CloudWatch Logs. Doing this allows them to
be visible all from one location. Alternatively, you can separate them by Region for more granular
approach. However, when you choose this option, make sure that the specified log groups exists in
the Region that you specified.
awslogs-group

Required: Optional

With the awslogs-group option, you can specify the log group that the awslogs log driver sends
its log streams to. If this isn't speciﬁed, aws/batch/job is used.
awslogs-stream-prefix

Required: Optional

With the awslogs-stream-prefix option, you can associate a log stream with the specified
prefix, and the Amazon ECS task ID of the AWS Batch job that the container belongs to. If you
specify a prefix with this option, then the log stream takes the following format:

prefix-name/default/ecs-task-id

awslogs-datetime-format

Required: No

This option deﬁnes a multiline start pattern in Python strftime format. A log message consists
of a line that matches the pattern and any following lines that don't match the pattern. Thus the
matched line is the delimiter between log messages.

One example of a use case for using this format is for parsing output such as a stack dump, which
might otherwise be logged in multiple entries. The correct pattern allows it to be captured in a
single entry.

For more information, see awslogs-datetime-format.

This option always takes precedence if both awslogs-datetime-format and awslogs-

multiline-pattern are conﬁgured.
Note
Multiline logging performs regular expression parsing and matching of all log messages.
This may have a negative impact on logging performance.

65
AWS Batch User Guide
Specifying a log conﬁguration in your job deﬁnition

awslogs-multiline-pattern

Required: No

This option deﬁnes a multiline start pattern using a regular expression. A log message consists of
a line that matches the pattern and any following lines that don't match the pattern. Thus, the
matched line is the delimiter between log messages.

For more information, see awslogs-multiline-pattern in the Docker documentation.

This option is ignored if awslogs-datetime-format is also conﬁgured.

Note
Multiline logging performs regular expression parsing and matching of all log messages.
This might have a negative impact on logging performance.
awslogs-create-group

Required: No

Specify whether you want the log group automatically created. If this option isn't speciﬁed, it
defaults to false.
Warning
This option isn't recommended. We recommend that you create the log group in advance
using the CloudWatch Logs CreateLogGroup API action as each job tries to create the log
group, increasing the chance that the job fails.
Note
The IAM policy for your execution role must include the logs:CreateLogGroup
permission before you attempt to use awslogs-create-group.

Specifying a log conﬁguration in your job deﬁnition

By default, AWS Batch enables the awslogs log driver. This section describes how to customize the
awslogs log conﬁguration for a job. For more information, see Creating a job deﬁnition (p. 31).

The following log configuration JSON snippets have a logConfiguration object specified for each job.
One is for a WordPress job that sends logs to a log group called awslogs-wordpress, and another is
for a MySQL container that sends logs to a log group called awslogs-mysql. Both containers use the
awslogs-example log stream prefix.

"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "awslogs-wordpress",
"awslogs-stream-prefix": "awslogs-example"
}
}

"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "awslogs-mysql",
"awslogs-stream-prefix": "awslogs-example"
}
}

In the AWS Batch console, the log configuration for the wordpress job definition is specified as shown
in the following image.

66
AWS Batch User Guide
Specifying sensitive data

After you have registered a task definition with the awslogs log driver in a job definition log
configuration, you can submit a job with that job definition to start sending logs to CloudWatch Logs.
For more information, see Submitting a Job (p. 14).

Specifying sensitive data

With AWS Batch, you can inject sensitive data into your jobs by storing your sensitive data in either AWS
Secrets Manager secrets or AWS Systems Manager Parameter Store parameters, and then reference them
in your job deﬁnition.

Secrets can be exposed to a job in the following ways:

• To inject sensitive data into your containers as environment variables, use the secrets job definition
parameter.
• To reference sensitive information in the log configuration of a job, use the secretOptions job
definition parameter.

Topics
• Specifying sensitive data using Secrets Manager (p. 67)
• Specifying sensitive data using Systems Manager Parameter Store (p. 73)

Specifying sensitive data using Secrets Manager

With AWS Batch, you can inject sensitive data into your jobs by storing your sensitive data in AWS Secrets
Manager secrets and then referencing them in your job deﬁnition. Sensitive data stored in Secrets
Manager secrets can be exposed to a job as environment variables or as part of the log conﬁguration.

When you inject a secret as an environment variable, you can specify a JSON key or version of a secret to
inject. This process helps you control the sensitive data exposed to your job. For more information about

67
AWS Batch User Guide
Using Secrets Manager

secret versioning, see Key Terms and Concepts for AWS Secrets Manager in the AWS Secrets Manager User
Guide.

Considerations for specifying sensitive data using Secrets

Manager
The following should be considered when using Secrets Manager to specify sensitive data for jobs.

• To inject a secret using a speciﬁc JSON key or version of a secret, the container instance in your
compute environment must have version 1.37.0 or later of the Amazon ECS container agent installed.
However, we recommend using the latest container agent version. For information about checking
your agent version and updating to the latest version, see Updating the Amazon ECS container agent
in the Amazon Elastic Container Service Developer Guide.

To inject the full contents of a secret as an environment variable or to inject a secret in a log
conﬁguration, your container instance must have version 1.22.0 or later of the container agent.
• Only secrets that store text data, which are secrets created with the SecretString parameter of the
CreateSecret API, are supported. Secrets that store binary data, which are secrets created with the
SecretBinary parameter of the CreateSecret API aren't supported.
• When using a job deﬁnition that references Secrets Manager secrets to retrieve sensitive data for your
jobs, if you're also using interface VPC endpoints, you must create the interface VPC endpoints for
Secrets Manager. For more information, see Using Secrets Manager with VPC Endpoints in the AWS
Secrets Manager User Guide.
• Sensitive data is injected into your job when the job is initially started. If the secret is subsequently
updated or rotated, the job doesn't receive the updated value automatically. You must launch a new
job to force the service to launch a fresh job with the updated secret value.

Required IAM permissions for AWS Batch secrets

To use this feature, you must have the execution role and reference it in your job deﬁnition. This allows
the container agent to pull the necessary Secrets Manager resources. For more information, see AWS
Batch execution IAM role (p. 176).

To provide access to the Secrets Manager secrets that you create, manually add the following
permissions as an inline policy to the execution role. For more information, see Adding and Removing
IAM Policies in the IAM User Guide.

• secretsmanager:GetSecretValue–Required if you're referencing a Secrets Manager secret.

• kms:Decrypt–Required only if your secret uses a custom KMS key and not the default key. The ARN
for your custom key should be added as a resource.

The following example inline policy adds the required permissions.

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"secretsmanager:GetSecretValue",
"kms:Decrypt"
],
"Resource": [
"arn:aws:secretsmanager:<region>:<aws_account_id>:secret:<secret_name>",
"arn:aws:kms:<region>:<aws_account_id>:key/<key_id>"
]

68
AWS Batch User Guide
Using Secrets Manager

}
]
}

Injecting sensitive data as an environment variable

Within your job deﬁnition, you can specify the following items:

• The secrets object containing the name of the environment variable to set in the job
• The Amazon Resource Name (ARN) of the Secrets Manager secret
• Additional parameters that contain the sensitive data to present to the job

The following example shows the full syntax that must be speciﬁed for the Secrets Manager secret.

arn:aws:secretsmanager:region:aws_account_id:secret:secret-name

The following section describes the additional parameters. These parameters are optional. However,
if you don't use them, you must include the colons : to use the default values. Examples are provided
below for more context.

json-key

Speciﬁes the name of the key in a key-value pair with the value that you want to set as the
environment variable value. Only values in JSON format are supported. If you don't specify a JSON
key, then the full contents of the secret is used.
version-stage

Specifies the staging label of the version of a secret that you want to use. If a version staging label
is specified, you can't specify a version ID. If no version stage is specified, the default behavior is to
retrieve the secret with the AWSCURRENT staging label.

Staging labels are used to keep track of diﬀerent versions of a secret when they are either updated
or rotated. Each version of a secret has one or more staging labels and an ID. For more information,
see Key Terms and Concepts for AWS Secrets Manager in the AWS Secrets Manager User Guide.
version-id

Specifies the unique identifier of the version of a secret that you want to use. If a version ID is
specified, you can't specify a version staging label. If no version ID is specified, the default behavior
is to retrieve the secret with the AWSCURRENT staging label.

Version IDs are used to keep track of diﬀerent versions of a secret when they are either updated or
rotated. Each version of a secret has an ID. For more information, see Key Terms and Concepts for
AWS Secrets Manager in the AWS Secrets Manager User Guide.

Example container deﬁnitions

The following examples show ways that you can reference Secrets Manager secrets in your container
deﬁnitions.

Example referencing a full secret

The following is a snippet of a task deﬁnition showing the format when referencing the full text of a
Secrets Manager secret.

69
AWS Batch User Guide
Using Secrets Manager

"containerProperties": [{
"secrets": [{
"name": "environment_variable_name",
"valueFrom": "arn:aws:secretsmanager:region:aws_account_id:secret:secret_name-AbCdEf"
}]
}]
}

Example referencing a speciﬁc key within a secret

The following shows an example output from a get-secret-value command that displays the contents of
a secret along with the version staging label and version ID associated with it.

{
"ARN": "arn:aws:secretsmanager:region:aws_account_id:secret:appauthexample-AbCdEf",
"Name": "appauthexample",
"VersionId": "871d9eca-18aa-46a9-8785-981dd39ab30c",
"SecretString": "{\"username1\":\"password1\",\"username2\":\"password2\",
\"username3\":\"password3\"}",
"VersionStages": [
"AWSCURRENT"
],
"CreatedDate": 1581968848.921
}

Reference a speciﬁc key from the previous output in a container deﬁnition by specifying the key name at
the end of the ARN.

{
"containerProperties": [{
"secrets": [{
"name": "environment_variable_name",
"valueFrom": "arn:aws:secretsmanager:region:aws_account_id:secret:appauthexample-
AbCdEf:username1::"
}]
}]
}

Example referencing a speciﬁc secret version

The following shows an example output from a describe-secret command that displays the unencrypted
contents of a secret along with the metadata for all versions of the secret.

{
"ARN": "arn:aws:secretsmanager:region:aws_account_id:secret:appauthexample-AbCdEf",
"Name": "appauthexample",
"Description": "Example of a secret containing application authorization data.",
"RotationEnabled": false,
"LastChangedDate": 1581968848.926,
"LastAccessedDate": 1581897600.0,
"Tags": [],
"VersionIdsToStages": {
"871d9eca-18aa-46a9-8785-981dd39ab30c": [
"AWSCURRENT"
],
"9d4cb84b-ad69-40c0-a0ab-cead36b967e8": [
"AWSPREVIOUS"
]
}
}

70
AWS Batch User Guide
Using Secrets Manager

Reference a speciﬁc version staging label from the previous output in a container deﬁnition by specifying
the key name at the end of the ARN.

{
"containerProperties": [{
"secrets": [{
"name": "environment_variable_name",
"valueFrom": "arn:aws:secretsmanager:region:aws_account_id:secret:appauthexample-
AbCdEf::AWSPREVIOUS:"
}]
}]
}

Reference a speciﬁc version ID from the previous output in a container deﬁnition by specifying the key
name at the end of the ARN.

{
"containerProperties": [{
"secrets": [{
"name": "environment_variable_name",
"valueFrom": "arn:aws:secretsmanager:region:aws_account_id:secret:appauthexample-
AbCdEf::9d4cb84b-ad69-40c0-a0ab-cead36b967e8"
}]
}]
}

Example referencing a speciﬁc key and version staging label of a secret

The following shows how to reference both a speciﬁc key within a secret and a speciﬁc version staging
label.

{
"containerProperties": [{
"secrets": [{
"name": "environment_variable_name",
"valueFrom": "arn:aws:secretsmanager:region:aws_account_id:secret:appauthexample-
AbCdEf:username1:AWSPREVIOUS:"
}]
}]
}

To specify a speciﬁc key and version ID, use the following syntax.

Injecting sensitive data in a log conﬁguration

Within your job deﬁnition, when specifying a logConfiguration you can specify secretOptions
with the name of the log driver option to set in the container and the full ARN of the Secrets Manager
secret containing the sensitive data to present to the container.

71
AWS Batch User Guide
Using Secrets Manager

The following is a snippet of a job deﬁnition showing the format when referencing an Secrets Manager
secret.

{
"containerProperties": [{
"logConfiguration": [{
"logDriver": "splunk",
"options": {
"splunk-url": "https://cloud.splunk.com:8080"
},
"secretOptions": [{
"name": "splunk-token",
"valueFrom": "arn:aws:secretsmanager:region:aws_account_id:secret:secret_name-
AbCdEf"
}]
}]
}]
}

Creating an AWS Secrets Manager secret

You can use the Secrets Manager console to create a secret for your sensitive data. For more information,
see Creating a Basic Secret in the AWS Secrets Manager User Guide.

To create a basic secret

Use Secrets Manager to create a secret for your sensitive data.

1. Open the Secrets Manager console at https://console.aws.amazon.com/secretsmanager/.

2. Choose Store a new secret.
3. For Select secret type, choose Other type of secrets.
4. Specify the details of your custom secret as Key and Value pairs. For example, you can specify a key
of UserName, and then supply the appropriate user name as its value. Add a second key with the
name of Password and the password text as its value. You could also add entries for a database
name, server address, or TCP port. You can add as many pairs as you need to store the information
you require.

Alternatively, you can choose the Plaintext tab and enter the secret value in any way you like.
5. Choose the AWS KMS encryption key that you want to use to encrypt the protected text in the
secret. If you don't choose one, Secrets Manager checks to see if there's a default key for the
account, and uses it if it exists. If a default key doesn't exist, Secrets Manager creates one for you
automatically. You can also choose Add new key to create a custom CMK speciﬁcally for this secret.
To create your own AWS KMS CMK, you must have permissions to create CMKs in your account.
6. Choose Next.
7. For Secret name, type an optional path and name, such as production/MyAwesomeAppSecret
or development/TestSecret, and choose Next. You can optionally add a description to help you
remember the purpose of this secret later.

The secret name must be ASCII letters, digits, or any of the following characters: /_+=.@-
8. (Optional) At this point, you can conﬁgure rotation for your secret. For this procedure, leave it at
Disable automatic rotation and choose Next.

For information about how to conﬁgure rotation on new or existing secrets, see Rotating Your AWS
Secrets Manager Secrets.
9. Review your settings, and then choose Store secret to save everything you entered as a new secret
in Secrets Manager.

72
AWS Batch User Guide
Using Systems Manager Parameter Store

Specifying sensitive data using Systems Manager

Parameter Store
With AWS Batch, you can inject sensitive data into your containers by storing your sensitive data in AWS
Systems Manager Parameter Store parameters and then referencing them in your container deﬁnition.

Topics
• Considerations for specifying sensitive data using Systems Manager Parameter Store (p. 73)
• Required IAM permissions for AWS Batch secrets (p. 73)
• Injecting sensitive data as an environment variable (p. 74)
• Injecting sensitive data in a log conﬁguration (p. 74)
• Creating an AWS Systems Manager Parameter Store parameter (p. 75)

Considerations for specifying sensitive data using Systems

Manager Parameter Store
The following should be considered when specifying sensitive data for containers using Systems
Manager Parameter Store parameters.

• This feature requires that your container instance have version 1.22.0 or later of the container agent.
However, we recommend using the latest container agent version. For information about checking
your agent version and updating to the latest version, see Updating the Amazon ECS container agent
in the Amazon Elastic Container Service Developer Guide.
• Sensitive data is injected into the container for your job when the container is initially started. If the
secret or Parameter Store parameter is subsequently updated or rotated, the container doesn't receive
the updated value automatically. You must launch a new job to force the launch of a fresh job with
updated secrets.

Required IAM permissions for AWS Batch secrets

To use this feature, you must have the execution role and reference it in your job deﬁnition. This allows
the Amazon ECS container agent to pull the necessary AWS Systems Manager resources. For more
information, see AWS Batch execution IAM role (p. 176).

To provide access to the AWS Systems Manager Parameter Store parameters that you create, manually
add the following permissions as an inline policy to the execution role. For more information, see Adding
and Removing IAM Policies in the IAM User Guide.

• ssm:GetParameters—Required if you're referencing a Systems Manager Parameter Store parameter

in a task deﬁnition.
• secretsmanager:GetSecretValue—Required if you're referencing a Secrets Manager secret either
directly or if your Systems Manager Parameter Store parameter is referencing a Secrets Manager secret
in a task deﬁnition.
• kms:Decrypt—Required only if your secret uses a custom KMS key and not the default key. The ARN
for your custom key should be added as a resource.

The following example inline policy adds the required permissions:

{
"Version": "2012-10-17",

73
AWS Batch User Guide
Using Systems Manager Parameter Store

"Statement": [
{
"Effect": "Allow",
"Action": [
"ssm:GetParameters",
"secretsmanager:GetSecretValue",
"kms:Decrypt"
],
"Resource": [
"arn:aws:ssm:<region>:<aws_account_id>:parameter/<parameter_name>",
"arn:aws:secretsmanager:<region>:<aws_account_id>:secret:<secret_name>",
"arn:aws:kms:<region>:<aws_account_id>:key/<key_id>"
]
}
]
}

Injecting sensitive data as an environment variable

Within your container deﬁnition, specify secrets with the name of the environment variable to set
in the container and the full ARN of the Systems Manager Parameter Store parameter containing the
sensitive data to present to the container.

The following is a snippet of a task definition showing the format when referencing an Systems Manager
Parameter Store parameter. If the Systems Manager Parameter Store parameter exists in the same
Region as the task that you're launching, then you can use either the full ARN or name of the parameter.
If the parameter exists in a different Region, then the full ARN must be specified.

{
"containerProperties": [{
"secrets": [{
"name": "environment_variable_name",
"valueFrom": "arn:aws:ssm:region:aws_account_id:parameter/parameter_name"
}]
}]
}

Injecting sensitive data in a log conﬁguration

Within your container definition, when specifying a logConfiguration you can specify
secretOptions with the name of the log driver option to set in the container and the full ARN of the
Systems Manager Parameter Store parameter containing the sensitive data to present to the container.
Important
If the Systems Manager Parameter Store parameter exists in the same Region as the task you're
launching, then you can use either the full ARN or name of the parameter. If the parameter
exists in a different Region, then the full ARN must be specified.

The following is a snippet of a task deﬁnition showing the format when referencing an Systems Manager
Parameter Store parameter.

{
"containerProperties": [{
"logConfiguration": [{
"logDriver": "fluentd",
"options": {
"tag": "fluentd demo"
},
"secretOptions": [{

74
AWS Batch User Guide
Amazon EFS volumes

"name": "fluentd-address",
"valueFrom": "arn:aws:ssm:region:aws_account_id:parameter:parameter_name"
}]
}]
}]
}

Creating an AWS Systems Manager Parameter Store parameter

You can use the AWS Systems Manager console to create a Systems Manager Parameter Store parameter
for your sensitive data. For more information, see Walkthrough: Create and Use a Parameter in a
Command (Console) in the AWS Systems Manager User Guide.

To create a Parameter Store parameter

1. Open the AWS Systems Manager console at https://console.aws.amazon.com/systems-manager/.

2. In the navigation pane, choose Parameter Store, Create parameter.
3. For Name, type a hierarchy and a parameter name. For example, type test/database_password.
4. For Description, type an optional description.
5. For Type, choose String, StringList, or SecureString.
Note

• If you choose SecureString, the KMS Key ID ﬁeld appears. If you don't provide a KMS
CMK ID, a KMS CMK ARN, an alias name, or an alias ARN, then the system uses alias/
aws/ssm. This is the default KMS CMK for Systems Manager. To avoid using this key,
choose a custom key. For more information, see Use Secure String Parameters in the AWS
Systems Manager User Guide.
• When you create a secure string parameter in the console by using the key-id parameter
with either a custom KMS CMK alias name or an alias ARN, you must specify the preﬁx
alias/ before the alias. The following is an ARN example:

arn:aws:kms:us-east-2:123456789012:alias/MyAliasName

The following is an alias name example:

alias/MyAliasName

6. For Value, type a value. For example, MyFirstParameter. If you chose SecureString, the value is
masked exactly as you entered it.
7. Choose Create parameter.

Amazon EFS volumes

Amazon Elastic File System (Amazon EFS) provides simple, scalable ﬁle storage for use with your AWS
Batch jobs. With Amazon EFS, storage capacity is elastic. It scales automatically as you add and remove
ﬁles. Your applications can have the storage they need, when they need it.

You can use Amazon EFS file systems with AWS Batch to export file system data across your fleet of
container instances. That way, your jobs have access to the same persistent storage. However, you must
configure your container instance AMI to mount the Amazon EFS file system before the Docker daemon
starts. Also, your job definitions must reference volume mounts on the container instance to use the file
system. The following sections help you get started using Amazon EFS with AWS Batch.

75
AWS Batch User Guide
Amazon EFS volume considerations

Amazon EFS volume considerations

The following should be considered when using Amazon EFS volumes:

• For jobs using EC2 resources, Amazon EFS ﬁle system support was added as a public preview with
Amazon ECS optimized AMI version 20191212 with container agent version 1.35.0. However,
Amazon EFS ﬁle system support entered general availability with Amazon ECS optimized AMI version
20200319 with container agent version 1.38.0, which contained the Amazon EFS access point and IAM
authorization features. We recommend that you use Amazon ECS optimized AMI version 20200319
or later to take advantage of these features. For more information, see Amazon ECS optimized AMI
versions in the Amazon Elastic Container Service Developer Guide.
Note
If you create your own AMI, you must use container agent 1.38.0 or later, ecs-init version
1.38.0-1 or later, and run the following commands on your Amazon EC2 instance. This is all
to enable the Amazon ECS volume plugin. The commands are dependent on whether you're
using Amazon Linux 2 or Amazon Linux as your base image.
Amazon Linux 2

$ yum install amazon-efs-utils

systemctl enable --now amazon-ecs-volume-plugin

Amazon Linux

$ yum install amazon-efs-utils

sudo shutdown -r now

• For jobs using Fargate resources, Amazon EFS ﬁle system support was added when using platform
version 1.4.0 or later. For more information, see AWS Fargate platform versions in the Amazon Elastic
Container Service Developer Guide.
• When specifying Amazon EFS volumes in jobs using Fargate resources, Fargate creates a supervisor
container that is responsible for managing the Amazon EFS volume. The supervisor container uses
a small amount of the job's memory. The supervisor container is visible when querying the task
metadata version 4 endpoint. For more information, see Task metadata endpoint version 4 in the
Amazon Elastic Container Service User Guide for AWS Fargate.

Using Amazon EFS access points

Amazon EFS access points are application-speciﬁc entry points into an EFS ﬁle system that help you to
manage application access to shared datasets. For more information about Amazon EFS access points
and how to control access to them, see Working with Amazon EFS Access Points in the Amazon Elastic
File System User Guide.

Access points can enforce a user identity, including the user's POSIX groups, for all file system requests
that are made through the access point. Access points can also enforce a different root directory for the
file system so that clients can only access data in the specified directory or its subdirectories.
Note
When creating an EFS access point, you specify a path on the file system to serve as the root
directory. When you reference the EFS file system with an access point ID in your AWS Batch job
definition, the root directory must either be omitted or set to / This enforces the path that's set
on the EFS access point.

You can use an AWS Batch job IAM role to enforce that specific applications use a specific access point.
By combining IAM policies with access points, you can easily provide secure access to specific datasets for

76
AWS Batch User Guide
Specifying an Amazon EFS ﬁle system in your job deﬁnition

your applications. This feature uses Amazon ECS IAM roles for task functionality. For more information,
see IAM Roles for Tasks in the Amazon Elastic Container Service Developer Guide.

Specifying an Amazon EFS ﬁle system in your job

definition
To use Amazon EFS file system volumes for your containers, you must specify the volume and mount
point configurations in your job definition. The following job definition JSON snippet shows the syntax
for the volumes and mountPoints objects for a container:

{
"containerProperties": [
{
"name": "container-using-efs",
"image": "amazonlinux:2"
],
"command": [
"ls",
"-la",
"/mount/efs"
],
"mountPoints": [
{
"sourceVolume": "myEfsVolume",
"containerPath": "/mount/efs",
"readOnly": true
}
],
"volumes": [
{
"name": "myEfsVolume",
"efsVolumeConfiguration": {
"fileSystemId": "fs-12345678",
"rootDirectory": "/path/to/my/data",
"transitEncryption": "ENABLED",
"transitEncryptionPort": integer,
"authorizationConfig": {
"accessPointId": "fsap-1234567890abcdef1",
"iam": "ENABLED"
}
}
}
]
}
]
}

efsVolumeConfiguration

Type: Object

Required: No

This parameter is speciﬁed when using Amazon EFS volumes.

fileSystemId

Type: String

Required: Yes

The Amazon EFS ﬁle system ID to use.

77
AWS Batch User Guide
Specifying an Amazon EFS ﬁle system in your job deﬁnition

rootDirectory

Type: String

Required: No

The directory within the Amazon EFS file system to mount as the root directory inside the host.
If this parameter is omitted, the root of the Amazon EFS volume is used. Specifying / has the
same effect as omitting this parameter. It can be up to 4,096 characters in length.
Important
If an EFS access point is specified in the authorizationConfig, the root directory
parameter must either be omitted or set to /. This enforces the path that's set on the
EFS access point.
transitEncryption

Type: String

Valid values: ENABLED | DISABLED

Required: No

Determines whether to enable encryption for Amazon EFS data that's in transit between the
AWS Batch host and the Amazon EFS server. Transit encryption must be enabled if Amazon
EFS IAM authorization is used. If this parameter is omitted, the default value of DISABLED is
used. For more information, see Encrypting data in transit in the Amazon Elastic File System User
Guide.
transitEncryptionPort

Type: Integer

Required: No

The port to use when sending encrypted data between the AWS Batch host and the Amazon
EFS server. If you don't specify a transit encryption port, it uses the port selection strategy
that the Amazon EFS mount helper uses. The value must be between 0 and 65,535. For more
information, see EFS Mount Helper in the Amazon Elastic File System User Guide.
authorizationConfig

Type: Object

Required: No

The authorization conﬁguration details for the Amazon EFS ﬁle system.
accessPointId

Type: String

Required: No

The access point ID to use. If an access point is speciﬁed, the root directory value in the
efsVolumeConfiguration must either be omitted or set to /. This enforces the path
that's set on the EFS access point. If an access point is used, transit encryption must be
enabled in the EFSVolumeConfiguration. For more information, see Working with
Amazon EFS Access Points in the Amazon Elastic File System User Guide.
iam

Type: String

78
AWS Batch User Guide
Example job deﬁnitions

Valid values: ENABLED | DISABLED

Required: No

Determines whether to use the AWS Batch job IAM role that's defined in a job definition
when mounting the Amazon EFS file system. If enabled, transit encryption must be
enabled in the EFSVolumeConfiguration. If this parameter is omitted, the default value
of DISABLED is used. For more information about execution IAM roles, see AWS Batch
execution IAM role (p. 176).

Example job deﬁnitions

The following example job deﬁnitions illustrate how to use common patterns such as environment
variables, parameter substitution, and volume mounts.

Use environment variables

The following example job definition uses environment variables to specify a file type and Amazon S3
URL. This particular example is from the Creating a Simple "Fetch & Run" AWS Batch Job compute blog
post. The fetch_and_run.sh script that's described in the blog post uses these environment variables
to download the myjob.sh script from S3 and declare its file type.

Even though the command and environment variables are hardcoded into the job deﬁnition in this
example, you can specify command and environment variable overrides to make the job deﬁnition more
versatile.

{
"jobDefinitionName": "fetch_and_run",
"type": "container",
"containerProperties": {
"image": "123456789012.dkr.ecr.us-east-1.amazonaws.com/fetch_and_run",
"resourceRequirements": [
{
"type": "MEMORY",
"value": "2000"
},
{
"type": "VCPU",
"value": "2"
}
],
"command": [
"myjob.sh",
"60"
],
"jobRoleArn": "arn:aws:iam::123456789012:role/AWSBatchS3ReadOnly",
"environment": [
{
"name": "BATCH_FILE_S3_URL",
"value": "s3://my-batch-scripts/myjob.sh"
},
{
"name": "BATCH_FILE_TYPE",
"value": "script"
}
],
"user": "nobody"
}
}

79
AWS Batch User Guide
Using parameter substitution

Using parameter substitution

The following example job deﬁnition illustrates how to allow for parameter substitution and to set
default values.

The Ref:: declarations in the command section are used to set placeholders for parameter substitution.
When you submit a job with this job deﬁnition, you specify the parameter overrides to ﬁll in those
values, such as the inputfile and outputfile. The parameters section that follows sets a default
for codec, but you can override that parameter as needed.

For more information, see Parameters (p. 44).

{
"jobDefinitionName": "ffmpeg_parameters",
"type": "container",
"parameters": {"codec": "mp4"},
"containerProperties": {
"image": "my_repo/ffmpeg",
"resourceRequirements": [
{
"type": "MEMORY",
"value": "2000"
},
{
"type": "VCPU",
"value": "2"
}
],
"command": [
"ffmpeg",
"-i",
"Ref::inputfile",
"-c",
"Ref::codec",
"-o",
"Ref::outputfile"
],
"jobRoleArn": "arn:aws:iam::123456789012:role/ECSTask-S3FullAccess",
"user": "nobody"
}
}

Test GPU functionality

The following example job definition tests if the GPU workload AMI described in Using a GPU workload
AMI (p. 92) is configured properly. This example job definition runs the TensorFlow deep MNIST
classifier example from GitHub.

{
"containerProperties": {
"image": "tensorflow/tensorflow:1.8.0-devel-gpu",
"resourceRequirements": [
{
"type": "MEMORY",
"value": "32000"
},
{
"type": "VCPU",
"value": "8"
}
],

80
AWS Batch User Guide
Multi-node parallel job

"command": [
"sh",
"-c",
"cd /tensorflow/tensorflow/examples/tutorials/mnist; python mnist_deep.py"
]
},
"type": "container",
"jobDefinitionName": "tensorflow_mnist_deep"
}

You can create a ﬁle with the preceding JSON text called tensorflow_mnist_deep.json and then
register an AWS Batch job deﬁnition with the following command:

aws batch register-job-definition --cli-input-json file://tensorflow_mnist_deep.json

Multi-node parallel job

The following example job deﬁnition illustrates a multi-node parallel job. For more information, see
Building a tightly coupled molecular dynamics workﬂow with multi-node parallel jobs in AWS Batch in
the AWS Compute blog.

{
"jobDefinitionName": "gromacs-jobdef",
"jobDefinitionArn": "arn:aws:batch:us-east-2:123456789012:job-definition/gromacs-
jobdef:1",
"revision": 6,
"status": "ACTIVE",
"type": "multinode",
"parameters": {},
"nodeProperties": {
"numNodes": 2,
"mainNode": 0,
"nodeRangeProperties": [
{
"targetNodes": "0:1",
"container": {
"image": "123456789012.dkr.ecr.us-east-2.amazonaws.com/gromacs_mpi:latest",
"resourceRequirements": [
{
"type": "MEMORY",
"value": "24000"
},
{
"type": "VCPU",
"value": "8"
}
],
"command": [],
"jobRoleArn": "arn:aws:iam::123456789012:role/ecsTaskExecutionRole",
"ulimits": [],
"instanceType": "p3.2xlarge"
}
}
]
}
}

81
AWS Batch User Guide
Creating a job queue

Job queues
Jobs are submitted to a job queue where they reside until they can be scheduled to run in a compute
environment. An AWS account can have multiple job queues. For example, you can create a queue that
uses Amazon EC2 On-Demand instances for high priority jobs and another queue that uses Amazon EC2
Spot Instances for low-priority jobs. Job queues have a priority that's used by the scheduler to determine
which jobs in which queue should be evaluated for execution ﬁrst.

Creating a job queue

Before you can submit jobs in AWS Batch, you must create a job queue. When you create a job queue,
you associate one or more compute environments to the queue and assign an order of preference for the
compute environments.

You also set a priority to the job queue that determines the order in which the AWS Batch scheduler
places jobs onto its associated compute environments. For example, if a compute environment is
associated with more than one job queue, the job queue with a higher priority is given preference for
scheduling jobs to that compute environment.

To create a job queue

1. Open the AWS Batch console at https://console.aws.amazon.com/batch/.

2. From the navigation bar, select the Region to use.
3. In the navigation pane, choose Job queues, Create.
4. For Job queue name, enter a unique name for your job queue. Up to 128 letters (uppercase and
lowercase), numbers, and underscores are allowed.
5. For Priority, enter an integer value for the job queue's priority. Job queues with a higher priority (or
a higher integer value for the priority parameter) are evaluated ﬁrst when associated with the
same compute environment. Priority is determined in descending order, for example, a job queue
with a priority value of 10 is given scheduling preference over a job queue with a priority value of 1.
6. (Optional) Expand Additional conﬁguration.

• For State, select Enabled so that your job queue can accept job submissions.
7. (Optional) In the Tags section, you can specify the key and value for each tag to associate with the
job queue. For more information, see Tagging your AWS Batch resources (p. 197).
8. In the Connected compute environments section, select one or more compute environments from
the list to associate with the job queue, in the order that the queue should attempt placement. The
job scheduler uses compute environment order to determine which compute environment should
start a given job. Compute environments must be in the VALID state before you can associate them
with a job queue. You can associate up to three compute environments with a job queue.
Note
All compute environments that are associated with a job queue must share the same
provisioning model, either EC2 (On-Demand and Spot) or Fargate (Fargate and Fargate
Spot). AWS Batch doesn't support mixing provisioning models in a single job queue.
Note
All compute environments that are associated with a job queue must share the same
architecture. AWS Batch doesn't support mixing compute environment architecture types in
a single job queue.

82
AWS Batch User Guide
Job queue template

You can change the order of compute environments by choosing the up and down arrows next to
the Order column in the table.
9. Choose Create to ﬁnish and create your job queue.

Job queue template

An empty job queue template is shown below. You can use this template to create your job queue which
can then be saved to a ﬁle and used with the AWS CLI --cli-input-json option. For more information
about these parameters, see CreateJobQueue in the AWS Batch API Reference.

{
"jobQueueName": "",
"state": "DISABLED",
"priority": 0,
"computeEnvironmentOrder": [
{
"order": 0,
"computeEnvironment": ""
}
],
"tags": {
"KeyName": ""
}
}

Note
You can generate the preceding job queue template with the following AWS CLI command.

$ aws batch create-job-queue --generate-cli-skeleton

Job queue parameters

Job queues are split into four basic components: the name, state, and priority of the job queue, and the
compute environment order.

Job queue name

jobQueueName

The name for your job queue. Up to 128 letters (uppercase and lowercase), numbers, and
underscores are allowed.

Type: String

Required: Yes

Priority
priority

The priority of the job queue. Job queues with a higher priority (or a higher integer value for the
priority parameter) are evaluated ﬁrst when associated with same compute environment. Priority

83
AWS Batch User Guide
Scheduling policy

is determined in descending order, for example, a job queue with a priority value of 10 is given
scheduling preference over a job queue with a priority value of 1. All of the compute environments
must be either EC2 (EC2 or SPOT) or Fargate (FARGATE or FARGATE_SPOT); EC2 and Fargate
compute environments can't be mixed.

Type: Integer

Required: Yes

Scheduling policy
schedulingPolicyArn

The Amazon Resource Name (ARN) of the scheduling policy for the job queue. Job queues that
don't have a scheduling policy are scheduled in a first-in, first-out (FIFO) model. After a job queue
has a scheduling policy, it can be replaced but can't be removed. A job queue without a scheduling
policy is scheduled as a FIFO job queue and can't have a scheduling policy added. Jobs queues with
a scheduling policy can have a maximum of 500 active fair share identifiers. When the limit has been
reached, submissions of any jobs that add a new fair share identifier fail.

Type: String

Required: No

State
state

The state of the job queue. If the job queue state is ENABLED (the default value), it can accept jobs.
If the job queue state is DISABLED, new jobs can't be added to the queue, but jobs already in the
queue can ﬁnish.

Type: String

Valid values: ENABLED | DISABLED

Required: No

Compute environment order

computeEnvironmentOrder

The set of compute environments mapped to a job queue and their order relative to each other. The
job scheduler uses this parameter to determine which compute environment should run a speciﬁc
job. Compute environments must be in the VALID state before you can associate them with a job
queue. You can associate up to three compute environments with a job queue. All of the compute
environments must be either EC2 (EC2 or SPOT) or Fargate (FARGATE or FARGATE_SPOT); EC2 and
Fargate compute environments can't be mixed.
Note
All compute environments that are associated with a job queue must share the same
architecture. AWS Batch doesn't support mixing compute environment architecture types in
a single job queue.

Type: Array of ComputeEnvironmentOrder objects

84
AWS Batch User Guide
Tags

Required: Yes
computeEnvironment

The Amazon Resource Name (ARN) of the compute environment.

Type: String

Required: Yes
order

The order of the compute environment. Compute environments are tried in ascending order.
For example, if two compute environments are associated with a job queue, the compute
environment with a lower order integer value is tried for job placement ﬁrst.

Tags
tags

Key-value pair tags to associate with the job queue. For more information, see Tagging your AWS
Batch resources (p. 197).

Type: String to string map

Required: No

85
AWS Batch User Guide

Job Scheduling
The AWS Batch scheduler evaluates when, where, and how to run jobs that have been submitted to a job
queue. If you need to guarantee the order that jobs are run, use the dependsOn parameter to SubmitJob
to specify the dependencies for each job.

By default, jobs run in approximately the order in which they are submitted (ﬁrst in, ﬁrst out), as
long as all dependencies on other jobs have been met. If the job queue has a scheduling policy, the
scheduling policy will determine the order in which jobs are run. For more information, see Scheduling
policies (p. 116).

86
AWS Batch User Guide
Managed compute environments

Compute environment
Job queues are mapped to one or more compute environments. Compute environments contain the
Amazon ECS container instances that are used to run containerized batch jobs. A specific compute
environment can also be mapped to one or more than one job queue. Within a job queue, the associated
compute environments each have an order that's used by the scheduler to determine where jobs that
are ready to be run should run. If the first compute environment has a status of VALID and has available
resources, the job is scheduled to a container instance within that compute environment. If the first
compute environment has a status of INVALID or can't provide a suitable compute resource, the
scheduler attempts to run the job on the next compute environment.
Note
AWS Batch does not support Windows containers, on either Fargate or EC2 resources.

Topics
• Managed compute environments (p. 87)
• Unmanaged compute environments (p. 88)
• Compute resource AMIs (p. 88)
• Launch template support (p. 96)
• Creating a compute environment (p. 99)
• Compute environment template (p. 104)
• Compute environment parameters (p. 105)
• EC2 Conﬁgurations (p. 113)
• Allocation strategies (p. 113)
• Compute Resource Memory Management (p. 114)

Managed compute environments

You can use managed compute environments to meet business requirements. In a managed compute
environment, AWS Batch helps you to manage the capacity and instance types of the compute resources
within the environment. This is based on the compute resource specification that you define when you
create the compute environment. You can choose either to use EC2 On-Demand Instances and EC2 Spot
Instances. Or, you can alternatively use Fargate and Fargate Spot capacity in your managed compute
environment. You can optionally set a maximum price so that Spot Instances only launch when the Spot
Instance price is under a specified percentage of the On-Demand price.

Managed compute environments launch Amazon ECS container instances into the VPC and subnets that
you specify when you create the compute environment. Amazon ECS container instances need external
network access to communicate with the Amazon ECS service endpoint. Some subnets don't provide
container instances with public IP addresses. If your container instances don't have public IP addresses,
they must use network address translation (NAT) to gain this access. For more information, see NAT
gateways in the Amazon VPC User Guide. For more information about how to create a VPC, see Tutorial:
Creating a VPC with Public and Private Subnets for Your Compute Environments (p. 165).

By default, AWS Batch managed compute environments use a recent, approved version of the Amazon
ECS optimized AMI for compute resources. However, you might want to create your own AMI to use for

87
AWS Batch User Guide
Unmanaged compute environments

your managed compute environments for various reasons. For more information, see Compute resource
AMIs (p. 88).
Note
AWS Batch doesn't upgrade the AMIs in a compute environment after it's created. For example,
it also doesn't update the AMIs in your compute environment when a newer version of the
Amazon ECS optimized AMI is available. You're responsible for the management of the guest
operating system. This includes any updates and security patches. You're also responsible for
any additional application software or utilities that you install on the compute resources. To use
a new AMI for your AWS Batch jobs:

1. Create a new compute environment with the new AMI.

2. Add the compute environment to an existing job queue.
3. Remove the earlier compute environment from your job queue.
4. Delete the earlier compute environment.

Unmanaged compute environments

In an unmanaged compute environment, you manage your own compute resources. You must verify that
the AMI you use for your compute resources meets the Amazon ECS container instance AMI speciﬁcation.
For more information, see Compute resource AMI speciﬁcation (p. 89) and Creating a compute
resource AMI (p. 90).
Note
AWS Fargate resources aren't supported in unmanaged compute environments.

After you created your unmanaged compute environment, use the DescribeComputeEnvironments API
operation to view the compute environment details. Find the Amazon ECS cluster that's associated with
the environment and then manually launch your container instances into that Amazon ECS cluster.

The following AWS CLI command also provides the Amazon ECS cluster ARN:

$ aws batch describe-compute-environments \

--compute-environments unmanagedCE \
--query "computeEnvironments[].ecsClusterArn"

For more information, see Launching an Amazon ECS container instance in the Amazon Elastic Container
Service Developer Guide. When you launch your compute resources, specify the Amazon ECS cluster ARN
that the resources should register with the following Amazon EC2 user data. Replace ecsClusterArn
with the cluster ARN you obtained with the previous command.

#!/bin/bash
echo "ECS_CLUSTER=ecsClusterArn" >> /etc/ecs/ecs.config

Compute resource AMIs

By default, AWS Batch managed compute environments use a recent, approved version of the Amazon
ECS optimized AMI for compute resources. However, you might want to consider creating your own AMI
to use for your managed and unmanaged compute environments. You should do this if you also require
the following actions:

• Increase the storage size of your AMI root or data volumes

• Add instance storage volumes for supported Amazon EC2 instance types

88
AWS Batch User Guide
Compute resource AMI speciﬁcation

• Conﬁgure the Amazon ECS container agent with custom options

• Conﬁgure Docker to use custom options
• Conﬁgure a GPU workload AMI that allows containers to access GPU hardware on supported Amazon
EC2 instance types

Note
AWS Batch doesn't upgrade the AMIs in a compute environment after it's created. For example,
it also doesn't update the AMIs in your compute environment when a newer version of the
Amazon ECS optimized AMI is available. You're responsible for the management of the guest
operating system. This includes any updates and security patches. You're also responsible for
any additional application software or utilities that you install on the compute resources. To use
a new AMI for your AWS Batch jobs:

1. Create a new compute environment with the new AMI.

2. Add the compute environment to an existing job queue.
3. Remove the earlier compute environment from your job queue.
4. Delete the earlier compute environment.

Topics
• Compute resource AMI speciﬁcation (p. 89)
• Creating a compute resource AMI (p. 90)
• Using a GPU workload AMI (p. 92)

Compute resource AMI speciﬁcation

The basic AWS Batch compute resource AMI speciﬁcation consists of the following items:

Required

• A modern Linux distribution that's running at least version 3.10 of the Linux kernel on an HVM
virtualization type AMI. Windows containers are not supported.
Important
Multi-node parallel jobs can only run on compute resources that were launched on an Amazon
Linux instance with the ecs-init package installed. We recommend that you use the default
Amazon ECS optimized AMI when you create your compute environment. You can do this by
not specifying a custom AMI. For more information, see Multi-node Parallel Jobs (p. 27).
• The Amazon ECS container agent. We recommend that you use the latest version. For more
information, see Installing the Amazon ECS Container Agent in the Amazon Elastic Container Service
Developer Guide.
• The awslogs log driver must be speciﬁed as an available log driver with the
ECS_AVAILABLE_LOGGING_DRIVERS environment variable when the Amazon ECS container agent is
started. For more information, see Amazon ECS Container Agent Conﬁguration in the Amazon Elastic
Container Service Developer Guide.
• A Docker daemon that's running at least version 1.9, and any Docker runtime dependencies. For more
information, see Check runtime dependencies in the Docker documentation.
Note
For a better experience, we recommend the Docker version that ships with and is tested with
the corresponding Amazon ECS agent version that you're using. For more information, see
Amazon ECS Container Agent Versions in the Amazon Elastic Container Service Developer
Guide.

89
AWS Batch User Guide
Creating a compute resource AMI

Recommended

• An initialization and nanny process to run and monitor the Amazon ECS agent. The Amazon ECS
optimized AMI uses the ecs-init upstart process, and other operating systems might use systemd.
To view several example user data conﬁguration scripts that use systemd to start and monitor the
Amazon ECS container agent, see Example container instance User Data Conﬁguration Scripts in the
Amazon Elastic Container Service Developer Guide. For more information about ecs-init, see the
ecs-init project on GitHub. At a minimum, managed compute environments require the Amazon
ECS agent to start at boot. If the Amazon ECS agent isn't running on your compute resource, then it
can't accept jobs from AWS Batch.

The Amazon ECS optimized AMI is preconﬁgured with these requirements and recommendations.
We recommend that you use the Amazon ECS optimized AMI or an Amazon Linux AMI with the ecs-
init package installed for your compute resources. You should choose another AMI if your application
requires a speciﬁc operating system or a Docker version that's not yet available in those AMIs. For more
information, see Amazon ECS-Optimized AMI in the Amazon Elastic Container Service Developer Guide.

Creating a compute resource AMI

You can create your own custom compute resource AMI to use for your managed and unmanaged
compute environments, provided that you follow the Compute resource AMI speciﬁcation (p. 89).
After you have created your custom AMI, you can create a compute environment that uses that AMI, you
can associate it with a job queue, and then start submitting jobs to that queue.

To create a custom compute resource AMI

1. Choose a base AMI to start from. The base AMI must use HVM virtualization, and it can't be a
Windows AMI.
Note
The AMI that you choose for a compute environment must match the architecture of the
instance types that you intend to use for that compute environment. For example, if your
compute environment uses A1 instance types, the compute resource AMI that you choose
must support ARM instances. Amazon ECS vends both x86 and ARM versions of the Amazon
ECS optimized Amazon Linux 2 AMI. For more information, see Amazon ECS optimized
Amazon Linux 2 AMI in the Amazon Elastic Container Service Developer Guide.

The Amazon ECS optimized Amazon Linux 2 AMI is the default AMI for compute resources in
managed compute environments. The Amazon ECS optimized Amazon Linux 2 AMI is preconﬁgured
and tested on AWS Batch by AWS engineers. It's the simplest AMI for you to get started and to get
your compute resources that are running on AWS quickly. For more information, see Amazon ECS
Optimized AMI in the Amazon Elastic Container Service Developer Guide.

Alternatively, you can choose another Amazon Linux 2 variant and install the ecs-init package
with the commands below. For more information, see Installing the Amazon ECS container agent on
an Amazon Linux 2 EC2 instance in the Amazon Elastic Container Service Developer Guide:

$ sudo amazon-linux-extras disable docker

$ sudo amazon-linux-extras install ecs-init

For example, if you want to run GPU workloads on your AWS Batch compute resources, you can start
with the Amazon Linux Deep Learning AMI and conﬁgure it to be able to run AWS Batch jobs. For
more information, see Using a GPU workload AMI (p. 92).
Important
If you choose a base AMI that doesn't support the ecs-init package, you must conﬁgure
a way to start the Amazon ECS agent at boot and keep it running. To view several example

90
AWS Batch User Guide
Creating a compute resource AMI

user data configuration scripts that use systemd to start and monitor the Amazon ECS
container agent, see Example container instance user data configuration scripts in the
Amazon Elastic Container Service Developer Guide.
2. Launch an instance from your selected base AMI with the appropriate storage options for your
AMI. You can configure the size and number of attached Amazon EBS volumes, or instance storage
volumes if the instance type you've selected supports them. For more information, see Launching an
Instance and Amazon EC2 Instance Store in the Amazon EC2 User Guide for Linux Instances.
3. Connect to your instance with SSH and perform any necessary configuration tasks. This might
include any or all of the following steps:

• Installing the Amazon ECS container agent. For more information, see Installing the Amazon ECS
Container Agent in the Amazon Elastic Container Service Developer Guide.
• Configuring a script to format instance store volumes.
• Adding instance store volume or Amazon EFS file systems to the /etc/fstab file so that they're
mounted at boot.
• Configuring Docker options, such as enabling debugging or adjusting base image size.
• Installing packages or copying files.

For more information, see Connecting to Your Linux Instance Using SSH in the Amazon EC2 User
Guide for Linux Instances.
4. If you started the Amazon ECS container agent on your instance, you must stop it and remove any
persistent data checkpoint ﬁles before creating your AMI. Otherwise, if you don't do this, the agent
doesn't start on instances that are launched from your AMI.

a. Stop the Amazon ECS container agent.

• Amazon ECS-optimized Amazon Linux 2 AMI:

sudo systemctl stop ecs

• Amazon ECS-optimized Amazon Linux AMI:

sudo stop ecs

b. Remove the persistent data checkpoint files. By default, these files are located in the /var/
lib/ecs/data/ directory. Use the following command to remove any such files.

sudo rm -rf /var/lib/ecs/data/*

5. Create a new AMI from your running instance. For more information, see Creating an Amazon EBS-
Backed Linux AMI in the Amazon EC2 User Guide for Linux Instances guide.

To use your new AMI with AWS Batch

1. After the AMI is created, create a compute environment with your new AMI. Make sure that you
select Enable user-speciﬁed AMI ID and specify your custom AMI ID in Step 5.h.iii (p. 102)). For
more information, see Creating a compute environment (p. 99).
Note
The AMI that you choose for a compute environment must match the architecture of the
instance types that you intend to use for that compute environment. For example, if your
compute environment uses A1 instance types, the compute resource AMI that you choose
must support ARM instances. Amazon ECS vends both x86 and ARM versions of the Amazon
ECS optimized Amazon Linux 2 AMI. For more information, see Amazon ECS optimized
Amazon Linux 2 AMI in the Amazon Elastic Container Service Developer Guide.

91
AWS Batch User Guide
Using a GPU workload AMI

2. Create a job queue and associate your new compute environment. For more information, see
Creating a job queue (p. 82).
Note
All compute environments that are associated with a job queue must share the same
architecture. AWS Batch doesn't support mixing compute environment architecture types in
a single job queue.
3. (Optional) Submit a sample job to your new job queue. For more information, see Example job
deﬁnitions (p. 79), Creating a job deﬁnition (p. 31), and Submitting a Job (p. 14).

Using a GPU workload AMI

To run GPU workloads on your AWS Batch compute resources, you must use an AMI with GPU support.
For more information, see Working with GPUs on Amazon ECS and Amazon ECS-optimized AMIs in
Amazon Elastic Container Service Developer Guide.

In managed compute environments, if the compute environment speciﬁes any p2, p3, p4, g3, g3s, or g4
instance types or instance families, then AWS Batch uses an Amazon ECS GPU optimized AMI.

In unmanaged compute environments, an Amazon ECS GPU-optimized AMI is recommended. You

can use the AWS Command Line Interface or AWS Systems Manager Parameter Store GetParameter,
GetParameters, and GetParametersByPath operations to retrieve the metadata for the recommended
Amazon ECS GPU-optimized AMIs.

The following examples demonstrate the use of GetParameter.

AWS CLI

$ aws ssm get-parameter --name /aws/service/ecs/optimized-ami/amazon-linux-2/gpu/

recommended \
--region us-east-2 --output json

The output includes the AMI information in the Value parameter:

{
"Parameter": {
"Name": "/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended",
"LastModifiedDate": 1555434128.664,
"Value": "{\"schema_version\":1,\"image_name\":\"amzn2-ami-ecs-gpu-
hvm-2.0.20190402-x86_64-ebs\",\"image_id\":\"ami-083c800fe4211192f\",\"os\":\"Amazon
Linux 2\",\"ecs_runtime_version\":\"Docker version 18.06.1-ce\",\"ecs_agent_version\":
\"1.27.0\"}",
"Version": 9,
"Type": "String",
"ARN": "arn:aws:ssm:us-east-2::parameter/aws/service/ecs/optimized-ami/amazon-
linux-2/gpu/recommended"
}
}

Python

from future import print_function

import json
import boto3

ssm = boto3.client('ssm', 'us-east-2')

92
AWS Batch User Guide
Using a GPU workload AMI

response = ssm.get_parameter(Name='/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/
recommended')
jsonVal = json.loads(response['Parameter']['Value'])
print("image_id = " + jsonVal['image_id'])
print("image_name = " + jsonVal['image_name'])

The output only includes the AMI ID and AMI name:

image_id = ami-083c800fe4211192f
image_name = amzn2-ami-ecs-gpu-hvm-2.0.20190402-x86_64-ebs

The following examples demonstrate the use of GetParameters.

AWS CLI

$ aws ssm get-parameters --names /aws/service/ecs/optimized-ami/amazon-linux-2/gpu/

recommended/image_name \
/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/
recommended/image_id \
--region us-east-2 --output json

The output includes the full metadata for each of the parameters:

{
"InvalidParameters": [],
"Parameters": [
{
"Name": "/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/
image_id",
"LastModifiedDate": 1555434128.749,
"Value": "ami-083c800fe4211192f",
"Version": 9,
"Type": "String",
"ARN": "arn:aws:ssm:us-east-2::parameter/aws/service/ecs/optimized-ami/
amazon-linux-2/gpu/recommended/image_id"
},
{
"Name": "/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/
image_name",
"LastModifiedDate": 1555434128.712,
"Value": "amzn2-ami-ecs-gpu-hvm-2.0.20190402-x86_64-ebs",
"Version": 9,
"Type": "String",
"ARN": "arn:aws:ssm:us-east-2::parameter/aws/service/ecs/optimized-ami/
amazon-linux-2/gpu/recommended/image_name"
}
]
}

Python

from future import print_function

import boto3

ssm = boto3.client('ssm', 'us-east-2')

response = ssm.get_parameters(
Names=['/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/
image_name',

93
AWS Batch User Guide
Using a GPU workload AMI

'/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/
image_id'])
for parameter in response['Parameters']:
print(parameter['Name'] + " = " + parameter['Value'])

The output includes the AMI ID and AMI name, using the full path for the names:

/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/image_id =
ami-083c800fe4211192f
/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/image_name = amzn2-ami-
ecs-gpu-hvm-2.0.20190402-x86_64-ebs

The following examples demonstrate the use of GetParametersByPath.

AWS CLI

$ aws ssm get-parameters-by-path --path /aws/service/ecs/optimized-ami/amazon-linux-2/

gpu/recommended \
--region us-east-2 --output json

The output includes the full metadata for all of the parameters under the speciﬁed path:

{
"Parameters": [
{
"Name": "/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/
ecs_agent_version",
"LastModifiedDate": 1555434128.801,
"Value": "1.27.0",
"Version": 8,
"Type": "String",
"ARN": "arn:aws:ssm:us-east-2::parameter/aws/service/ecs/optimized-ami/
amazon-linux-2/gpu/recommended/ecs_agent_version"
},
{
"Name": "/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/
ecs_runtime_version",
"LastModifiedDate": 1548368308.213,
"Value": "Docker version 18.06.1-ce",
"Version": 1,
"Type": "String",
"ARN": "arn:aws:ssm:us-east-2::parameter/aws/service/ecs/optimized-ami/
amazon-linux-2/gpu/recommended/ecs_runtime_version"
},
{
"Name": "/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/
image_id",
"LastModifiedDate": 1555434128.749,
"Value": "ami-083c800fe4211192f",
"Version": 9,
"Type": "String",
"ARN": "arn:aws:ssm:us-east-2::parameter/aws/service/ecs/optimized-ami/
amazon-linux-2/gpu/recommended/image_id"
},
{
"Name": "/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/
image_name",
"LastModifiedDate": 1555434128.712,
"Value": "amzn2-ami-ecs-gpu-hvm-2.0.20190402-x86_64-ebs",
"Version": 9,
"Type": "String",

94
AWS Batch User Guide
Using a GPU workload AMI

"ARN": "arn:aws:ssm:us-east-2::parameter/aws/service/ecs/optimized-ami/
amazon-linux-2/gpu/recommended/image_name"
},
{
"Name": "/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/os",
"LastModifiedDate": 1548368308.143,
"Value": "Amazon Linux 2",
"Version": 1,
"Type": "String",
"ARN": "arn:aws:ssm:us-east-2::parameter/aws/service/ecs/optimized-ami/
amazon-linux-2/gpu/recommended/os"
},
{
"Name": "/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/
schema_version",
"LastModifiedDate": 1548368307.914,
"Value": "1",
"Version": 1,
"Type": "String",
"ARN": "arn:aws:ssm:us-east-2::parameter/aws/service/ecs/optimized-ami/
amazon-linux-2/gpu/recommended/schema_version"
}
]
}

Python

from future import print_function

import boto3

ssm = boto3.client('ssm', 'us-east-2')

response = ssm.get_parameters_by_path(Path='/aws/service/ecs/optimized-ami/amazon-
linux-2/gpu/recommended')
for parameter in response['Parameters']:
print(parameter['Name'] + " = " + parameter['Value'])

The output includes the values of all the parameter names at the speciﬁed path, using the full path
for the names:

/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/ecs_agent_version =
1.27.0
/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/ecs_runtime_version =
Docker version 18.06.1-ce
/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/image_id =
ami-083c800fe4211192f
/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/image_name = amzn2-ami-
ecs-gpu-hvm-2.0.20190402-x86_64-ebs
/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/os = Amazon Linux 2
/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/schema_version = 1

For more information, see Retrieving Amazon ECS-Optimized AMI Metadata in the Amazon Elastic
Container Service Developer Guide.

95
AWS Batch User Guide
Launch template support

Launch template support

AWS Batch supports using Amazon EC2 launch templates with your EC2 compute environments. With
launch templates, you can modify the default conﬁguration of your AWS Batch compute resources
without needing to create customized AMIs.
Note
Launch templates aren't supported on AWS Fargate resources.

You must create a launch template before you can associate it with a compute environment. You can
create a launch template in the Amazon EC2 console, or you can use the AWS CLI or an AWS SDK. For
example, the following JSON ﬁle represents a launch template that resizes the Docker data volume for
the default AWS Batch compute resource AMI and also sets it to be encrypted.

{
"LaunchTemplateName": "increase-container-volume-encrypt",
"LaunchTemplateData": {
"BlockDeviceMappings": [
{
"DeviceName": "/dev/xvdcz",
"Ebs": {
"Encrypted": true,
"VolumeSize": 100,
"VolumeType": "gp2"
}
}
]
}
}

You can create the previous launch template by saving the JSON to a ﬁle called lt-data.json and
running the following AWS CLI command:

aws ec2 --region <region> create-launch-template --cli-input-json file://lt-data.json

For more information about launch templates, see Launching an Instance from a Launch Template in the
Amazon EC2 User Guide for Linux Instances.

If you use a launch template to create your compute environment, you can move the following existing
compute environment parameters to your launch template:
Note
If any of these parameters (with the exception of Amazon EC2 tags) are specified both in the
launch template and in the compute environment configuration, the compute environment
parameters take precedence. Amazon EC2 tags are merged between the launch template and
the compute environment configuration. If there is a collision on the tag's key, then the value in
the compute environment configuration takes precedence.

• Amazon EC2 key pair

• Amazon EC2 AMI ID
• Security group IDs
• Amazon EC2 tags

The following launch template parameters are ignored by AWS Batch:

• Instance type (specify your desired instance types when you create your compute environment)

96
AWS Batch User Guide
Amazon EC2 user data in launch templates

• Instance role (specify your desired instance role when you create your compute environment)
• Network interface subnets (specify your desired subnets when you create your compute environment)
• Instance market options (AWS Batch must control Spot Instance conﬁguration)
• Disable API termination (AWS Batch must control instance lifecycle)

AWS Batch doesn't support updating a compute environment with a new launch template version. If you
update your launch template, you must create a new compute environment with the new template for
the changes to take eﬀect.

Amazon EC2 user data in launch templates

You can supply Amazon EC2 user data in your launch template that's run by cloud-init when your
instances launch. Your user data can perform common conﬁguration scenarios, including but not limited
to:

• Including users or groups

• Installing packages
• Creating partitions and ﬁle systems

Amazon EC2 user data in launch templates must be in the MIME multi-part archive format. This is
because your user data is merged with other AWS Batch user data that's required to configure your
compute resources. You can combine multiple user data blocks together into a single MIME multi-part
file. For example, you might want to combine a cloud boothook that configures the Docker daemon with
a user data shell script that writes configuration information for the Amazon ECS container agent.

If you're using AWS CloudFormation, the AWS::CloudFormation::Init type can be used with the cfn-init
helper script to perform common conﬁguration scenarios.

A MIME multi-part ﬁle consists of the following components:

• The content type and part boundary declaration: Content-Type: multipart/mixed;

boundary="==BOUNDARY=="
• The MIME version declaration: MIME-Version: 1.0
• One or more user data blocks that contain the following components:
• The opening boundary that signals the beginning of a user data block: --==BOUNDARY==
• The content type declaration for the block: Content-Type: text/cloud-config;
charset="us-ascii". For more information about content types, see the Cloud-Init
documentation.
• The content of the user data, for example, a list of shell commands or cloud-init directives
• The closing boundary that signals the end of the MIME multi-part ﬁle: --==BOUNDARY==--

The follwing are example MIME multi-part files that you can use to create your own.
Note
If you add user data to a launch template in the Amazon EC2 console, you can paste it in as
plaintext, or upload from a file. If you use the AWS CLI or an AWS SDK, you must first base64
encode the user data and submit that string as the value of the UserData parameter when you
call CreateLaunchTemplate, as shown in this JSON.

{
"LaunchTemplateName": "base64-user-data",

97
AWS Batch User Guide
Amazon EC2 user data in launch templates

"LaunchTemplateData": {
"UserData":
"ewogICAgIkxhdW5jaFRlbXBsYXRlTmFtZSI6ICJpbmNyZWFzZS1jb250YWluZXItdm9sdW..."
}
}

Examples
• Example: Mount an existing Amazon EFS file system (p. 98)
• Example: Override default Amazon ECS container agent configuration (p. 98)
• Example: Mount an existing Amazon FSx for Lustre file system (p. 99)

Example: Mount an existing Amazon EFS ﬁle system

Example

This example MIME multi-part file configures the compute resource to install the amazon-efs-utils
package and mount an existing Amazon EFS file system at /mnt/efs.

MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="==MYBOUNDARY=="

--==MYBOUNDARY==
Content-Type: text/cloud-config; charset="us-ascii"

packages:
- amazon-efs-utils

runcmd:
- file_system_id_01=fs-abcdef123
- efs_directory=/mnt/efs

- mkdir -p ${efs_directory}
- echo "${file_system_id_01}:/ ${efs_directory} efs tls,_netdev" >> /etc/fstab
- mount -a -t efs defaults

--==MYBOUNDARY==--

Example: Override default Amazon ECS container agent

conﬁguration
Example

This example MIME multi-part ﬁle overrides the default Docker image cleanup settings for a compute
resource.

MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="==MYBOUNDARY=="

--==MYBOUNDARY==
Content-Type: text/x-shellscript; charset="us-ascii"

#!/bin/bash
echo ECS_IMAGE_CLEANUP_INTERVAL=60m >> /etc/ecs/ecs.config
echo ECS_IMAGE_MINIMUM_CLEANUP_AGE=60m >> /etc/ecs/ecs.config

--==MYBOUNDARY==--

98
AWS Batch User Guide
Creating a compute environment

Example: Mount an existing Amazon FSx for Lustre ﬁle system

Example

This example MIME multi-part file configures the compute resource to install the lustre2.10 package
from the Extras Library and mount an existing FSx for Lustre file system at /scratch. This example is
for Amazon Linux 2. For installation instructions for other Linux distributions, see Installing the Lustre
Client in the Amazon FSx for Lustre User Guide.

MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="==MYBOUNDARY=="

--==MYBOUNDARY==
Content-Type: text/cloud-config; charset="us-ascii"

runcmd:
- file_system_id_01=fs-0abcdef1234567890
- region=us-east-2
- fsx_directory=/scratch
- amazon-linux-extras install -y lustre2.10
- mkdir -p ${fsx_directory}
- mount -t lustre ${file_system_id_01}.fsx.${region}.amazonaws.com@tcp:fsx ${fsx_directory}

--==MYBOUNDARY==--

In the volumes and mountPoints members of the container properties the mount points must be
mapped into the container.

{
"volumes": [
{
"host": {
"sourcePath": "/scratch"
},
"name": "Scratch"
}
],
"mountPoints": [
{
"containerPath": "/scratch",
"sourceVolume": "Scratch"
}
],
}

Creating a compute environment

Before you can run jobs in AWS Batch, you need to create a compute environment. You can create a
managed compute environment where AWS Batch manages the Amazon EC2 instances or AWS Fargate
resources within the environment based on your speciﬁcations. Or, alternatively, you can create an
unmanaged compute environment where you handle the Amazon EC2 instance conﬁguration within the
environment.

Contents
• To create a managed compute environment using AWS Fargate resources (p. 100)
• To create a managed compute environment using EC2 resources (p. 101)

99
AWS Batch User Guide
To create a managed compute
environment using AWS Fargate resources

• To create an unmanaged compute environment using EC2 resources (p. 103)

To create a managed compute environment using

AWS Fargate resources
1. Open the AWS Batch console at https://console.aws.amazon.com/batch/.
2. From the navigation bar, select the Region to use.
3. In the navigation pane, choose Compute environments, Create.
4. Conﬁgure the environment.

a. For Compute environment type, choose Managed.

b. For Compute environment name, specify a unique name for your compute environment. The
name can contain up to 128 characters in length. It can contain uppercase and lowercase letters,
numbers, hyphens (-), and underscores (_).
c. Ensure that Enable compute environment is selected so that your compute environment can
accept jobs from the AWS Batch job scheduler.
d. For Additional settings: service role, instance role, EC2 key pair.

• For Service role, choose Batch service-linked role. The role allows the AWS Batch service
to make calls to the required AWS API operations on your behalf. For more information, see
Service-linked role permissions for AWS Batch (p. 181).
5. Conﬁgure your Instance conﬁguration.

a. For Provisioning model, choose Fargate to launch Fargate On-Demand resources or Fargate
Spot to use Fargate Spot resources.
b. For Maximum vCPUs, choose the maximum number of vCPUs that your compute environment
can scale out to, regardless of job queue demand.
c.
6. Conﬁgure networking.
Important
Compute resources need access to communicate with the Amazon ECS service endpoint.
This can be through an interface VPC endpoint or through your compute resources having
public IP addresses.
For more information about interface VPC endpoints, see Amazon ECS Interface VPC
Endpoints (AWS PrivateLink) in the Amazon Elastic Container Service Developer Guide.
If you do not have an interface VPC endpoint conﬁgured and your compute resources do
not have public IP addresses, then they must use network address translation (NAT) to
provide this access. For more information, see NAT gateways in the Amazon VPC User Guide.
For more information, see Tutorial: Creating a VPC with Public and Private Subnets for Your
Compute Environments (p. 165).

a. For VPC ID, choose a VPC where you intend to launch your instances.
b. For Subnets, choose which subnets in the selected VPC should host your instances. By default,
all subnets within the selected VPC are chosen.
c. (Optional) Expand Additional settings: Security groups, EC2 tags.

• For Security groups, choose a security group to attach to your instances. By default, the
default security group for your VPC is chosen.
7. (Optional) In the Tags section, you can specify the key and value for each tag to associate with the
compute environment. For more information, see Tagging your AWS Batch resources (p. 197).
8. Choose Create compute environment to ﬁnish.

100
AWS Batch User Guide
To create a managed compute
environment using EC2 resources

To create a managed compute environment using

EC2 resources
1. Open the AWS Batch console at https://console.aws.amazon.com/batch/.
2. From the navigation bar, select the Region to use.
3. In the navigation pane, choose Compute environments, Create.
4. Conﬁgure the environment.

a. For Compute environment type, choose Managed.

b. For Compute environment name, specify a unique name for your compute environment. The
name can contain up to 128 characters in length. It can contain uppercase and lowercase letters,
numbers, hyphens (-), and underscores (_).
c. Ensure that Enable compute environment is selected so that your compute environment can
accept jobs from the AWS Batch job scheduler.
d. (Optional) Expand Additional settings: service role, instance role, EC2 key pair.

i. For Service role, choose Batch service-linked role. The role allows the AWS Batch service
to make calls to the required AWS API operations on your behalf. For more information, see
Service-linked role permissions for AWS Batch (p. 181).
ii. For Instance role, choose to create a new instance profile or use an existing instance profile
that has the required IAM permissions attached. This instance profile allows the Amazon
ECS container instances that are created for your compute environment to make calls to
the required AWS API operations on your behalf. For more information, see Amazon ECS
instance role (p. 145). If you choose to create a new instance profile, the required role
(ecsInstanceRole) is created for you.
iii. For EC2 key pair choose an existing Amazon EC2 key pair to associate with the instance at
launch. You can use this key pair to connect to your instances with SSH. Make sure to verify
that your security group allows incoming traffic on port 22.
5. Configure your Instance configuration.

a. For Provisioning model, choose On-Demand to launch Amazon EC2 On-Demand Instances or
Spot to use Amazon EC2 Spot Instances.
b. If you chose to use Spot Instances:

• (Optional) For Maximum % on-demand price, choose the maximum percentage that a Spot
Instance price can be when compared with the On-Demand price for that instance type
before instances are launched. For example, if your maximum price is 20%, then the Spot
price must be less than 20% of the current On-Demand price for that EC2 instance. You
always pay the lowest (market) price and never more than your maximum percentage. If
you leave this ﬁeld empty, the default value is 100% of the On-Demand price.
c. For Minimum vCPUs, choose the minimum number of EC2 vCPUs that your compute
environment should maintain, regardless of job queue demand.
d. For Maximum vCPUs, choose the maximum number of EC2 vCPUs that your compute
environment can scale out to, regardless of job queue demand.
e. For Desired vCPUs, choose the number of EC2 vCPUs that your compute environment should
launch with. As your job queue demand increases, AWS Batch can increase the desired number
of vCPUs in your compute environment and add EC2 instances, up to the maximum vCPUs. As
demand decreases, AWS Batch can decrease the desired number of vCPUs in your compute
environment and remove instances, down to the minimum vCPUs.
f. For Allowed instance types, choose the Amazon EC2 instance types that can be launched. You
can specify instance families to launch any instance type within those families (for example, c5,
c5n, or p3), or you can specify speciﬁc sizes within a family (such as c5.8xlarge). Note that

101
AWS Batch User Guide
To create a managed compute
environment using EC2 resources

metal instance types aren't in the instance families. For example, c5 doesn't include c5.metal.
You can also choose optimal to select instance types (from the C4, M4, and R4 instance
families) as you need that match the demand of your job queues.
Note
When you create a compute environment, the instance types that you select for the
compute environment must share the same architecture. For example, you can't mix
x86 and ARM instances in the same compute environment.
Note
AWS Batch will scale GPUs based on the required amount in your job queues. To use
GPU scheduling, the compute environment must include instance types from the p2,
p3, p4, g3, g3s, or g4 families.
Note
Currently, optimal uses instance types from the C4, M4, and R4 instance families. In
Regions that don't have instance types from those instance families, instance types
from the C5, M5. and R5 instance families are used.
g. For Allocation strategy, choose the allocation strategy to use when selecting instance
types from the list of allowed instance types. BEST_FIT_PROGRESSIVE is usually the better
choice for EC2 On-Demand compute environments, and SPOT_CAPACITY_OPTIMIZED for
EC2 Spot compute environments. For more information, see the section called “Allocation
strategies” (p. 113).
h. (Optional) Expand Additional settings: launch template, user speciﬁed AMI.

i. (Optional) For Launch template, select an existing Amazon EC2 launch template to
configure your compute resources. The default version of the template is automatically
populated. For more information, see Launch template support (p. 96).
ii. (Optional) For Launch template version, enter $Default, $Latest, or a specific version
number to use.
Important
After the compute environment is created, the launch template version used
isn't changed even if the $Default or $Latest version for the launch template
is updated. To use a new launch template version, create a new compute
environment, add the new compute environment to the existing job queue, remove
the old compute environment from the job queue, and delete the old compute
environment.
iii. (Optional) Check Enable user-specified AMI ID to use your own custom AMI. By default,
AWS Batch managed compute environments use a recent, approved version of the Amazon
ECS optimized AMI for compute resources. You can create and use your own AMI in your
compute environment by following the compute resource AMI specification. For more
information, see Compute resource AMIs (p. 88).
Note
The AMI that you choose for a compute environment must match the architecture
of the instance types that you intend to use for that compute environment. For
example, if your compute environment uses A1 instance types, the compute
resource AMI that you choose must support ARM instances. Amazon ECS vends
both x86 and ARM versions of the Amazon ECS optimized Amazon Linux 2 AMI. For
more information, see Amazon ECS optimized Amazon Linux 2 AMI in the Amazon
Elastic Container Service Developer Guide.

• For AMI ID, paste your custom AMI ID and choose Validate AMI.
iv. (Optional) For EC2 configuration choose Image type and Image ID override values to
provide information for AWS Batch to select Amazon Machine Images (AMIs) for instances
in the compute environment. If the Image ID override isn't specified for each Image type,
AWS Batch selects a recent Amazon ECS optimized AMI. If no Image type is specified, the

102
AWS Batch User Guide
To create an unmanaged compute
environment using EC2 resources

default is a Amazon Linux for non-GPU, non AWS Graviton instance. In the future, this
default will change to Amazon Linux 2 for all non-GPU instances.

Amazon Linux 2

Default for all AWS Graviton-based instance families (for example, C6g, M6g, R6g, and
T4g) and can be used for all non-GPU instance types.
Amazon Linux 2 (GPU)

Default for all GPU instance families (for example P4 and G4) and can be used for all
non AWS Graviton-based instance types.
Amazon Linux

Default for all non-GPU, non AWS Graviton instance families. Amazon Linux is reaching
the end-of-life of standard support. For more information, see Amazon Linux AMI.
i.
6. Conﬁgure networking.
Important
Compute resources need access to communicate with the Amazon ECS service endpoint.
This can be through an interface VPC endpoint or through your compute resources having
public IP addresses.
For more information about interface VPC endpoints, see Amazon ECS Interface VPC
Endpoints (AWS PrivateLink) in the Amazon Elastic Container Service Developer Guide.
If you do not have an interface VPC endpoint conﬁgured and your compute resources do
not have public IP addresses, then they must use network address translation (NAT) to
provide this access. For more information, see NAT gateways in the Amazon VPC User Guide.
For more information, see Tutorial: Creating a VPC with Public and Private Subnets for Your
Compute Environments (p. 165).

a. For VPC ID, choose a VPC where to launch your instances.

b. For Subnets, choose which subnets in the selected VPC should host your instances. By default,
all subnets within the selected VPC are chosen.
c. (Optional) Expand Additional settings: Security groups, EC2 tags.

i. For Security groups, choose a security group to attach to your instances. By default, the
default security group for your VPC is chosen.
ii. (Optional) In the EC2 tags, you can tag the Amazon EC2 instances used by your On-
Demand Instances. For example, you can specify "Name": "AWS Batch Instance -
C4OnDemand" as a tag so that each instance in your compute environment has that name.
This is helpful for recognizing your AWS Batch instances in the Amazon EC2 console.
Note
EC2 tags isn't available when using either Fargate or Fargate Spot provisioning
models.
7. (Optional) In the Tags section, you can specify the key and value for each tag to associate with the
compute environment. For more information, see Tagging your AWS Batch resources (p. 197).
8. Choose Create compute environment to ﬁnish.

To create an unmanaged compute environment using

EC2 resources
1. Open the AWS Batch console at https://console.aws.amazon.com/batch/.
103
AWS Batch User Guide
Compute environment template

2. From the navigation bar, select the Region to use.

3. In the navigation pane, choose Compute environments, Create environment.
4. For Compute environment type, choose Unmanaged.
5. For Compute environment name, specify a unique name for your compute environment. The name
can be up to 128 characters in length. It can contain uppercase and lowercase letters, numbers,
hyphens (-), and underscores (_).
6. For Service role, choose Batch service-linked role. The role allows the AWS Batch service to make
calls to the required AWS API operations on your behalf. For more information, see Service-linked
role permissions for AWS Batch (p. 181).
7. Ensure that Enable compute environment is selected so that your compute environment can accept
jobs from the AWS Batch job scheduler.
8. Choose Create to ﬁnish.
9. (Optional) Retrieve the Amazon ECS cluster ARN for the associated cluster. The following AWS CLI
command provides the Amazon ECS cluster ARN for a compute environment:

aws batch describe-compute-environments --compute-environments unmanagedCE --query

"computeEnvironments[].ecsClusterArn"

10. (Optional) Launch container instances into the associated Amazon ECS cluster. For more information,
see Launching an Amazon ECS container instance in the Amazon Elastic Container Service Developer
Guide. When you launch your compute resources, specify the Amazon ECS cluster ARN that the
resources should register with the following Amazon EC2 user data. Replace ecsClusterArn with
the cluster ARN you obtained with the previous command.

#!/bin/bash
echo "ECS_CLUSTER=ecsClusterArn" >> /etc/ecs/ecs.config

Note
Your unmanaged compute environment doesn't have any compute resources until you
launch them manually.

Compute environment template

The following example shows an empty compute environment template. You can use this template to
create your compute environment that can then be saved to a ﬁle and used with the AWS CLI --cli-
input-json option. For more information about these parameters, see CreateComputeEnvironment in
the AWS Batch API Reference.

{
"computeEnvironmentName": "",
"type": "UNMANAGED",
"state": "ENABLED",
"computeResources": {
"type": "SPOT",
"allocationStrategy": "SPOT_CAPACITY_OPTIMIZED",
"minvCpus": 0,
"maxvCpus": 0,
"desiredvCpus": 0,
"instanceTypes": [
""
],
"imageId": "",
"subnets": [
""
],

104
AWS Batch User Guide
Compute environment parameters

"securityGroupIds": [
""
],
"ec2KeyPair": "",
"instanceRole": "",
"tags": {
"KeyName": ""
},
"placementGroup": "",
"bidPercentage": 0,
"spotIamFleetRole": "",
"launchTemplate": {
"launchTemplateId": "",
"launchTemplateName": "",
"version": ""
},
"ec2Configuration": [
{
"imageType": "",
"imageIdOverride": ""
}
]
},
"serviceRole": "",
"tags": {
"KeyName": ""
}
}

Note
You can generate the preceding compute environment template with the following AWS CLI
command.

$ aws batch create-compute-environment --generate-cli-skeleton

Compute environment parameters

Compute environments are split into ﬁve basic components: the name, type, and state of the compute
environment, the compute resource deﬁnition (if it's a managed compute environment), and the service
role to use to provide IAM permissions to AWS Batch.

Topics
• Compute environment name (p. 105)
• Type (p. 106)
• State (p. 106)
• Compute resources (p. 106)
• Service role (p. 112)
• Tags (p. 112)

Compute environment name

computeEnvironmentName

The name for your compute environment. The name can be up to 128 characters in length. It can
contain uppercase and lowercase letters, numbers, hyphens (-), and underscores (_).

105
AWS Batch User Guide
Type

Type: String

Required: Yes

Type
type

The type of the compute environment. Choose MANAGED to have AWS Batch manage the EC2 or
Fargate compute resources that you deﬁne. For more information, see Compute resources (p. 106).
Choose UNMANAGED to manage your own EC2 compute resources.

Type: String

Valid values: MANAGED | UNMANAGED

Required: Yes

State
state

The state of the compute environment.

If the state is ENABLED, the AWS Batch scheduler attempts to place jobs within the environment.
These jobs are from an associated job queue on the compute resources. If the compute environment
is managed, it can scale its instances out or in automatically based on job queue demand.

If the state is DISABLED, the AWS Batch scheduler doesn't attempt to place jobs within the
environment. Jobs in a STARTING or RUNNING state continue to progress normally. Managed
compute environments in the DISABLED state don't scale out. However, after instances go idle, they
scale in to the smallest number of instances that satisﬁes the minvCpus value.

Type: String

Valid values: ENABLED | DISABLED

Required: No

Compute resources
computeResources

Details of the compute resources managed by the compute environment. For more information, see
Compute Environments.

Type: ComputeResource object

Required: This parameter is required for managed compute environments

type

The type of compute environment. You can choose either to use EC2 On-Demand Instances
(EC2) and EC2 Spot Instances (SPOT), or to use Fargate capacity (FARGATE) and Fargate Spot
capacity (FARGATE_SPOT) in your managed compute environment. If you choose SPOT, you
must also specify an Amazon EC2 Spot Fleet role with the spotIamFleetRole parameter. For
more information, see Amazon EC2 spot ﬂeet role (p. 145).

106
AWS Batch User Guide
Compute resources

Valid values: EC2 | SPOT | FARGATE | FARGATE_SPOT

Required: Yes
allocationStrategy

The allocation strategy to use for the compute resource if not enough instances of the best
ﬁtting EC2 instance type can be allocated. This might be due to availability of the instance
type in the Region or Amazon EC2 service limits. For more information, see Allocation
strategies (p. 113).
Note
This parameter isn't applicable to jobs that are running on Fargate resources, and
shouldn't be speciﬁed.
BEST_FIT (default)

AWS Batch selects an instance type that best ﬁts the needs of the jobs with a preference
for the lowest cost instance type. If additional instances of the selected instance type
aren't available, AWS Batch waits for the additional instances to be available. If there aren't
enough instances available, or if you're hitting Amazon EC2 service limits then additional
jobs don't run until currently running jobs have completed. This allocation strategy keeps
costs lower but can limit scaling. If you're using Spot Fleets with BEST_FIT then the Spot
Fleet IAM Role must be speciﬁed.
BEST_FIT_PROGRESSIVE

Use additional instance types that are large enough to meet the requirements of the jobs
in the queue, with a preference for instance types with a lower cost for each unit vCPU. If
additional instances of the previously selected instance types aren't available, AWS Batch
selects new instance types.
SPOT_CAPACITY_OPTIMIZED

(Only available for Spot Instance compute resources) Use additional instance types that
are large enough to meet the requirements of the jobs in the queue, with a preference for
instance types that are less likely to be interrupted.

With both BEST_FIT_PROGRESSIVE and SPOT_CAPACITY_OPTIMIZED strategies, AWS Batch

might need to exceed maxvCpus to meet your capacity requirements. In this event, AWS Batch
never exceeds maxvCpus by more than a single instance.

Valid values: BEST_FIT | BEST_FIT_PROGRESSIVE | SPOT_CAPACITY_OPTIMIZED

Required: No
minvCpus

The minimum number of Amazon EC2 vCPUs that an environment should maintain (even if a
compute environment is DISABLED).
Note
This parameter isn't applicable to jobs running on Fargate resources, and shouldn't be
speciﬁed.

Type: Integer

Required: Yes
maxvCpus

The maximum number of Amazon EC2 vCPUs that an environment can reach.
Note
With both BEST_FIT_PROGRESSIVE and SPOT_CAPACITY_OPTIMIZED allocation
strategies, AWS Batch might need to exceed maxvCpus to meet your capacity

107
AWS Batch User Guide
Compute resources

requirements. In this event, AWS Batch never exceeds maxvCpus by more than a single
instance. For example, AWS Batch uses no more than a single instance from among
those speciﬁed in your compute environment.

Type: Integer

Required: Yes
desiredvCpus

The desired number of Amazon EC2 vCPUS in the compute environment. AWS Batch modiﬁes
this value between the minimum and maximum values based on job queue demand.
Note
This parameter isn't applicable to jobs running on Fargate resources, and shouldn't be
speciﬁed.

Type: Integer

Required: No
instanceTypes

The instance types that can be launched. This parameter isn't applicable to jobs that are running
on Fargate resources, and shouldn't be speciﬁed. You can specify instance families to launch
any instance type within those families (for example, c5, c5n, or p3). Or, you can specify
speciﬁc sizes within a family (such as c5.8xlarge). Note that metal instance types aren't in the
instance families (for example c5 does not include c5.metal.) You can also choose optimal to
select instance types (from the C4, M4, and R4 instance families) that match the demand of your
job queues.
Note
When you create a compute environment, the instance types that you select for the
compute environment must share the same architecture. For example, you can't mix
x86 and ARM instances in the same compute environment.
Note
Currently, optimal uses instance types from the C4, M4, and R4 instance families. In
Regions that don't have instance types from those instance families, instance types
from the C5, M5. and R5 instance families are used.

Type: Array of strings

Required: yes
imageId

This parameter is deprecated.

The Amazon Machine Image (AMI) ID used for instances launched in the compute environment.
This parameter is overridden by the imageIdOverride member of the Ec2Configuration
structure.
Note
This parameter isn't applicable to jobs that are running on Fargate resources, and
shouldn't be speciﬁed.
Note
The AMI that you choose for a compute environment must match the architecture of
the instance types that you intend to use for that compute environment. For example, if
your compute environment uses A1 instance types, the compute resource AMI that you
choose must support ARM instances. Amazon ECS vends both x86 and ARM versions
of the Amazon ECS optimized Amazon Linux 2 AMI. For more information, see Amazon
ECS optimized Amazon Linux 2 AMI in the Amazon Elastic Container Service Developer
Guide.

108
AWS Batch User Guide
Compute resources

Type: String

Required: No
subnets

The VPC subnets into which the compute resources are launched. These subnets must be within
the same VPC. Fargate compute resources can contain a maximum of 16 subnets. For more
information, see VPCs and Subnets in the Amazon VPC User Guide.

Type: Array of strings

Required: Yes
securityGroupIds

The Amazon EC2 security groups associated with instances launched in the compute
environment. One or more security groups must be speciﬁed, either in securityGroupIds or
using a launch template referenced in launchTemplate. This parameter is required for jobs
running on Fargate resources and must contain at least one security group. (Fargate doesn't
support launch templates.) If security groups are speciﬁed using both securityGroupIds and
launchTemplate, the values in securityGroupIds will be used.

Type: Array of strings

Required: Yes
ec2KeyPair

The EC2 key pair that's used for instances launched in the compute environment. You can use
this key pair to log in to your instances with SSH.
Note
This parameter isn't applicable to jobs that are running on Fargate resources, and
shouldn't be speciﬁed.

Type: String

Required: No
instanceRole

The Amazon ECS instance profile to attach to Amazon EC2 instances in a compute environment.
This parameter isn't applicable to jobs that are running on Fargate resources, and shouldn't be
specified. You can specify the short name or full Amazon Resource Name (ARN) of an instance
profile. For example, ecsInstanceRole or arn:aws:iam::aws_account_id:instance-
profile/ecsInstanceRole. For more information, see Amazon ECS instance role (p. 145).

Type: String

Required: No
tags

Key-value pair tags to be applied to EC2 instances that are launched in the compute
environment. For example, you can specify "Name": "AWS Batch Instance -
C4OnDemand" as a tag so that each instance in your compute environment has that name. This
is helpful for recognizing your AWS Batch instances in the Amazon EC2 console. These tags can't
be updated or removed after the compute environment has been created. Any changes require
creating a new compute environment and removing the previous compute environment. These
tags aren't seen when using the AWS Batch ListTagsForResource API operation.
Note
This parameter isn't applicable to jobs that are running on Fargate resources, and
shouldn't be speciﬁed.

109
AWS Batch User Guide
Compute resources

Type: String to string map

Required: No
placementGroup

The Amazon EC2 placement group to associate with your compute resources. This parameter
isn't applicable to jobs running on Fargate resources, and shouldn't be speciﬁed. If you intend
to submit multi-node parallel jobs to your compute environment, you should consider creating
a cluster placement group and associate it with your compute resources. This keeps your multi-
node parallel job on a logical grouping of instances within a single Availability Zone with high
network ﬂow potential. For more information, see Placement Groups in the Amazon EC2 User
Guide for Linux Instances.

Type: String

Required: No
bidPercentage

The maximum percentage that an EC2 Spot Instance price can be when compared with the
On-Demand price for that instance type before instances are launched. For example, if your
maximum percentage is 20%, then the Spot price must be less than 20% of the current On-
Demand price for that EC2 instance. You always pay the lowest (market) price and never more
than your maximum percentage. If you leave this ﬁeld empty, the default value is 100% of the
On-Demand price.
Note
This parameter isn't applicable to jobs that are running on Fargate resources, and
shouldn't be speciﬁed.

Required: No
spotIamFleetRole

The Amazon Resource Name (ARN) of the Amazon EC2 Spot Fleet IAM role applied to a SPOT
compute environment. This role is required if the allocation strategy set to BEST_FIT or
if the allocation strategy isn't specified. For more information, see Amazon EC2 spot fleet
role (p. 145).
Note
This parameter isn't applicable to jobs that are running on Fargate resources, and
shouldn't be specified.
Important
To tag your Spot Instances on creation, the Spot Fleet IAM role specified here must
use the newer AmazonEC2SpotFleetTaggingRole managed policy. The previously
recommended AmazonEC2SpotFleetRole managed policy doesn't have the required
permissions to tag Spot Instances. For more information, see Spot Instances Not
Tagged on Creation (p. 204).

Type: String

Required: This parameter is required for SPOT compute environments.

launchTemplate

An optional launch template to associate with your compute resources. This parameter isn't
applicable to jobs running on Fargate resources, and shouldn't be speciﬁed. Any other compute
resource parameters that you specify in a CreateComputeEnvironment API operation override
the same parameters in the launch template. To use a launch template, you must specify
either the launch template ID or launch template name in the request, but not both. For more
information, see Launch template support (p. 96).

110
AWS Batch User Guide
Compute resources

Type: LaunchTemplateSpeciﬁcation

object

Required: No
launchTemplateId

The ID of the launch template.

Type: String

Required: No
launchTemplateName

The name of the launch template.

Type: String

Required: No
version

The version number of the launch template, $Latest, or $Default.

If the value is $Latest, the latest version of the launch template is used. If the value is
$Default, the default version of the launch template is used.
Important
After the compute environment is created, the launch template version used
will not be changed, even if the $Default or $Latest version for the launch
template is updated. To use a new launch template version, create a new compute
environment, add the new compute environment to the existing job queue, remove
the old compute environment from the job queue, and delete the old compute
environment.

Default: $Default.

Type: String

Required: No
ec2Configuration

Provides information used to select Amazon Machine Images (AMIs) for instances in the EC2
compute environment. If Ec2Configuration isn't speciﬁed, the default is Amazon Linux 2
(ECS_AL2). Before March 31, 2021, this default was Amazon Linux (ECS_AL1) for non-GPU, non
AWS Graviton instances.
Note
This parameter isn't applicable to jobs that are running on Fargate resources, and
shouldn't be speciﬁed.

Type: Array of Ec2Conﬁguration objects

Required: No
imageIdOverride

The AMI ID used for instances launched in the compute environment that matches the
image type. This setting overrides the imageId set in the computeResource object.

Type: String

Required: No

111
AWS Batch User Guide
Service role

imageType

The image type to match with the instance type to select an AMI. If the imageIdOverride
parameter isn't speciﬁed, then a recent Amazon ECS optimized AMI is used.
Amazon Linux 2 (ECS_AL2)

Default for all AWS Graviton based instance families (for example, C6g, M6g, R6g, and
T4g) and can be used for all non-GPU instance types.
Amazon Linux 2 (GPU) (ECS_AL2_NVIDIA)

Default for all GPU instance families (for example P4 and G4) and can be used for all
non AWS Graviton based instance types.
Amazon Linux (ECS_AL1)

Default for all non-GPU, non AWS Graviton instance families. Amazon Linux will
discontinue standard support. For more information, see Amazon Linux AMI.

Type: String

Required: Yes

Service role
serviceRole

The full Amazon Resource Name (ARN) of the IAM role that allows AWS Batch to make calls to other
AWS services on your behalf. For more information, see AWS Batch service IAM role (p. 142).
Important
If your account has already created the AWS Batch service-linked role
(AWSServiceRoleForBatch), that role is used by default for your compute environment
unless you specify a role here. If the AWS Batch service-linked role doesn't exist in your
account, and no role is speciﬁed here, the service tries to create the AWS Batch service-
linked role in your account. For more information about the AWSServiceRoleForBatch
service-linked role, see Service-linked role permissions for AWS Batch (p. 181).

If your specified role has a path other than /, then you must either specify the full role ARN (this is
recommended) or prefix the role name with the path.
Note
Depending on how you created your AWS Batch service role, its ARN might contain the
service-role path prefix. When you only specify the name of the service role, AWS Batch
assumes that your ARN doesn't use the service-role path prefix. Because of this, we
recommend that you specify the full ARN of your service role when you create compute
environments.

Type: String

Required: No

Tags
tags

Key-value pair tags to associate with the compute environment. For more information, see Tagging
your AWS Batch resources (p. 197).

112
AWS Batch User Guide
EC2 Conﬁgurations

Type: String to string map

Required: No

EC2 Conﬁgurations
AWS Batch uses Amazon ECS optimized AMIs for EC2 and EC2 Spot compute environments. The default
is Amazon Linux 2 (ECS_AL2). Before March 31, 2021, this default was Amazon Linux (ECS_AL1) for non-
GPU, non AWS Graviton instances.

We made this change because the Amazon Linux AMI has discontinued standard support and entered
into a maintenance support period, which is scheduled to end on June 30, 2023. The Amazon Linux AMI
will continue to receive critical and important security updates for a reduced list of packages. During the
maintenance support period, an Amazon Linux AMI might still be used for newly-created managed EC2
and EC2 Spot compute environments by specifying an Ec2Configuration parameter when creating a
compute environment. After the end of the maintenance support period, an Amazon Linux AMI will no
longer be a supported image type for new AWS Batch compute environments.

Existing compute environments and instances will not be affected by this change and will continue
to operate with their configured AMI until the end of the maintenance support period. Amazon Linux
AMI will no longer be a supported image type for AWS Batch compute environments. We encourage
migration of all compute environments to Amazon Linux 2 prior to June 30, 2023. Not all instance
types introduced after March 31, 2021, will be supported by the Amazon Linux AMI. If you use launch
templates with custom user data, confirm that everything is configured as expected.

The storage configuration differs between the Amazon ECS optimized Amazon Linux AMI and Amazon
Linux 2-based Amazon ECS optimized AMIs. For more information, see AMI storage configuration in the
Amazon Elastic Container Service Developer Guide.

Allocation strategies
When a managed compute environment is created, AWS Batch selects instance types from the
instanceTypes specified that best fit the needs of the jobs. The allocation strategy defines behavior
when AWS Batch needs additional capacity. This parameter isn't applicable to jobs running on Fargate
resources, and shouldn't be specified. For more information, see Allocation strategies (p. 113).

BEST_FIT (default)

AWS Batch selects an instance type that best ﬁts the needs of the jobs with a preference for the
lowest-cost instance type. If additional instances of the selected instance type aren't available, AWS
Batch waits for the additional instances to be available. If there aren't enough instances available, or
if the user is hitting Amazon EC2 service limits then additional jobs don't run until currently running
jobs have completed. This allocation strategy keeps costs lower but can limit scaling. If you're using
Spot Fleets with BEST_FIT then the Spot Fleet IAM Role must be speciﬁed.
BEST_FIT_PROGRESSIVE

AWS Batch selects additional instance types that are large enough to meet the requirements of
the jobs in the queue. It has a preference for instance types with a lower cost for each unit vCPU.
If additional instances of the previously selected instance types aren't available, AWS Batch selects
new instance types.
SPOT_CAPACITY_OPTIMIZED

AWS Batch selects one or more instance types that are large enough to meet the requirements of
the jobs in the queue, with a preference for instance types that are less likely to be interrupted. This
allocation strategy is only available for Spot Instance compute resources.

113
AWS Batch User Guide
Memory Management

With both BEST_FIT_PROGRESSIVE and SPOT_CAPACITY_OPTIMIZED strategies, AWS Batch might

need to exceed maxvCpus to meet your capacity requirements. In this event, AWS Batch never exceeds
maxvCpus by more than a single instance.

Compute Resource Memory Management

When the Amazon ECS container agent registers a compute resource into a compute environment, the
agent must determine how much memory the compute resource has available to reserve for your jobs.
Because of platform memory overhead and memory occupied by the system kernel, this number is
diﬀerent than the installed memory amount that is advertised for Amazon EC2 instances. For example,
an m4.large instance has 8 GiB of installed memory. However, this does not always translate to exactly
8192 MiB of memory available for jobs when the compute resource registers.

If you specify 8192 MiB for the job, and none of your compute resources have 8192 MiB or greater
of memory available to satisfy this requirement, then the job cannot be placed in your compute
environment. If you are using a managed compute environment, then AWS Batch must launch a larger
instance type to accommodate the request.

The default AWS Batch compute resource AMI also reserves 32 MiB of memory for the Amazon ECS
container agent and other critical system processes. This memory is not available for job allocation. For
more information, see Reserving System Memory (p. 114).

The Amazon ECS container agent uses the Docker ReadMemInfo() function to query the total memory
available to the operating system. Linux provides command line utilities to determine the total memory.

Example - Determine Linux total memory

The free command returns the total memory that is recognized by the operating system.

$ free -b

Example output for an m4.large instance running the Amazon ECS-optimized Amazon Linux AMI.

total used free shared buffers cached

Mem: 8373026816 348180480 8024846336 90112 25534464 205418496
-/+ buffers/cache: 117227520 8255799296

This instance has 8373026816 bytes of total memory, which translates to 7985 MiB available for tasks.

Reserving System Memory

If you occupy all of the memory on a compute resource with your jobs, then it is possible that your jobs
will contend with critical system processes for memory and possibly trigger a system failure. The Amazon
ECS container agent provides a configuration variable called ECS_RESERVED_MEMORY, which you can
use to remove a specified number of MiB of memory from the pool that is allocated to your jobs. This
effectively reserves that memory for critical system processes.

The default AWS Batch compute resource AMI reserves 32 MiB of memory for the Amazon ECS container
agent and other critical system processes.

Viewing Compute Resource Memory

You can view how much memory a compute resource registers with in the Amazon ECS console (or with
the DescribeContainerInstances API operation). If you are trying to maximize your resource utilization

114
AWS Batch User Guide
Viewing Compute Resource Memory

by providing your jobs as much memory as possible for a particular instance type, you can observe the
memory available for that compute resource and then assign your jobs that much memory.

To view compute resource memory

1. Open the Amazon ECS console at https://console.aws.amazon.com/ecs/.

2. Choose the cluster that hosts your compute resources to view. The cluster name for your compute
environment begins with your compute environment name.
3. Choose ECS Instances, and select a compute resource from the Container Instance column to view.
4. The Resources section shows the registered and available memory for the compute resource.

The Registered memory value is what the compute resource registered with Amazon ECS when it
was ﬁrst launched, and the Available memory value is what has not already been allocated to jobs.

115
AWS Batch User Guide
Creating a scheduling policy

Scheduling policies
Scheduling policies allow compute resources in a job queue to be allocated in a more equitable manner
between different users or workloads. Different workloads or users are assigned different fair share
identifiers. AWS Batch assigns each fair share identifier a share based on the total weight of all recently
used fair share identifiers, which defines the amount of the total resources available for use by jobs with
that fair share identifier. Time can be added to the fair share analysis by assigning a share decay time
to the policy. A long decay time gives more weight to time and less to the defined weight. Compute
resources can be held in reserve for fair share identifiers that are not active by specifying a compute
reservation.

Creating a scheduling policy

Before you can create a job queue with a scheduling policy, you must create a scheduling policy. When
you create a scheduling policy, you associate one or more fair share identifiers or fair share identifier
prefixes with weights for the queue and optionally assign a decay period and compute reservation to the
policy.

To create a scheduling policy

1. Open the AWS Batch console at https://console.aws.amazon.com/batch/.

2. From the navigation bar, select the Region to use.
3. In the navigation pane, choose Scheduling policies, Create.
4. For Name, enter a unique name for your scheduling policy. Up to 128 letters (uppercase and
lowercase), numbers, hyphens, and underscores are allowed.
5. (Optional) For Share decay seconds, enter an integer value for the scheduling policy's share decay
time. A longer share decay time will use consider compute resource usage over a longer time when
scheduling jobs. This can allow jobs using a fair share identifier to temporarily use more compute
resources than the weight for that fair share identifier would allow if that fair share identifier had
not recently been using compute resources.
6. (Optional) For Compute reservation, enter an integer value for the scheduling policy's compute
reservation. The compute reservation will hold some vCPUs in reserve to be used for fair share
identifiers that are not currently active.

The reserved ratio is (computeReservation/100)^ActiveFairShares where ActiveFairShares is

the number of active fair share identiﬁers.

For example, a computeReservation value of 50 indicates that AWS Batch should reserve 50% of
the maximum available VCPU if there is only one fair share identifier, 25% if there are two fair share
identifiers, and 12.5% if there are three fair share identifiers. A computeReservation value of 25
indicates that AWS Batch should reserve 25% of the maximum available VCPU if there is only one
fair share identifier, 6.25% if there are two fair share identifiers, and 1.56% if there are three fair
share identifiers.
7. In the Share attributes section, you can specify the fair share identifier and weight for each fair
share identifier to associate with the scheduling policy.

a. Choose Add share identiﬁer.

b. For Share identifier, specify the fair share identifier. If the string ends with '*', this becomes
a fair share identifier prefix used to match fair share identifiers for jobs. All of the fair share
identifiers and fair share identifier prefixes in a scheduling policy must be unique and cannot

116
AWS Batch User Guide
Scheduling policy template

overlap. For example, you can't have fair share identifiers prefix 'UserA*' and fair share identifier
'UserA1' in the same scheduling policy.
c. For Weight factor, specify the relative weight for the fair share identifier. The default value is
1.0. A lower value has a higher priority for compute resources. If a fair share identifier prefix
is used, jobs with fair share identifiers that start with the prefix will share the weight factor.
This effectively increases the weight factor for those jobs, lowering their individual priority but
maintaining the same weight factor for the fair share identifier prefix.
8. (Optional) In the Tags section, you can specify the key and value for each tag to associate with the
scheduling policy. For more information, see Tagging your AWS Batch resources (p. 197).
9. Choose Submit to finish and create your scheduling policy.

Scheduling policy template

An empty scheduling policy template is shown below. You can use this template to create your
scheduling policy which can then be saved to a ﬁle and used with the AWS CLI --cli-input-json
option. For more information about these parameters, see CreateSchedulingPolicy in the AWS Batch API
Reference.

{
"name": "",
"fairsharePolicy": {
"shareDecaySeconds": 0,
"computeReservation": 0,
"shareDistribution": [
{
"shareIdentifier": "",
"weightFactor": 0.0
}
]
},
"tags": {
"KeyName": ""
}
}

Note
You can generate the preceding job queue template with the following AWS CLI command.

$ aws batch create-scheduling-policy --generate-cli-skeleton

Scheduling policy parameters

Scheduling policies are split into three basic components: the name, fair share policy, and tags of the
scheduling policy.

Scheduling policy name

name

The name for your scheduling policy. Up to 128 letters (uppercase and lowercase), numbers,
hyphens, and underscores are allowed.

Type: String

117
AWS Batch User Guide
Fair share policy

Required: Yes

Fair share policy

fairsharePolicy

The fair share policy of the scheduling policy.

"fairsharePolicy": {
"computeReservation": number,
"shareDecaySeconds": number,
"shareDistribution": [
{
"shareIdentifier": "string",
"weightFactor": number
}
]
}

Type: Object

Required: No
computeReservation

A value used to reserve some of the available maximum VCPU for fair share identiﬁers that have
not yet been used.

The reserved ratio is (computeReservation/100)^ActiveFairShares where

ActiveFairShares is the number of active fair share identiﬁers.

For example, a computeReservation value of 50 indicates that AWS Batch should reserve
50% of the maximum available VCPU if there is only one active fair share identifier, 25% if there
are two active fair share identifiers, and 12.5% if there are three active fair share identifiers.
A computeReservation value of 25 indicates that AWS Batch should reserve 25% of the
maximum available VCPU if there is only one active fair share identifier, 6.25% if there are two
active fair share identifiers, and 1.56% if there are three active fair share identifiers.

Type: Integer

Valid range: Minimum value of 0. Maximum value of 99.

Required: No
shareDecaySeconds

The time period to use to calculate a fair share percentage for each fair share identiﬁer in use.
A value of zero (0) indicates that only current usage should be measured. The decay allows for
more recently run jobs to have more weight than jobs that ran earlier.

Type: Integer

Valid range: Minimum value of 0. Maximum value of 604800 (1 week).

Required: No
shareDistribution

Array of objects that contain the weights for the fair share identiﬁers for the fair share policy.
Fair share identiﬁers that are not included have a default weight of 1.0.

118
AWS Batch User Guide
Tags

"shareDistribution": [
{
"shareIdentifier": "string",
"weightFactor": number
}
]

Type: Array

Required: No
shareIdentifier

A fair share identifier or fair share identifier prefix. If the string ends with '*' then this string
specifies a fair share identifier prefix for fair share identifiers that begin with that prefix. For
example if the value is UserA* and the weightFactor is 1 and there are two fair share
identifiers that begin with UserA, then each of those fair share identifiers will have a weight
of 2; if there are five such fair share identifiers, then each would have a weight of 5.

The list of fair share identifiers and fair share identifier prefixes in a fair share policy cannot
overlap. For example you cannot have a fair share identifier prefix of UserA* and a fair
share identifier of UserA-1 in the same fair share policy.

Type: String

Required: Yes
weightFactor

The weight factor for the fair share identifier. The default value is 1.0. A lower value has
a higher priority for compute resources. For example, jobs that use a share identifier with
a weight factor of 0.125 (1/8) get 8 times the compute resources of jobs that use a share
identifier with a weight factor of 1.

The smallest supported value is 0.0001 and the largest supported value is 999.9999.

Type: Float

Required: No

Tags
tags

Key-value pair tags to associate with the scheduling policy. For more information, see Tagging your
AWS Batch resources (p. 197).

Type: String to string map

Required: No

119
AWS Batch User Guide
Viewing state machine details

Orchestrate AWS Batch jobs with

Step Functions state machines in the
AWS Batch console
You can use the AWS Batch console to view details about your Step Functions state machines and the
functions that they use.

Sections
• Viewing state machine details (p. 120)
• Editing a state machine (p. 120)
• Running a state machine (p. 121)

Viewing state machine details

The AWS Batch console displays a list of your state machines in the current AWS Region that contain at
least one workﬂow step that submits a AWS Batch job.

Choose a state machine to view a graphical representation of the workﬂow. Steps highlighted in blue
represent AWS Batch jobs. Use the graph controls to zoom in, zoom out, and center the graph.
Note
When a AWS Batch job is dynamically referenced with JsonPath in the state machine deﬁnition,
the function details cannot be shown in the AWS Batch console. Instead, the job name is listed
as a Dynamic reference, and the corresponding steps in the graph are grayed out.

To view state machine details

1. Open the AWS Batch console Workﬂow orchestration powered by Step Functions page.
2. Choose a state machine.
<result>

The AWS Batch console opens the Details page.

</result>

For more information, see Step Functions in the AWS Step Functions Developer Guide.

Editing a state machine

When you want to edit a state machine, AWS Batch opens the Edit deﬁnition page of the Step Functions
console.

To edit a state machine

1. Open the AWS Batch console Workﬂow orchestration powered by Step Functions page.
2. Choose a state machine.

120
AWS Batch User Guide
Running a state machine

3. Choose Edit.

The Step Functions console opens the Edit deﬁnition page.

4. Edit the state machine and choose Save.

For more information about editing state machines, see Step Functions state machine language in the
AWS Step Functions Developer Guide.

Running a state machine

When you want to run a state machine, AWS Batch opens the New execution page of the Step Functions
console.

To run a state machine

1. Open the AWS Batch console Workﬂow orchestration powered by Step Functions page.
2. Choose a state machine.
3. Choose Execute.

The Step Functions console opens the New execution page.

4. (Optional) Edit the state machine and choose Start execution.

For more information about running state machines, see Step Functions state machine execution
concepts in the AWS Step Functions Developer Guide.

121
AWS Batch User Guide
When to use Fargate

AWS Batch on AWS Fargate

AWS Fargate is a technology that you can use with AWS Batch to run containers without having to
manage servers or clusters of Amazon EC2 instances. With AWS Fargate, you no longer have to provision,
conﬁgure, or scale clusters of virtual machines to run containers. This removes the need to choose server
types, decide when to scale your clusters, or optimize cluster packing.

When you run your jobs with Fargate resources, you package your application in containers, specify the
CPU and memory requirements, deﬁne networking and IAM policies, and launch the application. Each
Fargate job has its own isolation boundary and does not share the underlying kernel, CPU resources,
memory resources, or elastic network interface with another job.

Contents
• When to use Fargate (p. 122)
• Job deﬁnitions on Fargate (p. 122)
• Job queues on Fargate (p. 124)
• Compute environments on Fargate (p. 124)

When to use Fargate

We recommend using Fargate in most scenarios. Fargate launches and scales the compute to closely
match the resource requirements that you specify for the container. With Fargate, you don't need
to over-provision or pay for additional servers. You also don't need to worry about the speciﬁcs of
infrastructure-related parameters such as instance type. When the compute environment needs to
be scaled up, jobs that run on Fargate resources can get started more quickly. Typically, it takes a few
minutes to spin up a new Amazon EC2 instance. However, jobs that run on Fargate can be provisioned in
about 30 seconds, depending on the container image size, number of jobs, and other factors.

However, we recommend that you use Amazon EC2 if your jobs require any of the following:

• more than 4 vCPUs

• more than 30 gibibytes (GiB) of memory
• a GPU
• Arm-based AWS Graviton CPU
• a custom Amazon Machine Image (AMI)
• any of the linuxParameters (p. 48) parameters

If you have a large number of jobs, we recommend Amazon EC2 because jobs can be dispatched at a
higher rate to EC2 resources than to Fargate resources. Moreover, more jobs can run concurrently when
you use EC2. For more information, see AWS Fargate service quotas in the Amazon Elastic Container
Service Developer Guide.
Note
AWS Batch does not support Windows containers, on either Fargate or EC2 resources.

Job deﬁnitions on Fargate

AWS Batch jobs on Fargate don't support all of the job deﬁnition parameters that are available. Some
parameters are not supported at all, and others behave diﬀerently for Fargate jobs.

122
AWS Batch User Guide
Job deﬁnitions on Fargate

The following list describes job deﬁnition parameters that are not valid or otherwise restricted in Fargate
jobs.

platformCapabilities

Must be speciﬁed as FARGATE.

"platformCapabilities": [ "FARGATE" ]

type

Must be speciﬁed as container.

"type": "container"

Parameters in containerProperties
executionRoleArn

Must be speciﬁed for jobs running on Fargate resources. For more information, see IAM Roles for
Tasks in the Amazon Elastic Container Service Developer Guide.

"executionRoleArn": "arn:aws:iam::123456789012:role/ecsTaskExecutionRole"

fargatePlatformConfiguration

(Optional, only for Fargate job deﬁnitions). Speciﬁes the Fargate platform version, or LATEST
for a recent platform version. Possible values for platformVersion are 1.3.0, 1.4.0, and
LATEST (default).

"fargatePlatformConfiguration": { "platformVersion": "1.4.0" }

instanceType, ulimits

Not applicable for jobs running on Fargate resources.

memory, vcpus

These settings must be speciﬁed in resourceRequirements

privileged

Either don't specify this parameter, or specify false.

"privileged": false

resourceRequirements

Both memory and vCPU requirements must be speciﬁed, using supported values (p. 56). GPU
resources are not supported for jobs running on Fargate resources.

"resourceRequirements": [
{"type": "MEMORY", "value": "512"},
{"type": "VCPU", "value": "0.25"}
]

Parameters in linuxParameters
devices, maxSwap, sharedMemorySize, swappiness, tmpfs

Not applicable for jobs running on Fargate resources.

123
AWS Batch User Guide
Job queues on Fargate

Parameters in logConfiguration
logDriver

Only awslogs and fluentd are supported. For more information, see Using the awslogs log
driver (p. 64).
Members in networkConfiguration
assignPublicIp

If the private subnet does not have a NAT gateway attached to send traﬃc to the Internet,
assignPublicIp must be "ENABLED". For more information, see For more information, see
AWS Batch execution IAM role (p. 176).

Job queues on Fargate

AWS Batch job queues on Fargate are essentially unchanged. The only restriction is that the compute
environments listed in computeEnvironmentOrder must all be Fargate compute environments
(FARGATE or FARGATE_SPOT). EC2 and Fargate compute environments can't be mixed.

Compute environments on Fargate

AWS Batch compute environments on Fargate don't support all of the compute environment parameters
that are available. Some parameters are not supported at all, and others have speciﬁc requirements for
Fargate.

The following list describes compute environment parameters that are not valid or otherwise restricted
in Fargate jobs.

type

This parameter must be MANAGED.

"type": "MANAGED"

Parameters in the computeResources object

allocationStrategy, bidPercentage, desiredvCpus, imageId, instanceTypes,
ec2Configuration, ec2KeyPair, instanceRole, launchTemplate, minvCpus,
placementGroup, spotIamFleetRole

These aren't applicable for Fargate compute environments and shouldn't be provided.
subnets

If the subnets listed in this parameter don't have NAT gateways attached, the assignPublicIp
parameter in the job deﬁnition must be set to ENABLED.
tags

This isn't applicable for Fargate compute environments and shouldn't be provided. To
specify tags for Fargate compute environments, use the tags parameter that's not in the
computeResources object.
type

This must be either FARGATE or FARGATE_SPOT.

"type": "FARGATE_SPOT"

124
AWS Batch User Guide

Elastic Fabric Adapter

An Elastic Fabric Adapter (EFA) is a network device to accelerate High Performance Computing (HPC)
applications. AWS Batch supports applications that use EFA if the following conditions are met.

• Compute environment contains only supported instance types (c5n.18xlarge, c5n.metal,

i3en.24xlarge, m5dn.24xlarge, m5n.24xlarge, r5dn.24xlarge, r5n.24xlarge, and
p3dn.24xlarge).
• The OS in the AMI supports EFA: Amazon Linux, Amazon Linux 2, Red Hat Enterprise Linux 7.6, CentOS
7.6, Ubuntu 16.04, Ubuntu 18.04.
• The AMI has the EFA driver loaded.
• The security group for the EFA must allows all inbound and outbound traffic to and from the security
group itself.
• All instances that use an EFA should be in the same cluster placement group.
• The job definition must include a devices member with hostPath set to /dev/infiniband/
uverbs0 to allow the EFA device to be passed through to the container. If containerPath is
specified it must also be set to /dev/infiniband/uverbs0. If permissions is set it must be set to
READ | WRITE | MKNOD.

The location of the LinuxParameters member will be diﬀerent for multi-node parallel jobs and single-
node container jobs. The examples below demonstrate the diﬀerences but are missing required values.

Example Example for multi-node parallel job

{
"jobDefinitionName": "EFA-MNP-JobDef",
"type": "multinode",
"nodeProperties": {
...
"nodeRangeProperties": [
{
...
"container": {
...
"linuxParameters": {
"devices": [
{
"hostPath": "/dev/infiniband/uverbs0",
"containerPath": "/dev/infiniband/uverbs0",
"permissions": [
"READ", "WRITE", "MKNOD"
]
},
],
},
},
},
],
},
}

Example Example for single-node container job

{
"jobDefinitionName": "EFA-Container-JobDef",

125
AWS Batch User Guide

"type": "container",
...
"containerProperties": {
...
"linuxParameters": {
"devices": [
{
"hostPath": "/dev/infiniband/uverbs0",
},
],
},
},
}

For more information about EFA, see Elastic Fabric Adapter in Amazon EC2 User Guide for Linux Instances.

126
AWS Batch User Guide
Policy structure

AWS Batch IAM policies, roles, and

permissions
By default, IAM users don't have permission to create or modify AWS Batch resources, or perform tasks
using the AWS Batch API. This means that they also can't do so using the AWS Batch console or the AWS
CLI. To allow IAM users to create or modify resources and submit jobs, you must create IAM policies that
grant IAM users permission to use the speciﬁc resources and API operations they need. Then, attach
those policies to the IAM users or groups that require those permissions.

When you attach a policy to a user or group of users, it allows or denies the users permissions to perform
the speciﬁed tasks on the speciﬁed resources. For more information, see Permissions and Policies in the
IAM User Guide. For more information about managing and creating custom IAM policies, see Managing
IAM Policies.

Likewise, AWS Batch makes calls to other AWS services on your behalf, so the service must authenticate
with your credentials. This authentication is accomplished by creating an IAM role and policy that can
provide these permissions and then associating that role with your compute environments when you
create them. For more information, see Amazon ECS instance role (p. 145), IAM Roles, Using Service-
Linked Roles, and Creating a Role to Delegate Permissions to an AWS Service in the IAM User Guide.

Getting Started

An IAM policy must grant or deny permissions to use one or more AWS Batch actions.

Topics
• Policy structure (p. 127)
• Supported resource-level permissions for AWS Batch API actions (p. 130)
• Example policies (p. 138)
• AWS Batch managed policy (p. 141)
• Creating AWS Batch IAM policies (p. 142)
• AWS Batch service IAM role (p. 142)
• Amazon ECS instance role (p. 145)
• Amazon EC2 spot ﬂeet role (p. 145)
• EventBridge IAM role (p. 147)

Policy structure
The following topics explain the structure of an IAM policy.

Topics
• Policy syntax (p. 128)
• Actions for AWS Batch (p. 128)
• Amazon Resource Names for AWS Batch (p. 129)
• Checking that users have the required permissions (p. 129)

127
AWS Batch User Guide
Policy syntax

Policy syntax
An IAM policy is a JSON document that consists of one or more statements. Each statement is structured
as follows:

{
"Statement":[{
"Effect":"effect",
"Action":"action",
"Resource":"arn",
"Condition":{
"condition":{
"key":"value"
}
}
}
]
}

There are various elements that make up a statement:

• Effect: The effect can be Allow or Deny. By default, IAM users don't have permission to use resources
and API actions, so all requests are denied. An explicit allow overrides the default. An explicit deny
overrides any allows.
• Action: The action is the specific API action that you're granting or denying permission for. To learn
about specifying action, see Actions for AWS Batch (p. 128).
• Resource: The resource that's affected by the action. With some AWS Batch API actions, you can
include specific resources in your policy that can be created or modified by the action. To specify a
resource in the statement, use its Amazon Resource Name (ARN). For more information, see Supported
resource-level permissions for AWS Batch API actions (p. 130) and Amazon Resource Names for AWS
Batch (p. 129). If the AWS Batch API operation currently doesn't support resource-level permissions,
you must use the * wildcard to specify that all resources can be affected by the action.
• Condition: Conditions are optional. They can be used to control when your policy is in effect.

For more information about example IAM policy statements for AWS Batch, see Creating AWS Batch IAM
policies (p. 142).

Actions for AWS Batch

In an IAM policy statement, you can specify any API action from any service that supports IAM.
For AWS Batch, use the following preﬁx with the name of the API action: batch:. For example:
batch:SubmitJob and batch:CreateComputeEnvironment.

To specify multiple actions in a single statement, separate them with commas as follows:

"Action": ["batch:action1", "batch:action2"]

You can also specify multiple actions using wildcards (*). For example, you can specify all actions whose
name begins with the word "Describe" as follows:

"Action": "batch:Describe*"

To specify all AWS Batch API actions, use the wildcard (*) as follows:

"Action": "batch:*"

128
AWS Batch User Guide
Amazon Resource Names for AWS Batch

For a list of AWS Batch actions, see Actions in the AWS Batch API Reference.

Amazon Resource Names for AWS Batch

Each IAM policy statement applies to the resources that you specify using their ARNs.

An ARN has the following general syntax:

arn:aws:[service]:[region]:[account]:resourceType/resourcePath

service

The service (for example, batch).

region

The Region for the resource (for example, us-east-2).

account

The AWS account ID, with no hyphens (for example, 123456789012).

resourceType

The type of resource (for example, compute-environment).

resourcePath

A path that identiﬁes the resource. You can use the wildcard (*) in your paths.

AWS Batch API operations currently supports resource-level permissions on several API operations.
For more information, see Supported resource-level permissions for AWS Batch API actions (p. 130).
To specify all resources, or if a speciﬁc API action doesn't support ARNs, use the wildcard (*) in the
Resource element as follows:

"Resource": "*"

Checking that users have the required permissions

Before you put an IAM policy into production, we recommend that you check whether it grants users the
permissions to use the particular API actions and resources they need.

First, create an IAM user for testing purposes and attach the IAM policy to the test user. Then, make a
request as the test user. You can make test requests in the console or with the AWS CLI.
Note
You can also test your policies with the IAM Policy Simulator. For more information about the
policy simulator, see Working with the IAM Policy Simulator in the IAM User Guide.

If the policy doesn't grant the user the permissions that you expected, or is overly permissive, you can
adjust the policy as needed. Retest until you get the desired results.
Important
It can take several minutes for policy changes to propagate before they take eﬀect. Therefore,
we recommend that you allow ﬁve minutes to pass before you test your policy updates.

If an authorization check fails, the request returns an encoded message with diagnostic information. You
can decode the message using the DecodeAuthorizationMessage action. For more information, see

129
AWS Batch User Guide
Supported resource-level permissions

DecodeAuthorizationMessage in the AWS Security Token Service API Reference, and decode-authorization-
message in the AWS CLI Command Reference.

Supported resource-level permissions for AWS

Batch API actions
The term resource-level permissions refers to the ability to specify the resources that users are allowed to
perform actions on. AWS Batch has partial support for resource-level permissions. For certain AWS Batch
actions, you can control when users are allowed to use those actions based on conditions that have to be
fulfilled, or specific resources that users are allowed to use. For example, you can grant users permissions
to submit jobs, but only to a specific job queue and only with a specific job definition.

The following list describes the AWS Batch API actions that currently support resource-level permissions,
as well as the supported resources, resource ARNs, and condition keys for each action.
Important
If an AWS Batch API action isn't listed in this list, then it doesn't support resource-level
permissions. If an AWS Batch API action doesn't support resource-level permissions, you can
grant users permission to use the action, but you have to specify a wildcard (*) for the resource
element of your policy statement.

Actions

CancelJob (p. 130), CreateComputeEnvironment (p. 130), CreateJobQueue (p. 131),

CreateSchedulingPolicy (p. 131), DeleteComputeEnvironment (p. 131),
DeleteJobQueue (p. 132), DeleteSchedulingPolicy (p. 132), DeregisterJobDeﬁnition (p. 132),
ListTagsForResource (p. 132), RegisterJobDeﬁnition (p. 133), SubmitJob (p. 134),
TagResource (p. 134), TerminateJob (p. 135), UntagResource (p. 136),
UpdateComputeEnvironment (p. 136), UpdateScedulingPolicy (p. 137),
UpdateJobQueue (p. 137)

CancelJob

Cancels a job in an AWS Batch queue.

Resource
Job

arn:aws:batch:region:account:job/jobId
Condition keys
aws:ResourceTag/${TagKey} (String)

Filters actions based on the tags associated with the resource.

CreateComputeEnvironment

Creates an AWS Batch compute environment.

Resource
Compute Environment

arn:aws::batch:region:account:compute-environment/compute-environment-name
Condition keys
aws:ResourceTag/${TagKey} (String)

Filters actions based on the tags associated with the resource.

130
AWS Batch User Guide
Supported resource-level permissions

Condition keys
aws:RequestTag/${TagKey} (String)

Filters actions based on the tags that are passed in the request.
aws:TagKeys (String)

Filters actions based on the tag keys that are passed in the request.
CreateJobQueue

Creates an AWS Batch job queue.

Resource
Compute Environment

arn:aws::batch:region:account:compute-environment/compute-environment-name
Condition keys
aws:ResourceTag/${TagKey} (String)

Filters actions based on the tags that are associated with the resource.
Job Queue

arn:aws:batch:region:account:job-queue/queue-name
Condition keys
aws:ResourceTag/${TagKey} (String)

Filters actions based on the tags that are associated with the resource.
Scheduling Policy

arn:aws:batch:region:account:scheduling-policy/scheduling-policy-name
Condition keys
aws:ResourceTag/${TagKey} (String)

Filters actions based on the tags that are associated with the resource.
Condition keys
aws:RequestTag/${TagKey} (String)

Filters actions based on the tags that are passed in the request.
aws:TagKeys (String)

Filters actions based on the tag keys that are passed in the request.
DeleteComputeEnvironment

Deletes an AWS Batch compute environment.

Resource
Compute Environment

arn:aws::batch:region:account:compute-environment/compute-environment-name
Condition keys
aws:ResourceTag/${TagKey} (String)

Filters actions based on the tags that are associated with the resource.
CreateSchedulingPolicy

Creates an AWS Batch scheduling policy.

131
AWS Batch User Guide
Supported resource-level permissions

Resource
Scheduling Policy

arn:aws:batch:region:account:scheduling-policy/scheduling-policy-name
Condition keys
aws:ResourceTag/${TagKey} (String)

Filters actions based on the tags that are associated with the resource.
Condition keys
aws:RequestTag/${TagKey} (String)

Filters actions based on the tags that are passed in the request.
aws:TagKeys (String)

Filters actions based on the tag keys that are passed in the request.
DeleteJobQueue

Deletes the speciﬁed job queue. Deleting the job queue eventually deletes all of the jobs in the
queue. Jobs are deleted at a rate of about 16 jobs each second.
Resource
Job Queue

arn:aws:batch:region:account:job-queue/queue-name
Condition keys
aws:ResourceTag/${TagKey} (String)

Filters actions based on the tags that are associated with the resource.
DeleteSchedulingPolicy

Deletes the speciﬁed scheduling policy.

Resource
Scheduling Policy

arn:aws:batch:region:account:scheduling-policy/scheduling-policy-name
Condition keys
aws:ResourceTag/${TagKey} (String)

Filters actions based on the tags that are associated with the resource.
DeregisterJobDeﬁnition

Deregisters an AWS Batch job deﬁnition.

Resource
Job Deﬁnition

arn:aws:batch:region:account:job-deﬁnition/definition-name:revision
Condition keys
aws:ResourceTag/${TagKey} (String)

Filters actions based on the tags that are associated with the resource.
ListTagsForResource

Lists the tags for the speciﬁed resource.

132
AWS Batch User Guide
Supported resource-level permissions

Resource
Compute Environment

arn:aws::batch:region:account:compute-environment/compute-environment-name
Condition keys
aws:ResourceTag/${TagKey} (String)

Filters actions based on the tags that are associated with the resource.
Job

arn:aws:batch:region:account:job/jobId
Condition keys
aws:ResourceTag/${TagKey} (String)

Filters actions based on the tags that are associated with the resource.
Job Deﬁnition

arn:aws:batch:region:account:job-deﬁnition/definition-name:revision
Condition keys
aws:ResourceTag/${TagKey} (String)

Filters actions based on the tags that are associated with the resource.
Job Queue

arn:aws:batch:region:account:job-queue/queue-name
Condition keys
aws:ResourceTag/${TagKey} (String)

Filters actions based on the tags that are associated with the resource.
Scheduling Policy

arn:aws:batch:region:account:scheduling-policy/scheduling-policy-name
Condition keys
aws:ResourceTag/${TagKey} (String)

Filters actions based on the tags that are associated with the resource.
RegisterJobDeﬁnition

Registers an AWS Batch deﬁnition.

Resource
Job Deﬁnition

arn:aws:batch:region:account:job-deﬁnition/definition-name:revision
Condition keys
aws:ResourceTag/${TagKey} (String)

Filters actions based on the tags that are associated with the resource.
Condition keys
batch:AWSLogsCreateGroup (Boolean)

When this parameter is true, the awslogs-group is created for the logs.
batch:AWSLogsGroup (String)

The awslogs group where the logs are located.

133
AWS Batch User Guide
Supported resource-level permissions

batch:AWSLogsRegion (String)

The Region where the logs are sent to.

batch:AWSLogsStreamPrefix (String)

The awslogs log stream preﬁx.

batch:Image (String)

The Docker image used to start a job.

batch:LogDriver (String)

The log driver used for the job.

batch:Privileged (Boolean)

When this parameter is true, the container for the job is given elevated permissions on the
host container instance (similar to the root user).
batch:User (String)

The user name or numeric uid to use inside the container for the job.
aws:RequestTag/${TagKey} (String)

Filters actions based on the tags that are passed in the request.
aws:TagKeys (String)

Filters actions based on the tag keys that are passed in the request.
SubmitJob

Submits an AWS Batch job from a job deﬁnition.

Resource
Job

arn:aws:batch:region:account:job/jobId
Condition keys
aws:ResourceTag/${TagKey} (String)

Filters actions based on the tags associated with the resource.

Job Deﬁnition

arn:aws:batch:region:account:job-deﬁnition/definition-name:revision
Condition keys
aws:ResourceTag/${TagKey} (String)

Filters actions based on the tags associated with the resource.

Job Queue

arn:aws:batch:region:account:job-queue/queue-name
Condition keys
aws:ResourceTag/${TagKey} (String)

Filters actions based on the tags that are associated with the resource.
TagResource

Tags the speciﬁed resource.

134
AWS Batch User Guide
Supported resource-level permissions