AWS Batch User Guide
AWS Batch User Guide
User Guide
AWS Batch User Guide
Amazon's trademarks and trade dress may not be used in connection with any product or service that is not
Amazon's, in any manner that is likely to cause confusion among customers, or in any manner that disparages or
discredits Amazon. All other trademarks not owned by Amazon are the property of their respective owners, who may
or may not be affiliated with, connected to, or sponsored by Amazon.
AWS Batch User Guide
Table of Contents
What Is AWS Batch? ........................................................................................................................... 1
Components of AWS Batch ......................................................................................................... 1
Jobs ................................................................................................................................. 1
Job Definitions ................................................................................................................... 1
Job Queues ....................................................................................................................... 1
Compute Environment ........................................................................................................ 1
Getting Started .......................................................................................................................... 2
Setting Up ........................................................................................................................................ 3
Sign Up for AWS ........................................................................................................................ 3
Create an IAM User .................................................................................................................... 3
Create IAM Roles for your Compute Environments and Container Instances ........................................ 5
Create a Key Pair ....................................................................................................................... 5
Create a Virtual Private Cloud ..................................................................................................... 7
Create a Security Group .............................................................................................................. 7
Install the AWS CLI .................................................................................................................... 8
Getting Started .................................................................................................................................. 9
Step 1: Define a Job ................................................................................................................... 9
Step 2: Configure the Compute Environment and Job Queue ......................................................... 11
Jobs ................................................................................................................................................ 14
Submitting a Job ..................................................................................................................... 14
Job States ............................................................................................................................... 16
Job Environment Variables ........................................................................................................ 17
Automated Job Retries .............................................................................................................. 18
Job Dependencies .................................................................................................................... 19
Job Timeouts ........................................................................................................................... 19
Array Jobs ............................................................................................................................... 20
Example Array Job Workflow ............................................................................................. 21
Tutorial: Using array job index ........................................................................................... 23
Multi-node Parallel Jobs ............................................................................................................ 27
Environment Variables ...................................................................................................... 28
Node Groups ................................................................................................................... 28
Job Lifecycle .................................................................................................................... 28
Compute Environment Considerations ................................................................................. 29
GPU Jobs ................................................................................................................................ 30
Job definitions ................................................................................................................................. 31
Creating a job definition ........................................................................................................... 31
Creating a multi-node parallel job definition ................................................................................ 36
Job definition template ............................................................................................................. 39
Job definition parameters ......................................................................................................... 43
Job definition name ......................................................................................................... 44
Type ............................................................................................................................... 44
Parameters ...................................................................................................................... 44
Platform capabilities ......................................................................................................... 45
Propagate tags ................................................................................................................. 45
Container properties ......................................................................................................... 45
Node properties ............................................................................................................... 61
Retry strategy .................................................................................................................. 62
Tags ................................................................................................................................ 64
Timeout .......................................................................................................................... 64
Using the awslogs log driver ...................................................................................................... 64
Available awslogs log driver options ................................................................................... 65
Specifying a log configuration in your job definition ............................................................. 66
Specifying sensitive data ........................................................................................................... 67
Using Secrets Manager ...................................................................................................... 67
iii
AWS Batch User Guide
iv
AWS Batch User Guide
v
AWS Batch User Guide
vi
AWS Batch User Guide
Components of AWS Batch
As a fully managed service, AWS Batch helps you to run batch computing workloads of any scale. AWS
Batch automatically provisions compute resources and optimizes the workload distribution based on
the quantity and scale of the workloads. With AWS Batch, there's no need to install or manage batch
computing software, so you can focus your time on analyzing results and solving problems.
Jobs
A unit of work (such as a shell script, a Linux executable, or a Docker container image) that you submit
to AWS Batch. It has a name, and runs as a containerized application on AWS Fargate or Amazon EC2
resources in your compute environment, using parameters that you specify in a job definition. Jobs can
reference other jobs by name or by ID, and can be dependent on the successful completion of other jobs.
For more information, see Jobs (p. 14).
Job Definitions
A job definition specifies how jobs are to be run. You can think of a job definition as a blueprint for the
resources in your job. You can supply your job with an IAM role to provide access to other AWS resources.
You also specify both memory and CPU requirements. The job definition can also control container
properties, environment variables, and mount points for persistent storage. Many of the specifications in
a job definition can be overridden by specifying new values when submitting individual Jobs. For more
information, see Job definitions (p. 31)
Job Queues
When you submit an AWS Batch job, you submit it to a particular job queue, where the job resides until
it's scheduled onto a compute environment. You associate one or more compute environments with a job
queue. You can also assign priority values for these compute environments and even across job queues
themselves. For example, you can have a high priority queue that you submit time-sensitive jobs to, and
a low priority queue for jobs that can run anytime when compute resources are cheaper.
Compute Environment
A compute environment is a set of managed or unmanaged compute resources that are used to run
jobs. With managed compute environments, you can specify desired compute type (Fargate or EC2) at
1
AWS Batch User Guide
Getting Started
several levels of detail. You can set up compute environments that use a particular type of EC2 instance,
a particular model such as c5.2xlarge or m5.10xlarge. Or, you can choose only to specify that
you want to use the newest instance types. You can also specify the minimum, desired, and maximum
number of vCPUs for the environment, along with the amount that you're willing to pay for a Spot
Instance as a percentage of the On-Demand Instance price and a target set of VPC subnets. AWS Batch
efficiently launches, manages, and terminates compute types as needed. You can also manage your own
compute environments. As such, you're responsible for setting up and scaling the instances in an Amazon
ECS cluster that AWS Batch creates for you. For more information, see Compute environment (p. 87).
Getting Started
Get started with AWS Batch by creating a job definition, compute environment, and a job queue in the
AWS Batch console.
The AWS Batch first-run wizard gives you the option of creating a compute environment and a job queue
and submitting a sample Hello World job. If you already have a Docker image you want to launch in AWS
Batch, you can create a job definition with that image and submit that to your queue instead. For more
information, see Getting Started with AWS Batch (p. 9).
2
AWS Batch User Guide
Sign Up for AWS
Complete the following tasks to get set up for AWS Batch. If you have already completed any of these
steps, you may skip them and move on to installing the AWS CLI.
If you have an AWS account already, skip to the next task. If you don't have an AWS account, use the
following procedure to create one.
1. Open https://portal.aws.amazon.com/billing/signup.
2. Follow the online instructions.
Part of the sign-up procedure involves receiving a phone call and entering a verification code on the
phone keypad.
Note your AWS account number, because you'll need it for the next task.
3
AWS Batch User Guide
Create an IAM User
console requires your password. You can create access keys for your AWS account to access the command
line interface or API. However, we don't recommend that you access AWS using the credentials for your
AWS account; we recommend that you use AWS Identity and Access Management (IAM) instead. Create
an IAM user, and then add the user to an IAM group with administrative permissions or grant this user
administrative permissions. You can then access AWS using a special URL and the IAM user's credentials.
If you signed up for AWS but have not created an IAM user for yourself, you can create one using the IAM
console.
To create an administrator user for yourself and add the user to an administrators group
(console)
1. Sign in to the IAM console as the account owner by choosing Root user and entering your AWS
account email address. On the next page, enter your password.
Note
We strongly recommend that you adhere to the best practice of using the Administrator
IAM user that follows and securely lock away the root user credentials. Sign in as the root
user only to perform a few account and service management tasks.
2. In the navigation pane, choose Users and then choose Add user.
3. For User name, enter Administrator.
4. Select the check box next to AWS Management Console access. Then select Custom password, and
then enter your new password in the text box.
5. (Optional) By default, AWS requires the new user to create a new password when first signing in. You
can clear the check box next to User must create a new password at next sign-in to allow the new
user to reset their password after they sign in.
6. Choose Next: Permissions.
7. Under Set permissions, choose Add user to group.
8. Choose Create group.
9. In the Create group dialog box, for Group name enter Administrators.
10. Choose Filter policies, and then select AWS managed - job function to filter the table contents.
11. In the policy list, select the check box for AdministratorAccess. Then choose Create group.
Note
You must activate IAM user and role access to Billing before you can use the
AdministratorAccess permissions to access the AWS Billing and Cost Management
console. To do this, follow the instructions in step 1 of the tutorial about delegating access
to the billing console.
12. Back in the list of groups, select the check box for your new group. Choose Refresh if necessary to
see the group in the list.
13. Choose Next: Tags.
14. (Optional) Add metadata to the user by attaching tags as key-value pairs. For more information
about using tags in IAM, see Tagging IAM entities in the IAM User Guide.
15. Choose Next: Review to see the list of group memberships to be added to the new user. When you
are ready to proceed, choose Create user.
You can use this same process to create more groups and users and to give your users access to your AWS
account resources. To learn about using policies that restrict user permissions to specific AWS resources,
see Access management and Example policies.
To sign in as this new IAM user, sign out of the AWS console, then use the following URL, where
your_aws_account_id is your AWS account number without the hyphens (for example, if your AWS
account number is 1234-5678-9012, your AWS account ID is 123456789012):
4
AWS Batch User Guide
Create IAM Roles for your Compute
Environments and Container Instances
https://your_aws_account_id.signin.aws.amazon.com/console/
Enter the IAM user name and password that you just created. When you're signed in, the navigation bar
displays "your_user_name @ your_aws_account_id".
If you don't want the URL for your sign-in page to contain your AWS account ID, you can create an
account alias. From the IAM dashboard, choose Create Account Alias and enter an alias, such as your
company name. To sign in after you create an account alias, use the following URL:
https://your_account_alias.signin.aws.amazon.com/console/
To verify the sign-in link for IAM users for your account, open the IAM console and check under IAM
users sign-in link on the dashboard.
For more information about IAM, see the AWS Identity and Access Management User Guide.
If you haven't created a key pair already, you can create one using the Amazon EC2 console. Note that
if you plan to launch instances in multiple Regions, you'll need to create a key pair in each Region. For
more information about Regions, see Regions and Availability Zones in the Amazon EC2 User Guide for
Linux Instances.
5
AWS Batch User Guide
Create a Key Pair
For more information, see Amazon EC2 Key Pairs in the Amazon EC2 User Guide for Linux Instances.
To connect to your Linux instance from a computer running Mac or Linux, specify the .pem file to your
SSH client with the -i option and the path to your private key. To connect to your Linux instance from a
computer running Windows, you can use either MindTerm or PuTTY. If you plan to use PuTTY, you'll need
to install it and use the following procedure to convert the .pem file to a .ppk file.
4. Choose Load. By default, PuTTYgen displays only files with the extension .ppk. To locate your .pem
file, choose the option to display files of all types.
5. Select the private key file that you created in the previous procedure and choose Open. Choose OK
to dismiss the confirmation dialog box.
6. Choose Save private key. PuTTYgen displays a warning about saving the key without a passphrase.
Choose Yes.
7. Specify the same name for the key that you used for the key pair. PuTTY automatically adds the
.ppk file extension.
6
AWS Batch User Guide
Create a Virtual Private Cloud
If you have a default VPC, you also can skip this section and move to the next task, Create a Security
Group (p. 7). To determine whether you have a default VPC, see Supported Platforms in the Amazon
EC2 Console in the Amazon EC2 User Guide for Linux Instances. Otherwise, you can create a nondefault
VPC in your account using the steps below.
For more information about Amazon VPC, see What is Amazon VPC? in the Amazon VPC User Guide.
Note that if you plan to launch container instances in multiple Regions, you need to create a security
group in each Region. For more information, see Regions and Availability Zones in the Amazon EC2 User
Guide for Linux Instances.
Note
You need the public IP address of your local computer, which you can get using a service.
For example, we provide the following service: http://checkip.amazonaws.com/ or https://
checkip.amazonaws.com/. To locate another service that provides your IP address, use the
search phrase "what is my IP address." If you are connecting through an Internet service provider
(ISP) or from behind a firewall without a static IP address, you need to find out the range of IP
addresses used by client computers.
7
AWS Batch User Guide
Install the AWS CLI
6. AWS Batch container instances do not require any inbound ports to be open. However, you might
want to add an SSH rule so you can log into the container instance and examine the containers in
jobs with Docker commands. You can also add rules for HTTP if you want your container instance to
host a job that runs a web server. Complete the following steps to add these optional security group
rules.
On the Inbound tab, create the following rules and choose Create:
• Choose Add Rule. For Type, choose HTTP. For Source, choose Anywhere (0.0.0.0/0).
• Choose Add Rule. For Type, choose SSH. For Source, ensure that Custom IP is selected, and
specify the public IP address of your computer or network in CIDR notation. To specify an
individual IP address in CIDR notation, add the routing prefix /32. For example, if your IP address
is 203.0.113.25, specify 203.0.113.25/32. If your company allocates addresses from a range,
specify the entire range, such as 203.0.113.0/24.
Note
For security reasons, we don't recommend that you allow SSH access from all IP addresses
(0.0.0.0/0) to your instance, except for testing purposes and only for a short time.
8
AWS Batch User Guide
Step 1: Define a Job
With the AWS Batch first-run wizard, you can create a compute environment and a job queue and can
optionally also submit a sample hello world job. If you already have a Docker image that you want to
launch in AWS Batch, you can create a job definition with that image and submit that to your queue
instead.
Important
Before you begin, be sure that you completed the steps in Setting Up with AWS Batch (p. 3)
and that your AWS user has the required permissions. Admin users don't need to worry about
permissions issues. For more information, see Creating Your First IAM Admin User and Group in
the IAM User Guide.
1. If you're creating a new job definition, for Job definition name, specify a name for your job
definition.
2. (Optional) For Job role, specify an IAM role that provides the container in your job with permissions
to use the AWS APIs. This feature uses Amazon ECS IAM roles for task functionality. For more
information about this feature, including configuration prerequisites, see IAM Roles for Tasks in the
Amazon Elastic Container Service Developer Guide.
Note
Only roles that have the Amazon Elastic Container Service Task Role trust relationship are
shown here. For instructions on creating an IAM role for your AWS Batch jobs, see Creating
an IAM Role and Policy for your Tasks in the Amazon Elastic Container Service Developer
Guide.
3. For Container image, choose the Docker image to use for your job. By default, images in the Docker
Hub registry are available. Optionally, you can also specify other repositories with repository-
url/image:tag. The parameter can be up to 255 characters in length. It can contain uppercase and
lowercase letters, numbers, hyphens (-), underscores (_), colons (:), periods (.), forward slashes (/),
and number signs (#). The parameter maps to Image in the Create a container section of the Docker
Remote API and the IMAGE parameter of docker run.
9
AWS Batch User Guide
Step 1: Define a Job
Note
Docker image architecture must match the processor architecture of the compute resources
that they're scheduled on. For example, ARM-based Docker images can only run on ARM-
based compute resources.
1. For Command, specify the command to pass to the container. This parameter maps to Cmd in the
Create a container section of the Docker Remote API and the COMMAND parameter to docker run. For
more information about the Docker CMD parameter, see https://docs.docker.com/engine/reference/
builder/#cmd.
Note
You can use parameter substitution default values and placeholders in your command. For
more information, see Parameters (p. 44).
2. For vCPUs, specify the number of vCPUs to reserve for the container. This parameter maps to
CpuShares in the Create a container section of the Docker Remote API and the --cpu-shares
option to docker run. Each vCPU is equivalent to 1,024 CPU shares.
3. For Memory, specify the hard limit (in MiB) of memory to present to the job's container. If your
container attempts to exceed the memory specified here, the container is stopped. This parameter
maps to Memory in the Create a container section of the Docker Remote API and the --memory
option to docker run.
4. For Job attempts, specify the maximum number of times to attempt your job (in case it fails). For
more information, see Automated Job Retries (p. 18).
Parameters
(Optional) Specify parameter substitution default values and placeholders in your command. For more
information, see Parameters (p. 44).
(Optional) Specify environment variables to pass to your job's container. This parameter maps to Env in
the Create a container section of the Docker Remote API and the --env option to docker run.
Important
We don't recommend that you use plaintext environment variables for sensitive information,
such as credential data.
10
AWS Batch User Guide
Step 2: Configure the Compute
Environment and Job Queue
1. For Compute environment name, specify a unique name for your compute environment.
2. For Service role, choose to create a new role or use an existing role that allows the AWS Batch
service to make calls to the required AWS APIs on your behalf. For more information, see
AWS Batch service IAM role (p. 142). If you choose to create a new role, the required role
(AWSBatchServiceRole) is created for you.
3. For EC2 instance role, choose to create a new role or use an existing role that allows the Amazon
ECS container instances that are created for your compute environment to make calls to the required
AWS APIs on your behalf. For more information, see Amazon ECS instance role (p. 145). If you
choose to create a new role, the required role (ecsInstanceRole) is created for you.
1. For Provisioning model, choose On-Demand to launch Amazon EC2 On-Demand instances or Spot
to use Amazon EC2 Spot Instances.
2. If you chose to use Amazon EC2 Spot Instances:
a. For Maximum bid price, choose the maximum percentage that a Spot Instance price must be
compared with the On-Demand price for that instance type before instances are launched.
For example, if your bid percentage is 20%, then the Spot price must be less than 20% of the
current On-Demand price for that EC2 instance. You always pay the lowest (market) price and
never more than your maximum percentage.
b. For Spot fleet role, choose to create a new role or use an existing Amazon EC2 Spot Fleet
IAM role to apply to your Spot compute environment. If you choose to create a new role, the
required role (aws-ec2-spot-fleet-role) is created for you. For more information, see
Amazon EC2 spot fleet role (p. 145).
3. For Allowed instance types, choose the Amazon EC2 instance types that can be launched. You
can specify instance families to launch any instance type within those families (for example, c5,
c5n, or p3), or you can specify specific sizes within a family (such as c5.8xlarge). Note that metal
instance types aren't in the instance families (for example, c5 doesn't include c5.metal). You can
also choose optimal to pick instance types (from the C4, M4, and R4 instance families) on the fly
that match the demand of your job queues.
Note
When you create a compute environment, the instance types that you select for the
compute environment must share the same architecture. For example, you can't mix x86
and ARM instances in the same compute environment.
11
AWS Batch User Guide
Step 2: Configure the Compute
Environment and Job Queue
Note
Currently, optimal uses instance types from the C4, M4, and R4 instance families. In
Regions that don't have instance types from those instance families, instance types from
the C5, M5, and R5 instance families are used.
4. For Minimum vCPUs, choose the minimum number of EC2 vCPUs that your compute environment
should maintain, regardless of job queue demand.
5. For Desired vCPUs, choose the number of EC2 vCPUs that your compute environment should launch
with. As your job queue demand increases, AWS Batch can increase the desired number of vCPUs
in your compute environment and add EC2 instances, up to the maximum vCPUs, and as demand
decreases, AWS Batch can decrease the desired number of vCPUs in your compute environment and
remove instances, down to the minimum vCPUs.
6. For Maximum vCPUs, choose the maximum number of EC2 vCPUs that your compute environment
can scale out to, regardless of job queue demand.
Compute resources are launched into the VPC and subnets that you specify here. This way, you can
control the network isolation of AWS Batch compute resources.
Important
Compute resources need access to communicate with the Amazon ECS service endpoint. This
can be through an interface VPC endpoint or through your compute resources having public IP
addresses.
For more information about interface VPC endpoints, see Amazon ECS Interface VPC Endpoints
(AWS PrivateLink) in the Amazon Elastic Container Service Developer Guide.
If you do not have an interface VPC endpoint configured and your compute resources do not
have public IP addresses, then they must use network address translation (NAT) to provide
this access. For more information, see NAT gateways in the Amazon VPC User Guide. For more
information, see Tutorial: Creating a VPC with Public and Private Subnets for Your Compute
Environments (p. 165).
(Optional) Apply key-value pair tags to instances that are launched in your compute environment. For
example, you can specify "Name": "AWS Batch Instance - C4OnDemand" as a tag so that each
instance in your compute environment has that name. This is helpful for recognizing your AWS Batch
instances in the Amazon EC2 console. By default, the compute environment name is used to tag your
instances.
Submit your jobs to a job queue which stores jobs until the AWS Batch scheduler runs the job on a
compute resource within your compute environment.
• For Job queue name, choose a unique name for your job queue.
12
AWS Batch User Guide
Step 2: Configure the Compute
Environment and Job Queue
The Connected compute environments for this job queue section shows that your new compute
environment is associated with your new job queue and its order. Later, you can associate other compute
environments with the job queue. The job scheduler uses the compute environment order to determine
which compute environment should start a given job. Compute environments must be in the VALID state
before you can associate them with a job queue. You can associate up to three compute environments
with a job queue.
• Review the compute environment and job queue configuration and choose Create to create your
compute environment.
13
AWS Batch User Guide
Submitting a Job
Jobs
Jobs are the unit of work invoked by AWS Batch. Jobs can be invoked as containerized applications
running on Amazon ECS container instances in an ECS cluster.
Containerized jobs can reference a container image, command, and parameters. For more information,
see Job definition parameters (p. 43).
Topics
• Submitting a Job (p. 14)
• Job States (p. 16)
• AWS Batch Job Environment Variables (p. 17)
• Automated Job Retries (p. 18)
• Job Dependencies (p. 19)
• Job Timeouts (p. 19)
• Array Jobs (p. 20)
• Multi-node Parallel Jobs (p. 27)
• GPU Jobs (p. 30)
Submitting a Job
After you have registered a job definition, you can submit it as a job to an AWS Batch job queue. Many of
the parameters that are specified in the job definition can be overridden at runtime.
To submit a job
a. For Job depends on, enter the job IDs for any jobs that must finish before this job starts.
14
AWS Batch User Guide
Submitting a Job
b. (Array jobs only) For N-To-N job dependencies, specify one or more job IDs for any array jobs
for which each child job index of this job should depend on the corresponding child index job of
the dependency. For example, JobB:1 depends on JobA:1, and so on.
c. (Array jobs only) Select Run children sequentially to create a SEQUENTIAL dependency for the
current array job. This ensures that each child index job waits for its earlier sibling to finish. For
example, JobA:1 depends on JobA:0 and so on.
10. For Job attempts, specify the maximum number of times to attempt your job (in case it fails). For
more information, see Automated Job Retries (p. 18).
11. (Optional) For Execution timeout, specify the maximum number of seconds to allow your job
attempts to run. If an attempt exceeds the timeout duration, it is stopped and the status moves to
FAILED. For more information, see Job Timeouts (p. 19).
Important
Jobs running on Fargate resources can't expect to run for more than 14 days. After 14 days,
the Fargate resources may no longer be available and the job will be terminated.
12. (Optional) In the Parameters section, you can specify parameter substitution default values
and placeholders to use in the command that your job's container runs when it starts. For more
information, see Parameters (p. 44).
The job will run on a container with the specified number of GPUs pinned to that container.
16. For Command, specify the command to pass to the container. For simple commands, you can type
the command as you would at a command prompt in the Space delimited tab. Verify that the
JSON result (which is passed to the Docker daemon) is correct. For more complicated commands
(for example, with special characters), you can switch to the JSON tab and enter the string array
equivalent there.
This parameter maps to Cmd in the Create a container section of the Docker Remote API and the
COMMAND parameter to docker run. For more information about the Docker CMD parameter, go to
https://docs.docker.com/engine/reference/builder/#cmd.
Note
You can use parameter substitution default values and placeholders in your command. For
more information, see Parameters (p. 44).
17. (Optional) You can specify environment variables to pass to your job's container. This parameter
maps to Env in the Create a container section of the Docker Remote API and the --env option to
docker run.
Important
We do not recommend using plaintext environment variables for sensitive information, such
as credential data.
15
AWS Batch User Guide
Job States
Note
Environment variables must not start with AWS_BATCH; this naming convention is
reserved for variables that are set by the AWS Batch service.
c. For Value, specify the value for your environment variable.
18. (Optional) In the Tags section, you can specify the key and value for each tag to associate with the
job. For more information, see Tagging your AWS Batch resources (p. 197).
19. Choose Submit job.
Note
Logs for RUNNING, SUCCEEDED, and FAILED jobs are available in CloudWatch
Logs; the log group is /aws/batch/job, and the log stream name format is
first200CharsOfJobDefinitionName/default/ecs_task_id (this format may
change in the future).
After a job reaches the RUNNING status, you can programmatically retrieve its log stream
name with the DescribeJobs API operation. For more information, see View Log Data Sent
to CloudWatch Logs in the Amazon CloudWatch Logs User Guide. By default, these logs are
set to never expire, but you can modify the retention period. For more information, see
Change Log Data Retention in CloudWatch Logs in the Amazon CloudWatch Logs User Guide.
Job States
When you submit a job to an AWS Batch job queue, the job enters the SUBMITTED state. It then passes
through the following states until it succeeds (exits with code 0) or fails (exits with a non-zero code). AWS
Batch jobs can have the following states:
SUBMITTED
A job that has been submitted to the queue, and has not yet been evaluated by the scheduler. The
scheduler evaluates the job to determine if it has any outstanding dependencies on the successful
completion of any other jobs. If there are dependencies, the job is moved to PENDING. If there are no
dependencies, the job is moved to RUNNABLE.
PENDING
A job that resides in the queue and isn't yet able to run due to a dependency on another job or
resource. After the dependencies are satisfied, the job is moved to RUNNABLE.
RUNNABLE
A job that resides in the queue, has no outstanding dependencies, and is therefore ready to be
scheduled to a host. Jobs in this state are started as soon as sufficient resources are available in one
of the compute environments that are mapped to the job's queue. However, jobs can remain in this
state indefinitely when sufficient resources are unavailable.
Note
If your jobs do not progress to STARTING, see Jobs Stuck in RUNNABLE Status (p. 203) in
the troubleshooting section.
STARTING
These jobs have been scheduled to a host and the relevant container initiation operations are
underway. After the container image is pulled and the container is up and running, the job
transitions to RUNNING.
RUNNING
The job is running as a container job on an Amazon ECS container instance within a compute
environment. When the job's container exits, the process exit code determines whether the job
16
AWS Batch User Guide
Job Environment Variables
succeeded or failed. An exit code of 0 indicates success, and any non-zero exit code indicates failure.
If the job associated with a failed attempt has any remaining attempts left in its optional retry
strategy configuration, the job is moved to RUNNABLE again. For more information, see Automated
Job Retries (p. 18).
Note
Logs for RUNNING jobs are available in CloudWatch Logs; the log group is /aws/batch/
job, and the log stream name format is first200CharsOfJobDefinitionName/
default/ecs_task_id (this format may change in the future).
After a job reaches the RUNNING status, you can programmatically retrieve its log stream
name with the DescribeJobs API operation. For more information, see View Log Data Sent
to CloudWatch Logs in the Amazon CloudWatch Logs User Guide. By default, these logs are
set to never expire, but you can modify the retention period. For more information, see
Change Log Data Retention in CloudWatch Logs in the Amazon CloudWatch Logs User Guide.
SUCCEEDED
The job has successfully completed with an exit code of 0. The job state for SUCCEEDED jobs is
persisted in AWS Batch for at least 24 hours.
Note
Logs for SUCCEEDED jobs are available in CloudWatch Logs; the log group is /aws/batch/
job, and the log stream name format is first200CharsOfJobDefinitionName/
default/ecs_task_id (this format may change in the future).
After a job reaches the RUNNING status, you can programmatically retrieve its log stream
name with the DescribeJobs API operation. For more information, see View Log Data Sent
to CloudWatch Logs in the Amazon CloudWatch Logs User Guide. By default, these logs are
set to never expire, but you can modify the retention period. For more information, see
Change Log Data Retention in CloudWatch Logs in the Amazon CloudWatch Logs User Guide.
FAILED
The job has failed all available attempts. The job state for FAILED jobs is persisted in AWS Batch for
at least 24 hours.
Note
Logs for FAILED jobs are available in CloudWatch Logs; the log group is /aws/batch/
job, and the log stream name format is first200CharsOfJobDefinitionName/
default/ecs_task_id (this format may change in the future).
After a job reaches the RUNNING status, you can programmatically retrieve its log stream
with the DescribeJobs API operation. For more information, see View Log Data Sent to
CloudWatch Logs in the Amazon CloudWatch Logs User Guide. By default, these logs are set
to never expire, but you can modify the retention period. For more information, see Change
Log Data Retention in CloudWatch Logs in the Amazon CloudWatch Logs User Guide.
AWS_BATCH_CE_NAME
This variable is set to the name of the compute environment in which your job is placed.
17
AWS Batch User Guide
Automated Job Retries
AWS_BATCH_JOB_ARRAY_INDEX
This variable is only set in child array jobs. The array job index begins at 0, and each child job
receives a unique index number. For example, an array job with 10 children has index values of 0-9.
You can use this index value to control how your array job children are differentiated. For more
information, see Tutorial: Using the array job index to control job differentiation (p. 23).
AWS_BATCH_JOB_ATTEMPT
This variable is set to the job attempt number. The first attempt is numbered 1. For more
information, see Automated Job Retries (p. 18).
AWS_BATCH_JOB_ID
This variable is only set in multi-node parallel jobs. This variable is set to the index number of the
job's main node. Your application code can compare the AWS_BATCH_JOB_MAIN_NODE_INDEX to
the AWS_BATCH_JOB_NODE_INDEX on an individual node to determine if it is the main node.
AWS_BATCH_JOB_MAIN_NODE_PRIVATE_IPV4_ADDRESS
This variable is only set in multi-node parallel job child nodes (it isn't present on the main node).
This variable is set to the private IPv4 address of the job's main node. Your child node's application
code can use this address to communicate with the main node.
AWS_BATCH_JOB_NODE_INDEX
This variable is only set in multi-node parallel jobs. This variable is set to the node index number of
the node. The node index begins at 0, and each node receives a unique index number. For example, a
multi-node parallel job with 10 children has index values of 0-9.
AWS_BATCH_JOB_NUM_NODES
This variable is only set in multi-node parallel jobs. This variable is set to the number of nodes that
you have requested for your multi-node parallel job.
AWS_BATCH_JQ_NAME
This variable is set to the name of the job queue to which your job was submitted.
When a job is submitted to a job queue and placed into the RUNNING state, that is considered an
attempt. By default, each job is given one attempt to move to either the SUCCEEDED or FAILED job
state. However, both the job definition and the job submission workflows allow you to specify a retry
strategy with between 1 and 10 attempts. For more information, see Retry strategy (p. 62).
18
AWS Batch User Guide
Job Dependencies
If a job attempt fails for any reason, and the number of attempts specified in the retry configuration is
greater than the AWS_BATCH_JOB_ATTEMPT number, then the job is placed back in the RUNNABLE state.
For more information, see Job States (p. 16).
Note
Jobs that have been cancelled or terminated are not retried. Also, jobs that fail due to an invalid
job definition are not retried.
For more information, see Creating a job definition (p. 31) and Submitting a Job (p. 14).
Job Dependencies
When you submit an AWS Batch job, you can specify the job IDs on which the job depends. When you
do so, the AWS Batch scheduler ensures that your job is run only after the specified dependencies
have successfully completed. After they succeed, the dependent job transitions from PENDING to
RUNNABLE and then to STARTING and RUNNING. If any of the job dependencies fail, the dependent job
automatically transitions from PENDING to FAILED.
For example, Job A can express a dependency on up to 20 other jobs that must succeed before it can run.
You can then submit additional jobs that have a dependency on Job A and up to 19 other jobs.
For array jobs, you can specify a SEQUENTIAL type dependency without specifying a job ID so that
each child array job completes sequentially, starting at index 0. You can also specify an N_TO_N type
dependency with a job ID. That way, each index child of this job must wait for the corresponding
index child of each dependency to complete before it can begin. For more information, see Array
Jobs (p. 20).
To submit an AWS Batch job with dependencies, see Submitting a Job (p. 14).
Job Timeouts
You can configure a timeout duration for your jobs so that if a job runs longer than that, AWS Batch
terminates the job. For example, you might have a job that you know should only take 15 minutes to
complete. Sometimes your application gets stuck in a loop and runs forever, so you can set a timeout of
30 minutes to terminate the stuck job.
You specify an attemptDurationSeconds parameter, which must be at least 60 seconds, either in your
job definition, or when you submit the job. When this number of seconds has passed following the job
attempt's startedAt timestamp, AWS Batch terminates the job. On the compute resource, your job's
container receives a SIGTERM signal to give your application a chance to shut down gracefully. If the
container is still running after 30 seconds, a SIGKILL signal is sent to forcefully shut down the container.
Timeout terminations are handled on a best-effort basis. You shouldn't expect your timeout termination
to happen exactly when the job attempt times out (it may take a few seconds longer). If your application
requires precise timeout execution, you should implement this logic within the application. If you have
a large number of jobs timing out concurrently, the timeout terminations behave as a first in, first out
queue, where jobs are terminated in batches.
If a job is terminated for exceeding the timeout duration, it isn't retried. If a job attempt fails on its own,
then it can retry if retries are enabled, and the timeout countdown is started over for the new attempt.
Important
Jobs running on Fargate resources can't expect to run for more than 14 days. If the timeout
duration exceeds 14 days, the Fargate resources may no longer be available and the job will be
terminated.
For array jobs, child jobs have the same timeout configuration as the parent job.
19
AWS Batch User Guide
Array Jobs
For information about submitting an AWS Batch job with a timeout configuration, see Submitting a
Job (p. 14).
Array Jobs
An array job is a job that shares common parameters, such as the job definition, vCPUs, and memory. It
runs as a collection of related, yet separate, basic jobs that may be distributed across multiple hosts and
may run concurrently. Array jobs are the most efficient way to run extremely parallel jobs such as Monte
Carlo simulations, parametric sweeps, or large rendering jobs.
AWS Batch array jobs are submitted just like regular jobs. However, you specify an array size (between 2
and 10,000) to define how many child jobs should run in the array. If you submit a job with an array size
of 1000, a single job runs and spawns 1000 child jobs. The array job is a reference or pointer to manage
all the child jobs. This allows you to submit large workloads with a single query.
When you submit an array job, the parent array job gets a normal AWS Batch job ID. Each child job has
the same base ID, but the array index for the child job is appended to the end of the parent ID, such as
example_job_ID:0 for the first child job of the array.
For array job dependencies, you can specify a type for a dependency, such as SEQUENTIAL or N_TO_N.
You can specify a SEQUENTIAL type dependency (without specifying a job ID) so that each child array job
completes sequentially, starting at index 0. For example, if you submit an array job with an array size of
100, and specify a dependency with type SEQUENTIAL, 100 child jobs are spawned sequentially, where
the first child job must succeed before the next child job starts. The figure below shows Job A, an array
job with an array size of 10. Each job in Job A's child index is dependent on the previous child job. Job A:1
can't start until job A:0 finishes.
You can also specify an N_TO_N type dependency with a job ID for array jobs so that each index child of
this job must wait for the corresponding index child of each dependency to complete before it can begin.
20
AWS Batch User Guide
Example Array Job Workflow
The figure below shows Job A and Job B, two array jobs with an array size of 10,000 each. Each job in Job
B's child index is dependent on the corresponding index in Job A. Job B:1 can't start until job A:1 finishes.
If you cancel or terminate a parent array job, all of the child jobs are cancelled or terminated with it. You
can cancel or terminate individual child jobs (which moves them to the FAILED status) without affecting
the other child jobs. However, if a child array job fails (on its own or by cancelling/terminating manually),
the parent job also fails.
For example:
• JobA: A standard, non-array job that performs a quick listing and metadata validation of objects in an
Amazon S3 bucket, BucketA. The SubmitJob JSON syntax is shown below.
{
"jobName": "JobA",
"jobQueue": "ProdQueue",
"jobDefinition": "JobA-list-and-validate:1"
}
• JobB: An array job with 10,000 copies that is dependent upon JobA, that runs CPU-intensive
commands against each object in BucketA and uploads results to BucketB. The SubmitJob JSON
syntax is shown below.
{
"jobName": "JobB",
"jobQueue": "ProdQueue",
"jobDefinition": "JobB-CPU-Intensive-Processing:1",
"containerOverrides": {
"resourceRequirements": [
{
"type": "MEMORY",
"value": "4096"
},
{
21
AWS Batch User Guide
Example Array Job Workflow
"type": "VCPU",
"value": "32"
}
]
}
"arrayProperties": {
"size": 10000
},
"dependsOn": [
{
"jobId": "JobA_job_ID"
}
]
}
• JobC: Another 10,000 copy array job that is dependent upon JobB with an N_TO_N dependency
model, that runs memory-intensive commands against each item in BucketB, writes metadata to
DynamoDB, and uploads the resulting output to BucketC. The SubmitJob JSON syntax is shown
below.
{
"jobName": "JobC",
"jobQueue": "ProdQueue",
"jobDefinition": "JobC-Memory-Intensive-Processing:1",
"containerOverrides": {
"resourceRequirements": [
{
"type": "MEMORY",
"value": "32768"
},
{
"type": "VCPU",
"value": "1"
}
]
}
"arrayProperties": {
"size": 10000
},
"dependsOn": [
{
"jobId": "JobB_job_ID",
"type": "N_TO_N"
}
]
}
• JobD: An array job that performs 10 validation steps that each need to query DynamoDB and may
interact with any of the above Amazon S3 buckets. Each of the steps in JobD run the same command,
but the behavior is different based on the value of the AWS_BATCH_JOB_ARRAY_INDEX environment
variable within the job's container. These validation steps run sequentially (for example, JobD:0, then
JobD:1, and so on. The SubmitJob JSON syntax is shown below.
{
"jobName": "JobD",
"jobQueue": "ProdQueue",
"jobDefinition": "JobD-Sequential-Validation:1",
"containerOverrides": {
"resourceRequirements": [
{
"type": "MEMORY",
"value": "32768"
},
22
AWS Batch User Guide
Tutorial: Using array job index
{
"type": "VCPU",
"value": "1"
}
]
}
"arrayProperties": {
"size": 10
},
"dependsOn": [
{
"jobId": "JobC_job_ID"
},
{
"type": "SEQUENTIAL"
},
]
}
• JobE: A final, non-array job that performs some simple cleanup operations and sends an Amazon
SNS notification with a message that the pipeline has completed and a link to the output URL. The
SubmitJob JSON syntax is shown below.
{
"jobName": "JobE",
"jobQueue": "ProdQueue",
"jobDefinition": "JobE-Cleanup-and-Notification:1",
"parameters": {
"SourceBucket": "s3://JobD-Output-Bucket",
"Recipient": "[email protected]"
},
"dependsOn": [
{
"jobId": "JobD_job_ID"
}
]
}
In this tutorial, you create a text file that has all of the colors of the rainbow, each on its own line. Then,
you create an entrypoint script for a Docker container that converts the index into a value that can be
used for a line number in the color file. The index starts at zero, but line numbers start at one. Create
a Dockerfile that copies the color and index files to the container image and sets ENTRYPOINT for the
image to the entrypoint script. The Dockerfile and resources are built to a Docker image that's pushed
to Amazon ECR. You then register a job definition that uses your new container image, submit an AWS
Batch array job with that job definition, and view the results.
23
AWS Batch User Guide
Tutorial: Using array job index
Prerequisites
This tutorial has the following prerequisites:
• An AWS Batch compute environment. For more information, see Creating a compute
environment (p. 99).
• An AWS Batch job queue and associated compute environment. For more information, see Creating a
job queue (p. 82).
• The AWS CLI installed on your local system. For more information, see Installing the AWS Command
Line Interface in the AWS Command Line Interface User Guide.
• Docker installed on your local system. For more information, see About Docker CE in the Docker
documentation.
1. Create a new directory to use as your Docker image workspace and navigate to it.
2. Create a file named colors.txt in your workspace directory and paste the following into it.
red
orange
yellow
green
blue
indigo
violet
3. Create a file named print-color.sh in your workspace directory and paste the following into it.
Note
The LINE variable is set to the AWS_BATCH_JOB_ARRAY_INDEX + 1 because the array
index starts at 0, but line numbers start at 1. The COLOR variable is set to the color in
colors.txt that's associated with its line number.
#!/bin/sh
LINE=$((AWS_BATCH_JOB_ARRAY_INDEX + 1))
COLOR=$(sed -n ${LINE}p /tmp/colors.txt)
echo My favorite color of the rainbow is $COLOR.
4. Create a file named Dockerfile in your workspace directory and paste the contents below into it.
This Dockerfile copies the previous files to your container and sets the entrypoint script to run when
the container starts.
FROM busybox
COPY print-color.sh /tmp/print-color.sh
COPY colors.txt /tmp/colors.txt
RUN chmod +x /tmp/print-color.sh
ENTRYPOINT /tmp/print-color.sh
24
AWS Batch User Guide
Tutorial: Using array job index
6. Test your container with the following script. This script sets the AWS_BATCH_JOB_ARRAY_INDEX
variable to 0 locally and then increments it to simulate what an array job with seven children does.
AWS_BATCH_JOB_ARRAY_INDEX=0
while [ $AWS_BATCH_JOB_ARRAY_INDEX -le 6 ]
do
docker run -e AWS_BATCH_JOB_ARRAY_INDEX=$AWS_BATCH_JOB_ARRAY_INDEX print-color
AWS_BATCH_JOB_ARRAY_INDEX=$((AWS_BATCH_JOB_ARRAY_INDEX + 1))
done
1. Create an Amazon ECR image repository to store your container image. This example only uses the
AWS CLI, but you can also use the AWS Management Console. For more information, see Creating a
Repository in the Amazon Elastic Container Registry User Guide.
2. Tag your print-color image with your Amazon ECR repository URI that was returned from the
previous step.
3. Log in to your Amazon ECR registry. For more information, see Registry Authentication in the
Amazon Elastic Container Registry User Guide.
25
AWS Batch User Guide
Tutorial: Using array job index
1. Create a file named print-color-job-def.json in your workspace directory and paste the
following into it. Replace the image repository URI with your own image's URI.
{
"jobDefinitionName": "print-color",
"type": "container",
"containerProperties": {
"image": "aws_account_id.dkr.ecr.region.amazonaws.com/print-color",
"resourceRequirements": [
{
"type": "MEMORY",
"value": "250"
},
{
"type": "VCPU",
"value": "1"
}
]
}
}
1. Create a file named print-color-job.json in your workspace directory and paste the following
into it.
Note
This example assumes the default job queue name that's created by the AWS Batch first-run
wizard. If your job queue name is different, replace the first-run-job-queue name with
your job queue name.
{
"jobName": "print-color",
"jobQueue": "first-run-job-queue",
"arrayProperties": {
"size": 7
},
"jobDefinition": "print-color"
}
2. Submit the job to your AWS Batch job queue. Note the job ID that's returned in the output.
3. Describe the job's status and wait for the job to move to SUCCEEDED.
26
AWS Batch User Guide
Multi-node Parallel Jobs
7. View the other child job's logs. Each job returns a different color of the rainbow.
Multi-node parallel jobs are submitted as a single job. However, your job definition (or job submission
node overrides) specifies the number of nodes to create for the job and what node groups to create. Each
multi-node parallel job contains a main node, which is launched first. After the main node is up, the
child nodes are launched and started. If the main node exits, the job is considered finished, and the child
nodes are stopped. For more information, see Node Groups (p. 28).
Multi-node parallel job nodes are single-tenant, meaning that only a single job container is run on each
Amazon EC2 instance.
The final job status (SUCCEEDED or FAILED) is determined by the final job status of the main node. To
get the status of a multi-node parallel job, you can describe the job using the job ID that was returned
when you submitted the job. If you need the details for child nodes, then you must describe each child
node individually. Nodes are addressed using #N notation (starting with 0). For example, to access the
details of the second node of a job, you need to describe aws_batch_job_id#1 using the AWS Batch
DescribeJobs API action. The started, stoppedAt, statusReason, and exit information for a multi-
node parallel job is populated from the main node.
27
AWS Batch User Guide
Environment Variables
If you specify job retries, then a main node failure triggers another attempt; child node failures do not.
Each new attempt of a multi-node parallel job updates the corresponding attempt of its associated child
nodes.
To run multi-node parallel jobs on AWS Batch, your application code must contain the frameworks and
libraries necessary for distributed communication.
Environment Variables
At runtime, in addition to the standard environment variables that all AWS Batch jobs receive, each node
is configured with the following environment variables that are specific to multi-node parallel jobs:
AWS_BATCH_JOB_MAIN_NODE_INDEX
This variable is set to the index number of the job's main node. Your application code can compare
the AWS_BATCH_JOB_MAIN_NODE_INDEX to the AWS_BATCH_JOB_NODE_INDEX on an individual
node to determine if it is the main node.
AWS_BATCH_JOB_MAIN_NODE_PRIVATE_IPV4_ADDRESS
This variable is only set in multi-node parallel job child nodes (it isn't present on the main node).
This variable is set to the private IPv4 address of the job's main node. Your child node's application
code can use this address to communicate with the main node.
AWS_BATCH_JOB_NODE_INDEX
This variable is set to the node index number of the node. The node index begins at 0, and each
node receives a unique index number. For example, a multi-node parallel job with 10 children has
index values of 0-9.
AWS_BATCH_JOB_NUM_NODES
This variable is set to the number of nodes that you have requested for your multi-node parallel job.
Node Groups
A node group is an identical group of job nodes that all share the same container properties. AWS Batch
lets you specify up to five distinct node groups for each job.
Each group can have its own container images, commands, environment variables, and so on. For
example, you can submit a job that requires a single c4.xlarge instance for the main node, and five
c4.xlarge instance child nodes; each of these distinct node groups may specify different container
images or commands to run for each job.
Alternatively, all of the nodes in your job can use a single node group, and your
application code can differentiate node roles (main node vs. child node) by comparing
the AWS_BATCH_JOB_MAIN_NODE_INDEX environment variable against its own value for
AWS_BATCH_JOB_NODE_INDEX. You may have up to 1000 nodes in a single job. This is the default limit
for instances in an Amazon ECS cluster, which can be increased on request.
Note
Currently all node groups in a multi-node parallel job must use the same instance type.
Job Lifecycle
When you submit a multi-node parallel job, the job enters the SUBMITTED status, and it waits for any
job dependencies to finish. Then the job moves to the RUNNABLE status, and AWS Batch provisions the
instance capacity required to run your job and launches these instances.
28
AWS Batch User Guide
Compute Environment Considerations
Each multi-node parallel job contains a main node. The main node is a single subtask that AWS Batch
monitors to determine the outcome of the submitted multi node job. The main node is launched first
and it moves to the STARTING status.
When the main node reaches the RUNNING status (after the node's container is running), the child
nodes are launched and they also move to the STARTING status. The child nodes come up in
random order. There are no guarantees on the timing or ordering of child node launch. To ensure
that the all the nodes of the jobs are in the RUNNING status (after the node's container is running),
your application code can either query the AWS Batch API to get the main node and child node
information, or coordinate within the application code to wait until all nodes are online before
starting any distributed processing task. The private IP address of the main node is available as the
AWS_BATCH_JOB_MAIN_NODE_PRIVATE_IPV4_ADDRESS environment variable in each child node. Your
application code may use this information to coordinate and communicate data between each task.
As individual nodes exit, they move to SUCCEEDED or FAILED, depending on their exit code. If the main
node exits, the job is considered finished, and all of the child nodes are stopped. If a child node dies, AWS
Batch does not take any action on the other nodes in the job. If you do not want your job to continue
with a reduced number of nodes, you must factor this into your application code to terminate or cancel
the job.
29
AWS Batch User Guide
GPU Jobs
GPU Jobs
GPU jobs help you to run jobs that use an instance's GPUs.
The following Amazon EC2 GPU-based instance types are supported. For more information, see Amazon
EC2 G3 Instances, Amazon EC2 G4 Instances, Amazon EC2 P2 Instances, Amazon EC2 P3 Instances, and
Amazon EC2 P4d Instances.
The resourceRequirements (p. 55) parameter for the job definition specifies the number of GPUs to
be pinned to the container. This number of GPUs isn't available to any other job running on that instance
for the duration of that job. All instance types in a compute environment that run GPU jobs should be
from the p2, p3, p4, g3, g3s, or g4 instance families. If this isn't done a GPU job might get stuck in the
RUNNABLE status.
Jobs that don't use the GPUs can be run on GPU instances. However, they might cost more to run on the
GPU instances than on similar non-GPU instances. Depending on the specific vCPU, memory, and time
needed, these non-GPU jobs might block GPU jobs from running.
30
AWS Batch User Guide
Creating a job definition
Job definitions
AWS Batch job definitions specify how jobs are to be run. While each job must reference a job definition,
many of the parameters that are specified in the job definition can be overridden at runtime.
Contents
• Creating a job definition (p. 31)
• Creating a multi-node parallel job definition (p. 36)
• Job definition template (p. 39)
• Job definition parameters (p. 43)
• Using the awslogs log driver (p. 64)
• Specifying sensitive data (p. 67)
• Amazon EFS volumes (p. 75)
• Example job definitions (p. 79)
For a complete description of the parameters available in a job definition, see Job definition
parameters (p. 43).
To create a multi-node parallel job definition, see Creating a multi-node parallel job definition (p. 36).
For more information about multi-node parallel jobs, see Multi-node Parallel Jobs (p. 27).
31
AWS Batch User Guide
Creating a job definition
on string matching of the error code and reasons that are listed for the job attempt. For more
information, see Automated Job Retries (p. 18).
a. For Job attempts, specify the number of times to attempt your job if it fails. This number must
be between one (1) and ten (10), inclusive.
b. (Optional) Select Add evaluate on exit to add up to five (5) conditions to match string patterns
with the exit code, status reason, and reason that are returned in the job attempt. For each set
of conditions, Action must be set to either Retry (to retry until the number of job attempts has
been reached), or Exit to stop retrying the job.
7. (Optional) For Execution timeout, specify the maximum number of seconds that you want to allow
your job attempts to run. If an attempt exceeds the timeout duration, it's stopped and the status
moves to FAILED. For more information, see Job Timeouts (p. 19).
8. For Multi-node parallel, leave this box unchecked. To create a multi-node parallel job definition
instead, see Creating a multi-node parallel job definition (p. 36).
9. In Container properties, you can specify properties that are passed to the Docker daemon when the
job is placed.
a. For Image, choose the Docker image to use for your job. Images in the Docker Hub registry
are available by default. You can also specify other repositories with repository-
url/image:tag. Up to 255 letters (uppercase and lowercase), numbers, hyphens, underscores,
colons, periods, forward slashes, and number signs are allowed. This parameter maps to Image
in the Create a container section of the Docker Remote API and the IMAGE parameter of docker
run.
Note
Docker image architecture must match the processor architecture of the compute
resources that they're scheduled on. For example, ARM-based Docker images can only
run on ARM-based compute resources.
This parameter maps to Cmd in the Create a container section of the Docker Remote API and the
COMMAND parameter to docker run. For more information about the Docker CMD parameter, go
to https://docs.docker.com/engine/reference/builder/#cmd.
Note
You can use default values for parameter substitution as well as placeholders in your
command. For more information, see Parameters (p. 44).
c. For vCPUs, specify the number of vCPUs to reserve for the container. This parameter maps to
CpuShares in the Create a container section of the Docker Remote API and the --cpu-shares
32
AWS Batch User Guide
Creating a job definition
option to docker run. Each vCPU is equivalent to 1,024 CPU shares. You must specify at least
one vCPU.
d. For Memory, specify the hard limit (in MiB) of memory to present to the job's container. If your
container attempts to exceed the memory specified here, the container is killed. This parameter
maps to Memory in the Create a container section of the Docker Remote API and the --memory
option to docker run. You must specify at least 4 MiB of memory for a job.
Note
You can maximize your resource utilization by prioritizing memory for jobs of a specific
instance type. For instructions, see Compute Resource Memory Management (p. 114).
e. (Optional) For Number of GPUs, specify the number of GPUs your job uses.
The job runs on a container with the specified number of GPUs pinned to that container.
f. In the Additional configuration section, you can specify additional parameters to be used with
the container.
i. (Optional) For Job role, you can specify an IAM role that provides the container in your job
with permissions to use the AWS APIs. This feature uses Amazon ECS IAM roles for task
functionality. For more information, including configuration prerequisites, see IAM Roles for
Tasks in the Amazon Elastic Container Service Developer Guide.
Note
A job role is required for jobs that are running on Fargate resources.
Note
Only roles that have the Amazon Elastic Container Service Task Role trust
relationship are shown here. For more information about creating an IAM role for
your AWS Batch jobs, see Creating an IAM Role and Policy for your Tasks in the
Amazon Elastic Container Service Developer Guide.
ii. For Execution role, you can specify an IAM role that grants the Amazon ECS container
and Fargate agents permission to make AWS API calls on your behalf. This feature uses
Amazon ECS IAM roles for task functionality. For more information, including configuration
prerequisites, see Amazon ECS task execution IAM roles in the Amazon Elastic Container
Service Developer Guide.
Note
An execution role is required for jobs running on Fargate resources.
iii. (Optional, only for jobs running on Fargate resources) In the Assign public IP section,
select Enable to give the job a public IP address. For a job that's running in a private subnet
to send outbound traffic to the internet, the private subnet requires a NAT gateway be
attached to route requests to the internet. You might want to do this so that you can pull
container images. For more information, see Amazon ECS task networking in the Amazon
Elastic Container Service Developer Guide.
iv. (Optional) In the Mount points section, you can configure mount points for your job's
container to access.
A. For Container path, enter the path on the container at which to mount the host
volume.
B. For Source volume, enter the name of the volume to mount.
C. To make the volume read-only for the container, choose Read-only.
v. (Optional, only for jobs running on EC2 resources) In the Ulimits section, you can configure
any ulimit values to use for your job's container.
vi. (Optional) In the Environment variables section, you can specify environment variables to
pass to your job's container. This parameter maps to Env in the Create a container section
of the Docker Remote API and the --env option to docker run.
Important
We don't recommend that you use plaintext environment variables for sensitive
information, such as credential data.
A. For Name, enter a name for your volume. The name can be up to 255 characters in
length. It can contain uppercase and lowercase letters, numbers, hyphens (-), and
underscores (_).
B. (Optional) To use an Amazon EFS file system, select Enable EFS
I. (Optional) For Transit encryption port, enter the port to use when sending
encrypted data between the AWS Batch host and the Amazon EFS server. If you
don't specify a transit encryption port, it uses the port selection strategy that the
Amazon EFS mount helper uses. The value must be between 0 and 65,535. For
more information, see EFS Mount Helper in the Amazon Elastic File System User
Guide.
II. (Optional) For Access point ID, enter the access point ID to use. If an access point
is specified, the root directory value must either be omitted or set to /. For more
information, see Working with Amazon EFS Access Points in the Amazon Elastic File
System User Guide.
III. (Optional) To use the execution role when mounting the Amazon EFS file system,
select Use selected job role. For more information, see AWS Batch execution IAM
role (p. 176).
viii. (Optional) In the Security section, you can configure security options for your job's
container.
34
AWS Batch User Guide
Creating a job definition
A. To give your job's container elevated permissions on the host instance (similar to the
root user), select Enable privileged mode. This parameter maps to Privileged
in the Create a container section of the Docker Remote API and the --privileged
option to docker run.
B. For User, enter the user name to use inside the container. This parameter maps to User
in the Create a container section of the Docker Remote API and the --user option to
docker run.
ix. (Optional) In the Linux Parameters section, you can configure any device mappings to use
for your job's container. This allows the container to be able to access a device on the host
instance.
I. In the Container path field, enter the absolute file path in the container where the
tmpfs volume is mounted.
II. In the Size field, enter size (in MiB) of the tmpfs volume.
III. (Optional) In the Mount options field, enter the mount options. For
more information, including the list of available mount options, see
mountOptions (p. 50) in Job definition parameters (p. 43).
x. (Optional) In the Log configuration section, you can configure the log driver to use for your
job's container. By default, the awslogs log driver is used.
A. In the Log driver section, select the log driver to use. For more information about the
available log drivers, see logDriver (p. 51) in Job definition parameters (p. 43).
B. (Optional) In the Options section, select Add option to add an option.
I. In the Name field, enter the name of the option. The options available vary by log
driver. For more information, see the log driver documentation.
II. In the Value field, enter the value of the option.
C. (Optional) In the Secrets section, select Add secret to add a secret.
I. In the Name field, enter the name of the secret. For more information, see
secretOptions (p. 53) in Job definition parameters (p. 43).
II. In the Value field, enter the ARN of the secret.
35
AWS Batch User Guide
Creating a multi-node parallel job definition
10. (Optional) In the Parameters section, you can specify parameter substitution default values
and placeholders to use in the command that your job's container runs when it starts. For more
information, see Parameters (p. 44).
To create a single-node job definition, see Creating a job definition (p. 31).
a. For Job attempts, specify the number of times to attempt your job (in case it fails). This number
must be between one (1) and ten (10), inclusive.
b. (Optional) Select Add evaluate on exit to add up to five (5) conditions to match string patterns
with the exit code, status reason, and reason that is returned in the job attempt. For each set
of conditions, Action must be set to either Retry (to retry until the number of job attempts has
been reached), or Exit to stop retrying the job.
7. (Optional) For Execution timeout, specify the maximum number of seconds you would like to allow
your job attempts to run. If an attempt exceeds the timeout duration, it is stopped and the status
moves to FAILED. For more information, see Job Timeouts (p. 19).
8. For Multi-node parallel, select Enable multi-node parallel and then complete the
following substeps. To create a single node parallel job definition instead, see Creating a job
definition (p. 31).
a. For Number of nodes, enter the total number of nodes to use for your job.
b. For Main node, enter the node index to use for the main node. The default main node index is 0.
c. Select Add node range. This creates a Node range section.
i. For Target nodes, specify the range for your node group, using range_start:range_end
notation.
36
AWS Batch User Guide
Creating a multi-node parallel job definition
You can create up to five node ranges for the number of nodes you specified for your job.
Node ranges use the index value for a node, and the node index begins at 0. The range end
index value of your final node group should be the number of nodes you specified in Step
8.a (p. 36), minus one. For example, If you specified 10 nodes, and you want to use a
single node group, then your end range should be 9.
ii. In Container properties, you can specify properties that are passed to the Docker daemon
for the nodes in the node range.
A. For Image, choose the Docker image to use for your job. Images in the Docker
Hub registry are available by default. You can also specify other repositories with
repository-url/image:tag. Up to 255 letters (uppercase and lowercase), numbers,
hyphens, underscores, colons, periods, forward slashes, and number signs are allowed.
This parameter maps to Image in the Create a container section of the Docker Remote
API and the IMAGE parameter of docker run.
Note
Docker image architecture must match the processor architecture of the
compute resources that they're scheduled on. For example, ARM-based Docker
images can only run on ARM-based compute resources.
This parameter maps to Cmd in the Create a container section of the Docker Remote
API and the COMMAND parameter to docker run. For more information about the Docker
CMD parameter, go to https://docs.docker.com/engine/reference/builder/#cmd.
Note
You can use default values for parameter substitution and placeholders in your
command. For more information, see Parameters (p. 44).
C. For vCPUs, specify the number of vCPUs to reserve for the container. This parameter
maps to CpuShares in the Create a container section of the Docker Remote API and
the --cpu-shares option to docker run. Each vCPU is equivalent to 1,024 CPU
shares. You must specify at least one vCPU.
D. For Memory, specify the hard limit (in MiB) of memory to present to the job's
container. If your container attempts to exceed the memory specified here, the
container is killed. This parameter maps to Memory in the Create a container section of
the Docker Remote API and the --memory option to docker run. You must specify at
least 4 MiB of memory for a job.
37
AWS Batch User Guide
Creating a multi-node parallel job definition
Note
If you're trying to maximize your resource utilization by providing your jobs
as much memory as possible for a particular instance type, see Compute
Resource Memory Management (p. 114).
E. (Optional) For Number of GPUs, specify the number of GPUs your job uses.
The job runs on a container with the specified number of GPUs pinned to that
container.
F. In the Additional configuration section, you can specify additional parameters to be
used with the container.
I. (Optional) For Job role, you can specify an IAM role that provides the container
in your job with permissions to use the AWS APIs. This feature uses Amazon ECS
IAM roles for task functionality. For more information, including configuration
prerequisites, see IAM Roles for Tasks in the Amazon Elastic Container Service
Developer Guide.
Note
A job role is required for jobs that are running on Fargate resources.
Note
Only roles that have the Amazon Elastic Container Service Task Role
trust relationship are shown here. For more information about creating an
IAM role for your AWS Batch jobs, see Creating an IAM Role and Policy for
your Tasks in the Amazon Elastic Container Service Developer Guide.
II. (Optional) In the Volumes section, you can specify data volumes for your job to
pass to your job's container.
1. For Name, enter a name for your volume. Up to 255 letters (uppercase and
lowercase), numbers, hyphens, and underscores are allowed.
2. (Optional) For Source Path, enter the path on the host instance to present to
the container. If you leave this field empty, then the Docker daemon assigns a
host path for you. If you specify a source path, then the data volume persists
at the specified location on the host container instance until you delete it
manually. If the source path doesn't exist on the host container instance, the
Docker daemon creates it. If the location does exist, the contents of the source
path folder are exported to the container.
III. (Optional) In the Mount points section, you can configure mount points for your
job's container to access.
1. For Container path, enter the path on the container at which to mount the
host volume.
2. For Source volume, enter the name of the volume to mount.
3. To make the volume read-only for the container, choose Read-only.
IV. (Optional) In the Ulimits section, you can configure any ulimit values to use for
your job's container.
38
AWS Batch User Guide
Job definition template
Important
We don't recommend using plaintext environment variables for sensitive
information, such as credential data.
1. To give your job's container elevated privileges on the host instance (similar to
the root user), select Privileged. This parameter maps to Privileged in the
Create a container section of the Docker Remote API and the --privileged
option to docker run.
2. For User, enter the user name to use inside the container. This parameter
maps to User in the Create a container section of the Docker Remote API and
the --user option to docker run.
VII. (Optional) In the Linux Parameters section, you can configure any device
mappings to use for your job's container so that the container can access a device
on the host instance.
{
"jobDefinitionName": "",
39
AWS Batch User Guide
Job definition template
"type": "container",
"parameters": {
"KeyName": ""
},
"containerProperties": {
"image": "",
"vcpus": 0,
"memory": 0,
"command": [
""
],
"jobRoleArn": "",
"executionRoleArn": "",
"volumes": [
{
"host": {
"sourcePath": ""
},
"name": "",
"efsVolumeConfiguration": {
"fileSystemId": "",
"rootDirectory": "",
"transitEncryption": "ENABLED",
"transitEncryptionPort": 0,
"authorizationConfig": {
"accessPointId": "",
"iam": "ENABLED"
}
}
}
],
"environment": [
{
"name": "",
"value": ""
}
],
"mountPoints": [
{
"containerPath": "",
"readOnly": true,
"sourceVolume": ""
}
],
"readonlyRootFilesystem": true,
"privileged": true,
"ulimits": [
{
"hardLimit": 0,
"name": "",
"softLimit": 0
}
],
"user": "",
"instanceType": "",
"resourceRequirements": [
{
"value": "",
"type": "VCPU"
}
],
"linuxParameters": {
"devices": [
{
"hostPath": "",
"containerPath": "",
40
AWS Batch User Guide
Job definition template
"permissions": [
"MKNOD"
]
}
],
"initProcessEnabled": true,
"sharedMemorySize": 0,
"tmpfs": [
{
"containerPath": "",
"size": 0,
"mountOptions": [
""
]
}
],
"maxSwap": 0,
"swappiness": 0
},
"logConfiguration": {
"logDriver": "json-file",
"options": {
"KeyName": ""
},
"secretOptions": [
{
"name": "",
"valueFrom": ""
}
]
},
"secrets": [
{
"name": "",
"valueFrom": ""
}
],
"networkConfiguration": {
"assignPublicIp": "ENABLED"
},
"fargatePlatformConfiguration": {
"platformVersion": ""
}
},
"nodeProperties": {
"numNodes": 0,
"mainNode": 0,
"nodeRangeProperties": [
{
"targetNodes": "",
"container": {
"image": "",
"vcpus": 0,
"memory": 0,
"command": [
""
],
"jobRoleArn": "",
"executionRoleArn": "",
"volumes": [
{
"host": {
"sourcePath": ""
},
"name": "",
"efsVolumeConfiguration": {
41
AWS Batch User Guide
Job definition template
"fileSystemId": "",
"rootDirectory": "",
"transitEncryption": "DISABLED",
"transitEncryptionPort": 0,
"authorizationConfig": {
"accessPointId": "",
"iam": "DISABLED"
}
}
}
],
"environment": [
{
"name": "",
"value": ""
}
],
"mountPoints": [
{
"containerPath": "",
"readOnly": true,
"sourceVolume": ""
}
],
"readonlyRootFilesystem": true,
"privileged": true,
"ulimits": [
{
"hardLimit": 0,
"name": "",
"softLimit": 0
}
],
"user": "",
"instanceType": "",
"resourceRequirements": [
{
"value": "",
"type": "GPU"
}
],
"linuxParameters": {
"devices": [
{
"hostPath": "",
"containerPath": "",
"permissions": [
"MKNOD"
]
}
],
"initProcessEnabled": true,
"sharedMemorySize": 0,
"tmpfs": [
{
"containerPath": "",
"size": 0,
"mountOptions": [
""
]
}
],
"maxSwap": 0,
"swappiness": 0
},
"logConfiguration": {
42
AWS Batch User Guide
Job definition parameters
"logDriver": "awslogs",
"options": {
"KeyName": ""
},
"secretOptions": [
{
"name": "",
"valueFrom": ""
}
]
},
"secrets": [
{
"name": "",
"valueFrom": ""
}
],
"networkConfiguration": {
"assignPublicIp": "DISABLED"
},
"fargatePlatformConfiguration": {
"platformVersion": ""
}
}
}
]
},
"retryStrategy": {
"attempts": 0,
"evaluateOnExit": [
{
"onStatusReason": "",
"onReason": "",
"onExitCode": "",
"action": "EXIT"
}
]
},
"propagateTags": true,
"timeout": {
"attemptDurationSeconds": 0
},
"tags": {
"KeyName": ""
},
"platformCapabilities": [
"FARGATE"
]
}
Note
You can generate the preceding job definition template with the following AWS CLI command:
Contents
43
AWS Batch User Guide
Job definition name
When you register a job definition, you specify a name. The name can be up to 128 characters in
length. It can contain uppercase and lowercase letters, numbers, hyphens (-), and underscores (_).
The first job definition that's registered with that name is given a revision of 1. Any subsequent job
definitions that are registered with that name are given an incremental revision number.
Type: String
Required: Yes
Type
type
When you register a job definition, you specify the type of job. If the job runs on Fargate resources,
then multinode isn't supported. For more information about multi-node parallel jobs, see the
section called “Creating a multi-node parallel job definition” (p. 36).
Type: String
Required: Yes
Parameters
parameters
When you submit a job, you can specify parameters that should replace the placeholders or override
the default job definition parameters. Parameters in job submission requests take precedence over
the defaults in a job definition. This means that you can use the same job definition for multiple jobs
that use the same format, and programmatically change values in the command at submission time.
Required: No
When you register a job definition, you can use parameter substitution placeholders in the command
field of a job's container properties. For example:
44
AWS Batch User Guide
Platform capabilities
When this job definition is submitted to run, the Ref::codec argument in the command for the
container is replaced with the default value, mp4.
Platform capabilities
platformCapabilities
The platform capabilities that's required by the job definition. If no value is specified, it defaults to
EC2. For jobs that run on Fargate resources, FARGATE is specified.
Type: String
Required: No
Propagate tags
propagateTags
Specifies whether to propagate the tags from the job or job definition to the corresponding Amazon
ECS task. If no value is specified, the tags aren't propagated. Tags can only be propagated to the
tasks when the task is created. For tags with the same name, job tags are given priority over job
definitions tags. If the total number of combined tags from the job and job definition is over 50, the
job's moved to the FAILED state.
Type: Boolean
Required: No
Container properties
When you register a job definition, you must specify a list of container properties that are passed to the
Docker daemon on a container instance when the job is placed. The following container properties are
allowed in a job definition. For single-node jobs, these container properties are set at the job definition
level. For multi-node parallel jobs, container properties are set in the Node properties (p. 61) level, for
each node group.
command
The command that's passed to the container. This parameter maps to Cmd in the Create a container
section of the Docker Remote API and the COMMAND parameter to docker run. For more information
about the Docker CMD parameter, see https://docs.docker.com/engine/reference/builder/#cmd.
45
AWS Batch User Guide
Container properties
Required: No
environment
The environment variables to pass to a container. This parameter maps to Env in the Create a
container section of the Docker Remote API and the --env option to docker run.
Important
We don't recommend that you use plaintext environment variables for sensitive
information, such as credential data.
Note
Environment variables must not start with AWS_BATCH. This naming convention is reserved
for variables that are set by the AWS Batch service.
Required: No
name
Type: String
Type: String
"environment" : [
{ "name" : "envName1", "value" : "envValue1" },
{ "name" : "envName2", "value" : "envValue2" }
]
executionRoleArn
When you register a job definition, you can specify an IAM role. The role provides the Amazon ECS
container agent with permissions to call the API actions that are specified in its associated policies
on your behalf. Jobs that are running on Fargate resources must provide an execution role. For more
information, see AWS Batch execution IAM role (p. 176).
Type: String
Required: No
fargatePlatformConfiguration
The platform configuration for jobs that are running on Fargate resources. Jobs that are running on
EC2 resources must not specify this parameter.
46
AWS Batch User Guide
Container properties
Required: No
platformVersion
The AWS Fargate platform version use for the jobs, or LATEST to use a recent, approved version
of the AWS Fargate platform.
Type: String
Default: LATEST
Required: No
image
The image used to start a job. This string is passed directly to the Docker daemon. Images in
the Docker Hub registry are available by default. You can also specify other repositories with
repository-url/image:tag. Up to 255 letters (uppercase and lowercase), numbers, hyphens,
underscores, colons, periods, forward slashes, and number signs are allowed. This parameter maps
to Image in the Create a container section of the Docker Remote API and the IMAGE parameter of
docker run.
Note
Docker image architecture must match the processor architecture of the compute resources
that they're scheduled on. For example, ARM-based Docker images can only run on ARM-
based compute resources.
• Images in Amazon ECR Public repositories use the full registry/repository[:tag]
or registry/repository[@digest] naming conventions. For example,
public.ecr.aws/registry_alias/my-web-app:latest.
• Images in Amazon ECR repositories use the full registry/repository:[tag] naming
convention. For example, aws_account_id.dkr.ecr.region.amazonaws.com/my-web-
app:latest.
• Images in official repositories on Docker Hub use a single name (for example, ubuntu or mongo).
• Images in other repositories on Docker Hub are qualified with an organization name (for example,
amazon/amazon-ecs-agent).
• Images in other online repositories are qualified further by a domain name (for example,
quay.io/assemblyline/ubuntu).
Type: String
Required: Yes
instanceType
The instance type to use for a multi-node parallel job. All node groups in a multi-node parallel job
must use the same instance type. This parameter isn't valid for single-node container jobs or for jobs
running on Fargate resources.
Type: String
Required: No
jobRoleArn
When you register a job definition, you can specify an IAM role. The role provides the job container
with permissions to call the API actions that are specified in its associated policies on your behalf.
For more information, see IAM Roles for Tasks in the Amazon Elastic Container Service Developer
Guide.
Type: String
47
AWS Batch User Guide
Container properties
Required: No
linuxParameters
Linux-specific modifications that are applied to the container, such as details for device mappings.
"linuxParameters": {
"devices": [
{
"hostPath": "string",
"containerPath": "string",
"permissions": [
"READ", "WRITE", "MKNOD"
]
}
],
"initProcessEnabled": true|false,
"sharedMemorySize": 0,
"tmpfs": [
{
"containerPath": "string",
"size": integer,
"mountOptions": [
"string"
]
}
],
"maxSwap": integer,
"swappiness": integer
}
Required: No
devices
List of devices mapped into the container. This parameter maps to Devices in the Create a
container section of the Docker Remote API and the --device option to docker run.
Note
This parameter isn't applicable to jobs that are running on Fargate resources and
shouldn't be provided.
Required: No
hostPath
Path where the device available in the host container instance is.
Type: String
Required: Yes
containerPath
Path where the device is exposed in the container is. If this isn't specified, the device is
exposed at the same path as the host path.
Type: String
Required: No
48
AWS Batch User Guide
Container properties
permissions
Permissions for the device in the container. If this isn't specified the permissions are set to
READ, WRITE, and MKNOD.
Required: No
If true, run an init process inside the container that forwards signals and reaps processes.
This parameter maps to the --init option to docker run. This parameter requires version 1.25
of the Docker Remote API or greater on your container instance. To check the Docker Remote
API version on your container instance, log into your container instance and run the following
command: sudo docker version | grep "Server API version"
Type: Boolean
Required: No
maxSwap
The total amount of swap memory (in MiB) a job can use. This parameter is translated to the
--memory-swap option to docker run where the value is the sum of the container memory
plus the maxSwap value. For more information, see --memory-swap details in the Docker
documentation.
If a maxSwap value of 0 is specified, the container doesn't use swap. Accepted values are 0
or any positive integer. If the maxSwap parameter is omitted, the container uses the swap
configuration for the container instance that it's running on. A maxSwap value must be set for
the swappiness parameter to be used.
Note
This parameter isn't applicable to jobs that are running on Fargate resources and
shouldn't be provided.
Type: Integer
Required: No
sharedMemorySize
The value for the size (in MiB) of the /dev/shm volume. This parameter maps to the --shm-
size option to docker run.
Note
This parameter isn't applicable to jobs running on Fargate resources and shouldn't be
provided.
Type: Integer
Required: No
swappiness
You can use this to tune a container's memory swappiness behavior. A swappiness value of
0 causes swapping to not happen unless absolutely necessary. A swappiness value of 100
causes pages to be swapped very aggressively. Accepted values are whole numbers between 0
and 100. If the swappiness parameter isn't specified, a default value of 60 is used. If a value
49
AWS Batch User Guide
Container properties
isn't specified for maxSwap, then this parameter is ignored. If maxSwap is set to 0, the container
doesn't use swap. This parameter maps to the --memory-swappiness option to docker run.
Type: Integer
Required: No
tmpfs
The container path, mount options, and size of the tmpfs mount.
Required: No
containerPath
The absolute file path in the container where the tmpfs volume is mounted.
Type: String
Required: Yes
mountOptions
Valid values: "defaults" | "ro" | "rw" | "suid" | "nosuid" | "dev" | "nodev" | "exec" |
"noexec" | "sync" | "async" | "dirsync" | "remount" | "mand" | "nomand" | "atime"
| "noatime" | "diratime" | "nodiratime" | "bind" | "rbind" | "unbindable" |
"runbindable" | "private" | "rprivate" | "shared" | "rshared" | "slave" | "rslave" |
"relatime" | "norelatime" | "strictatime" | "nostrictatime" | "mode" | "uid" | "gid" |
"nr_inodes" | "nr_blocks" | "mpol"
Required: No
50
AWS Batch User Guide
Container properties
size
Type: Integer
Required: Yes
logConfiguration
This parameter maps to LogConfig in the Create a container section of the Docker Remote API and
the --log-driver option to docker run. By default, containers use the same logging driver that
the Docker daemon uses. However the container can use a different logging driver than the Docker
daemon by specifying a log driver with this parameter in the container definition. To use a different
logging driver for a container, the log system must be either configured on the container instance or
on another log server to provide remote logging options. For more information about the options
for different supported log drivers, see Configure logging drivers in the Docker documentation.
Note
AWS Batch currently supports a subset of the logging drivers available to the Docker
daemon (shown in the LogConfiguration data type).
This parameter requires version 1.18 of the Docker Remote API or greater on your container
instance. To check the Docker Remote API version on your container instance, log into your container
instance and run the following command: sudo docker version | grep "Server API
version"
"logConfiguration": {
"devices": [
{
"logDriver": "string",
"options": {
"optionName1" : "optionValue1",
"optionName2" : "optionValue2"
}
"secretOptions": [
{
"name" : "secretOptionName1",
"valueFrom" : "secretOptionArn1"
},
{
"name" : "secretOptionName2",
"valueFrom" : "secretOptionArn2"
}
]
}
]
}
Required: No
logDriver
The log driver to use for the job. By default, AWS Batch enables the awslogs log driver. The
valid values listed for this parameter are log drivers that the Amazon ECS container agent can
communicate with by default.
This parameter maps to LogConfig in the Create a container section of the Docker Remote
API and the --log-driver option to docker run. By default, jobs use the same logging driver
51
AWS Batch User Guide
Container properties
that the Docker daemon uses. However, the job can use a different logging driver than the
Docker daemon by specifying a log driver with this parameter in the job definition. If you want
to specify another logging driver for a job, then the log system must be configured on the
container instance in the compute environment. Or, alternatively, you should configure it on
another log server to provide remote logging options. For more information about the options
for different supported log drivers, see Configure logging drivers in the Docker documentation.
Note
AWS Batch currently supports a subset of the logging drivers that are available to the
Docker daemon. Additional log drivers might be available in future releases of the
Amazon ECS container agent.
The supported log drivers are awslogs, fluentd, gelf, json-file, journald, logentries,
syslog, and splunk.
Note
Jobs that are running on Fargate resources are restricted to the awslogs and splunk
log drivers.
This parameter requires version 1.18 of the Docker Remote API or greater on your container
instance. To check the Docker Remote API version on your container instance, log into your
container instance and run the following command: sudo docker version | grep
"Server API version"
Note
The Amazon ECS container agent that's running on a container instance
must register the logging drivers that are available on that instance with the
ECS_AVAILABLE_LOGGING_DRIVERS environment variable. Otherwise, the containers
placed on that instance can't use these log configuration options. For more information,
see Amazon ECS Container Agent Configuration in the Amazon Elastic Container Service
Developer Guide.
awslogs
Specifies the Amazon CloudWatch Logs logging driver. For more information, see Using the
awslogs log driver (p. 64) and Amazon CloudWatch Logs logging driver in the Docker
documentation.
fluentd
Specifies the Fluentd logging driver. For more information, including usage and options, see
Fluentd logging driver in the Docker documentation.
gelf
Specifies the Graylog Extended Format (GELF) logging driver. For more information,
including usage and options, see Graylog Extended Format logging driver in the Docker
documentation.
journald
Specifies the journald logging driver. For more information, including usage and options,
see Journald logging driver in the Docker documentation.
json-file
Specifies the JSON file logging driver. For more information, including usage and options,
see JSON File logging driver in the Docker documentation.
splunk
Specifies the Splunk logging driver. For more information, including usage and options, see
Splunk logging driver in the Docker documentation.
52
AWS Batch User Guide
Container properties
syslog
Specifies the syslog logging driver. For more information, including usage and options, see
Syslog logging driver in the Docker documentation.
Type: String
Required: Yes
This parameter requires version 1.19 of the Docker Remote API or greater on your container
instance.
Required: No
secretOptions
An object representing the secret to pass to the log configuration. For more information, see
Specifying sensitive data (p. 67).
Required: No
name
Type: String
Required: Yes
valueFrom
The ARN of the secret to expose to the log configuration of the container. The supported
values are either the full ARN of the Secrets Manager secret or the full ARN of the
parameter in the SSM Parameter Store.
Note
If the SSM Parameter Store parameter exists in the same Region as the task you're
launching, then you can use either the full ARN or name of the parameter. If the
parameter exists in a different Region, then the full ARN must be specified.
Type: String
Required: Yes
memory
53
AWS Batch User Guide
Container properties
As an example for how to use resourceRequirements (p. 55), if your job definition contains
lines similar to this:
"containerProperties": {
"memory": 512
}
"containerProperties": {
"resourceRequirements": [
{
"type": "MEMORY",
"value": "512"
}
]
}
Type: Integer
Required: Yes
mountPoints
The mount points for data volumes in your container. This parameter maps to Volumes in the
Create a container section of the Docker Remote API and the --volume option to docker run.
"mountPoints": [
{
"sourceVolume": "string",
"containerPath": "string",
"readOnly": true|false
}
]
Required: No
sourceVolume
Type: String
Type: String
If this value is true, the container has read-only access to the volume. If this value is false,
then the container can write to the volume.
54
AWS Batch User Guide
Container properties
Type: Boolean
Required: No
Default: False
networkConfiguration
The network configuration for jobs that are running on Fargate resources. Jobs that are running on
EC2 resources must not specify this parameter.
"networkConfiguration": {
"assignPublicIp": "string"
}
Required: No
assignPublicIp
Indicates whether the job should have a public IP address. This is required if the job needs
outbound network access.
Type: String
Required: No
Default: DISABLED
privileged
When this parameter is true, the container is given elevated permissions on the host container
instance (similar to the root user). This parameter maps to Privileged in the Create a container
section of the Docker Remote API and the --privileged option to docker run. This parameter isn't
applicable to jobs running on Fargate resources and shouldn't be provided, or specified as false.
"privileged": true|false
Type: Boolean
Required: No
readonlyRootFilesystem
When this parameter is true, the container is given read-only access to its root file system. This
parameter maps to ReadonlyRootfs in the Create a container section of the Docker Remote API
and the --read-only option to docker run.
"readonlyRootFilesystem": true|false
Type: Boolean
Required: No
resourceRequirements
The type and amount of a resource to assign to a container. The supported resources include GPU,
MEMORY, and VCPU.
55
AWS Batch User Guide
Container properties
"resourceRequirements" : [
{
"type": "GPU",
"value": "number"
}
]
Required: No
type
The type of resource to assign to a container. The supported resources include GPU, MEMORY, and
VCPU.
Type: String
The quantity of the specified resource to reserve for the container. The values vary based on the
type specified.
type="GPU"
The number of physical GPUs to reserve for the container. The number of GPUs reserved
for all containers in a job shouldn't exceed the number of available GPUs on the compute
resource that the job is launched on.
type="MEMORY"
The hard limit (in MiB) of memory to present to the container. If your container attempts to
exceed the memory specified here, the container is killed. This parameter maps to Memory
in the Create a container section of the Docker Remote API and the --memory option to
docker run. You must specify at least 4 MiB of memory for a job. This is required but can be
specified in several places for multi-node parallel (MNP) jobs. It must be specified for each
node at least once. This parameter maps to Memory in the Create a container section of the
Docker Remote API and the --memory option to docker run.
Note
If you're trying to maximize your resource utilization by providing your jobs as
much memory as possible for a particular instance type, see Compute Resource
Memory Management (p. 114).
For jobs that are running on Fargate resources, then value must match one of the
supported values. Moreover, the VCPU values must be one of the values supported for that
memory value.
VCPU MEMORY
1 vCPU 2048, 3072, 4096, 5120, 6144, 7168, and 8192 MiB
56
AWS Batch User Guide
Container properties
VCPU MEMORY
type="VCPU"
The number of vCPUs reserved for the job. This parameter maps to CpuShares in the
Create a container section of the Docker Remote API and the --cpu-shares option to
docker run. Each vCPU is equivalent to 1,024 CPU shares. For jobs that are running on
EC2 resources, you must specify at least one vCPU. This is required but can be specified in
several places. It must be specified for each node at least once.
For jobs that are running on Fargate resources, then value must match one of the
supported values and the MEMORY values must be one of the values supported for that
VCPU value. The supported values are 0.25, 0.5, 1, 2, and 4.
Type: String
The secrets for the job that are exposed as environment variables. For more information, see
Specifying sensitive data (p. 67).
"secrets": [
{
"name": "secretName1",
"valueFrom": "secretArn1"
},
{
"name": "secretName2",
"valueFrom": "secretArn2"
}
...
]
Required: No
name
Type: String
valueFrom
The secret to expose to the container. The supported values are either the full ARN of the
Secrets Manager secret or the full ARN of the parameter in the SSM Parameter Store.
Note
If the SSM Parameter Store parameter exists in the same Region as the job you're
launching, then you can use either the full ARN or name of the parameter. If the
parameter exists in a different Region, then the full ARN must be specified.
57
AWS Batch User Guide
Container properties
Type: String
A list of ulimits values to set in the container. This parameter maps to Ulimits in the Create a
container section of the Docker Remote API and the --ulimit option to docker run.
"ulimits": [
{
"name": string,
"softLimit": integer,
"hardLimit": integer
}
...
]
Required: No
name
Type: String
hardLimit
Type: Integer
softLimit
Type: Integer
user
The user name to use inside the container. This parameter maps to User in the Create a container
section of the Docker Remote API and the --user option to docker run.
"user": "string"
Type: String
Required: No
vcpus
58
AWS Batch User Guide
Container properties
As an example for how to use resourceRequirements, if your job definition contains lines similar
to this:
"containerProperties": {
"vcpus": 2
}
"containerProperties": {
"resourceRequirements": [
{
"type": "VCPU",
"value": "2"
}
]
}
Type: Integer
Required: Yes
volumes
When you register a job definition, you can specify a list of volumes that are passed to the Docker
daemon on a container instance. The following parameters are allowed in the container properties:
"volumes": [
{
"name": "string",
"host": {
"sourcePath": "string"
},
"efsVolumeConfiguration": {
"authorizationConfig": {
"accessPointId": "string",
"iam": "string"
},
"fileSystemId": "string",
"rootDirectory": "string",
"transitEncryption": "string",
"transitEncryptionPort": number
}
}
]
name
The name of the volume. Up to 255 letters (uppercase and lowercase), numbers, hyphens, and
underscores are allowed. This name is referenced in the sourceVolume parameter of container
definition mountPoints.
Type: String
Required: No
host
The contents of the host parameter determine whether your data volume persists on the
host container instance and where it's stored. If the host parameter is empty, then the Docker
daemon assigns a host path for your data volume. However, the data isn't guaranteed to persist
after the container associated with it stops running.
59
AWS Batch User Guide
Container properties
Note
This parameter isn't applicable to jobs that are running on Fargate resources and
shouldn't be provided.
Type: Object
Required: No
sourcePath
The path on the host container instance that's presented to the container. If this parameter
is empty, then the Docker daemon assigns a host path for you.
If the host parameter contains a sourcePath file location, then the data volume persists
at the specified location on the host container instance until you delete it manually. If the
sourcePath value doesn't exist on the host container instance, the Docker daemon creates
it. If the location does exist, the contents of the source path folder are exported.
Type: String
Required: No
efsVolumeConfiguration
This parameter is specified when you're using an Amazon Elastic File System file system for task
storage. For more information, see Amazon EFS Volumes in the AWS Batch User Guide.
Type: Object
Required: No
authorizationConfig
The authorization configuration details for the Amazon EFS file system.
Type: String
Required: No
accessPointId
The Amazon EFS access point ID to use. If an access point is specified, the root directory
value that's specified in the EFSVolumeConfiguration must either be omitted or
set to /. This enforces the path that's set on the EFS access point. If an access point is
used, transit encryption must be enabled in the EFSVolumeConfiguration. For more
information, see Working with Amazon EFS Access Points in the Amazon Elastic File
System User Guide.
Type: String
Required: No
iam
Determines whether to use the AWS Batch job IAM role defined in a job definition when
mounting the Amazon EFS file system. If enabled, transit encryption must be enabled
in the EFSVolumeConfiguration. If this parameter is omitted, the default value of
DISABLED is used. For more information, see Using Amazon EFS Access Points in the
AWS Batch User Guide.
Type: String
60
AWS Batch User Guide
Node properties
Required: No
fileSystemId
Type: String
Required: No
rootDirectory
The directory within the Amazon EFS file system to mount as the root directory inside
the host. If this parameter is omitted, the root of the Amazon EFS volume is used. If you
specify /, it has the same effect as omitting this parameter. The maximum length is 4,096
characters.
Important
If an EFS access point is specified in the authorizationConfig, the root
directory parameter must either be omitted or set to /. This enforces the path
that's set on the Amazon EFS access point.
Type: String
Required: No
transitEncryption
Determines whether to enable encryption for Amazon EFS data in transit between the
Amazon ECS host and the Amazon EFS server. Transit encryption must be enabled if
Amazon EFS IAM authorization is used. If this parameter is omitted, the default value of
DISABLED is used. For more information, see Encrypting data in transit in the Amazon
Elastic File System User Guide.
Type: String
Required: No
transitEncryptionPort
The port to use when sending encrypted data between the Amazon ECS host and the
Amazon EFS server. If you don't specify a transit encryption port, it uses the port selection
strategy that the Amazon EFS mount helper uses. The value must be between 0 and 65,535.
For more information, see EFS Mount Helper in the Amazon Elastic File System User Guide.
Type: Integer
Required: No
Node properties
nodeProperties
When you register a multi-node parallel job definition, you must specify a list of node properties.
These node properties should define the number of nodes to use in your job, the main node index,
and the different node ranges to use. If the job runs on Fargate resources, then you can't specify
nodeProperties. Rather, you should use containerProperties instead. The following node
properties are allowed in a job definition. For more information, see Multi-node Parallel Jobs (p. 27).
61
AWS Batch User Guide
Retry strategy
Required: No
mainNode
Specifies the node index for the main node of a multi-node parallel job. This node index value
must be smaller than the number of nodes.
Type: Integer
Required: Yes
numNodes
The number of nodes that are associated with a multi-node parallel job.
Type: Integer
Required: Yes
nodeRangeProperties
A list of node ranges and their properties that are associated with a multi-node parallel job.
Required: Yes
targetNodes
The range of nodes, using node index values. A range of 0:3 indicates nodes with index
values of 0 through 3. If the starting range value is omitted (:n), then 0 is used to start
the range. If the ending range value is omitted (n:), then the highest possible node index
is used to end the range. Your accumulative node ranges must account for all nodes
(0:n). You can nest node ranges, for example 0:10 and 4:5. For this case, the 4:5 range
properties override the 0:10 properties.
Type: String
Required: No
container
The container details for the node range. For more information, see Container
properties (p. 45).
Required: No
Retry strategy
retryStrategy
When you register a job definition, you can optionally specify a retry strategy to use for failed jobs
that are submitted with this job definition. Any retry strategy that's specified during a SubmitJob
operation overrides the retry strategy defined here. By default, each job is attempted one time. If
you specify more than one attempt, the job is retried if it fails. Examples of a fail attempt include the
job returns a non-zero exit code or the container instance is terminated. For more information, see
Automated job retries.
Required: No
62
AWS Batch User Guide
Retry strategy
attempts
The number of times to move a job to the RUNNABLE status. You can specify between 1 and 10
attempts. If attempts is greater than one, the job is retried that many times if it fails, until it
has moved to RUNNABLE.
"attempts": integer
Type: Integer
Required: No
evaluateOnExit
Array of up to 5 objects that specify conditions under which the job should be retried or failed. If
this parameter is specified, then the attempts parameter must also be specified.
"evaluateOnExit": [
{
"action": "string",
"onExitCode": "string",
"onReason": "string",
"onStatusReason": "string"
}
]
Required: No
action
Specifies the action to take if all of the specified conditions (onStatusReason, onReason,
and onExitCode) are met. The values aren't case sensitive.
Type: String
Required: Yes
Contains a glob pattern to match against the decimal representation of the ExitCode
that's returned for a job. The pattern can be up to 512 characters in length. It can contain
only numbers. It cannot contain letters or special characters. It can optionally end with an
asterisk (*) so that only the start of the string needs to be an exact match.
Type: String
Required: No
onReason
Contains a glob pattern to match against the Reason that's returned for a job. The pattern
can be up to 512 characters in length. It can contain letters, numbers, periods (.), colons
(:), and white space (spaces, tabs). It can optionally end with an asterisk (*) so that only the
start of the string needs to be an exact match.
Type: String
Required: No
63
AWS Batch User Guide
Tags
onStatusReason
Contains a glob pattern to match against the StatusReason that's returned for a job. The
pattern can be up to 512 characters in length. It can contain letters, numbers, periods (.),
colons (:), and white space (spaces, tabs). It can optionally end with an asterisk (*) so that
only the start of the string needs to be an exact match.
Type: String
Required: No
Tags
tags
Key-value pair tags to associate with the job definition. For more information, see Tagging your AWS
Batch resources (p. 197).
Required: No
Timeout
timeout
You can configure a timeout duration for your jobs so that if a job runs longer than that, AWS Batch
terminates the job. For more information, see Job Timeouts (p. 19). If a job is terminated due to a
timeout, it isn't retried. Any timeout configuration that's specified during a SubmitJob operation
overrides the timeout configuration defined here. For more information, see Job Timeouts (p. 19).
Required: No
attemptDurationSeconds
The time duration in seconds (measured from the job attempt's startedAt timestamp) after
AWS Batch terminates unfinished jobs. The minimum value for the timeout is 60 seconds.
Type: Integer
Required: No
64
AWS Batch User Guide
Available awslogs log driver options
to CloudWatch Logs. For more information about how Docker logs are processed, including
alternative ways to capture different file data or streams, see View logs for a container or service
in the Docker documentation.
To send system logs from your container instances to CloudWatch Logs, see Using CloudWatch Logs
with AWS Batch (p. 159). For more information about CloudWatch Logs, see Monitoring Log Files and
CloudWatch Logs quotas in the Amazon CloudWatch Logs User Guide.
awslogs-region
Required: No
Specify the Region where the awslogs log driver should send your Docker logs. By default, the
Region that's used is the same one as the one for the job. You can choose to send all of your logs
from jobs in different Regions to a single Region in CloudWatch Logs. Doing this allows them to
be visible all from one location. Alternatively, you can separate them by Region for more granular
approach. However, when you choose this option, make sure that the specified log groups exists in
the Region that you specified.
awslogs-group
Required: Optional
With the awslogs-group option, you can specify the log group that the awslogs log driver sends
its log streams to. If this isn't specified, aws/batch/job is used.
awslogs-stream-prefix
Required: Optional
With the awslogs-stream-prefix option, you can associate a log stream with the specified
prefix, and the Amazon ECS task ID of the AWS Batch job that the container belongs to. If you
specify a prefix with this option, then the log stream takes the following format:
prefix-name/default/ecs-task-id
awslogs-datetime-format
Required: No
This option defines a multiline start pattern in Python strftime format. A log message consists
of a line that matches the pattern and any following lines that don't match the pattern. Thus the
matched line is the delimiter between log messages.
One example of a use case for using this format is for parsing output such as a stack dump, which
might otherwise be logged in multiple entries. The correct pattern allows it to be captured in a
single entry.
65
AWS Batch User Guide
Specifying a log configuration in your job definition
awslogs-multiline-pattern
Required: No
This option defines a multiline start pattern using a regular expression. A log message consists of
a line that matches the pattern and any following lines that don't match the pattern. Thus, the
matched line is the delimiter between log messages.
Required: No
Specify whether you want the log group automatically created. If this option isn't specified, it
defaults to false.
Warning
This option isn't recommended. We recommend that you create the log group in advance
using the CloudWatch Logs CreateLogGroup API action as each job tries to create the log
group, increasing the chance that the job fails.
Note
The IAM policy for your execution role must include the logs:CreateLogGroup
permission before you attempt to use awslogs-create-group.
The following log configuration JSON snippets have a logConfiguration object specified for each job.
One is for a WordPress job that sends logs to a log group called awslogs-wordpress, and another is
for a MySQL container that sends logs to a log group called awslogs-mysql. Both containers use the
awslogs-example log stream prefix.
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "awslogs-wordpress",
"awslogs-stream-prefix": "awslogs-example"
}
}
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "awslogs-mysql",
"awslogs-stream-prefix": "awslogs-example"
}
}
In the AWS Batch console, the log configuration for the wordpress job definition is specified as shown
in the following image.
66
AWS Batch User Guide
Specifying sensitive data
After you have registered a task definition with the awslogs log driver in a job definition log
configuration, you can submit a job with that job definition to start sending logs to CloudWatch Logs.
For more information, see Submitting a Job (p. 14).
• To inject sensitive data into your containers as environment variables, use the secrets job definition
parameter.
• To reference sensitive information in the log configuration of a job, use the secretOptions job
definition parameter.
Topics
• Specifying sensitive data using Secrets Manager (p. 67)
• Specifying sensitive data using Systems Manager Parameter Store (p. 73)
When you inject a secret as an environment variable, you can specify a JSON key or version of a secret to
inject. This process helps you control the sensitive data exposed to your job. For more information about
67
AWS Batch User Guide
Using Secrets Manager
secret versioning, see Key Terms and Concepts for AWS Secrets Manager in the AWS Secrets Manager User
Guide.
• To inject a secret using a specific JSON key or version of a secret, the container instance in your
compute environment must have version 1.37.0 or later of the Amazon ECS container agent installed.
However, we recommend using the latest container agent version. For information about checking
your agent version and updating to the latest version, see Updating the Amazon ECS container agent
in the Amazon Elastic Container Service Developer Guide.
To inject the full contents of a secret as an environment variable or to inject a secret in a log
configuration, your container instance must have version 1.22.0 or later of the container agent.
• Only secrets that store text data, which are secrets created with the SecretString parameter of the
CreateSecret API, are supported. Secrets that store binary data, which are secrets created with the
SecretBinary parameter of the CreateSecret API aren't supported.
• When using a job definition that references Secrets Manager secrets to retrieve sensitive data for your
jobs, if you're also using interface VPC endpoints, you must create the interface VPC endpoints for
Secrets Manager. For more information, see Using Secrets Manager with VPC Endpoints in the AWS
Secrets Manager User Guide.
• Sensitive data is injected into your job when the job is initially started. If the secret is subsequently
updated or rotated, the job doesn't receive the updated value automatically. You must launch a new
job to force the service to launch a fresh job with the updated secret value.
To provide access to the Secrets Manager secrets that you create, manually add the following
permissions as an inline policy to the execution role. For more information, see Adding and Removing
IAM Policies in the IAM User Guide.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"secretsmanager:GetSecretValue",
"kms:Decrypt"
],
"Resource": [
"arn:aws:secretsmanager:<region>:<aws_account_id>:secret:<secret_name>",
"arn:aws:kms:<region>:<aws_account_id>:key/<key_id>"
]
68
AWS Batch User Guide
Using Secrets Manager
}
]
}
• The secrets object containing the name of the environment variable to set in the job
• The Amazon Resource Name (ARN) of the Secrets Manager secret
• Additional parameters that contain the sensitive data to present to the job
The following example shows the full syntax that must be specified for the Secrets Manager secret.
arn:aws:secretsmanager:region:aws_account_id:secret:secret-name
The following section describes the additional parameters. These parameters are optional. However,
if you don't use them, you must include the colons : to use the default values. Examples are provided
below for more context.
json-key
Specifies the name of the key in a key-value pair with the value that you want to set as the
environment variable value. Only values in JSON format are supported. If you don't specify a JSON
key, then the full contents of the secret is used.
version-stage
Specifies the staging label of the version of a secret that you want to use. If a version staging label
is specified, you can't specify a version ID. If no version stage is specified, the default behavior is to
retrieve the secret with the AWSCURRENT staging label.
Staging labels are used to keep track of different versions of a secret when they are either updated
or rotated. Each version of a secret has one or more staging labels and an ID. For more information,
see Key Terms and Concepts for AWS Secrets Manager in the AWS Secrets Manager User Guide.
version-id
Specifies the unique identifier of the version of a secret that you want to use. If a version ID is
specified, you can't specify a version staging label. If no version ID is specified, the default behavior
is to retrieve the secret with the AWSCURRENT staging label.
Version IDs are used to keep track of different versions of a secret when they are either updated or
rotated. Each version of a secret has an ID. For more information, see Key Terms and Concepts for
AWS Secrets Manager in the AWS Secrets Manager User Guide.
The following is a snippet of a task definition showing the format when referencing the full text of a
Secrets Manager secret.
69
AWS Batch User Guide
Using Secrets Manager
"containerProperties": [{
"secrets": [{
"name": "environment_variable_name",
"valueFrom": "arn:aws:secretsmanager:region:aws_account_id:secret:secret_name-AbCdEf"
}]
}]
}
The following shows an example output from a get-secret-value command that displays the contents of
a secret along with the version staging label and version ID associated with it.
{
"ARN": "arn:aws:secretsmanager:region:aws_account_id:secret:appauthexample-AbCdEf",
"Name": "appauthexample",
"VersionId": "871d9eca-18aa-46a9-8785-981dd39ab30c",
"SecretString": "{\"username1\":\"password1\",\"username2\":\"password2\",
\"username3\":\"password3\"}",
"VersionStages": [
"AWSCURRENT"
],
"CreatedDate": 1581968848.921
}
Reference a specific key from the previous output in a container definition by specifying the key name at
the end of the ARN.
{
"containerProperties": [{
"secrets": [{
"name": "environment_variable_name",
"valueFrom": "arn:aws:secretsmanager:region:aws_account_id:secret:appauthexample-
AbCdEf:username1::"
}]
}]
}
The following shows an example output from a describe-secret command that displays the unencrypted
contents of a secret along with the metadata for all versions of the secret.
{
"ARN": "arn:aws:secretsmanager:region:aws_account_id:secret:appauthexample-AbCdEf",
"Name": "appauthexample",
"Description": "Example of a secret containing application authorization data.",
"RotationEnabled": false,
"LastChangedDate": 1581968848.926,
"LastAccessedDate": 1581897600.0,
"Tags": [],
"VersionIdsToStages": {
"871d9eca-18aa-46a9-8785-981dd39ab30c": [
"AWSCURRENT"
],
"9d4cb84b-ad69-40c0-a0ab-cead36b967e8": [
"AWSPREVIOUS"
]
}
}
70
AWS Batch User Guide
Using Secrets Manager
Reference a specific version staging label from the previous output in a container definition by specifying
the key name at the end of the ARN.
{
"containerProperties": [{
"secrets": [{
"name": "environment_variable_name",
"valueFrom": "arn:aws:secretsmanager:region:aws_account_id:secret:appauthexample-
AbCdEf::AWSPREVIOUS:"
}]
}]
}
Reference a specific version ID from the previous output in a container definition by specifying the key
name at the end of the ARN.
{
"containerProperties": [{
"secrets": [{
"name": "environment_variable_name",
"valueFrom": "arn:aws:secretsmanager:region:aws_account_id:secret:appauthexample-
AbCdEf::9d4cb84b-ad69-40c0-a0ab-cead36b967e8"
}]
}]
}
The following shows how to reference both a specific key within a secret and a specific version staging
label.
{
"containerProperties": [{
"secrets": [{
"name": "environment_variable_name",
"valueFrom": "arn:aws:secretsmanager:region:aws_account_id:secret:appauthexample-
AbCdEf:username1:AWSPREVIOUS:"
}]
}]
}
To specify a specific key and version ID, use the following syntax.
{
"containerProperties": [{
"secrets": [{
"name": "environment_variable_name",
"valueFrom": "arn:aws:secretsmanager:region:aws_account_id:secret:appauthexample-
AbCdEf:username1::9d4cb84b-ad69-40c0-a0ab-cead36b967e8"
}]
}]
}
71
AWS Batch User Guide
Using Secrets Manager
The following is a snippet of a job definition showing the format when referencing an Secrets Manager
secret.
{
"containerProperties": [{
"logConfiguration": [{
"logDriver": "splunk",
"options": {
"splunk-url": "https://cloud.splunk.com:8080"
},
"secretOptions": [{
"name": "splunk-token",
"valueFrom": "arn:aws:secretsmanager:region:aws_account_id:secret:secret_name-
AbCdEf"
}]
}]
}]
}
Alternatively, you can choose the Plaintext tab and enter the secret value in any way you like.
5. Choose the AWS KMS encryption key that you want to use to encrypt the protected text in the
secret. If you don't choose one, Secrets Manager checks to see if there's a default key for the
account, and uses it if it exists. If a default key doesn't exist, Secrets Manager creates one for you
automatically. You can also choose Add new key to create a custom CMK specifically for this secret.
To create your own AWS KMS CMK, you must have permissions to create CMKs in your account.
6. Choose Next.
7. For Secret name, type an optional path and name, such as production/MyAwesomeAppSecret
or development/TestSecret, and choose Next. You can optionally add a description to help you
remember the purpose of this secret later.
The secret name must be ASCII letters, digits, or any of the following characters: /_+=.@-
8. (Optional) At this point, you can configure rotation for your secret. For this procedure, leave it at
Disable automatic rotation and choose Next.
For information about how to configure rotation on new or existing secrets, see Rotating Your AWS
Secrets Manager Secrets.
9. Review your settings, and then choose Store secret to save everything you entered as a new secret
in Secrets Manager.
72
AWS Batch User Guide
Using Systems Manager Parameter Store
Topics
• Considerations for specifying sensitive data using Systems Manager Parameter Store (p. 73)
• Required IAM permissions for AWS Batch secrets (p. 73)
• Injecting sensitive data as an environment variable (p. 74)
• Injecting sensitive data in a log configuration (p. 74)
• Creating an AWS Systems Manager Parameter Store parameter (p. 75)
• This feature requires that your container instance have version 1.22.0 or later of the container agent.
However, we recommend using the latest container agent version. For information about checking
your agent version and updating to the latest version, see Updating the Amazon ECS container agent
in the Amazon Elastic Container Service Developer Guide.
• Sensitive data is injected into the container for your job when the container is initially started. If the
secret or Parameter Store parameter is subsequently updated or rotated, the container doesn't receive
the updated value automatically. You must launch a new job to force the launch of a fresh job with
updated secrets.
To provide access to the AWS Systems Manager Parameter Store parameters that you create, manually
add the following permissions as an inline policy to the execution role. For more information, see Adding
and Removing IAM Policies in the IAM User Guide.
{
"Version": "2012-10-17",
73
AWS Batch User Guide
Using Systems Manager Parameter Store
"Statement": [
{
"Effect": "Allow",
"Action": [
"ssm:GetParameters",
"secretsmanager:GetSecretValue",
"kms:Decrypt"
],
"Resource": [
"arn:aws:ssm:<region>:<aws_account_id>:parameter/<parameter_name>",
"arn:aws:secretsmanager:<region>:<aws_account_id>:secret:<secret_name>",
"arn:aws:kms:<region>:<aws_account_id>:key/<key_id>"
]
}
]
}
The following is a snippet of a task definition showing the format when referencing an Systems Manager
Parameter Store parameter. If the Systems Manager Parameter Store parameter exists in the same
Region as the task that you're launching, then you can use either the full ARN or name of the parameter.
If the parameter exists in a different Region, then the full ARN must be specified.
{
"containerProperties": [{
"secrets": [{
"name": "environment_variable_name",
"valueFrom": "arn:aws:ssm:region:aws_account_id:parameter/parameter_name"
}]
}]
}
The following is a snippet of a task definition showing the format when referencing an Systems Manager
Parameter Store parameter.
{
"containerProperties": [{
"logConfiguration": [{
"logDriver": "fluentd",
"options": {
"tag": "fluentd demo"
},
"secretOptions": [{
74
AWS Batch User Guide
Amazon EFS volumes
"name": "fluentd-address",
"valueFrom": "arn:aws:ssm:region:aws_account_id:parameter:parameter_name"
}]
}]
}]
}
• If you choose SecureString, the KMS Key ID field appears. If you don't provide a KMS
CMK ID, a KMS CMK ARN, an alias name, or an alias ARN, then the system uses alias/
aws/ssm. This is the default KMS CMK for Systems Manager. To avoid using this key,
choose a custom key. For more information, see Use Secure String Parameters in the AWS
Systems Manager User Guide.
• When you create a secure string parameter in the console by using the key-id parameter
with either a custom KMS CMK alias name or an alias ARN, you must specify the prefix
alias/ before the alias. The following is an ARN example:
arn:aws:kms:us-east-2:123456789012:alias/MyAliasName
alias/MyAliasName
6. For Value, type a value. For example, MyFirstParameter. If you chose SecureString, the value is
masked exactly as you entered it.
7. Choose Create parameter.
You can use Amazon EFS file systems with AWS Batch to export file system data across your fleet of
container instances. That way, your jobs have access to the same persistent storage. However, you must
configure your container instance AMI to mount the Amazon EFS file system before the Docker daemon
starts. Also, your job definitions must reference volume mounts on the container instance to use the file
system. The following sections help you get started using Amazon EFS with AWS Batch.
75
AWS Batch User Guide
Amazon EFS volume considerations
• For jobs using EC2 resources, Amazon EFS file system support was added as a public preview with
Amazon ECS optimized AMI version 20191212 with container agent version 1.35.0. However,
Amazon EFS file system support entered general availability with Amazon ECS optimized AMI version
20200319 with container agent version 1.38.0, which contained the Amazon EFS access point and IAM
authorization features. We recommend that you use Amazon ECS optimized AMI version 20200319
or later to take advantage of these features. For more information, see Amazon ECS optimized AMI
versions in the Amazon Elastic Container Service Developer Guide.
Note
If you create your own AMI, you must use container agent 1.38.0 or later, ecs-init version
1.38.0-1 or later, and run the following commands on your Amazon EC2 instance. This is all
to enable the Amazon ECS volume plugin. The commands are dependent on whether you're
using Amazon Linux 2 or Amazon Linux as your base image.
Amazon Linux 2
Amazon Linux
• For jobs using Fargate resources, Amazon EFS file system support was added when using platform
version 1.4.0 or later. For more information, see AWS Fargate platform versions in the Amazon Elastic
Container Service Developer Guide.
• When specifying Amazon EFS volumes in jobs using Fargate resources, Fargate creates a supervisor
container that is responsible for managing the Amazon EFS volume. The supervisor container uses
a small amount of the job's memory. The supervisor container is visible when querying the task
metadata version 4 endpoint. For more information, see Task metadata endpoint version 4 in the
Amazon Elastic Container Service User Guide for AWS Fargate.
Access points can enforce a user identity, including the user's POSIX groups, for all file system requests
that are made through the access point. Access points can also enforce a different root directory for the
file system so that clients can only access data in the specified directory or its subdirectories.
Note
When creating an EFS access point, you specify a path on the file system to serve as the root
directory. When you reference the EFS file system with an access point ID in your AWS Batch job
definition, the root directory must either be omitted or set to / This enforces the path that's set
on the EFS access point.
You can use an AWS Batch job IAM role to enforce that specific applications use a specific access point.
By combining IAM policies with access points, you can easily provide secure access to specific datasets for
76
AWS Batch User Guide
Specifying an Amazon EFS file system in your job definition
your applications. This feature uses Amazon ECS IAM roles for task functionality. For more information,
see IAM Roles for Tasks in the Amazon Elastic Container Service Developer Guide.
{
"containerProperties": [
{
"name": "container-using-efs",
"image": "amazonlinux:2"
],
"command": [
"ls",
"-la",
"/mount/efs"
],
"mountPoints": [
{
"sourceVolume": "myEfsVolume",
"containerPath": "/mount/efs",
"readOnly": true
}
],
"volumes": [
{
"name": "myEfsVolume",
"efsVolumeConfiguration": {
"fileSystemId": "fs-12345678",
"rootDirectory": "/path/to/my/data",
"transitEncryption": "ENABLED",
"transitEncryptionPort": integer,
"authorizationConfig": {
"accessPointId": "fsap-1234567890abcdef1",
"iam": "ENABLED"
}
}
}
]
}
]
}
efsVolumeConfiguration
Type: Object
Required: No
Type: String
Required: Yes
77
AWS Batch User Guide
Specifying an Amazon EFS file system in your job definition
rootDirectory
Type: String
Required: No
The directory within the Amazon EFS file system to mount as the root directory inside the host.
If this parameter is omitted, the root of the Amazon EFS volume is used. Specifying / has the
same effect as omitting this parameter. It can be up to 4,096 characters in length.
Important
If an EFS access point is specified in the authorizationConfig, the root directory
parameter must either be omitted or set to /. This enforces the path that's set on the
EFS access point.
transitEncryption
Type: String
Required: No
Determines whether to enable encryption for Amazon EFS data that's in transit between the
AWS Batch host and the Amazon EFS server. Transit encryption must be enabled if Amazon
EFS IAM authorization is used. If this parameter is omitted, the default value of DISABLED is
used. For more information, see Encrypting data in transit in the Amazon Elastic File System User
Guide.
transitEncryptionPort
Type: Integer
Required: No
The port to use when sending encrypted data between the AWS Batch host and the Amazon
EFS server. If you don't specify a transit encryption port, it uses the port selection strategy
that the Amazon EFS mount helper uses. The value must be between 0 and 65,535. For more
information, see EFS Mount Helper in the Amazon Elastic File System User Guide.
authorizationConfig
Type: Object
Required: No
The authorization configuration details for the Amazon EFS file system.
accessPointId
Type: String
Required: No
The access point ID to use. If an access point is specified, the root directory value in the
efsVolumeConfiguration must either be omitted or set to /. This enforces the path
that's set on the EFS access point. If an access point is used, transit encryption must be
enabled in the EFSVolumeConfiguration. For more information, see Working with
Amazon EFS Access Points in the Amazon Elastic File System User Guide.
iam
Type: String
78
AWS Batch User Guide
Example job definitions
Required: No
Determines whether to use the AWS Batch job IAM role that's defined in a job definition
when mounting the Amazon EFS file system. If enabled, transit encryption must be
enabled in the EFSVolumeConfiguration. If this parameter is omitted, the default value
of DISABLED is used. For more information about execution IAM roles, see AWS Batch
execution IAM role (p. 176).
Even though the command and environment variables are hardcoded into the job definition in this
example, you can specify command and environment variable overrides to make the job definition more
versatile.
{
"jobDefinitionName": "fetch_and_run",
"type": "container",
"containerProperties": {
"image": "123456789012.dkr.ecr.us-east-1.amazonaws.com/fetch_and_run",
"resourceRequirements": [
{
"type": "MEMORY",
"value": "2000"
},
{
"type": "VCPU",
"value": "2"
}
],
"command": [
"myjob.sh",
"60"
],
"jobRoleArn": "arn:aws:iam::123456789012:role/AWSBatchS3ReadOnly",
"environment": [
{
"name": "BATCH_FILE_S3_URL",
"value": "s3://my-batch-scripts/myjob.sh"
},
{
"name": "BATCH_FILE_TYPE",
"value": "script"
}
],
"user": "nobody"
}
}
79
AWS Batch User Guide
Using parameter substitution
The Ref:: declarations in the command section are used to set placeholders for parameter substitution.
When you submit a job with this job definition, you specify the parameter overrides to fill in those
values, such as the inputfile and outputfile. The parameters section that follows sets a default
for codec, but you can override that parameter as needed.
{
"jobDefinitionName": "ffmpeg_parameters",
"type": "container",
"parameters": {"codec": "mp4"},
"containerProperties": {
"image": "my_repo/ffmpeg",
"resourceRequirements": [
{
"type": "MEMORY",
"value": "2000"
},
{
"type": "VCPU",
"value": "2"
}
],
"command": [
"ffmpeg",
"-i",
"Ref::inputfile",
"-c",
"Ref::codec",
"-o",
"Ref::outputfile"
],
"jobRoleArn": "arn:aws:iam::123456789012:role/ECSTask-S3FullAccess",
"user": "nobody"
}
}
{
"containerProperties": {
"image": "tensorflow/tensorflow:1.8.0-devel-gpu",
"resourceRequirements": [
{
"type": "MEMORY",
"value": "32000"
},
{
"type": "VCPU",
"value": "8"
}
],
80
AWS Batch User Guide
Multi-node parallel job
"command": [
"sh",
"-c",
"cd /tensorflow/tensorflow/examples/tutorials/mnist; python mnist_deep.py"
]
},
"type": "container",
"jobDefinitionName": "tensorflow_mnist_deep"
}
You can create a file with the preceding JSON text called tensorflow_mnist_deep.json and then
register an AWS Batch job definition with the following command:
{
"jobDefinitionName": "gromacs-jobdef",
"jobDefinitionArn": "arn:aws:batch:us-east-2:123456789012:job-definition/gromacs-
jobdef:1",
"revision": 6,
"status": "ACTIVE",
"type": "multinode",
"parameters": {},
"nodeProperties": {
"numNodes": 2,
"mainNode": 0,
"nodeRangeProperties": [
{
"targetNodes": "0:1",
"container": {
"image": "123456789012.dkr.ecr.us-east-2.amazonaws.com/gromacs_mpi:latest",
"resourceRequirements": [
{
"type": "MEMORY",
"value": "24000"
},
{
"type": "VCPU",
"value": "8"
}
],
"command": [],
"jobRoleArn": "arn:aws:iam::123456789012:role/ecsTaskExecutionRole",
"ulimits": [],
"instanceType": "p3.2xlarge"
}
}
]
}
}
81
AWS Batch User Guide
Creating a job queue
Job queues
Jobs are submitted to a job queue where they reside until they can be scheduled to run in a compute
environment. An AWS account can have multiple job queues. For example, you can create a queue that
uses Amazon EC2 On-Demand instances for high priority jobs and another queue that uses Amazon EC2
Spot Instances for low-priority jobs. Job queues have a priority that's used by the scheduler to determine
which jobs in which queue should be evaluated for execution first.
You also set a priority to the job queue that determines the order in which the AWS Batch scheduler
places jobs onto its associated compute environments. For example, if a compute environment is
associated with more than one job queue, the job queue with a higher priority is given preference for
scheduling jobs to that compute environment.
• For State, select Enabled so that your job queue can accept job submissions.
7. (Optional) In the Tags section, you can specify the key and value for each tag to associate with the
job queue. For more information, see Tagging your AWS Batch resources (p. 197).
8. In the Connected compute environments section, select one or more compute environments from
the list to associate with the job queue, in the order that the queue should attempt placement. The
job scheduler uses compute environment order to determine which compute environment should
start a given job. Compute environments must be in the VALID state before you can associate them
with a job queue. You can associate up to three compute environments with a job queue.
Note
All compute environments that are associated with a job queue must share the same
provisioning model, either EC2 (On-Demand and Spot) or Fargate (Fargate and Fargate
Spot). AWS Batch doesn't support mixing provisioning models in a single job queue.
Note
All compute environments that are associated with a job queue must share the same
architecture. AWS Batch doesn't support mixing compute environment architecture types in
a single job queue.
82
AWS Batch User Guide
Job queue template
You can change the order of compute environments by choosing the up and down arrows next to
the Order column in the table.
9. Choose Create to finish and create your job queue.
{
"jobQueueName": "",
"state": "DISABLED",
"priority": 0,
"computeEnvironmentOrder": [
{
"order": 0,
"computeEnvironment": ""
}
],
"tags": {
"KeyName": ""
}
}
Note
You can generate the preceding job queue template with the following AWS CLI command.
The name for your job queue. Up to 128 letters (uppercase and lowercase), numbers, and
underscores are allowed.
Type: String
Required: Yes
Priority
priority
The priority of the job queue. Job queues with a higher priority (or a higher integer value for the
priority parameter) are evaluated first when associated with same compute environment. Priority
83
AWS Batch User Guide
Scheduling policy
is determined in descending order, for example, a job queue with a priority value of 10 is given
scheduling preference over a job queue with a priority value of 1. All of the compute environments
must be either EC2 (EC2 or SPOT) or Fargate (FARGATE or FARGATE_SPOT); EC2 and Fargate
compute environments can't be mixed.
Type: Integer
Required: Yes
Scheduling policy
schedulingPolicyArn
The Amazon Resource Name (ARN) of the scheduling policy for the job queue. Job queues that
don't have a scheduling policy are scheduled in a first-in, first-out (FIFO) model. After a job queue
has a scheduling policy, it can be replaced but can't be removed. A job queue without a scheduling
policy is scheduled as a FIFO job queue and can't have a scheduling policy added. Jobs queues with
a scheduling policy can have a maximum of 500 active fair share identifiers. When the limit has been
reached, submissions of any jobs that add a new fair share identifier fail.
Type: String
Required: No
State
state
The state of the job queue. If the job queue state is ENABLED (the default value), it can accept jobs.
If the job queue state is DISABLED, new jobs can't be added to the queue, but jobs already in the
queue can finish.
Type: String
Required: No
The set of compute environments mapped to a job queue and their order relative to each other. The
job scheduler uses this parameter to determine which compute environment should run a specific
job. Compute environments must be in the VALID state before you can associate them with a job
queue. You can associate up to three compute environments with a job queue. All of the compute
environments must be either EC2 (EC2 or SPOT) or Fargate (FARGATE or FARGATE_SPOT); EC2 and
Fargate compute environments can't be mixed.
Note
All compute environments that are associated with a job queue must share the same
architecture. AWS Batch doesn't support mixing compute environment architecture types in
a single job queue.
84
AWS Batch User Guide
Tags
Required: Yes
computeEnvironment
Type: String
Required: Yes
order
The order of the compute environment. Compute environments are tried in ascending order.
For example, if two compute environments are associated with a job queue, the compute
environment with a lower order integer value is tried for job placement first.
Tags
tags
Key-value pair tags to associate with the job queue. For more information, see Tagging your AWS
Batch resources (p. 197).
Required: No
85
AWS Batch User Guide
Job Scheduling
The AWS Batch scheduler evaluates when, where, and how to run jobs that have been submitted to a job
queue. If you need to guarantee the order that jobs are run, use the dependsOn parameter to SubmitJob
to specify the dependencies for each job.
By default, jobs run in approximately the order in which they are submitted (first in, first out), as
long as all dependencies on other jobs have been met. If the job queue has a scheduling policy, the
scheduling policy will determine the order in which jobs are run. For more information, see Scheduling
policies (p. 116).
86
AWS Batch User Guide
Managed compute environments
Compute environment
Job queues are mapped to one or more compute environments. Compute environments contain the
Amazon ECS container instances that are used to run containerized batch jobs. A specific compute
environment can also be mapped to one or more than one job queue. Within a job queue, the associated
compute environments each have an order that's used by the scheduler to determine where jobs that
are ready to be run should run. If the first compute environment has a status of VALID and has available
resources, the job is scheduled to a container instance within that compute environment. If the first
compute environment has a status of INVALID or can't provide a suitable compute resource, the
scheduler attempts to run the job on the next compute environment.
Note
AWS Batch does not support Windows containers, on either Fargate or EC2 resources.
Topics
• Managed compute environments (p. 87)
• Unmanaged compute environments (p. 88)
• Compute resource AMIs (p. 88)
• Launch template support (p. 96)
• Creating a compute environment (p. 99)
• Compute environment template (p. 104)
• Compute environment parameters (p. 105)
• EC2 Configurations (p. 113)
• Allocation strategies (p. 113)
• Compute Resource Memory Management (p. 114)
Managed compute environments launch Amazon ECS container instances into the VPC and subnets that
you specify when you create the compute environment. Amazon ECS container instances need external
network access to communicate with the Amazon ECS service endpoint. Some subnets don't provide
container instances with public IP addresses. If your container instances don't have public IP addresses,
they must use network address translation (NAT) to gain this access. For more information, see NAT
gateways in the Amazon VPC User Guide. For more information about how to create a VPC, see Tutorial:
Creating a VPC with Public and Private Subnets for Your Compute Environments (p. 165).
By default, AWS Batch managed compute environments use a recent, approved version of the Amazon
ECS optimized AMI for compute resources. However, you might want to create your own AMI to use for
87
AWS Batch User Guide
Unmanaged compute environments
your managed compute environments for various reasons. For more information, see Compute resource
AMIs (p. 88).
Note
AWS Batch doesn't upgrade the AMIs in a compute environment after it's created. For example,
it also doesn't update the AMIs in your compute environment when a newer version of the
Amazon ECS optimized AMI is available. You're responsible for the management of the guest
operating system. This includes any updates and security patches. You're also responsible for
any additional application software or utilities that you install on the compute resources. To use
a new AMI for your AWS Batch jobs:
After you created your unmanaged compute environment, use the DescribeComputeEnvironments API
operation to view the compute environment details. Find the Amazon ECS cluster that's associated with
the environment and then manually launch your container instances into that Amazon ECS cluster.
The following AWS CLI command also provides the Amazon ECS cluster ARN:
For more information, see Launching an Amazon ECS container instance in the Amazon Elastic Container
Service Developer Guide. When you launch your compute resources, specify the Amazon ECS cluster ARN
that the resources should register with the following Amazon EC2 user data. Replace ecsClusterArn
with the cluster ARN you obtained with the previous command.
#!/bin/bash
echo "ECS_CLUSTER=ecsClusterArn" >> /etc/ecs/ecs.config
88
AWS Batch User Guide
Compute resource AMI specification
Note
AWS Batch doesn't upgrade the AMIs in a compute environment after it's created. For example,
it also doesn't update the AMIs in your compute environment when a newer version of the
Amazon ECS optimized AMI is available. You're responsible for the management of the guest
operating system. This includes any updates and security patches. You're also responsible for
any additional application software or utilities that you install on the compute resources. To use
a new AMI for your AWS Batch jobs:
Topics
• Compute resource AMI specification (p. 89)
• Creating a compute resource AMI (p. 90)
• Using a GPU workload AMI (p. 92)
Required
• A modern Linux distribution that's running at least version 3.10 of the Linux kernel on an HVM
virtualization type AMI. Windows containers are not supported.
Important
Multi-node parallel jobs can only run on compute resources that were launched on an Amazon
Linux instance with the ecs-init package installed. We recommend that you use the default
Amazon ECS optimized AMI when you create your compute environment. You can do this by
not specifying a custom AMI. For more information, see Multi-node Parallel Jobs (p. 27).
• The Amazon ECS container agent. We recommend that you use the latest version. For more
information, see Installing the Amazon ECS Container Agent in the Amazon Elastic Container Service
Developer Guide.
• The awslogs log driver must be specified as an available log driver with the
ECS_AVAILABLE_LOGGING_DRIVERS environment variable when the Amazon ECS container agent is
started. For more information, see Amazon ECS Container Agent Configuration in the Amazon Elastic
Container Service Developer Guide.
• A Docker daemon that's running at least version 1.9, and any Docker runtime dependencies. For more
information, see Check runtime dependencies in the Docker documentation.
Note
For a better experience, we recommend the Docker version that ships with and is tested with
the corresponding Amazon ECS agent version that you're using. For more information, see
Amazon ECS Container Agent Versions in the Amazon Elastic Container Service Developer
Guide.
89
AWS Batch User Guide
Creating a compute resource AMI
Recommended
• An initialization and nanny process to run and monitor the Amazon ECS agent. The Amazon ECS
optimized AMI uses the ecs-init upstart process, and other operating systems might use systemd.
To view several example user data configuration scripts that use systemd to start and monitor the
Amazon ECS container agent, see Example container instance User Data Configuration Scripts in the
Amazon Elastic Container Service Developer Guide. For more information about ecs-init, see the
ecs-init project on GitHub. At a minimum, managed compute environments require the Amazon
ECS agent to start at boot. If the Amazon ECS agent isn't running on your compute resource, then it
can't accept jobs from AWS Batch.
The Amazon ECS optimized AMI is preconfigured with these requirements and recommendations.
We recommend that you use the Amazon ECS optimized AMI or an Amazon Linux AMI with the ecs-
init package installed for your compute resources. You should choose another AMI if your application
requires a specific operating system or a Docker version that's not yet available in those AMIs. For more
information, see Amazon ECS-Optimized AMI in the Amazon Elastic Container Service Developer Guide.
1. Choose a base AMI to start from. The base AMI must use HVM virtualization, and it can't be a
Windows AMI.
Note
The AMI that you choose for a compute environment must match the architecture of the
instance types that you intend to use for that compute environment. For example, if your
compute environment uses A1 instance types, the compute resource AMI that you choose
must support ARM instances. Amazon ECS vends both x86 and ARM versions of the Amazon
ECS optimized Amazon Linux 2 AMI. For more information, see Amazon ECS optimized
Amazon Linux 2 AMI in the Amazon Elastic Container Service Developer Guide.
The Amazon ECS optimized Amazon Linux 2 AMI is the default AMI for compute resources in
managed compute environments. The Amazon ECS optimized Amazon Linux 2 AMI is preconfigured
and tested on AWS Batch by AWS engineers. It's the simplest AMI for you to get started and to get
your compute resources that are running on AWS quickly. For more information, see Amazon ECS
Optimized AMI in the Amazon Elastic Container Service Developer Guide.
Alternatively, you can choose another Amazon Linux 2 variant and install the ecs-init package
with the commands below. For more information, see Installing the Amazon ECS container agent on
an Amazon Linux 2 EC2 instance in the Amazon Elastic Container Service Developer Guide:
For example, if you want to run GPU workloads on your AWS Batch compute resources, you can start
with the Amazon Linux Deep Learning AMI and configure it to be able to run AWS Batch jobs. For
more information, see Using a GPU workload AMI (p. 92).
Important
If you choose a base AMI that doesn't support the ecs-init package, you must configure
a way to start the Amazon ECS agent at boot and keep it running. To view several example
90
AWS Batch User Guide
Creating a compute resource AMI
user data configuration scripts that use systemd to start and monitor the Amazon ECS
container agent, see Example container instance user data configuration scripts in the
Amazon Elastic Container Service Developer Guide.
2. Launch an instance from your selected base AMI with the appropriate storage options for your
AMI. You can configure the size and number of attached Amazon EBS volumes, or instance storage
volumes if the instance type you've selected supports them. For more information, see Launching an
Instance and Amazon EC2 Instance Store in the Amazon EC2 User Guide for Linux Instances.
3. Connect to your instance with SSH and perform any necessary configuration tasks. This might
include any or all of the following steps:
• Installing the Amazon ECS container agent. For more information, see Installing the Amazon ECS
Container Agent in the Amazon Elastic Container Service Developer Guide.
• Configuring a script to format instance store volumes.
• Adding instance store volume or Amazon EFS file systems to the /etc/fstab file so that they're
mounted at boot.
• Configuring Docker options, such as enabling debugging or adjusting base image size.
• Installing packages or copying files.
For more information, see Connecting to Your Linux Instance Using SSH in the Amazon EC2 User
Guide for Linux Instances.
4. If you started the Amazon ECS container agent on your instance, you must stop it and remove any
persistent data checkpoint files before creating your AMI. Otherwise, if you don't do this, the agent
doesn't start on instances that are launched from your AMI.
b. Remove the persistent data checkpoint files. By default, these files are located in the /var/
lib/ecs/data/ directory. Use the following command to remove any such files.
5. Create a new AMI from your running instance. For more information, see Creating an Amazon EBS-
Backed Linux AMI in the Amazon EC2 User Guide for Linux Instances guide.
1. After the AMI is created, create a compute environment with your new AMI. Make sure that you
select Enable user-specified AMI ID and specify your custom AMI ID in Step 5.h.iii (p. 102)). For
more information, see Creating a compute environment (p. 99).
Note
The AMI that you choose for a compute environment must match the architecture of the
instance types that you intend to use for that compute environment. For example, if your
compute environment uses A1 instance types, the compute resource AMI that you choose
must support ARM instances. Amazon ECS vends both x86 and ARM versions of the Amazon
ECS optimized Amazon Linux 2 AMI. For more information, see Amazon ECS optimized
Amazon Linux 2 AMI in the Amazon Elastic Container Service Developer Guide.
91
AWS Batch User Guide
Using a GPU workload AMI
2. Create a job queue and associate your new compute environment. For more information, see
Creating a job queue (p. 82).
Note
All compute environments that are associated with a job queue must share the same
architecture. AWS Batch doesn't support mixing compute environment architecture types in
a single job queue.
3. (Optional) Submit a sample job to your new job queue. For more information, see Example job
definitions (p. 79), Creating a job definition (p. 31), and Submitting a Job (p. 14).
In managed compute environments, if the compute environment specifies any p2, p3, p4, g3, g3s, or g4
instance types or instance families, then AWS Batch uses an Amazon ECS GPU optimized AMI.
AWS CLI
{
"Parameter": {
"Name": "/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended",
"LastModifiedDate": 1555434128.664,
"Value": "{\"schema_version\":1,\"image_name\":\"amzn2-ami-ecs-gpu-
hvm-2.0.20190402-x86_64-ebs\",\"image_id\":\"ami-083c800fe4211192f\",\"os\":\"Amazon
Linux 2\",\"ecs_runtime_version\":\"Docker version 18.06.1-ce\",\"ecs_agent_version\":
\"1.27.0\"}",
"Version": 9,
"Type": "String",
"ARN": "arn:aws:ssm:us-east-2::parameter/aws/service/ecs/optimized-ami/amazon-
linux-2/gpu/recommended"
}
}
Python
import json
import boto3
92
AWS Batch User Guide
Using a GPU workload AMI
response = ssm.get_parameter(Name='/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/
recommended')
jsonVal = json.loads(response['Parameter']['Value'])
print("image_id = " + jsonVal['image_id'])
print("image_name = " + jsonVal['image_name'])
image_id = ami-083c800fe4211192f
image_name = amzn2-ami-ecs-gpu-hvm-2.0.20190402-x86_64-ebs
AWS CLI
The output includes the full metadata for each of the parameters:
{
"InvalidParameters": [],
"Parameters": [
{
"Name": "/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/
image_id",
"LastModifiedDate": 1555434128.749,
"Value": "ami-083c800fe4211192f",
"Version": 9,
"Type": "String",
"ARN": "arn:aws:ssm:us-east-2::parameter/aws/service/ecs/optimized-ami/
amazon-linux-2/gpu/recommended/image_id"
},
{
"Name": "/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/
image_name",
"LastModifiedDate": 1555434128.712,
"Value": "amzn2-ami-ecs-gpu-hvm-2.0.20190402-x86_64-ebs",
"Version": 9,
"Type": "String",
"ARN": "arn:aws:ssm:us-east-2::parameter/aws/service/ecs/optimized-ami/
amazon-linux-2/gpu/recommended/image_name"
}
]
}
Python
import boto3
response = ssm.get_parameters(
Names=['/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/
image_name',
93
AWS Batch User Guide
Using a GPU workload AMI
'/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/
image_id'])
for parameter in response['Parameters']:
print(parameter['Name'] + " = " + parameter['Value'])
The output includes the AMI ID and AMI name, using the full path for the names:
/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/image_id =
ami-083c800fe4211192f
/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/image_name = amzn2-ami-
ecs-gpu-hvm-2.0.20190402-x86_64-ebs
AWS CLI
The output includes the full metadata for all of the parameters under the specified path:
{
"Parameters": [
{
"Name": "/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/
ecs_agent_version",
"LastModifiedDate": 1555434128.801,
"Value": "1.27.0",
"Version": 8,
"Type": "String",
"ARN": "arn:aws:ssm:us-east-2::parameter/aws/service/ecs/optimized-ami/
amazon-linux-2/gpu/recommended/ecs_agent_version"
},
{
"Name": "/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/
ecs_runtime_version",
"LastModifiedDate": 1548368308.213,
"Value": "Docker version 18.06.1-ce",
"Version": 1,
"Type": "String",
"ARN": "arn:aws:ssm:us-east-2::parameter/aws/service/ecs/optimized-ami/
amazon-linux-2/gpu/recommended/ecs_runtime_version"
},
{
"Name": "/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/
image_id",
"LastModifiedDate": 1555434128.749,
"Value": "ami-083c800fe4211192f",
"Version": 9,
"Type": "String",
"ARN": "arn:aws:ssm:us-east-2::parameter/aws/service/ecs/optimized-ami/
amazon-linux-2/gpu/recommended/image_id"
},
{
"Name": "/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/
image_name",
"LastModifiedDate": 1555434128.712,
"Value": "amzn2-ami-ecs-gpu-hvm-2.0.20190402-x86_64-ebs",
"Version": 9,
"Type": "String",
94
AWS Batch User Guide
Using a GPU workload AMI
"ARN": "arn:aws:ssm:us-east-2::parameter/aws/service/ecs/optimized-ami/
amazon-linux-2/gpu/recommended/image_name"
},
{
"Name": "/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/os",
"LastModifiedDate": 1548368308.143,
"Value": "Amazon Linux 2",
"Version": 1,
"Type": "String",
"ARN": "arn:aws:ssm:us-east-2::parameter/aws/service/ecs/optimized-ami/
amazon-linux-2/gpu/recommended/os"
},
{
"Name": "/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/
schema_version",
"LastModifiedDate": 1548368307.914,
"Value": "1",
"Version": 1,
"Type": "String",
"ARN": "arn:aws:ssm:us-east-2::parameter/aws/service/ecs/optimized-ami/
amazon-linux-2/gpu/recommended/schema_version"
}
]
}
Python
import boto3
response = ssm.get_parameters_by_path(Path='/aws/service/ecs/optimized-ami/amazon-
linux-2/gpu/recommended')
for parameter in response['Parameters']:
print(parameter['Name'] + " = " + parameter['Value'])
The output includes the values of all the parameter names at the specified path, using the full path
for the names:
/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/ecs_agent_version =
1.27.0
/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/ecs_runtime_version =
Docker version 18.06.1-ce
/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/image_id =
ami-083c800fe4211192f
/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/image_name = amzn2-ami-
ecs-gpu-hvm-2.0.20190402-x86_64-ebs
/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/os = Amazon Linux 2
/aws/service/ecs/optimized-ami/amazon-linux-2/gpu/recommended/schema_version = 1
For more information, see Retrieving Amazon ECS-Optimized AMI Metadata in the Amazon Elastic
Container Service Developer Guide.
95
AWS Batch User Guide
Launch template support
You must create a launch template before you can associate it with a compute environment. You can
create a launch template in the Amazon EC2 console, or you can use the AWS CLI or an AWS SDK. For
example, the following JSON file represents a launch template that resizes the Docker data volume for
the default AWS Batch compute resource AMI and also sets it to be encrypted.
{
"LaunchTemplateName": "increase-container-volume-encrypt",
"LaunchTemplateData": {
"BlockDeviceMappings": [
{
"DeviceName": "/dev/xvdcz",
"Ebs": {
"Encrypted": true,
"VolumeSize": 100,
"VolumeType": "gp2"
}
}
]
}
}
You can create the previous launch template by saving the JSON to a file called lt-data.json and
running the following AWS CLI command:
For more information about launch templates, see Launching an Instance from a Launch Template in the
Amazon EC2 User Guide for Linux Instances.
If you use a launch template to create your compute environment, you can move the following existing
compute environment parameters to your launch template:
Note
If any of these parameters (with the exception of Amazon EC2 tags) are specified both in the
launch template and in the compute environment configuration, the compute environment
parameters take precedence. Amazon EC2 tags are merged between the launch template and
the compute environment configuration. If there is a collision on the tag's key, then the value in
the compute environment configuration takes precedence.
• Instance type (specify your desired instance types when you create your compute environment)
96
AWS Batch User Guide
Amazon EC2 user data in launch templates
• Instance role (specify your desired instance role when you create your compute environment)
• Network interface subnets (specify your desired subnets when you create your compute environment)
• Instance market options (AWS Batch must control Spot Instance configuration)
• Disable API termination (AWS Batch must control instance lifecycle)
AWS Batch doesn't support updating a compute environment with a new launch template version. If you
update your launch template, you must create a new compute environment with the new template for
the changes to take effect.
Amazon EC2 user data in launch templates must be in the MIME multi-part archive format. This is
because your user data is merged with other AWS Batch user data that's required to configure your
compute resources. You can combine multiple user data blocks together into a single MIME multi-part
file. For example, you might want to combine a cloud boothook that configures the Docker daemon with
a user data shell script that writes configuration information for the Amazon ECS container agent.
If you're using AWS CloudFormation, the AWS::CloudFormation::Init type can be used with the cfn-init
helper script to perform common configuration scenarios.
The follwing are example MIME multi-part files that you can use to create your own.
Note
If you add user data to a launch template in the Amazon EC2 console, you can paste it in as
plaintext, or upload from a file. If you use the AWS CLI or an AWS SDK, you must first base64
encode the user data and submit that string as the value of the UserData parameter when you
call CreateLaunchTemplate, as shown in this JSON.
{
"LaunchTemplateName": "base64-user-data",
97
AWS Batch User Guide
Amazon EC2 user data in launch templates
"LaunchTemplateData": {
"UserData":
"ewogICAgIkxhdW5jaFRlbXBsYXRlTmFtZSI6ICJpbmNyZWFzZS1jb250YWluZXItdm9sdW..."
}
}
Examples
• Example: Mount an existing Amazon EFS file system (p. 98)
• Example: Override default Amazon ECS container agent configuration (p. 98)
• Example: Mount an existing Amazon FSx for Lustre file system (p. 99)
This example MIME multi-part file configures the compute resource to install the amazon-efs-utils
package and mount an existing Amazon EFS file system at /mnt/efs.
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="==MYBOUNDARY=="
--==MYBOUNDARY==
Content-Type: text/cloud-config; charset="us-ascii"
packages:
- amazon-efs-utils
runcmd:
- file_system_id_01=fs-abcdef123
- efs_directory=/mnt/efs
- mkdir -p ${efs_directory}
- echo "${file_system_id_01}:/ ${efs_directory} efs tls,_netdev" >> /etc/fstab
- mount -a -t efs defaults
--==MYBOUNDARY==--
This example MIME multi-part file overrides the default Docker image cleanup settings for a compute
resource.
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="==MYBOUNDARY=="
--==MYBOUNDARY==
Content-Type: text/x-shellscript; charset="us-ascii"
#!/bin/bash
echo ECS_IMAGE_CLEANUP_INTERVAL=60m >> /etc/ecs/ecs.config
echo ECS_IMAGE_MINIMUM_CLEANUP_AGE=60m >> /etc/ecs/ecs.config
--==MYBOUNDARY==--
98
AWS Batch User Guide
Creating a compute environment
This example MIME multi-part file configures the compute resource to install the lustre2.10 package
from the Extras Library and mount an existing FSx for Lustre file system at /scratch. This example is
for Amazon Linux 2. For installation instructions for other Linux distributions, see Installing the Lustre
Client in the Amazon FSx for Lustre User Guide.
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="==MYBOUNDARY=="
--==MYBOUNDARY==
Content-Type: text/cloud-config; charset="us-ascii"
runcmd:
- file_system_id_01=fs-0abcdef1234567890
- region=us-east-2
- fsx_directory=/scratch
- amazon-linux-extras install -y lustre2.10
- mkdir -p ${fsx_directory}
- mount -t lustre ${file_system_id_01}.fsx.${region}.amazonaws.com@tcp:fsx ${fsx_directory}
--==MYBOUNDARY==--
In the volumes and mountPoints members of the container properties the mount points must be
mapped into the container.
{
"volumes": [
{
"host": {
"sourcePath": "/scratch"
},
"name": "Scratch"
}
],
"mountPoints": [
{
"containerPath": "/scratch",
"sourceVolume": "Scratch"
}
],
}
Contents
• To create a managed compute environment using AWS Fargate resources (p. 100)
• To create a managed compute environment using EC2 resources (p. 101)
99
AWS Batch User Guide
To create a managed compute
environment using AWS Fargate resources
• For Service role, choose Batch service-linked role. The role allows the AWS Batch service
to make calls to the required AWS API operations on your behalf. For more information, see
Service-linked role permissions for AWS Batch (p. 181).
5. Configure your Instance configuration.
a. For Provisioning model, choose Fargate to launch Fargate On-Demand resources or Fargate
Spot to use Fargate Spot resources.
b. For Maximum vCPUs, choose the maximum number of vCPUs that your compute environment
can scale out to, regardless of job queue demand.
c.
6. Configure networking.
Important
Compute resources need access to communicate with the Amazon ECS service endpoint.
This can be through an interface VPC endpoint or through your compute resources having
public IP addresses.
For more information about interface VPC endpoints, see Amazon ECS Interface VPC
Endpoints (AWS PrivateLink) in the Amazon Elastic Container Service Developer Guide.
If you do not have an interface VPC endpoint configured and your compute resources do
not have public IP addresses, then they must use network address translation (NAT) to
provide this access. For more information, see NAT gateways in the Amazon VPC User Guide.
For more information, see Tutorial: Creating a VPC with Public and Private Subnets for Your
Compute Environments (p. 165).
a. For VPC ID, choose a VPC where you intend to launch your instances.
b. For Subnets, choose which subnets in the selected VPC should host your instances. By default,
all subnets within the selected VPC are chosen.
c. (Optional) Expand Additional settings: Security groups, EC2 tags.
• For Security groups, choose a security group to attach to your instances. By default, the
default security group for your VPC is chosen.
7. (Optional) In the Tags section, you can specify the key and value for each tag to associate with the
compute environment. For more information, see Tagging your AWS Batch resources (p. 197).
8. Choose Create compute environment to finish.
100
AWS Batch User Guide
To create a managed compute
environment using EC2 resources
i. For Service role, choose Batch service-linked role. The role allows the AWS Batch service
to make calls to the required AWS API operations on your behalf. For more information, see
Service-linked role permissions for AWS Batch (p. 181).
ii. For Instance role, choose to create a new instance profile or use an existing instance profile
that has the required IAM permissions attached. This instance profile allows the Amazon
ECS container instances that are created for your compute environment to make calls to
the required AWS API operations on your behalf. For more information, see Amazon ECS
instance role (p. 145). If you choose to create a new instance profile, the required role
(ecsInstanceRole) is created for you.
iii. For EC2 key pair choose an existing Amazon EC2 key pair to associate with the instance at
launch. You can use this key pair to connect to your instances with SSH. Make sure to verify
that your security group allows incoming traffic on port 22.
5. Configure your Instance configuration.
a. For Provisioning model, choose On-Demand to launch Amazon EC2 On-Demand Instances or
Spot to use Amazon EC2 Spot Instances.
b. If you chose to use Spot Instances:
• (Optional) For Maximum % on-demand price, choose the maximum percentage that a Spot
Instance price can be when compared with the On-Demand price for that instance type
before instances are launched. For example, if your maximum price is 20%, then the Spot
price must be less than 20% of the current On-Demand price for that EC2 instance. You
always pay the lowest (market) price and never more than your maximum percentage. If
you leave this field empty, the default value is 100% of the On-Demand price.
c. For Minimum vCPUs, choose the minimum number of EC2 vCPUs that your compute
environment should maintain, regardless of job queue demand.
d. For Maximum vCPUs, choose the maximum number of EC2 vCPUs that your compute
environment can scale out to, regardless of job queue demand.
e. For Desired vCPUs, choose the number of EC2 vCPUs that your compute environment should
launch with. As your job queue demand increases, AWS Batch can increase the desired number
of vCPUs in your compute environment and add EC2 instances, up to the maximum vCPUs. As
demand decreases, AWS Batch can decrease the desired number of vCPUs in your compute
environment and remove instances, down to the minimum vCPUs.
f. For Allowed instance types, choose the Amazon EC2 instance types that can be launched. You
can specify instance families to launch any instance type within those families (for example, c5,
c5n, or p3), or you can specify specific sizes within a family (such as c5.8xlarge). Note that
101
AWS Batch User Guide
To create a managed compute
environment using EC2 resources
metal instance types aren't in the instance families. For example, c5 doesn't include c5.metal.
You can also choose optimal to select instance types (from the C4, M4, and R4 instance
families) as you need that match the demand of your job queues.
Note
When you create a compute environment, the instance types that you select for the
compute environment must share the same architecture. For example, you can't mix
x86 and ARM instances in the same compute environment.
Note
AWS Batch will scale GPUs based on the required amount in your job queues. To use
GPU scheduling, the compute environment must include instance types from the p2,
p3, p4, g3, g3s, or g4 families.
Note
Currently, optimal uses instance types from the C4, M4, and R4 instance families. In
Regions that don't have instance types from those instance families, instance types
from the C5, M5. and R5 instance families are used.
g. For Allocation strategy, choose the allocation strategy to use when selecting instance
types from the list of allowed instance types. BEST_FIT_PROGRESSIVE is usually the better
choice for EC2 On-Demand compute environments, and SPOT_CAPACITY_OPTIMIZED for
EC2 Spot compute environments. For more information, see the section called “Allocation
strategies” (p. 113).
h. (Optional) Expand Additional settings: launch template, user specified AMI.
i. (Optional) For Launch template, select an existing Amazon EC2 launch template to
configure your compute resources. The default version of the template is automatically
populated. For more information, see Launch template support (p. 96).
ii. (Optional) For Launch template version, enter $Default, $Latest, or a specific version
number to use.
Important
After the compute environment is created, the launch template version used
isn't changed even if the $Default or $Latest version for the launch template
is updated. To use a new launch template version, create a new compute
environment, add the new compute environment to the existing job queue, remove
the old compute environment from the job queue, and delete the old compute
environment.
iii. (Optional) Check Enable user-specified AMI ID to use your own custom AMI. By default,
AWS Batch managed compute environments use a recent, approved version of the Amazon
ECS optimized AMI for compute resources. You can create and use your own AMI in your
compute environment by following the compute resource AMI specification. For more
information, see Compute resource AMIs (p. 88).
Note
The AMI that you choose for a compute environment must match the architecture
of the instance types that you intend to use for that compute environment. For
example, if your compute environment uses A1 instance types, the compute
resource AMI that you choose must support ARM instances. Amazon ECS vends
both x86 and ARM versions of the Amazon ECS optimized Amazon Linux 2 AMI. For
more information, see Amazon ECS optimized Amazon Linux 2 AMI in the Amazon
Elastic Container Service Developer Guide.
• For AMI ID, paste your custom AMI ID and choose Validate AMI.
iv. (Optional) For EC2 configuration choose Image type and Image ID override values to
provide information for AWS Batch to select Amazon Machine Images (AMIs) for instances
in the compute environment. If the Image ID override isn't specified for each Image type,
AWS Batch selects a recent Amazon ECS optimized AMI. If no Image type is specified, the
102
AWS Batch User Guide
To create an unmanaged compute
environment using EC2 resources
default is a Amazon Linux for non-GPU, non AWS Graviton instance. In the future, this
default will change to Amazon Linux 2 for all non-GPU instances.
Amazon Linux 2
Default for all AWS Graviton-based instance families (for example, C6g, M6g, R6g, and
T4g) and can be used for all non-GPU instance types.
Amazon Linux 2 (GPU)
Default for all GPU instance families (for example P4 and G4) and can be used for all
non AWS Graviton-based instance types.
Amazon Linux
Default for all non-GPU, non AWS Graviton instance families. Amazon Linux is reaching
the end-of-life of standard support. For more information, see Amazon Linux AMI.
i.
6. Configure networking.
Important
Compute resources need access to communicate with the Amazon ECS service endpoint.
This can be through an interface VPC endpoint or through your compute resources having
public IP addresses.
For more information about interface VPC endpoints, see Amazon ECS Interface VPC
Endpoints (AWS PrivateLink) in the Amazon Elastic Container Service Developer Guide.
If you do not have an interface VPC endpoint configured and your compute resources do
not have public IP addresses, then they must use network address translation (NAT) to
provide this access. For more information, see NAT gateways in the Amazon VPC User Guide.
For more information, see Tutorial: Creating a VPC with Public and Private Subnets for Your
Compute Environments (p. 165).
i. For Security groups, choose a security group to attach to your instances. By default, the
default security group for your VPC is chosen.
ii. (Optional) In the EC2 tags, you can tag the Amazon EC2 instances used by your On-
Demand Instances. For example, you can specify "Name": "AWS Batch Instance -
C4OnDemand" as a tag so that each instance in your compute environment has that name.
This is helpful for recognizing your AWS Batch instances in the Amazon EC2 console.
Note
EC2 tags isn't available when using either Fargate or Fargate Spot provisioning
models.
7. (Optional) In the Tags section, you can specify the key and value for each tag to associate with the
compute environment. For more information, see Tagging your AWS Batch resources (p. 197).
8. Choose Create compute environment to finish.
10. (Optional) Launch container instances into the associated Amazon ECS cluster. For more information,
see Launching an Amazon ECS container instance in the Amazon Elastic Container Service Developer
Guide. When you launch your compute resources, specify the Amazon ECS cluster ARN that the
resources should register with the following Amazon EC2 user data. Replace ecsClusterArn with
the cluster ARN you obtained with the previous command.
#!/bin/bash
echo "ECS_CLUSTER=ecsClusterArn" >> /etc/ecs/ecs.config
Note
Your unmanaged compute environment doesn't have any compute resources until you
launch them manually.
{
"computeEnvironmentName": "",
"type": "UNMANAGED",
"state": "ENABLED",
"computeResources": {
"type": "SPOT",
"allocationStrategy": "SPOT_CAPACITY_OPTIMIZED",
"minvCpus": 0,
"maxvCpus": 0,
"desiredvCpus": 0,
"instanceTypes": [
""
],
"imageId": "",
"subnets": [
""
],
104
AWS Batch User Guide
Compute environment parameters
"securityGroupIds": [
""
],
"ec2KeyPair": "",
"instanceRole": "",
"tags": {
"KeyName": ""
},
"placementGroup": "",
"bidPercentage": 0,
"spotIamFleetRole": "",
"launchTemplate": {
"launchTemplateId": "",
"launchTemplateName": "",
"version": ""
},
"ec2Configuration": [
{
"imageType": "",
"imageIdOverride": ""
}
]
},
"serviceRole": "",
"tags": {
"KeyName": ""
}
}
Note
You can generate the preceding compute environment template with the following AWS CLI
command.
Topics
• Compute environment name (p. 105)
• Type (p. 106)
• State (p. 106)
• Compute resources (p. 106)
• Service role (p. 112)
• Tags (p. 112)
The name for your compute environment. The name can be up to 128 characters in length. It can
contain uppercase and lowercase letters, numbers, hyphens (-), and underscores (_).
105
AWS Batch User Guide
Type
Type: String
Required: Yes
Type
type
The type of the compute environment. Choose MANAGED to have AWS Batch manage the EC2 or
Fargate compute resources that you define. For more information, see Compute resources (p. 106).
Choose UNMANAGED to manage your own EC2 compute resources.
Type: String
Required: Yes
State
state
If the state is ENABLED, the AWS Batch scheduler attempts to place jobs within the environment.
These jobs are from an associated job queue on the compute resources. If the compute environment
is managed, it can scale its instances out or in automatically based on job queue demand.
If the state is DISABLED, the AWS Batch scheduler doesn't attempt to place jobs within the
environment. Jobs in a STARTING or RUNNING state continue to progress normally. Managed
compute environments in the DISABLED state don't scale out. However, after instances go idle, they
scale in to the smallest number of instances that satisfies the minvCpus value.
Type: String
Required: No
Compute resources
computeResources
Details of the compute resources managed by the compute environment. For more information, see
Compute Environments.
The type of compute environment. You can choose either to use EC2 On-Demand Instances
(EC2) and EC2 Spot Instances (SPOT), or to use Fargate capacity (FARGATE) and Fargate Spot
capacity (FARGATE_SPOT) in your managed compute environment. If you choose SPOT, you
must also specify an Amazon EC2 Spot Fleet role with the spotIamFleetRole parameter. For
more information, see Amazon EC2 spot fleet role (p. 145).
106
AWS Batch User Guide
Compute resources
Required: Yes
allocationStrategy
The allocation strategy to use for the compute resource if not enough instances of the best
fitting EC2 instance type can be allocated. This might be due to availability of the instance
type in the Region or Amazon EC2 service limits. For more information, see Allocation
strategies (p. 113).
Note
This parameter isn't applicable to jobs that are running on Fargate resources, and
shouldn't be specified.
BEST_FIT (default)
AWS Batch selects an instance type that best fits the needs of the jobs with a preference
for the lowest cost instance type. If additional instances of the selected instance type
aren't available, AWS Batch waits for the additional instances to be available. If there aren't
enough instances available, or if you're hitting Amazon EC2 service limits then additional
jobs don't run until currently running jobs have completed. This allocation strategy keeps
costs lower but can limit scaling. If you're using Spot Fleets with BEST_FIT then the Spot
Fleet IAM Role must be specified.
BEST_FIT_PROGRESSIVE
Use additional instance types that are large enough to meet the requirements of the jobs
in the queue, with a preference for instance types with a lower cost for each unit vCPU. If
additional instances of the previously selected instance types aren't available, AWS Batch
selects new instance types.
SPOT_CAPACITY_OPTIMIZED
(Only available for Spot Instance compute resources) Use additional instance types that
are large enough to meet the requirements of the jobs in the queue, with a preference for
instance types that are less likely to be interrupted.
Required: No
minvCpus
The minimum number of Amazon EC2 vCPUs that an environment should maintain (even if a
compute environment is DISABLED).
Note
This parameter isn't applicable to jobs running on Fargate resources, and shouldn't be
specified.
Type: Integer
Required: Yes
maxvCpus
The maximum number of Amazon EC2 vCPUs that an environment can reach.
Note
With both BEST_FIT_PROGRESSIVE and SPOT_CAPACITY_OPTIMIZED allocation
strategies, AWS Batch might need to exceed maxvCpus to meet your capacity
107
AWS Batch User Guide
Compute resources
requirements. In this event, AWS Batch never exceeds maxvCpus by more than a single
instance. For example, AWS Batch uses no more than a single instance from among
those specified in your compute environment.
Type: Integer
Required: Yes
desiredvCpus
The desired number of Amazon EC2 vCPUS in the compute environment. AWS Batch modifies
this value between the minimum and maximum values based on job queue demand.
Note
This parameter isn't applicable to jobs running on Fargate resources, and shouldn't be
specified.
Type: Integer
Required: No
instanceTypes
The instance types that can be launched. This parameter isn't applicable to jobs that are running
on Fargate resources, and shouldn't be specified. You can specify instance families to launch
any instance type within those families (for example, c5, c5n, or p3). Or, you can specify
specific sizes within a family (such as c5.8xlarge). Note that metal instance types aren't in the
instance families (for example c5 does not include c5.metal.) You can also choose optimal to
select instance types (from the C4, M4, and R4 instance families) that match the demand of your
job queues.
Note
When you create a compute environment, the instance types that you select for the
compute environment must share the same architecture. For example, you can't mix
x86 and ARM instances in the same compute environment.
Note
Currently, optimal uses instance types from the C4, M4, and R4 instance families. In
Regions that don't have instance types from those instance families, instance types
from the C5, M5. and R5 instance families are used.
Required: yes
imageId
The Amazon Machine Image (AMI) ID used for instances launched in the compute environment.
This parameter is overridden by the imageIdOverride member of the Ec2Configuration
structure.
Note
This parameter isn't applicable to jobs that are running on Fargate resources, and
shouldn't be specified.
Note
The AMI that you choose for a compute environment must match the architecture of
the instance types that you intend to use for that compute environment. For example, if
your compute environment uses A1 instance types, the compute resource AMI that you
choose must support ARM instances. Amazon ECS vends both x86 and ARM versions
of the Amazon ECS optimized Amazon Linux 2 AMI. For more information, see Amazon
ECS optimized Amazon Linux 2 AMI in the Amazon Elastic Container Service Developer
Guide.
108
AWS Batch User Guide
Compute resources
Type: String
Required: No
subnets
The VPC subnets into which the compute resources are launched. These subnets must be within
the same VPC. Fargate compute resources can contain a maximum of 16 subnets. For more
information, see VPCs and Subnets in the Amazon VPC User Guide.
Required: Yes
securityGroupIds
The Amazon EC2 security groups associated with instances launched in the compute
environment. One or more security groups must be specified, either in securityGroupIds or
using a launch template referenced in launchTemplate. This parameter is required for jobs
running on Fargate resources and must contain at least one security group. (Fargate doesn't
support launch templates.) If security groups are specified using both securityGroupIds and
launchTemplate, the values in securityGroupIds will be used.
Required: Yes
ec2KeyPair
The EC2 key pair that's used for instances launched in the compute environment. You can use
this key pair to log in to your instances with SSH.
Note
This parameter isn't applicable to jobs that are running on Fargate resources, and
shouldn't be specified.
Type: String
Required: No
instanceRole
The Amazon ECS instance profile to attach to Amazon EC2 instances in a compute environment.
This parameter isn't applicable to jobs that are running on Fargate resources, and shouldn't be
specified. You can specify the short name or full Amazon Resource Name (ARN) of an instance
profile. For example, ecsInstanceRole or arn:aws:iam::aws_account_id:instance-
profile/ecsInstanceRole. For more information, see Amazon ECS instance role (p. 145).
Type: String
Required: No
tags
Key-value pair tags to be applied to EC2 instances that are launched in the compute
environment. For example, you can specify "Name": "AWS Batch Instance -
C4OnDemand" as a tag so that each instance in your compute environment has that name. This
is helpful for recognizing your AWS Batch instances in the Amazon EC2 console. These tags can't
be updated or removed after the compute environment has been created. Any changes require
creating a new compute environment and removing the previous compute environment. These
tags aren't seen when using the AWS Batch ListTagsForResource API operation.
Note
This parameter isn't applicable to jobs that are running on Fargate resources, and
shouldn't be specified.
109
AWS Batch User Guide
Compute resources
Required: No
placementGroup
The Amazon EC2 placement group to associate with your compute resources. This parameter
isn't applicable to jobs running on Fargate resources, and shouldn't be specified. If you intend
to submit multi-node parallel jobs to your compute environment, you should consider creating
a cluster placement group and associate it with your compute resources. This keeps your multi-
node parallel job on a logical grouping of instances within a single Availability Zone with high
network flow potential. For more information, see Placement Groups in the Amazon EC2 User
Guide for Linux Instances.
Type: String
Required: No
bidPercentage
The maximum percentage that an EC2 Spot Instance price can be when compared with the
On-Demand price for that instance type before instances are launched. For example, if your
maximum percentage is 20%, then the Spot price must be less than 20% of the current On-
Demand price for that EC2 instance. You always pay the lowest (market) price and never more
than your maximum percentage. If you leave this field empty, the default value is 100% of the
On-Demand price.
Note
This parameter isn't applicable to jobs that are running on Fargate resources, and
shouldn't be specified.
Required: No
spotIamFleetRole
The Amazon Resource Name (ARN) of the Amazon EC2 Spot Fleet IAM role applied to a SPOT
compute environment. This role is required if the allocation strategy set to BEST_FIT or
if the allocation strategy isn't specified. For more information, see Amazon EC2 spot fleet
role (p. 145).
Note
This parameter isn't applicable to jobs that are running on Fargate resources, and
shouldn't be specified.
Important
To tag your Spot Instances on creation, the Spot Fleet IAM role specified here must
use the newer AmazonEC2SpotFleetTaggingRole managed policy. The previously
recommended AmazonEC2SpotFleetRole managed policy doesn't have the required
permissions to tag Spot Instances. For more information, see Spot Instances Not
Tagged on Creation (p. 204).
Type: String
An optional launch template to associate with your compute resources. This parameter isn't
applicable to jobs running on Fargate resources, and shouldn't be specified. Any other compute
resource parameters that you specify in a CreateComputeEnvironment API operation override
the same parameters in the launch template. To use a launch template, you must specify
either the launch template ID or launch template name in the request, but not both. For more
information, see Launch template support (p. 96).
110
AWS Batch User Guide
Compute resources
Type: LaunchTemplateSpecification
object
Required: No
launchTemplateId
Type: String
Required: No
launchTemplateName
Type: String
Required: No
version
If the value is $Latest, the latest version of the launch template is used. If the value is
$Default, the default version of the launch template is used.
Important
After the compute environment is created, the launch template version used
will not be changed, even if the $Default or $Latest version for the launch
template is updated. To use a new launch template version, create a new compute
environment, add the new compute environment to the existing job queue, remove
the old compute environment from the job queue, and delete the old compute
environment.
Default: $Default.
Type: String
Required: No
ec2Configuration
Provides information used to select Amazon Machine Images (AMIs) for instances in the EC2
compute environment. If Ec2Configuration isn't specified, the default is Amazon Linux 2
(ECS_AL2). Before March 31, 2021, this default was Amazon Linux (ECS_AL1) for non-GPU, non
AWS Graviton instances.
Note
This parameter isn't applicable to jobs that are running on Fargate resources, and
shouldn't be specified.
Required: No
imageIdOverride
The AMI ID used for instances launched in the compute environment that matches the
image type. This setting overrides the imageId set in the computeResource object.
Type: String
Required: No
111
AWS Batch User Guide
Service role
imageType
The image type to match with the instance type to select an AMI. If the imageIdOverride
parameter isn't specified, then a recent Amazon ECS optimized AMI is used.
Amazon Linux 2 (ECS_AL2)
Default for all AWS Graviton based instance families (for example, C6g, M6g, R6g, and
T4g) and can be used for all non-GPU instance types.
Amazon Linux 2 (GPU) (ECS_AL2_NVIDIA)
Default for all GPU instance families (for example P4 and G4) and can be used for all
non AWS Graviton based instance types.
Amazon Linux (ECS_AL1)
Default for all non-GPU, non AWS Graviton instance families. Amazon Linux will
discontinue standard support. For more information, see Amazon Linux AMI.
Type: String
Required: Yes
Service role
serviceRole
The full Amazon Resource Name (ARN) of the IAM role that allows AWS Batch to make calls to other
AWS services on your behalf. For more information, see AWS Batch service IAM role (p. 142).
Important
If your account has already created the AWS Batch service-linked role
(AWSServiceRoleForBatch), that role is used by default for your compute environment
unless you specify a role here. If the AWS Batch service-linked role doesn't exist in your
account, and no role is specified here, the service tries to create the AWS Batch service-
linked role in your account. For more information about the AWSServiceRoleForBatch
service-linked role, see Service-linked role permissions for AWS Batch (p. 181).
If your specified role has a path other than /, then you must either specify the full role ARN (this is
recommended) or prefix the role name with the path.
Note
Depending on how you created your AWS Batch service role, its ARN might contain the
service-role path prefix. When you only specify the name of the service role, AWS Batch
assumes that your ARN doesn't use the service-role path prefix. Because of this, we
recommend that you specify the full ARN of your service role when you create compute
environments.
Type: String
Required: No
Tags
tags
Key-value pair tags to associate with the compute environment. For more information, see Tagging
your AWS Batch resources (p. 197).
112
AWS Batch User Guide
EC2 Configurations
Required: No
EC2 Configurations
AWS Batch uses Amazon ECS optimized AMIs for EC2 and EC2 Spot compute environments. The default
is Amazon Linux 2 (ECS_AL2). Before March 31, 2021, this default was Amazon Linux (ECS_AL1) for non-
GPU, non AWS Graviton instances.
We made this change because the Amazon Linux AMI has discontinued standard support and entered
into a maintenance support period, which is scheduled to end on June 30, 2023. The Amazon Linux AMI
will continue to receive critical and important security updates for a reduced list of packages. During the
maintenance support period, an Amazon Linux AMI might still be used for newly-created managed EC2
and EC2 Spot compute environments by specifying an Ec2Configuration parameter when creating a
compute environment. After the end of the maintenance support period, an Amazon Linux AMI will no
longer be a supported image type for new AWS Batch compute environments.
Existing compute environments and instances will not be affected by this change and will continue
to operate with their configured AMI until the end of the maintenance support period. Amazon Linux
AMI will no longer be a supported image type for AWS Batch compute environments. We encourage
migration of all compute environments to Amazon Linux 2 prior to June 30, 2023. Not all instance
types introduced after March 31, 2021, will be supported by the Amazon Linux AMI. If you use launch
templates with custom user data, confirm that everything is configured as expected.
The storage configuration differs between the Amazon ECS optimized Amazon Linux AMI and Amazon
Linux 2-based Amazon ECS optimized AMIs. For more information, see AMI storage configuration in the
Amazon Elastic Container Service Developer Guide.
Allocation strategies
When a managed compute environment is created, AWS Batch selects instance types from the
instanceTypes specified that best fit the needs of the jobs. The allocation strategy defines behavior
when AWS Batch needs additional capacity. This parameter isn't applicable to jobs running on Fargate
resources, and shouldn't be specified. For more information, see Allocation strategies (p. 113).
BEST_FIT (default)
AWS Batch selects an instance type that best fits the needs of the jobs with a preference for the
lowest-cost instance type. If additional instances of the selected instance type aren't available, AWS
Batch waits for the additional instances to be available. If there aren't enough instances available, or
if the user is hitting Amazon EC2 service limits then additional jobs don't run until currently running
jobs have completed. This allocation strategy keeps costs lower but can limit scaling. If you're using
Spot Fleets with BEST_FIT then the Spot Fleet IAM Role must be specified.
BEST_FIT_PROGRESSIVE
AWS Batch selects additional instance types that are large enough to meet the requirements of
the jobs in the queue. It has a preference for instance types with a lower cost for each unit vCPU.
If additional instances of the previously selected instance types aren't available, AWS Batch selects
new instance types.
SPOT_CAPACITY_OPTIMIZED
AWS Batch selects one or more instance types that are large enough to meet the requirements of
the jobs in the queue, with a preference for instance types that are less likely to be interrupted. This
allocation strategy is only available for Spot Instance compute resources.
113
AWS Batch User Guide
Memory Management
If you specify 8192 MiB for the job, and none of your compute resources have 8192 MiB or greater
of memory available to satisfy this requirement, then the job cannot be placed in your compute
environment. If you are using a managed compute environment, then AWS Batch must launch a larger
instance type to accommodate the request.
The default AWS Batch compute resource AMI also reserves 32 MiB of memory for the Amazon ECS
container agent and other critical system processes. This memory is not available for job allocation. For
more information, see Reserving System Memory (p. 114).
The Amazon ECS container agent uses the Docker ReadMemInfo() function to query the total memory
available to the operating system. Linux provides command line utilities to determine the total memory.
The free command returns the total memory that is recognized by the operating system.
$ free -b
Example output for an m4.large instance running the Amazon ECS-optimized Amazon Linux AMI.
This instance has 8373026816 bytes of total memory, which translates to 7985 MiB available for tasks.
The default AWS Batch compute resource AMI reserves 32 MiB of memory for the Amazon ECS container
agent and other critical system processes.
114
AWS Batch User Guide
Viewing Compute Resource Memory
by providing your jobs as much memory as possible for a particular instance type, you can observe the
memory available for that compute resource and then assign your jobs that much memory.
The Registered memory value is what the compute resource registered with Amazon ECS when it
was first launched, and the Available memory value is what has not already been allocated to jobs.
115
AWS Batch User Guide
Creating a scheduling policy
Scheduling policies
Scheduling policies allow compute resources in a job queue to be allocated in a more equitable manner
between different users or workloads. Different workloads or users are assigned different fair share
identifiers. AWS Batch assigns each fair share identifier a share based on the total weight of all recently
used fair share identifiers, which defines the amount of the total resources available for use by jobs with
that fair share identifier. Time can be added to the fair share analysis by assigning a share decay time
to the policy. A long decay time gives more weight to time and less to the defined weight. Compute
resources can be held in reserve for fair share identifiers that are not active by specifying a compute
reservation.
For example, a computeReservation value of 50 indicates that AWS Batch should reserve 50% of
the maximum available VCPU if there is only one fair share identifier, 25% if there are two fair share
identifiers, and 12.5% if there are three fair share identifiers. A computeReservation value of 25
indicates that AWS Batch should reserve 25% of the maximum available VCPU if there is only one
fair share identifier, 6.25% if there are two fair share identifiers, and 1.56% if there are three fair
share identifiers.
7. In the Share attributes section, you can specify the fair share identifier and weight for each fair
share identifier to associate with the scheduling policy.
116
AWS Batch User Guide
Scheduling policy template
overlap. For example, you can't have fair share identifiers prefix 'UserA*' and fair share identifier
'UserA1' in the same scheduling policy.
c. For Weight factor, specify the relative weight for the fair share identifier. The default value is
1.0. A lower value has a higher priority for compute resources. If a fair share identifier prefix
is used, jobs with fair share identifiers that start with the prefix will share the weight factor.
This effectively increases the weight factor for those jobs, lowering their individual priority but
maintaining the same weight factor for the fair share identifier prefix.
8. (Optional) In the Tags section, you can specify the key and value for each tag to associate with the
scheduling policy. For more information, see Tagging your AWS Batch resources (p. 197).
9. Choose Submit to finish and create your scheduling policy.
{
"name": "",
"fairsharePolicy": {
"shareDecaySeconds": 0,
"computeReservation": 0,
"shareDistribution": [
{
"shareIdentifier": "",
"weightFactor": 0.0
}
]
},
"tags": {
"KeyName": ""
}
}
Note
You can generate the preceding job queue template with the following AWS CLI command.
The name for your scheduling policy. Up to 128 letters (uppercase and lowercase), numbers,
hyphens, and underscores are allowed.
Type: String
117
AWS Batch User Guide
Fair share policy
Required: Yes
"fairsharePolicy": {
"computeReservation": number,
"shareDecaySeconds": number,
"shareDistribution": [
{
"shareIdentifier": "string",
"weightFactor": number
}
]
}
Type: Object
Required: No
computeReservation
A value used to reserve some of the available maximum VCPU for fair share identifiers that have
not yet been used.
For example, a computeReservation value of 50 indicates that AWS Batch should reserve
50% of the maximum available VCPU if there is only one active fair share identifier, 25% if there
are two active fair share identifiers, and 12.5% if there are three active fair share identifiers.
A computeReservation value of 25 indicates that AWS Batch should reserve 25% of the
maximum available VCPU if there is only one active fair share identifier, 6.25% if there are two
active fair share identifiers, and 1.56% if there are three active fair share identifiers.
Type: Integer
Required: No
shareDecaySeconds
The time period to use to calculate a fair share percentage for each fair share identifier in use.
A value of zero (0) indicates that only current usage should be measured. The decay allows for
more recently run jobs to have more weight than jobs that ran earlier.
Type: Integer
Required: No
shareDistribution
Array of objects that contain the weights for the fair share identifiers for the fair share policy.
Fair share identifiers that are not included have a default weight of 1.0.
118
AWS Batch User Guide
Tags
"shareDistribution": [
{
"shareIdentifier": "string",
"weightFactor": number
}
]
Type: Array
Required: No
shareIdentifier
A fair share identifier or fair share identifier prefix. If the string ends with '*' then this string
specifies a fair share identifier prefix for fair share identifiers that begin with that prefix. For
example if the value is UserA* and the weightFactor is 1 and there are two fair share
identifiers that begin with UserA, then each of those fair share identifiers will have a weight
of 2; if there are five such fair share identifiers, then each would have a weight of 5.
The list of fair share identifiers and fair share identifier prefixes in a fair share policy cannot
overlap. For example you cannot have a fair share identifier prefix of UserA* and a fair
share identifier of UserA-1 in the same fair share policy.
Type: String
Required: Yes
weightFactor
The weight factor for the fair share identifier. The default value is 1.0. A lower value has
a higher priority for compute resources. For example, jobs that use a share identifier with
a weight factor of 0.125 (1/8) get 8 times the compute resources of jobs that use a share
identifier with a weight factor of 1.
The smallest supported value is 0.0001 and the largest supported value is 999.9999.
Type: Float
Required: No
Tags
tags
Key-value pair tags to associate with the scheduling policy. For more information, see Tagging your
AWS Batch resources (p. 197).
Required: No
119
AWS Batch User Guide
Viewing state machine details
Sections
• Viewing state machine details (p. 120)
• Editing a state machine (p. 120)
• Running a state machine (p. 121)
Choose a state machine to view a graphical representation of the workflow. Steps highlighted in blue
represent AWS Batch jobs. Use the graph controls to zoom in, zoom out, and center the graph.
Note
When a AWS Batch job is dynamically referenced with JsonPath in the state machine definition,
the function details cannot be shown in the AWS Batch console. Instead, the job name is listed
as a Dynamic reference, and the corresponding steps in the graph are grayed out.
1. Open the AWS Batch console Workflow orchestration powered by Step Functions page.
2. Choose a state machine.
<result>
For more information, see Step Functions in the AWS Step Functions Developer Guide.
1. Open the AWS Batch console Workflow orchestration powered by Step Functions page.
2. Choose a state machine.
120
AWS Batch User Guide
Running a state machine
3. Choose Edit.
For more information about editing state machines, see Step Functions state machine language in the
AWS Step Functions Developer Guide.
1. Open the AWS Batch console Workflow orchestration powered by Step Functions page.
2. Choose a state machine.
3. Choose Execute.
For more information about running state machines, see Step Functions state machine execution
concepts in the AWS Step Functions Developer Guide.
121
AWS Batch User Guide
When to use Fargate
When you run your jobs with Fargate resources, you package your application in containers, specify the
CPU and memory requirements, define networking and IAM policies, and launch the application. Each
Fargate job has its own isolation boundary and does not share the underlying kernel, CPU resources,
memory resources, or elastic network interface with another job.
Contents
• When to use Fargate (p. 122)
• Job definitions on Fargate (p. 122)
• Job queues on Fargate (p. 124)
• Compute environments on Fargate (p. 124)
However, we recommend that you use Amazon EC2 if your jobs require any of the following:
If you have a large number of jobs, we recommend Amazon EC2 because jobs can be dispatched at a
higher rate to EC2 resources than to Fargate resources. Moreover, more jobs can run concurrently when
you use EC2. For more information, see AWS Fargate service quotas in the Amazon Elastic Container
Service Developer Guide.
Note
AWS Batch does not support Windows containers, on either Fargate or EC2 resources.
122
AWS Batch User Guide
Job definitions on Fargate
The following list describes job definition parameters that are not valid or otherwise restricted in Fargate
jobs.
platformCapabilities
"platformCapabilities": [ "FARGATE" ]
type
"type": "container"
Parameters in containerProperties
executionRoleArn
Must be specified for jobs running on Fargate resources. For more information, see IAM Roles for
Tasks in the Amazon Elastic Container Service Developer Guide.
"executionRoleArn": "arn:aws:iam::123456789012:role/ecsTaskExecutionRole"
fargatePlatformConfiguration
(Optional, only for Fargate job definitions). Specifies the Fargate platform version, or LATEST
for a recent platform version. Possible values for platformVersion are 1.3.0, 1.4.0, and
LATEST (default).
instanceType, ulimits
"privileged": false
resourceRequirements
Both memory and vCPU requirements must be specified, using supported values (p. 56). GPU
resources are not supported for jobs running on Fargate resources.
"resourceRequirements": [
{"type": "MEMORY", "value": "512"},
{"type": "VCPU", "value": "0.25"}
]
Parameters in linuxParameters
devices, maxSwap, sharedMemorySize, swappiness, tmpfs
123
AWS Batch User Guide
Job queues on Fargate
Parameters in logConfiguration
logDriver
Only awslogs and fluentd are supported. For more information, see Using the awslogs log
driver (p. 64).
Members in networkConfiguration
assignPublicIp
If the private subnet does not have a NAT gateway attached to send traffic to the Internet,
assignPublicIp must be "ENABLED". For more information, see For more information, see
AWS Batch execution IAM role (p. 176).
The following list describes compute environment parameters that are not valid or otherwise restricted
in Fargate jobs.
type
"type": "MANAGED"
These aren't applicable for Fargate compute environments and shouldn't be provided.
subnets
If the subnets listed in this parameter don't have NAT gateways attached, the assignPublicIp
parameter in the job definition must be set to ENABLED.
tags
This isn't applicable for Fargate compute environments and shouldn't be provided. To
specify tags for Fargate compute environments, use the tags parameter that's not in the
computeResources object.
type
"type": "FARGATE_SPOT"
124
AWS Batch User Guide
The location of the LinuxParameters member will be different for multi-node parallel jobs and single-
node container jobs. The examples below demonstrate the differences but are missing required values.
{
"jobDefinitionName": "EFA-MNP-JobDef",
"type": "multinode",
"nodeProperties": {
...
"nodeRangeProperties": [
{
...
"container": {
...
"linuxParameters": {
"devices": [
{
"hostPath": "/dev/infiniband/uverbs0",
"containerPath": "/dev/infiniband/uverbs0",
"permissions": [
"READ", "WRITE", "MKNOD"
]
},
],
},
},
},
],
},
}
{
"jobDefinitionName": "EFA-Container-JobDef",
125
AWS Batch User Guide
"type": "container",
...
"containerProperties": {
...
"linuxParameters": {
"devices": [
{
"hostPath": "/dev/infiniband/uverbs0",
},
],
},
},
}
For more information about EFA, see Elastic Fabric Adapter in Amazon EC2 User Guide for Linux Instances.
126
AWS Batch User Guide
Policy structure
When you attach a policy to a user or group of users, it allows or denies the users permissions to perform
the specified tasks on the specified resources. For more information, see Permissions and Policies in the
IAM User Guide. For more information about managing and creating custom IAM policies, see Managing
IAM Policies.
Likewise, AWS Batch makes calls to other AWS services on your behalf, so the service must authenticate
with your credentials. This authentication is accomplished by creating an IAM role and policy that can
provide these permissions and then associating that role with your compute environments when you
create them. For more information, see Amazon ECS instance role (p. 145), IAM Roles, Using Service-
Linked Roles, and Creating a Role to Delegate Permissions to an AWS Service in the IAM User Guide.
Getting Started
An IAM policy must grant or deny permissions to use one or more AWS Batch actions.
Topics
• Policy structure (p. 127)
• Supported resource-level permissions for AWS Batch API actions (p. 130)
• Example policies (p. 138)
• AWS Batch managed policy (p. 141)
• Creating AWS Batch IAM policies (p. 142)
• AWS Batch service IAM role (p. 142)
• Amazon ECS instance role (p. 145)
• Amazon EC2 spot fleet role (p. 145)
• EventBridge IAM role (p. 147)
Policy structure
The following topics explain the structure of an IAM policy.
Topics
• Policy syntax (p. 128)
• Actions for AWS Batch (p. 128)
• Amazon Resource Names for AWS Batch (p. 129)
• Checking that users have the required permissions (p. 129)
127
AWS Batch User Guide
Policy syntax
Policy syntax
An IAM policy is a JSON document that consists of one or more statements. Each statement is structured
as follows:
{
"Statement":[{
"Effect":"effect",
"Action":"action",
"Resource":"arn",
"Condition":{
"condition":{
"key":"value"
}
}
}
]
}
• Effect: The effect can be Allow or Deny. By default, IAM users don't have permission to use resources
and API actions, so all requests are denied. An explicit allow overrides the default. An explicit deny
overrides any allows.
• Action: The action is the specific API action that you're granting or denying permission for. To learn
about specifying action, see Actions for AWS Batch (p. 128).
• Resource: The resource that's affected by the action. With some AWS Batch API actions, you can
include specific resources in your policy that can be created or modified by the action. To specify a
resource in the statement, use its Amazon Resource Name (ARN). For more information, see Supported
resource-level permissions for AWS Batch API actions (p. 130) and Amazon Resource Names for AWS
Batch (p. 129). If the AWS Batch API operation currently doesn't support resource-level permissions,
you must use the * wildcard to specify that all resources can be affected by the action.
• Condition: Conditions are optional. They can be used to control when your policy is in effect.
For more information about example IAM policy statements for AWS Batch, see Creating AWS Batch IAM
policies (p. 142).
To specify multiple actions in a single statement, separate them with commas as follows:
You can also specify multiple actions using wildcards (*). For example, you can specify all actions whose
name begins with the word "Describe" as follows:
"Action": "batch:Describe*"
To specify all AWS Batch API actions, use the wildcard (*) as follows:
"Action": "batch:*"
128
AWS Batch User Guide
Amazon Resource Names for AWS Batch
For a list of AWS Batch actions, see Actions in the AWS Batch API Reference.
arn:aws:[service]:[region]:[account]:resourceType/resourcePath
service
A path that identifies the resource. You can use the wildcard (*) in your paths.
AWS Batch API operations currently supports resource-level permissions on several API operations.
For more information, see Supported resource-level permissions for AWS Batch API actions (p. 130).
To specify all resources, or if a specific API action doesn't support ARNs, use the wildcard (*) in the
Resource element as follows:
"Resource": "*"
First, create an IAM user for testing purposes and attach the IAM policy to the test user. Then, make a
request as the test user. You can make test requests in the console or with the AWS CLI.
Note
You can also test your policies with the IAM Policy Simulator. For more information about the
policy simulator, see Working with the IAM Policy Simulator in the IAM User Guide.
If the policy doesn't grant the user the permissions that you expected, or is overly permissive, you can
adjust the policy as needed. Retest until you get the desired results.
Important
It can take several minutes for policy changes to propagate before they take effect. Therefore,
we recommend that you allow five minutes to pass before you test your policy updates.
If an authorization check fails, the request returns an encoded message with diagnostic information. You
can decode the message using the DecodeAuthorizationMessage action. For more information, see
129
AWS Batch User Guide
Supported resource-level permissions
DecodeAuthorizationMessage in the AWS Security Token Service API Reference, and decode-authorization-
message in the AWS CLI Command Reference.
The following list describes the AWS Batch API actions that currently support resource-level permissions,
as well as the supported resources, resource ARNs, and condition keys for each action.
Important
If an AWS Batch API action isn't listed in this list, then it doesn't support resource-level
permissions. If an AWS Batch API action doesn't support resource-level permissions, you can
grant users permission to use the action, but you have to specify a wildcard (*) for the resource
element of your policy statement.
Actions
CancelJob
arn:aws:batch:region:account:job/jobId
Condition keys
aws:ResourceTag/${TagKey} (String)
arn:aws::batch:region:account:compute-environment/compute-environment-name
Condition keys
aws:ResourceTag/${TagKey} (String)
130
AWS Batch User Guide
Supported resource-level permissions
Condition keys
aws:RequestTag/${TagKey} (String)
Filters actions based on the tags that are passed in the request.
aws:TagKeys (String)
Filters actions based on the tag keys that are passed in the request.
CreateJobQueue
arn:aws::batch:region:account:compute-environment/compute-environment-name
Condition keys
aws:ResourceTag/${TagKey} (String)
Filters actions based on the tags that are associated with the resource.
Job Queue
arn:aws:batch:region:account:job-queue/queue-name
Condition keys
aws:ResourceTag/${TagKey} (String)
Filters actions based on the tags that are associated with the resource.
Scheduling Policy
arn:aws:batch:region:account:scheduling-policy/scheduling-policy-name
Condition keys
aws:ResourceTag/${TagKey} (String)
Filters actions based on the tags that are associated with the resource.
Condition keys
aws:RequestTag/${TagKey} (String)
Filters actions based on the tags that are passed in the request.
aws:TagKeys (String)
Filters actions based on the tag keys that are passed in the request.
DeleteComputeEnvironment
arn:aws::batch:region:account:compute-environment/compute-environment-name
Condition keys
aws:ResourceTag/${TagKey} (String)
Filters actions based on the tags that are associated with the resource.
CreateSchedulingPolicy
131
AWS Batch User Guide
Supported resource-level permissions
Resource
Scheduling Policy
arn:aws:batch:region:account:scheduling-policy/scheduling-policy-name
Condition keys
aws:ResourceTag/${TagKey} (String)
Filters actions based on the tags that are associated with the resource.
Condition keys
aws:RequestTag/${TagKey} (String)
Filters actions based on the tags that are passed in the request.
aws:TagKeys (String)
Filters actions based on the tag keys that are passed in the request.
DeleteJobQueue
Deletes the specified job queue. Deleting the job queue eventually deletes all of the jobs in the
queue. Jobs are deleted at a rate of about 16 jobs each second.
Resource
Job Queue
arn:aws:batch:region:account:job-queue/queue-name
Condition keys
aws:ResourceTag/${TagKey} (String)
Filters actions based on the tags that are associated with the resource.
DeleteSchedulingPolicy
arn:aws:batch:region:account:scheduling-policy/scheduling-policy-name
Condition keys
aws:ResourceTag/${TagKey} (String)
Filters actions based on the tags that are associated with the resource.
DeregisterJobDefinition
arn:aws:batch:region:account:job-definition/definition-name:revision
Condition keys
aws:ResourceTag/${TagKey} (String)
Filters actions based on the tags that are associated with the resource.
ListTagsForResource
132
AWS Batch User Guide
Supported resource-level permissions
Resource
Compute Environment
arn:aws::batch:region:account:compute-environment/compute-environment-name
Condition keys
aws:ResourceTag/${TagKey} (String)
Filters actions based on the tags that are associated with the resource.
Job
arn:aws:batch:region:account:job/jobId
Condition keys
aws:ResourceTag/${TagKey} (String)
Filters actions based on the tags that are associated with the resource.
Job Definition
arn:aws:batch:region:account:job-definition/definition-name:revision
Condition keys
aws:ResourceTag/${TagKey} (String)
Filters actions based on the tags that are associated with the resource.
Job Queue
arn:aws:batch:region:account:job-queue/queue-name
Condition keys
aws:ResourceTag/${TagKey} (String)
Filters actions based on the tags that are associated with the resource.
Scheduling Policy
arn:aws:batch:region:account:scheduling-policy/scheduling-policy-name
Condition keys
aws:ResourceTag/${TagKey} (String)
Filters actions based on the tags that are associated with the resource.
RegisterJobDefinition
arn:aws:batch:region:account:job-definition/definition-name:revision
Condition keys
aws:ResourceTag/${TagKey} (String)
Filters actions based on the tags that are associated with the resource.
Condition keys
batch:AWSLogsCreateGroup (Boolean)
When this parameter is true, the awslogs-group is created for the logs.
batch:AWSLogsGroup (String)
133
AWS Batch User Guide
Supported resource-level permissions
batch:AWSLogsRegion (String)
When this parameter is true, the container for the job is given elevated permissions on the
host container instance (similar to the root user).
batch:User (String)
The user name or numeric uid to use inside the container for the job.
aws:RequestTag/${TagKey} (String)
Filters actions based on the tags that are passed in the request.
aws:TagKeys (String)
Filters actions based on the tag keys that are passed in the request.
SubmitJob
arn:aws:batch:region:account:job/jobId
Condition keys
aws:ResourceTag/${TagKey} (String)
arn:aws:batch:region:account:job-definition/definition-name:revision
Condition keys
aws:ResourceTag/${TagKey} (String)
arn:aws:batch:region:account:job-queue/queue-name
Condition keys
aws:ResourceTag/${TagKey} (String)
Filters actions based on the tags that are associated with the resource.
TagResource
134
AWS Batch User Guide
Supported resource-level permissions
Resource
Compute Environment
arn:aws::batch:region:account:compute-environment/compute-environment-name
Condition keys
aws:ResourceTag/${TagKey} (String)
Filters actions based on the tags that are associated with the resource.
Job
arn:aws:batch:region:account:job/jobId
Condition keys
aws:ResourceTag/${TagKey} (String)
Filters actions based on the tags that are associated with the resource.
Job Definition
arn:aws:batch:region:account:job-definition/definition-name:revision
Condition keys
aws:ResourceTag/${TagKey} (String)
Filters actions based on the tags that are associated with the resource.
Job Queue
arn:aws:batch:region:account:job-queue/queue-name
Condition keys
aws:ResourceTag/${TagKey} (String)
Filters actions based on the tags that are associated with the resource.
Scheduling Policy
arn:aws:batch:region:account:scheduling-policy/scheduling-policy-name
Condition keys
aws:ResourceTag/${TagKey} (String)
Filters actions based on the tags that are associated with the resource.
Condition keys
aws:RequestTag/${TagKey} (String)
Filters actions based on the tags that are passed in the request.
aws:TagKeys (String)
Filters actions based on the tag keys that are passed in the request.
TerminateJob
arn:aws:batch:region:account:job/jobId
Condition keys
aws:ResourceTag/${TagKey} (String)
Filters actions based on the tags that are associated with the resource.
135
AWS Batch User Guide
Supported resource-level permissions
UntagResource
arn:aws::batch:region:account:compute-environment/compute-environment-name
Condition keys
aws:ResourceTag/${TagKey} (String)
Filters actions based on the tags that are associated with the resource.
Job
arn:aws:batch:region:account:job/jobId
Condition keys
aws:ResourceTag/${TagKey} (String)
Filters actions based on the tags that are associated with the resource.
Job Definition
arn:aws:batch:region:account:job-definition/definition-name:revision
Condition keys
aws:ResourceTag/${TagKey} (String)
Filters actions based on the tags that are associated with the resource.
Job Queue
arn:aws:batch:region:account:job-queue/queue-name
Condition keys
aws:ResourceTag/${TagKey} (String)
Filters actions based on the tags that are associated with the resource.
Scheduling Policy
arn:aws:batch:region:account:scheduling-policy/scheduling-policy-name
Condition keys
aws:ResourceTag/${TagKey} (String)
Filters actions based on the tags that are associated with the resource.
Condition keys
aws:TagKeys (String)
Filters actions based on the tag keys that are passed in the request.
UpdateComputeEnvironment
arn:aws::batch:region:account:compute-environment/compute-environment-name
Condition keys
aws:ResourceTag/${TagKey} (String)
Filters actions based on the tags that are associated with the resource.
136
AWS Batch User Guide
Condition keys
UpdateJobQueue
arn:aws:batch:region:account:job-queue/queue-name
Condition keys
aws:ResourceTag/${TagKey} (String)
Filters actions based on the tags that are associated with the resource.
Scheduling Policy
arn:aws:batch:region:account:scheduling-policy/scheduling-policy-name
Condition keys
aws:ResourceTag/${TagKey} (String)
Filters actions based on the tags that are associated with the resource.
UpdateSchedulingPolicy
arn:aws:batch:region:account:scheduling-policy/scheduling-policy-name
Condition keys
aws:ResourceTag/${TagKey} (String)
Filters actions based on the tags that are associated with the resource.
batch:AWSLogsCreateGroup (Boolean)
When this parameter is true, the awslogs-group is created for the logs.
batch:AWSLogsGroup (String)
137
AWS Batch User Guide
Example policies
batch:LogDriver (String)
When this parameter is true, the container for the job is given elevated permissions on the host
container instance (similar to the root user).
aws:ResourceTag/${TagKey} (String)
Filters actions based on the tags that are associated with the resource.
aws:RequestTag/${TagKey} (String)
Filters actions based on the tags that are passed in the request.
batch:ShareIdentifier (String)
Filters actions based on the tag keys that are passed in the request.
batch:User (String)
The user name or numeric uid to use inside the container for the job.
Example policies
The following examples show policy statements that you could use to control the permissions that IAM
users have to AWS Batch.
Examples
• Example: Read-only access (p. 138)
• Example: Restricting to POSIX user, Docker image, privilege level, and role on job
submission (p. 139)
• Example: Restrict to job definition prefix on job submission (p. 140)
• Example: Restrict to job queue (p. 140)
Users don't have permission to perform any actions on the resources (unless another statement grants
them permission to do so) because they're denied permission to use API actions by default.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"batch:Describe*",
"batch:List*"
],
"Resource": "*"
}
138
AWS Batch User Guide
Restricting user, image, privilege, role
]
}
The first and second statements allow a user to register and deregister any job definition name whose
name is prefixed with JobDefA_.
The first statement also uses conditional context keys to restrict the POSIX user, privileged status, and
container image values within the containerProperties of a job definition. For more information,
see RegisterJobDefinition in the AWS Batch API Reference. In this example, job definitions can only be
registered when the POSIX user is set to nobody, the privileged flag is set to false, and the image is set
to myImage in an Amazon ECR repository.
Important
Docker resolves the user parameter to that user's uid from within the container image. In most
cases, this is found in the /etc/passwd file within the container image. This name resolution
can be avoided by using direct uid values in both the job definition and any associated IAM
policies. Both the AWS Batch API operations and the batch:User IAM conditional keys support
numeric values.
The third statement restricts a user to passing only a specific role to a job definition.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"batch:RegisterJobDefinition"
],
"Resource": [
"arn:aws:batch:<aws_region>:<aws_account_id>:job-definition/JobDefA_*"
],
"Condition": {
"StringEquals": {
"batch:User": [
"nobody"
],
"batch:Image": [
"<aws_account_id>.dkr.ecr.<aws_region>.amazonaws.com/myImage"
]
},
"Bool": {
"batch:Privileged": "false"
}
}
},
{
"Effect": "Allow",
"Action": [
"batch:DeregisterJobDefinition"
],
"Resource": [
"arn:aws:batch:<aws_region>:<aws_account_id>:job-definition/JobDefA_*"
]
},
{
139
AWS Batch User Guide
Restrict job submission
"Effect": "Allow",
"Action": [
"iam:PassRole"
],
"Resource": [
"arn:aws:iam::<aws_account_id>:role/MyBatchJobRole"
]
}
]
}
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"batch:SubmitJob"
],
"Resource": [
"arn:aws:batch:<aws_region>:<aws_account_id>:job-definition/JobDefA_*",
"arn:aws:batch:<aws_region>:<aws_account_id>:job-queue/*"
]
}
]
}
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"batch:SubmitJob"
],
"Resource": [
"arn:aws:batch:<aws_region>:<aws_account_id>:job-definition/*",
"arn:aws:batch:<aws_region>:<aws_account_id>:job-queue/queue1"
]
}
140
AWS Batch User Guide
AWS Batch managed policy
]
}
AWSBatchFullAccess
This policy allows full administrator access to AWS Batch.
{
"Version":"2012-10-17",
"Statement":[
{
"Effect":"Allow",
"Action":[
"batch:*",
"cloudwatch:GetMetricStatistics",
"ec2:DescribeSubnets",
"ec2:DescribeSecurityGroups",
"ec2:DescribeKeyPairs",
"ec2:DescribeVpcs",
"ec2:DescribeImages",
"ec2:DescribeLaunchTemplates",
"ec2:DescribeLaunchTemplateVersions",
"ecs:DescribeClusters",
"ecs:Describe*",
"ecs:List*",
"logs:Describe*",
"logs:Get*",
"logs:TestMetricFilter",
"logs:FilterLogEvents",
"iam:ListInstanceProfiles",
"iam:ListRoles"
],
"Resource":"*"
},
{
"Effect":"Allow",
"Action":[
"iam:PassRole"
],
"Resource":[
"arn:aws:iam::*:role/AWSBatchServiceRole",
"arn:aws:iam::*:role/service-role/AWSBatchServiceRole",
"arn:aws:iam::*:role/ecsInstanceRole",
"arn:aws:iam::*:instance-profile/ecsInstanceRole",
"arn:aws:iam::*:role/iaws-ec2-spot-fleet-role",
"arn:aws:iam::*:role/aws-ec2-spot-fleet-role",
"arn:aws:iam::*:role/AWSBatchJobRole*"
]
},
{
"Effect":"Allow",
"Action":[
"iam:CreateServiceLinkedRole"
141
AWS Batch User Guide
Creating IAM policies
],
"Resource":"arn:aws:iam::*:role/*Batch*",
"Condition": {
"StringEquals": {
"iam:AWSServiceName": "batch.amazonaws.com"
}
}
}
]
}
When you attach a policy to a user or group of users, it allows or denies the users permission to perform
the specified tasks on the specified resources. For more information, see Permissions and Policies in the
IAM User Guide. For more information about managing and creating custom IAM policies, see Managing
IAM Policies.
In most cases, the AWS Batch service role is created for you automatically in the console first-run
experience. You can use the following procedure to check if your account already has the AWS Batch
service role.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ec2:DescribeAccountAttributes",
"ec2:DescribeInstances",
"ec2:DescribeInstanceAttribute",
"ec2:DescribeSubnets",
"ec2:DescribeSecurityGroups",
"ec2:DescribeKeyPairs",
"ec2:DescribeImages",
"ec2:DescribeImageAttribute",
"ec2:DescribeInstanceStatus",
"ec2:DescribeSpotInstanceRequests",
"ec2:DescribeSpotFleetInstances",
"ec2:DescribeSpotFleetRequests",
"ec2:DescribeSpotPriceHistory",
"ec2:DescribeVpcClassicLink",
"ec2:DescribeLaunchTemplateVersions",
"ec2:CreateLaunchTemplate",
"ec2:DeleteLaunchTemplate",
"ec2:RequestSpotFleet",
"ec2:CancelSpotFleetRequests",
142
AWS Batch User Guide
AWS Batch service IAM role
"ec2:ModifySpotFleetRequest",
"ec2:TerminateInstances",
"ec2:RunInstances",
"autoscaling:DescribeAccountLimits",
"autoscaling:DescribeAutoScalingGroups",
"autoscaling:DescribeLaunchConfigurations",
"autoscaling:DescribeAutoScalingInstances",
"autoscaling:CreateLaunchConfiguration",
"autoscaling:CreateAutoScalingGroup",
"autoscaling:UpdateAutoScalingGroup",
"autoscaling:SetDesiredCapacity",
"autoscaling:DeleteLaunchConfiguration",
"autoscaling:DeleteAutoScalingGroup",
"autoscaling:CreateOrUpdateTags",
"autoscaling:SuspendProcesses",
"autoscaling:PutNotificationConfiguration",
"autoscaling:TerminateInstanceInAutoScalingGroup",
"ecs:DescribeClusters",
"ecs:DescribeContainerInstances",
"ecs:DescribeTaskDefinition",
"ecs:DescribeTasks",
"ecs:ListAccountSettings",
"ecs:ListClusters",
"ecs:ListContainerInstances",
"ecs:ListTaskDefinitionFamilies",
"ecs:ListTaskDefinitions",
"ecs:ListTasks",
"ecs:CreateCluster",
"ecs:DeleteCluster",
"ecs:RegisterTaskDefinition",
"ecs:DeregisterTaskDefinition",
"ecs:RunTask",
"ecs:StartTask",
"ecs:StopTask",
"ecs:UpdateContainerAgent",
"ecs:DeregisterContainerInstance",
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents",
"logs:DescribeLogGroups",
"iam:GetInstanceProfile",
"iam:GetRole"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": "ecs:TagResource",
"Resource": [
"arn:aws:ecs:*:*:task/*_Batch_*"
]
},
{
"Effect": "Allow",
"Action": "iam:PassRole",
"Resource": [
"*"
],
"Condition": {
"StringEquals": {
"iam:PassedToService": [
"ec2.amazonaws.com",
"ec2.amazonaws.com.cn",
"ecs-tasks.amazonaws.com"
]
}
143
AWS Batch User Guide
AWS Batch service IAM role
}
},
{
"Effect": "Allow",
"Action": "iam:CreateServiceLinkedRole",
"Resource": "*",
"Condition": {
"StringEquals": {
"iam:AWSServiceName": [
"spot.amazonaws.com",
"spotfleet.amazonaws.com",
"autoscaling.amazonaws.com",
"ecs.amazonaws.com"
]
}
}
},
{
"Effect": "Allow",
"Action": [
"ec2:CreateTags"
],
"Resource": [
"*"
],
"Condition": {
"StringEquals": {
"ec2:CreateAction": "RunInstances"
}
}
}
]
}
You can use the following procedure to see if your account already has the AWS Batch service role and
attach the managed IAM policy if needed.
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
144
AWS Batch User Guide
Amazon ECS instance role
The Amazon ECS instance role and instance profile are automatically created for you in the console first-
run experience. However, you can use the following procedure to check and see if your account already
has the Amazon ECS instance role and instance profile and to attach the managed IAM policy if needed.
145
AWS Batch User Guide
Create Amazon EC2 spot fleet roles
in the AWS Management Console
and AWSServiceRoleForEC2SpotFleet service-linked roles for Amazon EC2 Spot and Spot Fleet. Use the
following instruction to create all of these roles. For more information, see Using Service-Linked Roles
and Creating a Role to Delegate Permissions to an AWS Service in the IAM User Guide.
Topics
• Create Amazon EC2 spot fleet roles in the AWS Management Console (p. 146)
• Create Amazon EC2 Spot Fleet Roles with the AWS CLI (p. 146)
Note
In the past, there have been two managed policies for the Amazon EC2 Spot Fleet role.
• AmazonEC2SpotFleetRole: This is the original managed policy for the Spot Fleet role.
However, we no longer recommend you use it with AWS Batch. This policy doesn't
support Spot Instance tagging in compute environments, which is required to use the
AWSServiceRoleForBatch service-linked role. If you previously created a Spot Fleet role
with this policy, see Spot Instances Not Tagged on Creation (p. 204) to apply the new
recommended policy to that role.
• AmazonEC2SpotFleetTaggingRole: This role provides all of the necessary permissions to tag
Amazon EC2 Spot Instances. Use this role to allow Spot Instance tagging on your AWS Batch
compute environments.
146
AWS Batch User Guide
EventBridge IAM role
To create the AWSServiceRoleForEC2Spot IAM service-linked role for Amazon EC2 Spot
The trust relationship for your EventBridge IAM role must provide the events.amazonaws.com service
principal the ability to assume the role, as follows.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "",
"Effect": "Allow",
"Principal": {
"Service": "events.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
The policy attached to your EventBridge IAM role should allow batch:SubmitJob permissions on your
resources. AWS Batch provides the AWSBatchServiceEventTargetRole managed policy to provide
these permissions, as follows.
{
"Version": "2012-10-17",
147
AWS Batch User Guide
EventBridge IAM role
"Statement": [
{
"Effect": "Allow",
"Action": [
"batch:SubmitJob"
],
"Resource": "*"
}
]
}
148
AWS Batch User Guide
AWS Batch Events
You can use EventBridge to gain further insights about your AWS Batch service. More specifically, you
can use it to check the progress of jobs, build AWS Batch custom workflows, generate usage reports or
metrics, or build your own dashboards. With AWS Batch and EventBridge, you don't need scheduling and
monitoring code that continuously polls AWS Batch for job status changes. Instead, you can handle AWS
Batch job state changes asynchronously using a variety of Amazon EventBridge targets. These include
AWS Lambda, Amazon Simple Queue Service, Amazon Simple Notification Service, or Amazon Kinesis
Data Streams.
Events from the AWS Batch event stream are ensured to be delivered at least one time. In the event that
duplicate events are sent, the event provides enough information to identify duplicates. That way, you
can compare the time stamp of the event and the job status.
AWS Batch jobs are available as EventBridge targets. Using simple rules, you can match events and
submit AWS Batch jobs in response to them. For more information, see What is EventBridge? in the
Amazon EventBridge User Guide. You can also use EventBridge to schedule automated actions that
self-trigger at certain times using cron or rate expressions. For more information, see Creating an
Amazon EventBridge rule that runs on a schedule in the Amazon EventBridge User Guide. For an example
walkthrough, see AWS Batch Jobs as EventBridge Targets (p. 151).
Topics
• AWS Batch Events (p. 149)
• AWS Batch Jobs as EventBridge Targets (p. 151)
• Tutorial: Listening for AWS Batch EventBridge (p. 154)
• Tutorial: Sending Amazon Simple Notification Service Alerts for Failed Job Events (p. 157)
149
AWS Batch User Guide
Job State Change Events
Note
Events aren't created for the initial job submission.
Job state change events are delivered in the following format (the detail section resembles the
JobDetail object that's returned from a DescribeJobs API operation in the AWS Batch API Reference).
For more information about EventBridge parameters, see Events and Event Patterns in the Amazon
EventBridge User Guide.
{
"version": "0",
"id": "c8f9c4b5-76e5-d76a-f980-7011e206042b",
"detail-type": "Batch Job State Change",
"source": "aws.batch",
"account": "123456789012",
"time": "2022-01-11T23:36:40Z",
"region": "us-east-1",
"resources": [
"arn:aws:batch:us-east-1:123456789012:job/4c7599ae-0a82-49aa-ba5a-4727fcce14a8"
],
"detail": {
"jobArn": "arn:aws:batch:us-east-1:123456789012:job/4c7599ae-0a82-49aa-
ba5a-4727fcce14a8",
"jobName": "event-test",
"jobId": "4c7599ae-0a82-49aa-ba5a-4727fcce14a8",
"jobQueue": "arn:aws:batch:us-east-1:123456789012:job-queue/
PexjEHappyPathCanary2JobQueue",
"status": "RUNNABLE",
"attempts": [],
"createdAt": 1641944200058,
"retryStrategy": {
"attempts": 2,
"evaluateOnExit": []
},
"dependsOn": [],
"jobDefinition": "arn:aws:batch:us-east-1:123456789012:job-definition/first-run-
job-definition:1",
"parameters": {},
"container": {
"image": "137112412989.dkr.ecr.us-east-1.amazonaws.com/amazonlinux:latest",
"command": [
"sleep",
"600"
],
"volumes": [],
"environment": [],
"mountPoints": [],
"ulimits": [],
"networkInterfaces": [],
"resourceRequirements": [
{
"value": "2",
"type": "VCPU"
}, {
"value": "256",
"type": "MEMORY"
}
],
"secrets": []
},
"tags": {
"resourceArn": "arn:aws:batch:us-east-1:123456789012:job/4c7599ae-0a82-49aa-
ba5a-4727fcce14a8"
150
AWS Batch User Guide
AWS Batch Jobs as EventBridge Targets
},
"propagateTags": false,
"platformCapabilities": []
}
}
You can also use EventBridge to schedule automated actions that are invoked at certain times using
cron or rate expressions. For more information, see Creating an Amazon EventBridge rule that runs on a
schedule in the Amazon EventBridge User Guide.
Common use cases for AWS Batch jobs as a EventBridge target include the following use cases:
• A scheduled job is created to occurs at regular time intervals. For example, a cron job occurs only
during low-usage hours when Amazon EC2 Spot Instances are less expensive.
• An AWS Batch job runs in response to an API operation that's logged in CloudTrail. For example, a job
is submitted whenever an object is uploaded to a specified Amazon S3 bucket, with the EventBridge
input transformer passing the bucket and key name of the object to AWS Batch parameters each time.
Note
In this scenario, all of the AWS resources (such as the Amazon S3 bucket, the EventBridge rule,
and all CloudTrail logs) must be in the same Region.
Before you can submit AWS Batch jobs with EventBridge rules and targets, the EventBridge service needs
several permissions to run AWS Batch jobs on your behalf. When you create a rule in the EventBridge
console that specifies an AWS Batch job as a target, you're provided with an opportunity to create this
role. For more information about the required service principal and IAM permissions for this role, see
EventBridge IAM role (p. 147).
A rule can't have the same name as another rule in the same Region and on the same event bus.
5. For Define pattern, choose Schedule.
6. Either choose Fixed rate of and specify how often the task is to run, or choose Cron expression and
specify a cron expression that defines when the task is to be triggered. For more information, see
Creating an Amazon EventBridge rule that runs on a schedule in the Amazon EventBridge User Guide
151
AWS Batch User Guide
Event Input Transformer
• For Fixed rate of, enter the interval and unit for your schedule.
• For Cron expression, enter the cron expression for your task schedule. These expressions have
six required fields. Each field is separated by white space. For more information and examples of
cron expressions, see Cron Expressions in the Amazon EventBridge User Guide.
7. For Select event bus, choose AWS default event bus. You can only create scheduled rules on the
default event bus.
8. For Select targets, choose Batch job queue and fill in the following fields appropriately:
• Job queue: Enter the Amazon Resource Name (ARN) of the job queue to schedule your job in.
• Job definition: Enter the name and revision or full ARN of the job definition to use for your job.
• Job name: Enter a name for your job.
• Array size: (Optional) Enter an array size for your job to run more than one copy. For more
information, see Array Jobs (p. 20).
• Job attempts: (Optional) Enter the number of times to retry your job if it fails. For more
information, see Automated Job Retries (p. 18).
9. For Batch job queue target types, EventBridge needs permission to send events to the target.
EventBridge can create the IAM role needed for your rule to run. Do one of these things:
• To create an IAM role automatically, choose Create a new role for this specific resource
• To use an IAM role that you created before, choose Use existing role
a. For Maximum age of event, enter a value between 1 minute (00:01) and 24 hours (24:00).
b. For Retry attempts, enter a number between 0 and 185.
11. For Dead-letter queue, choose whether to use a standard Amazon SQS queue as a dead-letter
queue. EventBridge sends events that match this rule to the dead-letter queue if it can't deliver
them to the target. Do one of the following:
• Choose None to not use a dead-letter queue.
• Choose Select an Amazon SQS queue in the current AWS account to use as the dead-letter
queue and then select the queue to use from the drop-down list.
• Choose Select an Amazon SQS queue in an other AWS account as a dead-letter queue and
then enter the ARN of the queue to use. You must attach a resource-based policy to the queue
that grants EventBridge permission to send messages to it.
12. (Optional) Enter one or more tags for the rule.
13. Choose Create.
152
AWS Batch User Guide
Event Input Transformer
"jobDefinitionName": "echo-parameters",
"containerProperties": {
"image": "busybox",
"resourceRequirements": [
{
"type": "MEMORY",
"value": "2000"
},
{
"type": "VCPU",
"value": "2"
}
],
"command": [
"echo",
"Ref::S3bucket",
"Ref::S3key"
]
}
}
Then, you simply create an AWS Batch event target that parses information from the event that starts it
and transforms it into a parameters object. When the job runs, the parameters from the trigger event
are passed to the command of the job container.
Note
In this scenario, all of the AWS resources (such as Amazon S3 buckets, EventBridge rules, and
CloudTrail logs) must be in the same Region.
{
"S3BucketValue":"$.detail.bucket.name",
"S3KeyValue":"$.detail.object.key"
}
8. For the lower input transformer text box, create the Parameters structure to pass to the AWS Batch
job. These parameters are substituted for the Ref::S3bucket and Ref::S3key placeholders in the
command of the job container when the job runs.
{
"Parameters" :
{
"S3bucket": <S3BucketValue>,
"S3key": <S3KeyValue>
}
}
153
AWS Batch User Guide
Tutorial: Listening for AWS Batch EventBridge
You can also update the ContainerOverrides structure to pass to update commands, environment
variables, and other settings.
{
"Parameters" :
{
"S3bucket": <S3BucketValue>
},
"ContainerOverrides" :
{
"Command":
[
"echo",
"Ref::S3bucket"
]
}
}
Note
The names of the members of the ContainerOverrides structure must be capitalized.
For example, Command and ResourceRequirements instead of command and
resourceRequirements.
9. Choose an existing EventBridge IAM role to use for your job, or Create a new role for this specific
resource to create a new one. For more information, see EventBridge IAM role (p. 147).
10. Choose Configure details and then for Rule definition, fill in the following fields appropriately, and
then choose Create rule.
Prerequisites
This tutorial assumes that you have a working compute environment and job queue that are ready to
accept jobs. If you don't have a running compute environment and job queue to capture events from,
follow the steps in Getting Started with AWS Batch (p. 9) to create one. At the end of this tutorial, you
can optionally submit a job to this job queue to test that you have configured your Lambda function
correctly.
154
AWS Batch User Guide
Step 2: Register Event Rule
import json
print(json.dumps(event))
This is a simple Python 3.8 function that prints the events sent by AWS Batch. If everything is
configured correctly, at the end of this tutorial, you will see that the event details appear in the
CloudWatch Logs log stream that's associated with this Lambda function.
7. Choose Deploy.
A rule can't have the same name as another rule in the same Region and on the same event bus.
5. For Define pattern, select Event Pattern as the event source, and then select Custom pattern.
6. Paste the following event pattern into the text area.
{
"source": [
"aws.batch"
]
}
This rule applies across all of your AWS Batch groups and to every AWS Batch event. Alternatively,
you can create a more specific rule to filter out some results.
155
AWS Batch User Guide
Step 3: Test Your Configuration
7. For Select targets, in Target, choose Lambda function, and select your Lambda function.
8. For Select event bus, choose AWS default event bus. You can only create scheduled rules on the
default event bus.
9. For Select targets, choose Batch job queue and fill in the following fields appropriately:
• Job queue: Enter the Amazon Resource Name (ARN) of the job queue to schedule your job in.
• Job definition: Enter the name and revision or full ARN of the job definition to use for your job.
• Job name: Enter a name for your job.
• Array size: (Optional) Enter an array size for your job to run more than one copy. For more
information, see Array Jobs (p. 20).
• Job attempts: (Optional) Enter the number of times to retry your job if it fails. For more
information, see Automated Job Retries (p. 18).
10. For Batch job queue target types, EventBridge needs permission to send events to the target.
EventBridge can create the IAM role needed for your rule to run. Do one of these things:
• To create an IAM role automatically, choose Create a new role for this specific resource
• To use an IAM role that you created before, choose Use existing role
a. For Maximum age of event, enter a value between 1 minute (00:01) and 24 hours (24:00).
b. For Retry attempts, enter a number between 0 and 185.
12. For Dead-letter queue, choose whether to use a standard Amazon SQS queue as a dead-letter
queue. EventBridge sends events that match this rule to the dead-letter queue if it can't deliver
them to the target. Do one of the following:
• Choose None to not use a dead-letter queue.
• Choose Select an Amazon SQS queue in the current AWS account to use as the dead-letter
queue and then select the queue to use from the drop-down list.
• Choose Select an Amazon SQS queue in an other AWS account as a dead-letter queue and
then enter the ARN of the queue to use. You must attach a resource-based policy to the queue
that grants EventBridge permission to send messages to it.
13. (Optional) Enter one or more tags for the rule.
14. Choose Create.
156
AWS Batch User Guide
Tutorial: Sending Amazon Simple Notification
Service Alerts for Failed Job Events
Prerequisites
This tutorial assumes that you have a working compute environment and job queue that are ready to
accept jobs. If you don't have a running compute environment and job queue to capture events from,
follow the steps in Getting Started with AWS Batch (p. 9) to create one.
A rule can't have the same name as another rule in the same Region and on the same event bus.
5. For Define pattern, select Event Pattern as the event source, and then select Custom pattern.
6. Paste the following event pattern into the text area.
{
"detail-type": [
"Batch Job State Change"
],
"source": [
"aws.batch"
],
157
AWS Batch User Guide
Step 3: Test Your Rule
"detail": {
"status": [
"FAILED"
]
}
}
This code defines a EventBridge rule that matches any event where the job status is FAILED. For
more information about event patterns, see Events and Event Patterns in the Amazon EventBridge
User Guide.
7. This rule applies across all of your AWS Batch groups and to every AWS Batch event. Alternatively,
you can create a more specific rule to filter out some results.
8. For Select event bus, choose AWS default event bus. You can only create scheduled rules on the
default event bus.
9. For Select targets, in Target, choose SNS topic, and select JobFailedAlert.
10. For Retry policy and dead-letter queue:, under Retry policy:
a. For Maximum age of event, enter a value between 1 minute (00:01) and 24 hours (24:00).
b. For Retry attempts, enter a number between 0 and 185.
c. For Dead-letter queue, choose whether to use a standard Amazon SQS queue as a dead-letter
queue. EventBridge sends events that match this rule to the dead-letter queue if it can't deliver
them to the target. Do one of the following:
•Choose None to not use a dead-letter queue.
•Choose Select an Amazon SQS queue in the current AWS account to use as the dead-
letter queue and then select the queue to use from the drop-down list.
• Choose Select an Amazon SQS queue in an other AWS account as a dead-letter queue
and then enter the ARN of the queue to use.
11. (Optional) Enter one or more tags for the rule.
12. Choose Create.
To test a rule
3. Check your email to confirm that you received an email alert for the failed job notification.
158
AWS Batch User Guide
CloudWatch Logs IAM Policy
For information about sending logs from your jobs to CloudWatch Logs, see Using the awslogs log
driver (p. 64). For more information about CloudWatch Logs, see Monitoring Log Files in the Amazon
CloudWatch User Guide.
Topics
• CloudWatch Logs IAM Policy (p. 159)
• Installing and configuring the CloudWatch agent (p. 160)
• Viewing CloudWatch Logs (p. 160)
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents",
"logs:DescribeLogStreams"
],
"Resource": [
"arn:aws:logs:*:*:*"
]
}
]
}
159
AWS Batch User Guide
Installing and configuring the CloudWatch agent
For more information, see Download and configure the CloudWatch agent using the command line in the
Amazon CloudWatch User Guide.
160
AWS Batch User Guide
Viewing CloudWatch Logs
4. Choose a log stream to view. By default, the streams are identified by the first 200 characters of the
job name and the Amazon ECS task ID.
161
AWS Batch User Guide
AWS Batch Information in CloudTrail
To learn more about CloudTrail, see the AWS CloudTrail User Guide.
For an ongoing record of events in your AWS account, including events for AWS Batch, create a trail.
A trail enables CloudTrail to deliver log files to an Amazon S3 bucket. By default, when you create a
trail in the console, the trail applies to all AWS Regions. The trail logs events from all Regions in the
AWS partition and delivers the log files to the Amazon S3 bucket that you specify. Additionally, you can
configure other AWS services to further analyze and act upon the event data collected in CloudTrail logs.
For more information, see the following:
All AWS Batch actions are logged by CloudTrail and are documented in the https://
docs.aws.amazon.com/batch/latest/APIReference/. For example, calls to the SubmitJob, ListJobs and
DescribeJobs sections generate entries in the CloudTrail log files.
Every event or log entry contains information about who generated the request. The identity
information helps you determine the following:
• Whether the request was made with root or AWS Identity and Access Management (IAM) user
credentials.
• Whether the request was made with temporary security credentials for a role or federated user.
• Whether the request was made by another AWS service.
162
AWS Batch User Guide
Understanding AWS Batch Log File Entries
The following example shows a CloudTrail log entry that demonstrates the
CreateComputeEnvironment action.
{
"eventVersion": "1.05",
"userIdentity": {
"type": "AssumedRole",
"principalId": "AIDACKCEVSQ6C2EXAMPLE:admin",
"arn": "arn:aws:sts::012345678910:assumed-role/Admin/admin",
"accountId": "012345678910",
"accessKeyId": "AKIAIOSFODNN7EXAMPLE",
"sessionContext": {
"attributes": {
"mfaAuthenticated": "false",
"creationDate": "2017-12-20T00:48:46Z"
},
"sessionIssuer": {
"type": "Role",
"principalId": "AIDACKCEVSQ6C2EXAMPLE",
"arn": "arn:aws:iam::012345678910:role/Admin",
"accountId": "012345678910",
"userName": "Admin"
}
}
},
"eventTime": "2017-12-20T00:48:46Z",
"eventSource": "batch.amazonaws.com",
"eventName": "CreateComputeEnvironment",
"awsRegion": "us-east-1",
"sourceIPAddress": "203.0.113.1",
"userAgent": "aws-cli/1.11.167 Python/2.7.10 Darwin/16.7.0 botocore/1.7.25",
"requestParameters": {
"computeResources": {
"subnets": [
"subnet-5eda8e04"
],
"tags": {
"testBatchTags": "CLI testing CE"
},
"desiredvCpus": 0,
"minvCpus": 0,
"instanceTypes": [
"optimal"
],
"securityGroupIds": [
"sg-aba9e8db"
],
"instanceRole": "ecsInstanceRole",
"maxvCpus": 128,
"type": "EC2"
},
"state": "ENABLED",
"type": "MANAGED",
"serviceRole": "service-role/AWSBatchServiceRole",
"computeEnvironmentName": "Test"
163
AWS Batch User Guide
Understanding AWS Batch Log File Entries
},
"responseElements": {
"computeEnvironmentName": "Test",
"computeEnvironmentArn": "arn:aws:batch:us-east-1:012345678910:compute-environment/
Test"
},
"requestID": "890b8639-e51f-11e7-b038-EXAMPLE",
"eventID": "874f89fa-70fc-4798-bc00-EXAMPLE",
"readOnly": false,
"eventType": "AwsApiCall",
"recipientAccountId": "012345678910"
}
164
AWS Batch User Guide
Step 1: Create an Elastic IP Address for Your NAT Gateway
This tutorial guides you through creating a VPC with two public subnets and two private subnets, which
are provided with internet access through a NAT gateway.
Non-default subnets, such as those created by the VPC wizard, are not auto-assigned public IPv4
addresses. Instances launched in the public subnet must be assigned a public IPv4 address to
communicate with the Amazon ECS service endpoint.
165
AWS Batch User Guide
Step 3: Create Additional Subnets
1. In the left navigation pane, choose Subnets and then Create Subnet.
2. For Name tag, enter a name for your subnet, such as Public subnet.
3. For VPC, choose the VPC that you created earlier.
4. For Availability Zone, choose the same Availability Zone as the additional private subnet that you
created in the previous procedure.
5. For IPv4 CIDR block, enter a valid CIDR block. For example, the wizard creates CIDR blocks in
10.0.0.0/24 and 10.0.1.0/24 by default. You could use 10.0.2.0/24 for your second public subnet.
6. Choose Yes, Create.
7. Select the public subnet that you just created and choose Route Table, Edit.
8. By default, the private route table is selected. Choose the other available route table so that the
0.0.0.0/0 destination is routed to the internet gateway (igw-xxxxxxxx) and choose Save.
9. With your second public subnet still selected, choose Subnet Actions, Modify auto-assign IP
settings.
10. Select Enable auto-assign public IPv4 address and choose Save, Close.
Next Steps
After you have created your VPC, you should consider the following next steps:
• Create security groups for your public and private resources if they require inbound network access.
For more information, see Working with Security Groups in the Amazon VPC User Guide.
166
AWS Batch User Guide
Next Steps
• Create an AWS Batch managed compute environment that launches compute resources into your
new VPC. For more information, see Creating a compute environment (p. 99). If you use the compute
environment creation wizard in the AWS Batch console, you can specify the VPC that you just created
and the public or private subnets into which to launch your instances, depending on your use case.
• Create an AWS Batch job queue that is mapped to your new compute environment. For more
information, see Creating a job queue (p. 82).
• Create a job definition to run your jobs with. For more information, see Creating a job
definition (p. 31).
• Submit a job with your job definition to your new job queue. This job will land in the compute
environment you created with your new VPC and subnets. For more information, see Submitting a
Job (p. 14).
167
AWS Batch User Guide
Identity and Access Management
• Security of the cloud – AWS is responsible for protecting the infrastructure that runs AWS services in
the AWS Cloud. AWS also provides you with services that you can use securely. Third-party auditors
regularly test and verify the effectiveness of our security as part of the AWS Compliance Programs.
To learn about the compliance programs that apply to AWS Batch, see AWS Services in Scope by
Compliance Program.
• Security in the cloud – Your responsibility is determined by the AWS service that you use. You are also
responsible for other factors including the sensitivity of your data, your company's requirements, and
applicable laws and regulations.
This documentation helps you understand how to apply the shared responsibility model when using AWS
Batch. The following topics show you how to configure AWS Batch to meet your security and compliance
objectives. You also learn how to use other AWS services that help you to monitor and secure your AWS
Batch resources.
Topics
• Identity and Access Management for AWS Batch (p. 168)
• Compliance Validation for AWS Batch (p. 195)
• Infrastructure Security in AWS Batch (p. 196)
Topics
• Audience (p. 168)
• Authenticating with identities (p. 169)
• Managing access using policies (p. 170)
• How AWS Batch works with IAM (p. 172)
• AWS Batch execution IAM role (p. 176)
• Identity-based policy examples for AWS Batch (p. 178)
• Troubleshooting AWS Batch identity and access (p. 179)
• Using service-linked roles for AWS Batch (p. 181)
• AWS managed policies for AWS Batch (p. 189)
Audience
How you use AWS Identity and Access Management (IAM) differs, depending on the work you do in AWS
Batch.
168
AWS Batch User Guide
Authenticating with identities
Service user – If you use the AWS Batch service to do your job, then your administrator provides you with
the credentials and permissions that you need. As you use more AWS Batch features to do your work, you
might need additional permissions. Understanding how access is managed can help you request the right
permissions from your administrator. If you cannot access a feature in AWS Batch, see Troubleshooting
AWS Batch identity and access (p. 179).
Service administrator – If you're in charge of AWS Batch resources at your company, you probably
have full access to AWS Batch. It's your job to determine which AWS Batch features and resources your
employees should access. You must then submit requests to your IAM administrator to change the
permissions of your service users. Review the information on this page to understand the basic concepts
of IAM. To learn more about how your company can use IAM with AWS Batch, see How AWS Batch works
with IAM (p. 172).
IAM administrator – If you're an IAM administrator, you might want to learn details about how you can
write policies to manage access to AWS Batch. To view example AWS Batch identity-based policies that
you can use in IAM, see Identity-based policy examples for AWS Batch (p. 178).
You must be authenticated (signed in to AWS) as the AWS account root user, an IAM user, or by assuming
an IAM role. You can also use your company's single sign-on authentication, or even sign in using Google
or Facebook. In these cases, your administrator previously set up identity federation using IAM roles.
When you access AWS using credentials from another company, you are assuming a role indirectly.
To sign in directly to the AWS Management Console, use your password with your root user email or your
IAM user name. You can access AWS programmatically using your root user or IAM user access keys. AWS
provides SDK and command line tools to cryptographically sign your request using your credentials. If
you don't use AWS tools, you must sign the request yourself. Do this using Signature Version 4, a protocol
for authenticating inbound API requests. For more information about authenticating requests, see
Signature Version 4 Signing Process in the AWS General Reference.
Regardless of the authentication method that you use, you might also be required to provide additional
security information. For example, AWS recommends that you use multi-factor authentication (MFA) to
increase the security of your account. To learn more, see Using Multi-Factor Authentication (MFA) in AWS
in the IAM User Guide.
169
AWS Batch User Guide
Managing access using policies
pair. You cannot recover the secret access key in the future. Instead, you must generate a new access key
pair.
An IAM group is an identity that specifies a collection of IAM users. You can't sign in as a group. You
can use groups to specify permissions for multiple users at a time. Groups make permissions easier to
manage for large sets of users. For example, you could have a group named IAMAdmins and give that
group permissions to administer IAM resources.
Users are different from roles. A user is uniquely associated with one person or application, but a role
is intended to be assumable by anyone who needs it. Users have permanent long-term credentials, but
roles provide temporary credentials. To learn more, see When to Create an IAM User (Instead of a Role) in
the IAM User Guide.
IAM roles
An IAM role is an identity within your AWS account that has specific permissions. It is similar to an IAM
user, but isn't associated with a specific person. You can temporarily assume an IAM role in the AWS
Management Console by switching roles. You can assume a role by calling an AWS CLI or AWS API
operation or by using a custom URL. For more information about methods for using roles, see Using IAM
Roles in the IAM User Guide.
IAM roles with temporary credentials are useful in the following situations.
• Temporary IAM user permissions – An IAM user can assume an IAM role to temporarily take on
different permissions for a specific task.
• Federated user access – Instead of creating an IAM user, you can use existing identities from AWS
Directory Service, your enterprise user directory, or a web identity provider. These are known as
federated users. AWS assigns a role to a federated user when access is requested through an identity
provider. For more information about federated users, see Federated users and roles in the IAM User
Guide.
• Cross-account access – You can use an IAM role to allow someone (a trusted principal) in a different
account to access resources in your account. Roles are the primary way to grant cross-account access.
However, with some AWS services, you can attach a policy directly to a resource (instead of using a role
as a proxy). To learn the difference between roles and resource-based policies for cross-account access,
see How IAM Roles Differ from Resource-based Policies in the IAM User Guide.
• AWS service access – A service role is an IAM role that a service assumes to perform actions on your
behalf. An IAM administrator can create, modify, and delete a service role from within IAM. For more
information, see Creating a role to delegate permissions to an AWS service in the IAM User Guide.
• Applications running on Amazon EC2 – You can use an IAM role to manage temporary credentials
for applications that are running on an EC2 instance and making AWS CLI or AWS API requests.
This is preferable to storing access keys within the EC2 instance. To assign an AWS role to an EC2
instance and make it available to all of its applications, you create an instance profile that is attached
to the instance. An instance profile contains the role and enables programs that are running on the
EC2 instance to get temporary credentials. For more information, see Using an IAM role to grant
permissions to applications running on Amazon EC2 instances in the IAM User Guide.
To learn whether to use IAM roles, see When to Create an IAM Role (Instead of a User) in the IAM User
Guide.
170
AWS Batch User Guide
Managing access using policies
in AWS as JSON documents. For more information about the structure and contents of JSON policy
documents, see Overview of JSON Policies in the IAM User Guide.
An IAM administrator can use policies to specify who has access to AWS resources, and what actions
they can perform on those resources. Every IAM entity (user or role) starts with no permissions. In other
words, by default, users can do nothing, not even change their own password. To give a user permission
to do something, an administrator must attach a permissions policy to a user. Or the administrator can
add the user to a group that has the intended permissions. When an administrator gives permissions to a
group, all users in that group are granted those permissions.
IAM policies define permissions for an action regardless of the method that you use to perform the
operation. For example, suppose that you have a policy that allows the iam:GetRole action. A user with
that policy can get role information from the AWS Management Console, the AWS CLI, or the AWS API.
Identity-based policies
Identity-based policies are JSON permissions policy documents that you can attach to an identity, such
as an IAM user, role, or group. These policies control what actions that identity can perform, on which
resources, and under what conditions. To learn how to create an identity-based policy, see Creating IAM
Policies in the IAM User Guide.
Identity-based policies can be further categorized as inline policies or managed policies. Inline policies
are embedded directly into a single user, group, or role. Managed policies are standalone policies that
you can attach to multiple users, groups, and roles in your AWS account. Managed policies include AWS
managed policies and customer managed policies. To learn how to choose between a managed policy or
an inline policy, see Choosing Between Managed Policies and Inline Policies in the IAM User Guide.
Resource-based policies
Resource-based policies are JSON policy documents that you attach to a resource such as an Amazon S3
bucket. Service administrators can use these policies to define what actions a specified principal (account
member, user, or role) can perform on that resource and under what conditions. Resource-based policies
are inline policies. There are no managed resource-based policies.
• Permissions boundaries – A permissions boundary is an advanced feature in which you set the
maximum permissions that an identity-based policy can grant to an IAM entity (IAM user or role).
You can set a permissions boundary for an entity. The resulting permissions are the intersection of
entity's identity-based policies and its permissions boundaries. Resource-based policies that specify
the user or role in the Principal field aren't limited by the permissions boundary. An explicit deny
in any of these policies overrides the allow. For more information about permissions boundaries, see
Permissions Boundaries for IAM Entities in the IAM User Guide.
• Service control policies (SCPs) – SCPs are JSON policies that specify the maximum permissions for
an organization or organizational unit (OU) in AWS Organizations. AWS Organizations is a service for
grouping and centrally managing multiple AWS accounts that your business owns. If you enable all
171
AWS Batch User Guide
How AWS Batch works with IAM
features in an organization, then you can apply service control policies (SCPs) to any or all of your
accounts. The SCP limits permissions for entities in member accounts, including each AWS account
root user. For more information about Organizations and SCPs, see How SCPs Work in the AWS
Organizations User Guide.
• Session policies – Session policies are advanced policies that you pass as a parameter when you
programmatically create a temporary session for a role or federated user. The resulting session's
permissions are the intersection of the user or role's identity-based policies and the session policies.
Permissions can also come from a resource-based policy. An explicit deny in any of these policies
overrides the allow. For more information, see Session Policies in the IAM User Guide.
To get a high-level view of how AWS Batch and other AWS services work with most IAM features, see
AWS services that work with IAM in the IAM User Guide.
172
AWS Batch User Guide
How AWS Batch works with IAM
Identity-based policies are JSON permissions policy documents that you can attach to an identity, such
as an IAM user, group of users, or role. These policies control what actions users and roles can perform,
on which resources, and under what conditions. To learn how to create an identity-based policy, see
Creating IAM policies in the IAM User Guide.
With IAM identity-based policies, you can specify allowed or denied actions and resources as well as the
conditions under which actions are allowed or denied. You can't specify the principal in an identity-based
policy because it applies to the user or role to which it is attached. To learn about all of the elements
that you can use in a JSON policy, see IAM JSON policy elements reference in the IAM User Guide.
To view examples of AWS Batch identity-based policies, see Identity-based policy examples for AWS
Batch (p. 178).
Resource-based policies are JSON policy documents that you attach to a resource. Examples of resource-
based policies are IAM role trust policies and Amazon S3 bucket policies. In services that support resource-
based policies, service administrators can use them to control access to a specific resource. For the
resource where the policy is attached, the policy defines what actions a specified principal can perform
on that resource and under what conditions. You must specify a principal in a resource-based policy.
Principals can include accounts, users, roles, federated users, or AWS services.
To enable cross-account access, you can specify an entire account or IAM entities in another account as
the principal in a resource-based policy. Adding a cross-account principal to a resource-based policy is
only half of establishing the trust relationship. When the principal and the resource are in different AWS
accounts, an IAM administrator in the trusted account must also grant the principal entity (user or role)
permission to access the resource. They grant permission by attaching an identity-based policy to the
entity. However, if a resource-based policy grants access to a principal in the same account, no additional
identity-based policy is required. For more information, see How IAM roles differ from resource-based
policies in the IAM User Guide.
The Action element of an IAM identity-based policy describes the specific action or actions that will be
allowed or denied by the policy. Policy actions usually have the same name as the associated AWS API
operation. The action is used in a policy to grant permissions to perform the associated operation.
To see a list of AWS Batch actions, see Actions Defined by AWS Batch in the Service Authorization
Reference.
Policy actions in AWS Batch use the following prefix before the action:
batch
173
AWS Batch User Guide
How AWS Batch works with IAM
"Action": [
"batch:action1",
"batch:action2"
]
You can specify multiple actions using wildcards (*). For example, to specify all actions that begin with
the word Describe, include the following action:
"Action": "batch:Describe*"
To view examples of AWS Batch identity-based policies, see Identity-based policy examples for AWS
Batch (p. 178).
The Resource element specifies the object or objects to which the action applies. Statements must
include either a Resource or a NotResource element. You specify a resource using an ARN or using the
wildcard (*) to indicate that the statement applies to all resources.
To see a list of AWS Batch resource types and their ARNs, see Resources Defined by AWS Batch in the
Service Authorization Reference. To learn with which actions you can specify the ARN of each resource,
see Actions Defined by AWS Batch.
To view examples of AWS Batch identity-based policies, see Identity-based policy examples for AWS
Batch (p. 178).
The Condition element (or Condition block) lets you specify conditions in which a statement is in
effect. The Condition element is optional. You can build conditional expressions that use condition
operators, such as equals or less than, to match the condition in the policy with values in the request.
If you specify multiple Condition elements in a statement, or multiple keys in a single Condition
element, AWS evaluates them using a logical AND operation. If you specify multiple values for a single
condition key, AWS evaluates the condition using a logical OR operation. All of the conditions must be
met before the statement's permissions are granted.
You can also use placeholder variables when you specify conditions. For example, you can grant an IAM
user permission to access a resource only if it is tagged with their IAM user name. For more information,
see IAM Policy Elements: Variables and Tags in the IAM User Guide.
To see a list of AWS Batch condition keys, see Condition Keys for AWS Batch in the Service Authorization
Reference. To learn with which actions and resources you can use a condition key, see Actions Defined by
AWS Batch.
To view examples of AWS Batch identity-based policies, see Identity-based policy examples for AWS
Batch (p. 178).
174
AWS Batch User Guide
How AWS Batch works with IAM
Supports ACLs No
Access control lists (ACLs) control which principals (account members, users, or roles) have permissions to
access a resource. ACLs are similar to resource-based policies, although they do not use the JSON policy
document format.
Attribute-based access control (ABAC) is an authorization strategy that defines permissions based on
attributes. In AWS, these attributes are called tags. You can attach tags to IAM entities (users or roles)
and to many AWS resources. Tagging entities and resources is the first step of ABAC. Then you design
ABAC policies to allow operations when the principal's tag matches the tag on the resource that they are
trying to access.
ABAC is helpful in environments that are growing rapidly and helps with situations where policy
management becomes cumbersome.
To control access based on tags, you provide tag information in the condition element of a policy using
the aws:ResourceTag/key-name, aws:RequestTag/key-name, or aws:TagKeys condition keys.
For more information about ABAC, see What is ABAC? in the IAM User Guide. To view a tutorial with steps
for setting up ABAC, see Use attribute-based access control (ABAC) in the IAM User Guide.
Some AWS services don't work when you sign in using temporary credentials. For additional information,
including which AWS services work with temporary credentials, see AWS services that work with IAM in
the IAM User Guide.
You are using temporary credentials if you sign in to the AWS Management Console using any method
except a user name and password. For example, when you access AWS using your company's single
sign-on (SSO) link, that process automatically creates temporary credentials. You also automatically
create temporary credentials when you sign in to the console as a user and then switch roles. For more
information about switching roles, see Switching to a role (console) in the IAM User Guide.
You can manually create temporary credentials using the AWS CLI or AWS API. You can then use those
temporary credentials to access AWS. AWS recommends that you dynamically generate temporary
credentials instead of using long-term access keys. For more information, see Temporary security
credentials in IAM.
175
AWS Batch User Guide
Execution IAM role
When you use an IAM user or role to perform actions in AWS, you are considered a principal. Policies
grant permissions to a principal. When you use some services, you might perform an action that
then triggers another action in a different service. In this case, you must have permissions to perform
both actions. To see whether an action requires additional dependent actions in a policy, see Actions,
Resources, and Condition Keys for AWS Batch in the Service Authorization Reference.
A service role is an IAM role that a service assumes to perform actions on your behalf. An IAM
administrator can create, modify, and delete a service role from within IAM. For more information, see
Creating a role to delegate permissions to an AWS service in the IAM User Guide.
Warning
Changing the permissions for a service role might break AWS Batch functionality. Edit service
roles only when AWS Batch provides guidance to do so.
A service-linked role is a type of service role that is linked to an AWS service. The service can assume the
role to perform an action on your behalf. Service-linked roles appear in your IAM account and are owned
by the service. An IAM administrator can view, but not edit the permissions for service-linked roles.
For details about creating or managing service-linked roles, see AWS services that work with IAM. Find
a service in the table that includes a Yes in the Service-linked role column. Choose the Yes link to view
the service-linked role documentation for that service.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ecr:GetAuthorizationToken",
"ecr:BatchCheckLayerAvailability",
"ecr:GetDownloadUrlForLayer",
"ecr:BatchGetImage",
"logs:CreateLogStream",
176
AWS Batch User Guide
Execution IAM role
"logs:PutLogEvents"
],
"Resource": "*"
}
]
}
An execution role is automatically created for you in the AWS Batch console first-run experience;
however, you should manually attach the managed IAM policy for tasks to allow Amazon ECS to add
permissions for future features and enhancements as they are introduced. You can use the following
procedure to check and see if your account already has the execution role and to attach the managed
IAM policy if needed.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "",
"Effect": "Allow",
"Principal": {
"Service": "ecs-tasks.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
177
AWS Batch User Guide
Identity-based policy examples
4. In the Choose a use case section, in the Or select a service to view its use cases section, choose
Elastic Container Service.
5. For Select your use case, choose Elastic Container Service Task, then choose Next: Permissions.
6. In the Attach permissions policy section, search for AmazonECSTaskExecutionRolePolicy, select
the policy, and then choose Next: Tags, and then Next: Review.
7. For Role Name, type ecsTaskExecutionRole and choose Create role.
To learn how to create an IAM identity-based policy using these example JSON policy documents, see
Creating IAM policies in the IAM User Guide.
Topics
• Policy best practices (p. 178)
• Using the AWS Batch console (p. 178)
• Allow users to view their own permissions (p. 179)
• Get Started Using AWS Managed Policies – To start using AWS Batch quickly, use AWS managed
policies to give your employees the permissions they need. These policies are already available in
your account and are maintained and updated by AWS. For more information, see Get Started Using
Permissions With AWS Managed Policies in the IAM User Guide.
• Grant Least Privilege – When you create custom policies, grant only the permissions required
to perform a task. Start with a minimum set of permissions and grant additional permissions as
necessary. Doing so is more secure than starting with permissions that are too lenient and then trying
to tighten them later. For more information, see Grant Least Privilege in the IAM User Guide.
• Enable MFA for Sensitive Operations – For extra security, require IAM users to use multi-factor
authentication (MFA) to access sensitive resources or API operations. For more information, see Using
Multi-Factor Authentication (MFA) in AWS in the IAM User Guide.
• Use Policy Conditions for Extra Security – To the extent that it's practical, define the conditions under
which your identity-based policies allow access to a resource. For example, you can write conditions to
specify a range of allowable IP addresses that a request must come from. You can also write conditions
to allow requests only within a specified date or time range, or to require the use of SSL or MFA. For
more information, see IAM JSON Policy Elements: Condition in the IAM User Guide.
178
AWS Batch User Guide
Troubleshooting
You don't need to allow minimum console permissions for users that are making calls only to the AWS
CLI or the AWS API. Instead, allow access to only the actions that match the API operation that you're
trying to perform.
To ensure that users and roles can still use the AWS Batch console, also attach the AWS Batch
ConsoleAccess or ReadOnly AWS managed policy to the entities. For more information, see Adding
permissions to a user in the IAM User Guide.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "ViewOwnUserInfo",
"Effect": "Allow",
"Action": [
"iam:GetUserPolicy",
"iam:ListGroupsForUser",
"iam:ListAttachedUserPolicies",
"iam:ListUserPolicies",
"iam:GetUser"
],
"Resource": ["arn:aws:iam::*:user/${aws:username}"]
},
{
"Sid": "NavigateInConsole",
"Effect": "Allow",
"Action": [
"iam:GetGroupPolicy",
"iam:GetPolicyVersion",
"iam:GetPolicy",
"iam:ListAttachedGroupPolicies",
"iam:ListGroupPolicies",
"iam:ListPolicyVersions",
"iam:ListPolicies",
"iam:ListUsers"
],
"Resource": "*"
}
]
}
Topics
• I am not authorized to perform an action in AWS Batch (p. 180)
• I am not authorized to perform iam:PassRole (p. 180)
• I want to view my access keys (p. 180)
179
AWS Batch User Guide
Troubleshooting
• I'm an administrator and want to allow others to access AWS Batch (p. 181)
• I want to allow people outside of my AWS account to access my AWS Batch resources (p. 181)
The following example error occurs when the mateojackson IAM user tries to use the console
to view details about a fictional my-example-widget resource but does not have the fictional
batch:GetWidget permissions.
In this case, Mateo asks his administrator to update his policies to allow him to access the my-example-
widget resource using the batch:GetWidget action.
Some AWS services allow you to pass an existing role to that service, instead of creating a new service
role or service-linked role. To do this, you must have permissions to pass the role to the service.
The following example error occurs when an IAM user named marymajor tries to use the console to
perform an action in AWS Batch. However, the action requires the service to have permissions granted by
a service role. Mary doesn't have permissions to pass the role to the service.
In this case, Mary asks her administrator to update her policies to allow her to perform the
iam:PassRole action.
Access keys consist of two parts: an access key ID (for example, AKIAIOSFODNN7EXAMPLE) and a secret
access key (for example, wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY). Like a user name and
password, you must use both the access key ID and secret access key together to authenticate your
requests. Manage your access keys as securely as you do your user name and password.
Important
Don't provide your access keys to a third party, even to help find your canonical user ID. By
doing this, you might give someone permanent access to your account.
When you create an access key pair, you are prompted to save the access key ID and secret access key in
a secure location. The secret access key is available only at the time you create it. If you lose your secret
access key, you must add new access keys to your IAM user. You can have a maximum of two access keys.
180
AWS Batch User Guide
Using Service-Linked Roles
If you already have two, you must delete one key pair before creating a new one. To view instructions,
see Managing Access Keys in the IAM User Guide.
To get started right away, see Creating Your First IAM Delegated User and Group in the IAM User Guide.
• To learn whether AWS Batch supports these features, see How AWS Batch works with IAM (p. 172).
• To learn how to provide access to your resources across AWS accounts that you own, see Providing
Access to an IAM User in Another AWS Account That You Own in the IAM User Guide.
• To learn how to provide access to your resources to third-party AWS accounts, see Providing Access to
AWS Accounts Owned by Third Parties in the IAM User Guide.
• To learn how to provide access through identity federation, see Providing Access to Externally
Authenticated Users (Identity Federation) in the IAM User Guide.
• To learn the difference between using roles and resource-based policies for cross-account access, see
How IAM Roles Differ from Resource-based Policies in the IAM User Guide.
A service-linked role makes setting up AWS Batch easier because you don’t have to manually add the
necessary permissions. AWS Batch defines the permissions of its service-linked roles, and unless defined
otherwise, only AWS Batch can assume its roles. The defined permissions include the trust policy and the
permissions policy, and that permissions policy cannot be attached to any other IAM entity.
You can delete a service-linked role only after first deleting their related resources. This protects your
AWS Batch resources because you can't inadvertently remove permission to access the resources.
For information about other services that support service-linked roles, see AWS Services That Work with
IAM and look for the services that have Yes in the Service-Linked Role column. Choose a Yes with a link
to view the service-linked role documentation for that service.
181
AWS Batch User Guide
Using Service-Linked Roles
The role permissions policy allows AWS Batch to complete the following actions on the specified
resources.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ec2:DescribeAccountAttributes",
"ec2:DescribeInstances",
"ec2:DescribeInstanceStatus",
"ec2:DescribeInstanceAttribute",
"ec2:DescribeSubnets",
"ec2:DescribeSecurityGroups",
"ec2:DescribeKeyPairs",
"ec2:DescribeImages",
"ec2:DescribeImageAttribute",
"ec2:DescribeSpotInstanceRequests",
"ec2:DescribeSpotFleetInstances",
"ec2:DescribeSpotFleetRequests",
"ec2:DescribeSpotPriceHistory",
"ec2:DescribeVpcClassicLink",
"ec2:DescribeLaunchTemplateVersions",
"ec2:RequestSpotFleet",
"autoscaling:DescribeAccountLimits",
"autoscaling:DescribeAutoScalingGroups",
"autoscaling:DescribeLaunchConfigurations",
"autoscaling:DescribeAutoScalingInstances",
"ecs:DescribeClusters",
"ecs:DescribeContainerInstances",
"ecs:DescribeTaskDefinition",
"ecs:DescribeTasks",
"ecs:ListClusters",
"ecs:ListContainerInstances",
"ecs:ListTaskDefinitionFamilies",
"ecs:ListTaskDefinitions",
"ecs:ListTasks",
"ecs:DeregisterTaskDefinition",
"ecs:TagResource",
"ecs:ListAccountSettings",
"logs:DescribeLogGroups",
"iam:GetInstanceProfile",
"iam:GetRole"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream"
],
"Resource": "arn:aws:logs:*:*:log-group:/aws/batch/job*"
},
{
"Effect": "Allow",
"Action": [
"logs:PutLogEvents"
],
"Resource": "arn:aws:logs:*:*:log-group:/aws/batch/job*:log-stream:*"
},
182
AWS Batch User Guide
Using Service-Linked Roles
{
"Effect": "Allow",
"Action": [
"autoscaling:CreateOrUpdateTags"
],
"Resource": "*",
"Condition": {
"Null": {
"aws:RequestTag/AWSBatchServiceTag": "false"
}
}
},
{
"Effect": "Allow",
"Action": "iam:PassRole",
"Resource": [
"*"
],
"Condition": {
"StringEquals": {
"iam:PassedToService": [
"ec2.amazonaws.com",
"ec2.amazonaws.com.cn",
"ecs-tasks.amazonaws.com"
]
}
}
},
{
"Effect": "Allow",
"Action": "iam:CreateServiceLinkedRole",
"Resource": "*",
"Condition": {
"StringEquals": {
"iam:AWSServiceName": [
"spot.amazonaws.com",
"spotfleet.amazonaws.com",
"autoscaling.amazonaws.com",
"ecs.amazonaws.com"
]
}
}
},
{
"Effect": "Allow",
"Action": [
"ec2:CreateLaunchTemplate"
],
"Resource": "*",
"Condition": {
"Null": {
"aws:RequestTag/AWSBatchServiceTag": "false"
}
}
},
{
"Effect": "Allow",
"Action": [
"ec2:TerminateInstances",
"ec2:CancelSpotFleetRequests",
"ec2:ModifySpotFleetRequest",
"ec2:DeleteLaunchTemplate"
],
"Resource": "*",
"Condition": {
"Null": {
183
AWS Batch User Guide
Using Service-Linked Roles
"aws:ResourceTag/AWSBatchServiceTag": "false"
}
}
},
{
"Effect": "Allow",
"Action": [
"autoscaling:CreateLaunchConfiguration",
"autoscaling:DeleteLaunchConfiguration"
],
"Resource":
"arn:aws:autoscaling:*:*:launchConfiguration:*:launchConfigurationName/AWSBatch*"
},
{
"Effect": "Allow",
"Action": [
"autoscaling:CreateAutoScalingGroup",
"autoscaling:UpdateAutoScalingGroup",
"autoscaling:SetDesiredCapacity",
"autoscaling:DeleteAutoScalingGroup",
"autoscaling:SuspendProcesses",
"autoscaling:PutNotificationConfiguration",
"autoscaling:TerminateInstanceInAutoScalingGroup"
],
"Resource": "arn:aws:autoscaling:*:*:autoScalingGroup:*:autoScalingGroupName/
AWSBatch*"
},
{
"Effect": "Allow",
"Action": [
"ecs:DeleteCluster",
"ecs:DeregisterContainerInstance",
"ecs:RunTask",
"ecs:StartTask",
"ecs:StopTask"
],
"Resource": "arn:aws:ecs:*:*:cluster/AWSBatch*"
},
{
"Effect": "Allow",
"Action": [
"ecs:RunTask",
"ecs:StartTask",
"ecs:StopTask"
],
"Resource": "arn:aws:ecs:*:*:task-definition/*"
},
{
"Effect": "Allow",
"Action": [
"ecs:StopTask"
],
"Resource": "arn:aws:ecs:*:*:task/*/*"
},
{
"Effect": "Allow",
"Action": [
"ecs:CreateCluster",
"ecs:RegisterTaskDefinition"
],
"Resource": "*",
"Condition": {
"Null": {
"aws:RequestTag/AWSBatchServiceTag": "false"
}
}
184
AWS Batch User Guide
Using Service-Linked Roles
},
{
"Effect": "Allow",
"Action": "ec2:RunInstances",
"Resource": [
"arn:aws:ec2:*::image/*",
"arn:aws:ec2:*::snapshot/*",
"arn:aws:ec2:*:*:subnet/*",
"arn:aws:ec2:*:*:network-interface/*",
"arn:aws:ec2:*:*:security-group/*",
"arn:aws:ec2:*:*:volume/*",
"arn:aws:ec2:*:*:key-pair/*",
"arn:aws:ec2:*:*:launch-template/*",
"arn:aws:ec2:*:*:placement-group/*",
"arn:aws:ec2:*:*:capacity-reservation/*",
"arn:aws:ec2:*:*:elastic-gpu/*",
"arn:aws:elastic-inference:*:*:elastic-inference-accelerator/*"
]
},
{
"Effect": "Allow",
"Action": "ec2:RunInstances",
"Resource": "arn:aws:ec2:*:*:instance/*",
"Condition": {
"Null": {
"aws:RequestTag/AWSBatchServiceTag": "false"
}
}
},
{
"Effect": "Allow",
"Action": [
"ec2:CreateTags"
],
"Resource": [
"*"
],
"Condition": {
"StringEquals": {
"ec2:CreateAction": [
"RunInstances",
"CreateLaunchTemplate",
"RequestSpotFleet"
]
}
}
}
]
}
You must configure permissions to allow an IAM entity (such as a user, group, or role) to create, edit, or
delete a service-linked role. For more information, see Service-Linked Role Permissions in the IAM User
Guide.
185
AWS Batch User Guide
Using Service-Linked Roles
before March 10, 2021, when it began supporting service-linked roles, then AWS Batch created
the AWSServiceRoleForBatch role in your account. To learn more, see A New Role Appeared in
My IAM Account.
If you delete this service-linked role, and then need to create it again, you can use the same process to
recreate the role in your account. When you CreateComputeEnvironment, AWS Batch creates the service-
linked role for you again.
To allow an IAM entity to edit the description of the AWSServiceRoleForBatch service-linked role
Add the following statement to the permissions policy. This allows the IAM entity to edit the description
of a service-linked role.
{
"Effect": "Allow",
"Action": [
"iam:UpdateRoleDescription"
],
"Resource": "arn:aws:iam::*:role/aws-service-role/batch.amazonaws.com/
AWSServiceRoleForBatch",
"Condition": {"StringLike": {"iam:AWSServiceName": "batch.amazonaws.com"}}
}
Add the following statement to the permissions policy. This allows the IAM entity to delete a service-
linked role.
{
"Effect": "Allow",
"Action": [
"iam:DeleteServiceLinkedRole",
"iam:GetServiceLinkedRoleDeletionStatus"
],
"Resource": "arn:aws:iam::*:role/aws-service-role/batch.amazonaws.com/
AWSServiceRoleForBatch",
"Condition": {"StringLike": {"iam:AWSServiceName": "batch.amazonaws.com"}}
}
186
AWS Batch User Guide
Using Service-Linked Roles
You must delete all AWS Batch compute environments that use the AWSServiceRoleForBatch role in all
AWS Regions before you can delete the AWSServiceRoleForBatch role.
1. Sign in to the AWS Management Console and open the IAM console at https://
console.aws.amazon.com/iam/.
2. In the navigation pane of the IAM console, choose Roles. Then select the check box next to
AWSServiceRoleForBatch, not the name or row itself.
3. Choose Delete role.
4. In the confirmation dialog box, review the service last accessed data, which shows when each of the
selected roles last accessed an AWS service. This helps you to confirm whether the role is currently
active. If you want to proceed, choose Yes, Delete to submit the service-linked role for deletion.
5. Watch the IAM console notifications to monitor the progress of the service-linked role deletion.
Because the IAM service-linked role deletion is asynchronous, after you submit the role for deletion,
the deletion task can succeed or fail.
• If the task succeeds, then the role is removed from the list and a notification of success appears at
the top of the page.
• If the task fails, you can choose View details or View Resources from the notifications to learn
why the deletion failed. If the deletion fails because the role is using the service's resources, then
187
AWS Batch User Guide
Using Service-Linked Roles
the notification includes a list of resources, if the service returns that information. You can then
clean up the resources and submit the deletion again.
Note
You might have to repeat this process several times, depending on the information that
the service returns. For example, your service-linked role might use six resources and your
service might return information about five of them. If you clean up the five resources
and submit the role for deletion again, the deletion fails and the service reports the one
remaining resource. A service might return all of the resources, a few of them, or it might
not report any resources.
• If the task fails and the notification does not include a list of resources, then the service might not
return that information. To learn how to clean up the resources for that service, see AWS services
that work with IAM. Find your service in the table, and choose the Yes link to view the service-
linked role documentation for that service.
1. Because a service-linked role can't be deleted if it's being used or has associated resources, you
must submit a deletion request. That request can be denied if these conditions aren't met. You must
capture the deletion-task-id from the response to check the status of the deletion task. Enter
the following command to submit a service-linked role deletion request:
2. Use the following command to check the status of the deletion task:
The status of the deletion task can be NOT_STARTED, IN_PROGRESS, SUCCEEDED, or FAILED.
If the deletion fails, the call returns the reason that it failed so that you can troubleshoot. If the
deletion fails because the role is using the service's resources, then the notification includes a list of
resources, if the service returns that information. You can then clean up the resources and submit
the deletion again.
Note
You might have to repeat this process several times, depending on the information that
the service returns. For example, your service-linked role might use six resources and your
service might return information about five of them. If you clean up the five resources
and submit the role for deletion again, the deletion fails and the service reports the one
remaining resource. A service might return all of the resources, a few of them. Or, it might
not report any resources. To learn how to clean up the resources for a service that doesn't
report any resources, see AWS services that work with IAM. Find your service in the table,
and choose the Yes link to view the service-linked role documentation for that service.
1. To submit a deletion request for a service-linked roll, call DeleteServiceLinkedRole. In the request,
specify the AWSServiceRoleForBatch role name.
188
AWS Batch User Guide
AWS managed policies
Because a service-linked role cannot be deleted if it is being used or has associated resources, you
must submit a deletion request. That request can be denied if these conditions are not met. You
must capture the DeletionTaskId from the response to check the status of the deletion task.
2. To check the status of the deletion, call GetServiceLinkedRoleDeletionStatus. In the request, specify
the DeletionTaskId.
The status of the deletion task can be NOT_STARTED, IN_PROGRESS, SUCCEEDED, or FAILED.
If the deletion fails, the call returns the reason that it failed so that you can troubleshoot. If the
deletion fails because the role is using the service's resources, then the notification includes a list of
resources, if the service returns that information. You can then clean up the resources and submit
the deletion again.
Note
You might have to repeat this process several times, depending on the information that
the service returns. For example, your service-linked role might use six resources and your
service might return information about five of them. If you clean up the five resources
and submit the role for deletion again, the deletion fails and the service reports the one
remaining resource. A service might return all of the resources, a few of them, or it might
not report any resources. To learn how to clean up the resources for a service that does not
report any resources, see AWS services that work with IAM. Find your service in the table,
and choose the Yes link to view the service-linked role documentation for that service.
You can use AWS managed policies for simpler identity access management for your team and
provisioned AWS resources. AWS managed policies cover a variety of common use cases, are available
by default in your AWS account, and are maintained and updated on your behalf. You can't change the
permissions in AWS managed policies. If you require greater flexibility, you can alternatively choose to
create IAM customer managed policies. This way, you can provide your team provisioned resources with
only the exact permissions they need.
For more information about AWS managed policies, see AWS managed policies in the IAM User Guide.
AWS services maintain and update AWS managed policies on your behalf. Periodically, AWS services add
additional permissions to an AWS managed policy. AWS managed policies are most likely updated when
a new feature launch or operation becomes available. These updates automatically affect all identities
(users, groups, and roles) where the policy is attached. However, they don't remove permissions or break
your existing permissions.
Additionally, AWS supports managed policies for job functions that span multiple services. For example,
the ReadOnlyAccess AWS managed policy provides read-only access to all AWS services and resources.
When a service launches a new feature, AWS adds read-only permissions for new operations and
resources. For a list and descriptions of job function policies, see AWS managed policies for job functions
in the IAM User Guide.
189
AWS Batch User Guide
AWS managed policies
The BatchServiceRolePolicy policy is attached to a service-linked role. This allows AWS Batch to perform
actions on your behalf. You can't attach this policy to your IAM entities. For more information, see Using
service-linked roles for AWS Batch (p. 181).
This policy grants AWS Batch permissions that grants access to related services including Amazon EC2,
Amazon EC2 Auto Scaling, Amazon ECS, and Amazon CloudWatch Logs.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ec2:DescribeAccountAttributes",
"ec2:DescribeInstances",
"ec2:DescribeInstanceStatus",
"ec2:DescribeInstanceAttribute",
"ec2:DescribeSubnets",
"ec2:DescribeSecurityGroups",
"ec2:DescribeKeyPairs",
"ec2:DescribeImages",
"ec2:DescribeImageAttribute",
"ec2:DescribeSpotInstanceRequests",
"ec2:DescribeSpotFleetInstances",
"ec2:DescribeSpotFleetRequests",
"ec2:DescribeSpotPriceHistory",
"ec2:DescribeVpcClassicLink",
"ec2:DescribeLaunchTemplateVersions",
"ec2:RequestSpotFleet",
"autoscaling:DescribeAccountLimits",
"autoscaling:DescribeAutoScalingGroups",
"autoscaling:DescribeLaunchConfigurations",
"autoscaling:DescribeAutoScalingInstances",
"ecs:DescribeClusters",
"ecs:DescribeContainerInstances",
"ecs:DescribeTaskDefinition",
"ecs:DescribeTasks",
"ecs:ListClusters",
"ecs:ListContainerInstances",
"ecs:ListTaskDefinitionFamilies",
"ecs:ListTaskDefinitions",
"ecs:ListTasks",
"ecs:DeregisterTaskDefinition",
"ecs:TagResource",
"ecs:ListAccountSettings",
"logs:DescribeLogGroups",
"iam:GetInstanceProfile",
"iam:GetRole"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream"
],
"Resource": "arn:aws:logs:*:*:log-group:/aws/batch/job*"
190
AWS Batch User Guide
AWS managed policies
},
{
"Effect": "Allow",
"Action": [
"logs:PutLogEvents"
],
"Resource": "arn:aws:logs:*:*:log-group:/aws/batch/job*:log-stream:*"
},
{
"Effect": "Allow",
"Action": [
"autoscaling:CreateOrUpdateTags"
],
"Resource": "*",
"Condition": {
"Null": {
"aws:RequestTag/AWSBatchServiceTag": "false"
}
}
},
{
"Effect": "Allow",
"Action": "iam:PassRole",
"Resource": [
"*"
],
"Condition": {
"StringEquals": {
"iam:PassedToService": [
"ec2.amazonaws.com",
"ec2.amazonaws.com.cn",
"ecs-tasks.amazonaws.com"
]
}
}
},
{
"Effect": "Allow",
"Action": "iam:CreateServiceLinkedRole",
"Resource": "*",
"Condition": {
"StringEquals": {
"iam:AWSServiceName": [
"spot.amazonaws.com",
"spotfleet.amazonaws.com",
"autoscaling.amazonaws.com",
"ecs.amazonaws.com"
]
}
}
},
{
"Effect": "Allow",
"Action": [
"ec2:CreateLaunchTemplate"
],
"Resource": "*",
"Condition": {
"Null": {
"aws:RequestTag/AWSBatchServiceTag": "false"
}
}
},
{
"Effect": "Allow",
"Action": [
191
AWS Batch User Guide
AWS managed policies
"ec2:TerminateInstances",
"ec2:CancelSpotFleetRequests",
"ec2:ModifySpotFleetRequest",
"ec2:DeleteLaunchTemplate"
],
"Resource": "*",
"Condition": {
"Null": {
"aws:ResourceTag/AWSBatchServiceTag": "false"
}
}
},
{
"Effect": "Allow",
"Action": [
"autoscaling:CreateLaunchConfiguration",
"autoscaling:DeleteLaunchConfiguration"
],
"Resource":
"arn:aws:autoscaling:*:*:launchConfiguration:*:launchConfigurationName/AWSBatch*"
},
{
"Effect": "Allow",
"Action": [
"autoscaling:CreateAutoScalingGroup",
"autoscaling:UpdateAutoScalingGroup",
"autoscaling:SetDesiredCapacity",
"autoscaling:DeleteAutoScalingGroup",
"autoscaling:SuspendProcesses",
"autoscaling:PutNotificationConfiguration",
"autoscaling:TerminateInstanceInAutoScalingGroup"
],
"Resource": "arn:aws:autoscaling:*:*:autoScalingGroup:*:autoScalingGroupName/
AWSBatch*"
},
{
"Effect": "Allow",
"Action": [
"ecs:DeleteCluster",
"ecs:DeregisterContainerInstance",
"ecs:RunTask",
"ecs:StartTask",
"ecs:StopTask"
],
"Resource": "arn:aws:ecs:*:*:cluster/AWSBatch*"
},
{
"Effect": "Allow",
"Action": [
"ecs:RunTask",
"ecs:StartTask",
"ecs:StopTask"
],
"Resource": "arn:aws:ecs:*:*:task-definition/*"
},
{
"Effect": "Allow",
"Action": [
"ecs:StopTask"
],
"Resource": "arn:aws:ecs:*:*:task/*/*"
},
{
"Effect": "Allow",
"Action": [
"ecs:CreateCluster",
192
AWS Batch User Guide
AWS managed policies
"ecs:RegisterTaskDefinition"
],
"Resource": "*",
"Condition": {
"Null": {
"aws:RequestTag/AWSBatchServiceTag": "false"
}
}
},
{
"Effect": "Allow",
"Action": "ec2:RunInstances",
"Resource": [
"arn:aws:ec2:*::image/*",
"arn:aws:ec2:*::snapshot/*",
"arn:aws:ec2:*:*:subnet/*",
"arn:aws:ec2:*:*:network-interface/*",
"arn:aws:ec2:*:*:security-group/*",
"arn:aws:ec2:*:*:volume/*",
"arn:aws:ec2:*:*:key-pair/*",
"arn:aws:ec2:*:*:launch-template/*",
"arn:aws:ec2:*:*:placement-group/*",
"arn:aws:ec2:*:*:capacity-reservation/*",
"arn:aws:ec2:*:*:elastic-gpu/*",
"arn:aws:elastic-inference:*:*:elastic-inference-accelerator/*"
]
},
{
"Effect": "Allow",
"Action": "ec2:RunInstances",
"Resource": "arn:aws:ec2:*:*:instance/*",
"Condition": {
"Null": {
"aws:RequestTag/AWSBatchServiceTag": "false"
}
}
},
{
"Effect": "Allow",
"Action": [
"ec2:CreateTags"
],
"Resource": [
"*"
],
"Condition": {
"StringEquals": {
"ec2:CreateAction": [
"RunInstances",
"CreateLaunchTemplate",
"RequestSpotFleet"
]
}
}
}
]
}
193
AWS Batch User Guide
AWS managed policies
You can attach BatchFullAccess to your IAM entities. AWS Batch also attaches this policy to a service role
that allows AWS Batch to perform actions on your behalf.
{
"Version":"2012-10-17",
"Statement":[
{
"Effect":"Allow",
"Action":[
"batch:*",
"cloudwatch:GetMetricStatistics",
"ec2:DescribeSubnets",
"ec2:DescribeSecurityGroups",
"ec2:DescribeKeyPairs",
"ec2:DescribeVpcs",
"ec2:DescribeImages",
"ec2:DescribeLaunchTemplates",
"ec2:DescribeLaunchTemplateVersions",
"ecs:DescribeClusters",
"ecs:Describe*",
"ecs:List*",
"logs:Describe*",
"logs:Get*",
"logs:TestMetricFilter",
"logs:FilterLogEvents",
"iam:ListInstanceProfiles",
"iam:ListRoles"
],
"Resource":"*"
},
{
"Effect":"Allow",
"Action":[
"iam:PassRole"
],
"Resource":[
"arn:aws:iam::*:role/AWSBatchServiceRole",
"arn:aws:iam::*:role/service-role/AWSBatchServiceRole",
"arn:aws:iam::*:role/ecsInstanceRole",
"arn:aws:iam::*:instance-profile/ecsInstanceRole",
"arn:aws:iam::*:role/iaws-ec2-spot-fleet-role",
"arn:aws:iam::*:role/aws-ec2-spot-fleet-role",
"arn:aws:iam::*:role/AWSBatchJobRole*"
]
},
{
"Effect":"Allow",
"Action":[
"iam:CreateServiceLinkedRole"
],
"Resource":"arn:aws:iam::*:role/*Batch*",
"Condition": {
"StringEquals": {
"iam:AWSServiceName": "batch.amazonaws.com"
}
}
}
]
}
194
AWS Batch User Guide
Compliance Validation
View details about updates to AWS managed policies for AWS Batch since this service began tracking
these changes. For automatic alerts about changes to this page, subscribe to the RSS feed on the AWS
Batch Document history page.
BatchServiceRolePolicy (p. 190) and AWSBatchServiceRole (p. 142) policies updated (December 6,
2021)
Updated to add support for describing the status of AWS Batch managed instances in Amazon EC2
so unhealthy instances are replaced.
BatchServiceRolePolicy (p. 190) policy updated (March 26, 2021)
Updated to add support for placement group, capacity reservation, elastic GPU, and Elastic Inference
resources in Amazon EC2.
BatchServiceRolePolicy (p. 190) policy added (March 10, 2021)
Add IAM permissions to allow the AWSServiceRoleForBatch service-linked role to be added to the
account.
AWS Batch started tracking changes (March 10, 2021)
AWS Batch started tracking changes for its AWS managed policies.
To learn whether AWS Batch or other AWS services are in scope of specific compliance programs, see
AWS Services in Scope by Compliance Program. For general information, see AWS Compliance Programs.
You can download third-party audit reports using AWS Artifact. For more information, see Downloading
Reports in AWS Artifact.
Your compliance responsibility when using AWS services is determined by the sensitivity of your data,
your company's compliance objectives, and applicable laws and regulations. AWS provides the following
resources to help with compliance:
• Security and Compliance Quick Start Guides – These deployment guides discuss architectural
considerations and provide steps for deploying baseline environments on AWS that are security and
compliance focused.
• Architecting for HIPAA Security and Compliance Whitepaper – This whitepaper describes how
companies can use AWS to create HIPAA-eligible applications.
Note
Not all AWS services are HIPAA eligible. For more information, see the HIPAA Eligible Services
Reference.
• AWS Compliance Resources – This collection of workbooks and guides might apply to your industry
and location.
• Evaluating Resources with Rules in the AWS Config Developer Guide – The AWS Config service assesses
how well your resource configurations comply with internal practices, industry guidelines, and
regulations.
195
AWS Batch User Guide
Infrastructure Security
• AWS Security Hub – This AWS service provides a comprehensive view of your security state within AWS
that helps you check your compliance with security industry standards and best practices.
• AWS Audit Manager – This AWS service helps you continuously audit your AWS usage to simplify how
you manage risk and compliance with regulations and industry standards.
You use AWS published API calls to access AWS Batch through the network. Clients must support
Transport Layer Security (TLS) 1.0 or later. We recommend TLS 1.2 or later. Clients must also support
cipher suites with perfect forward secrecy (PFS) such as Ephemeral Diffie-Hellman (DHE) or Elliptic Curve
Ephemeral Diffie-Hellman (ECDHE). Most modern systems such as Java 7 and later support these modes.
Additionally, requests must be signed by using an access key ID and a secret access key that is associated
with an IAM principal. Or you can use the AWS Security Token Service (AWS STS) to generate temporary
security credentials to sign requests.
196
AWS Batch User Guide
Tag basics
Contents
• Tag basics (p. 197)
• Tagging your resources (p. 197)
• Tag restrictions (p. 198)
• Working with tags using the console (p. 199)
• Working with tags using the CLI or API (p. 199)
Tag basics
A tag is a label that you assign to an AWS resource. Each tag consists of a key and an optional value, both
of which you define.
Tags enable you to categorize your AWS resources by, for example, purpose, owner, or environment.
When you have many resources of the same type, you can quickly identify a specific resource based on
the tags you've assigned to it. For example, you can define a set of tags for your AWS Batch services to
help you track each service's owner and stack level. We recommend that you devise a consistent set of
tag keys for each resource type.
Tags are not automatically assigned to your resources. After you add a tag, you can edit tag keys and
values or remove tags from a resource at any time. If you delete a resource, any tags for the resource are
also deleted.
Tags don't have any semantic meaning to AWS Batch and are interpreted strictly as a string of characters.
You can set the value of a tag to an empty string, but you can't set the value of a tag to null. If you add a
tag that has the same key as an existing tag on that resource, the new value overwrites the old value.
You can work with tags using the AWS Management Console, the AWS CLI, and the AWS Batch API.
If you're using AWS Identity and Access Management (IAM), you can control which users in your AWS
account have permission to create, edit, or delete tags.
If you're using the AWS Batch console, you can apply tags to new resources when they are created or to
existing resources at any time using the Tags tab on the relevant resource page.
If you're using the AWS Batch API, the AWS CLI, or an AWS SDK, you can apply tags to new resources
using the tags parameter on the relevant API action or to existing resources using the TagResource
API action. For more information, see TagResource.
Some resource-creating actions enable you to specify tags for a resource when the resource is created.
If tags cannot be applied during resource creation, the resource creation process fails. This ensures that
197
AWS Batch User Guide
Tag restrictions
resources you intended to tag on creation are either created with specified tags or not created at all. If
you tag resources at the time of creation, you don't need to run custom tagging scripts after resource
creation.
The following table describes the AWS Batch resources that can be tagged, and the resources that can be
tagged on creation.
Tag restrictions
The following basic restrictions apply to tags:
198
AWS Batch User Guide
Working with tags using the console
• To add a tag — specify the key and value in the empty text boxes at the end of the list.
•
To delete a tag — choose the button next to the tag.
6. Repeat this process for each tag you want to add or delete, and then choose Edit tags to finish.
The following examples show how to tag or untag resources using the AWS CLI.
199
AWS Batch User Guide
Working with tags using the CLI or API
The following command lists the tags associated with an existing resource.
Some resource-creating actions enable you to specify tags when you create the resource. The following
actions support tagging on creation.
200
AWS Batch User Guide
Resource Quota
Maximum number of job queues. For more information, see Job queues (p. 82). 50
Maximum number of transactions per second (TPS) for each account for SubmitJob 65
operations
Depending on how you use AWS Batch, additional quotas might apply. To learn about Amazon EC2
quotas, see Amazon EC2 Service Quotas in the AWS General Reference. For more information about
Amazon ECS quotas, see Amazon ECS Service Quotas in the AWS General Reference.
201
AWS Batch User Guide
INVALID Compute Environment
However, if you manually type the name or ARN for an IAM in an AWS CLI command or your SDK code,
AWS Batch can't validate the string and it accepts the bad value and attempts to create the environment.
After failing to create the environment, the environment moves to an INVALID state, and you see the
following errors.
One common cause for this issue is if you only specify the name of an IAM role when using the AWS CLI
or the AWS SDKs, instead of the full ARN. This is because depending on how you created the role, the
ARN might contain a service-role path prefix. For example, if you manually create the AWS Batch
service role using the procedures in AWS Batch service IAM role (p. 142), your service role ARN would
look like this:
arn:aws:iam::123456789012:role/AWSBatchServiceRole
However, if you created the service role as part of the console first run wizard today, your service role
ARN would look like this:
202
AWS Batch User Guide
Repairing an INVALID Compute Environment
arn:aws:iam::123456789012:role/service-role/AWSBatchServiceRole
When you only specify the name of an IAM role when using the AWS CLI or the AWS SDKs, AWS Batch
assumes that your ARN doesn't use the service-role path prefix. Because of this, we recommend that
you specify the full ARN for your IAM roles when you create compute environments.
To repair a compute environment that's misconfigured this way, see Repairing an INVALID Compute
Environment (p. 203).
AWS Batch jobs send their log information to CloudWatch Logs. To enable this, you must configure
your compute resources to use the awslogs log driver. If you base your compute resource AMI off
of the Amazon ECS optimized AMI (or Amazon Linux), then this driver is registered by default with
the ecs-init package. If you use a different base AMI, then you must verify that the awslogs
log driver is specified as an available log driver with the ECS_AVAILABLE_LOGGING_DRIVERS
environment variable when the Amazon ECS container agent is started. For more information, see
Compute resource AMI specification (p. 89) and Creating a compute resource AMI (p. 90).
Insufficient resources
If your job definitions specify more CPU or memory resources than your compute resources can
allocate, then your jobs is never placed. For example, if your job specifies 4 GiB of memory, and your
compute resources have less than that available, then the job can't be placed on those compute
resources. In this case, you must reduce the specified memory in your job definition or add larger
compute resources to your environment. Some memory is reserved for the Amazon ECS container
agent and other critical system processes. For more information, see Compute Resource Memory
Management (p. 114).
203
AWS Batch User Guide
Spot Instances Not Tagged on Creation
Compute resources need access to communicate with the Amazon ECS service endpoint. This can be
through an interface VPC endpoint or through your compute resources having public IP addresses.
For more information about interface VPC endpoints, see Amazon ECS Interface VPC Endpoints
(AWS PrivateLink) in the Amazon Elastic Container Service Developer Guide.
If you do not have an interface VPC endpoint configured and your compute resources do not have
public IP addresses, then they must use network address translation (NAT) to provide this access.
For more information, see NAT gateways in the Amazon VPC User Guide. For more information, see
Tutorial: Creating a VPC with Public and Private Subnets for Your Compute Environments (p. 165).
Amazon EC2 instance limit reached
The number of Amazon EC2 instances that your account can launch in an AWS Region is determined
by your EC2 instance limit. Certain instance types have a per-instance-type limit as well. For
more information on your account's Amazon EC2 instance limits (including how to request a limit
increase), see Amazon EC2 Service Limits in the Amazon EC2 User Guide for Linux Instances
For more information on diagnosing jobs stuck in RUNNABLE status, see Why is my AWS Batch job stuck
in RUNNABLE status? in the AWS Knowledge Center.
To fix Spot Instance tagging on creation, follow the following procedure to apply the current
recommended IAM managed policy to your Amazon EC2 Spot Fleet role, and then any future Spot
Instances that are created with that role have permissions to apply instance tags on creation.
To apply the current IAM managed policy to your Amazon EC2 Spot Fleet role
204
AWS Batch User Guide
Attach AmazonEC2SpotFleetTaggingRole managed policy
to your Spot Fleet role in the AWS Management Console
Topics
• Attach AmazonEC2SpotFleetTaggingRole managed policy to your Spot Fleet role in the AWS
Management Console (p. 205)
• Attach AmazonEC2SpotFleetTaggingRole managed policy to your Spot Fleet role with the AWS
CLI (p. 205)
Attach AmazonEC2SpotFleetTaggingRole
managed policy to your Spot Fleet role in the AWS
Management Console
To apply the current IAM managed policy to your Amazon EC2 Spot Fleet role
205
AWS Batch User Guide
Can't override job definition resource requirements
If you try to override these resource requirements, you might see the following error message:
"This value was submitted in a deprecated key and may conflict with the value provided by the job
definition's resource requirements."
To correct this, specify the memory and vCPU requirements in the resourceRequirements member of the
containerOverrides. For example, if your memory and vCPU overrides are specified in the following lines:
"containerOverrides": {
"memory": 8192,
"vcpus": 4
}
"containerOverrides": {
"resourceRequirements": [
{
"type": "MEMORY",
"value": "8192"
},
{
"type": "VCPU",
"value": "4"
}
],
}
Do the same change to the memory and vCPU requirements that are specified in the containerProperties
object in the job definition. For example, if your memory and vCPU requirements are specified in the
following lines:
{
"containerProperties": {
"memory": 4096,
"vcpus": 2,
}
"containerProperties": {
"resourceRequirements": [
{
"type": "MEMORY",
"value": "4096"
},
{
"type": "VCPU",
"value": "2"
}
],
}
206
AWS Batch User Guide
Document history
The following table describes the important changes to the documentation since the initial release of
AWS Batch. We also update the documentation frequently to address the feedback that you send us.
AWS managed policy updates - AWS Batch updated existing December 6, 2021
Update to existing policies managed policies.
Fair share scheduling AWS Batch adds support for November 9, 2021
adding scheduling policies to job
queues.
Added service-linked role AWS Batch adds the March 10, 2021
AWSServiceRoleForBatch
service-linked role.
Amazon Linux 2 support AWS Batch adds support for November 24, 2020
automatic selection of Amazon
Linux 2 AMIs in the compute
environment using the EC2
Configuration parameters.
Enhanced retry strategy AWS Batch enhances the retry October 20, 2020
strategy for jobs. Now jobs can
be retried or stop further retries
by matching the ExitCode,
Reason, or StatusReason of a
job with patterns.
Allocation strategies AWS Batch adds support for October 16, 2019
multiple strategies to choose
instance types.
207
AWS Batch User Guide
Multi-node parallel jobs AWS Batch adds support for November 19, 2018
multi-node parallel jobs. You can
use this feature run single jobs
that span over multiple Amazon
EC2 instances.
Amazon EC2 Launch template AWS Batch adds support for November 12, 2018
support using launch templates with
compute environments.
AWS Batch job timeouts AWS Batch adds support for job April 5, 2018
timeout. With this support, you
can configure a specific timeout
duration for your jobs so that
if a job runs longer than they
should, AWS Batch terminates
the job.
AWS Batch jobs as EventBridge AWS Batch jobs are made March 1, 2018
targets available as EventBridge targets.
By creating simple rules, you can
match events and submit AWS
Batch jobs in response to them.
CloudTrail auditing for AWS CloudTrail can audit calls made January 10, 2018
Batch to AWS Batch API actions.
Array jobs AWS Batch adds support for November 28, 2017
array jobs. You can use array jobs
for parameter sweep and Monte
Carlo workloads.
Expanded AWS Batch tagging AWS Batch expands support for October 26, 2017
the tagging function. You can
use this function to specify tags
for Amazon EC2 Spot Instances
launched within managed
compute environments.
208
AWS Batch User Guide
AWS Batch event stream for AWS Batch adds the event October 24, 2017
EventBridge stream for EventBridge. You
can use AWS Batch event
stream to receive near real-time
notifications regarding the state
of jobs that are submitted to
your job queues.
Automated job retries AWS Batch adds support for March 28, 2017
job retries. With this update,
you can apply a retry strategy
to your jobs and job definitions
that allows your jobs to be
automatically retried if they fail.
209