Problem Statement − Use boto3 library in Python to run a glue job. For example, run the job run_s3_file_job.
Approach/Algorithm to solve this problem
Step 1 − Import boto3 and botocore exceptions to handle exceptions.
Step 2 − job_name is the mandatory parameters while arguments is the optional parameter in function. Few jobs take arguments to run. In that case, arguments can be passed as dict.
For example: arguments = {‘arguments1’ = ‘value1’, ‘arguments2’ = ‘value2’}
If the job doesn’t take argument, then just pass the job_name.
Step 3 − Create an AWS session using boto3 library. Make sure region_name is mentioned in default profile. If it is not mentioned, then explicitly pass the region_name while creating the session.
Step 4 − Create an AWS client for glue.
Step 5 − Now use start_job_run function and pass the JobName and arguments if require.
Step 6 − Once the job starts, it provides job_run_id with the metadata of the job.
Step 7 − Handle the generic exception if something went wrong while checking the job.
Example
Use the following code to run an existing glue job −
import boto3 from botocore.exceptions import ClientError def run_glue_job(job_name, arguments = {}): session = boto3.session.Session() glue_client = session.client('glue') try: job_run_id = glue_client.start_job_run(JobName=job_name, Arguments=arguments) return job_run_id except ClientError as e: raise Exception( "boto3 client error in run_glue_job: " + e.__str__()) except Exception as e: raise Exception( "Unexpected error in run_glue_job: " + e.__str__()) print(run_glue_job("run_s3_file_job"))
Output
{'JobRunId': 'jr_5f8136286322ce5b7d0387e28df6742abc6f5e6892751431692ffd717f45fc00', 'ResponseMetadata': {'RequestId': '36c48542-a060-468b-83ccb067a540bc3c', 'HTTPStatusCode': 200, 'HTTPHeaders': {'date': 'Sat, 13 Feb 2021 13:36:50 GMT', 'content-type': 'application/x-amz-json-1.1', 'content-length': '82', 'connection': 'keep-alive', 'x-amzn-requestid': '36c48542-a060-468b-83cc-b067a540bc3c'}, 'RetryAttempts': 0}}