-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Description
Describe the bug
The options provided in PythonRayExecutableProps
are not all useful for the Ray job type - for instance, the extraPythonFiles
and extraFiles
keys correspond to parameters on spark and plain python jobs, but not ray jobs.
This could be a source of confusion (as was for me) for people looking to make use of the ray job --s3-py-modules
parameter, which currently needs to be provided under defaultArguments
.
Expected Behavior
I would expect that any Ray specific arguments would be exposed under PythonRayExecutableProps
and that non-ray parameters would not be.
Current Behavior
The extraPythonFiles
key has no effect
Reproduction Steps
new Job(this, 'job', {
role: myRole,
jobName: myJobName,
timeout: Duration.minutes(1),
workerType: WorkerType.Z_2X,
workerCount: 2,
maxConcurrentRuns: 1,
executionClass: ExecutionClass.STANDARD,
executable: JobExecutable.pythonRay({
glueVersion: GlueVersion.V4_0,
pythonVersion: PythonVersion.THREE_NINE,
runtime: Runtime.RAY_TWO_FOUR,
script: Code.fromBucket("<bucketPath>", "<scriptPath>"),
// does not work
// extraPythonFiles: [Code.fromBucket("<bucketPath>", "<dependenciesPath>")],
}),
defaultArguments: {
// does work
'--s3-py-modules': `s3://${props.assetsBucket.bucketName}/${props.dependenciesPath}`,
}
});
Possible Solution
https://github.com/aws/aws-cdk/blob/v2.118.0/packages/@aws-cdk/aws-glue-alpha/lib/job-executable.ts#L256 - PythonRayExecutableProps
is just an alias over the generic python job props and is not differentiated for Ray, which is a misleading experience. Fix would look like refactoring this to omit irrelevant props and include those missing.
Additional Information/Context
No response
CDK CLI Version
2.113.0
Framework Version
No response
Node.js Version
v18.19.0
OS
macOS
Language
TypeScript
Language Version
5
Other information
No response