Skip to content

(step_functions): S3 Json Lines Item ReaderΒ #33601

@MFC-MiguelFerreira

Description

@MFC-MiguelFerreira

Describe the feature

The AWS Step Functions team recently introduced support for JSON Lines (JSONL) in Distributed Map, allowing efficient processing of large datasets stored in this format:
πŸ”— AWS Blog Post – JSONL Support in Step Functions Distributed Map

Currently, the AWS CDK provides S3JsonItemReader (docs), which supports reading JSON objects from an S3 file. However, this construct does not support JSONL files. Given that JSONL is now natively supported by Step Functions Distributed Map, it would be highly beneficial to have native support for JSONL in the CDK as well.

Use Case

Developers using AWS Step Functions with CDK would be able to seamlessly leverage JSONL for large-scale data processing, without resorting to custom implementations or workarounds.

Proposed Solution

Introduce a new construct (or extend the existing S3JsonItemReader) to support JSONL files, aligning with the latest Step Functions capabilities.

Example:

s3_jsonl_reader = stepfunctions.S3JsonItemReader(
    bucket=s3_bucket,
    key="data.jsonl",
    format=stepfunctions.JsonFormat.JSONL  # Example of possible new parameter
)

Other Information

No response

Acknowledgements

  • I may be able to implement this feature request
  • This feature might incur a breaking change

CDK version used

2.280.0

Environment details (OS name and version, etc.)

Windows 11, python

Metadata

Metadata

Assignees

No one assigned

    Labels

    @aws-cdk/aws-stepfunctionsRelated to AWS StepFunctionseffort/mediumMedium work item – several days of effortfeature-requestA feature should be added or improved.p2

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions