-
Notifications
You must be signed in to change notification settings - Fork 12
Open
Description
Describe the enhancement requested
The Arrow IPC spec supports custom_metadata (key-value pairs) on RecordBatch messages via the Message.custom_metadata field. The Swift FlatBuffers generated code (org_apache_arrow_flatbuf_Message) already has full read/write support for this field, but the Swift RecordBatch model, reader, and writer all ignore it.
Proposed changes
- Add a
customMetadata: [String: String]property toRecordBatch(defaulting to[:]for backward compatibility) - Add
addMetadatabuilder methods onRecordBatch.Builder - Serialize metadata into the FlatBuffers
Messagewrapper when writing (covers file, streaming, and Flight paths viatoMessage) - Deserialize metadata from the FlatBuffers
Messagewrapper when reading (coversreadFile,readStreaming, andfromMessage)
Design notes
[String: String]matches the pragmatic API used by pyarrow and most Arrow consumers. Duplicate keys from other implementations are deduplicated (last wins). Dictionary order is not preserved across round-trips.- Flight support comes for free since
toMessage(batch:)delegates towriteRecordBatch. ArrowTable.from(recordBatches:)does not propagate per-batch metadata, which is expected sinceArrowTableis a table-level abstraction.
Files affected
| File | Change |
|---|---|
Sources/Arrow/ArrowTable.swift |
Add customMetadata property and builder methods |
Sources/Arrow/ArrowWriter.swift |
Serialize metadata into FlatBuffers Message |
Sources/Arrow/ArrowReader.swift |
Deserialize metadata from FlatBuffers Message |
Tests/ArrowTests/IPCTests.swift |
Round-trip, builder API, cross-language, and multi-batch tests |
I have an implementation ready and will open a PR referencing this issue.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels