Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hello @rhshadrach ,
I’ve created a fix that raises a ValueError when trying to create a StringArray from a list of lists with inconsistent lengths or non-character elements. This aligns the behavior for both consistent and inconsistent input formats and also tested.
I've would like to hear opinion to raise an error when a list of lists is passed for
dtype=StringDtype
, to avoid ambiguous behavior. If preferred, we could instead join the inner lists into strings automatically — happy to adjust based on guidance.Example case :
pd.array([["t", "e", "s", "t"], ["w", "o", "r", "d"]], dtype="string")
output : <StringArray> ['test', 'word'] Length: 2, dtype: string
Thanks