Skip to content

__repr__ wrong column alignment with non-ascii characters #1620

@manuteleco

Description

@manuteleco

Hi,

it seems that when DataFrame, Series and maybe other objects contain non-ascii characters inside non-unicode strings the __repr__ method is not able to give the correct column alignment to its values. However, we see that this issue does not affect unicode strings. I'm using pandas '0.8.1.dev-70c3deb' in a Linux box.

Sample code:

# -*- coding: utf-8 -*-

from pandas import Series, DataFrame

df1 = DataFrame([["aaaa", 1], ["bbbb", 2]])
df2 = DataFrame([["aaää", 1], ["bbbb", 2]])
df3 = DataFrame([[u"aaää", 1], ["bbbb", 2]])

# Comparison between "similar dataframes"
print df1
print
print df2
print
print df3
print

# Other cases:
s1 = Series(["ä", "bbbb", "ßß"])
print
print s1

This results in:

      0  1
0  aaaa  1
1  bbbb  2

        0  1
0  aaää  1
1    bbbb  2

      0  1
0  aaää  1
1  bbbb  2


0      ä
1    bbbb
2    ßß

Thanks and regards.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions