Skip to content

read_html: fails to parse column #3606

Closed
@timmie

Description

@timmie

The second column of the table
http://code.google.com/p/pythonxy/wiki/StandardPlugins#Python_packages

is not parsed as shown with this code:

# -*- coding: utf-8 -*-
# <nbformat>3.0</nbformat>

# <codecell>

import pandas as pd

# <codecell>

url = 'http://code.google.com/p/pythonxy/wiki/StandardPlugins'

# <codecell>

dfs = pd.read_html(url, attrs={'class': 'wikitable'})

# <codecell>

dfs

# <codecell>

dfs = pd.read_html(url, flavor='lxml', attrs={'class': 'wikitable'})

# <codecell>

dfs

# <codecell>

python_core = dfs[0]

# <codecell>

python_core[:10]

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugIO DataIO issues that don't fit into a more specific label

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions