Skip to content

Exception in loadarff with quoted nominal attributes in scipy 1.3.0 #10232

@sebp

Description

@sebp

In scipy 1.3.0, loadarff has undergone some refactoring (see #9854), which breaks parsing ARFF files where nominal attributes are in quotes. I believe this is due to using csv.reader to parse individual lines, which does strip quotes, whereas NominalAttribute._get_nom_val preserves quotes.

Reproducing code example:

from scipy.io.arff import loadarff
from io import StringIO

ARFF_FILE = """@relation SOME_DATA
@attribute age numeric
@attribute smoker {'yes', 'no'}
@data
18,'no'
24,'yes'
44,'no'
56,'no'
89,'yes'
11,'no'
"""

with StringIO(ARFF_FILE) as fin:
    data, meta = loadarff(fin)

Error message:

Traceback (most recent call last):
  File "test_parse_nominal.py", line 18, in <module>
    loadarff(fin)
  File "/miniconda3/envs/scipy13/lib/python3.7/site-packages/scipy/io/arff/arffread.py", line 738, in loadarff
    return _loadarff(ofile)
  File "/miniconda3/envs/scipy13/lib/python3.7/site-packages/scipy/io/arff/arffread.py", line 803, in _loadarff
    a = list(generator(ofile))
  File "/miniconda3/envs/scipy13/lib/python3.7/site-packages/scipy/io/arff/arffread.py", line 801, in generator
    yield tuple([attr[i].parse_data(row[i]) for i in elems])
  File "/miniconda3/envs/scipy13/lib/python3.7/site-packages/scipy/io/arff/arffread.py", line 801, in <listcomp>
    yield tuple([attr[i].parse_data(row[i]) for i in elems])
  File "/miniconda3/envs/scipy13/lib/python3.7/site-packages/scipy/io/arff/arffread.py", line 165, in parse_data
    str(self.values)))
ValueError: no value not in ("'yes'", "'no'")

Note that the first no is the value to be parsed (using %r instead of %s might be a good idea).

Scipy/Numpy/Python version information:

1.3.0 1.16.4 sys.version_info(major=3, minor=7, micro=3, releaselevel='final', serial=0)

Metadata

Metadata

Assignees

No one assigned

    Labels

    defectA clear bug or issue that prevents SciPy from being installed or used as expectedscipy.io

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions