Skip to content

Conversation

sinhrks
Copy link
Member

@sinhrks sinhrks commented Mar 29, 2014

Closes #5300.

Currently drop raises ValueError when non-existing label is passed. I think it is useful if drop has an option to suppress error and drop existing labels only.

For example, I sometimes process lots of files which has slightly different columns, and want to drop if data has unnecessary columns. Previously, I have to prepare different drop arguments checking columns existing each data.

@jreback
Copy link
Contributor

jreback commented Mar 29, 2014

this could be added on after #6599
@hayd

@@ -6762,6 +6762,26 @@ def test_drop_names(self):
self.assertEqual(obj.columns.name, 'second')
self.assertEqual(list(df.columns), ['d', 'e', 'f'])

self.assertRaises(ValueError, df.drop, ['g'])
self.assertRaises(ValueError, df.drop, ['g'], 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's give at least one of these an errors='raise' argument (and add it to an existing test that doesn't raise) just for completeness.

@jtratner
Copy link
Contributor

This all looks fine to me - thanks for the patch.

We might want to reduce the number of test cases (just because we don't necessarily need to duplicate), but aside from that, quite good. - somebody else want to take a look too? I may be a bit rusty :P

@hayd
Copy link
Contributor

hayd commented Mar 30, 2014

Looks all good to me. But let's wait for #6599. :)

@jreback jreback added this to the 0.14.0 milestone Mar 30, 2014
@jreback
Copy link
Contributor

jreback commented Mar 30, 2014

yep....so @hayd will revisit after that #6599 merge

@jreback
Copy link
Contributor

jreback commented Jan 25, 2015

@sinhrks this idea is ok, can you rebase

@sinhrks
Copy link
Member Author

sinhrks commented Jan 26, 2015

@jreback Yes, rebased.

@@ -2226,7 +2228,9 @@ def drop(self, labels):
indexer = self.get_indexer(labels)
mask = indexer == -1
if mask.any():
raise ValueError('labels %s not contained in axis' % labels[mask])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, shouldn't this raise a KeyError? @jorisvandenbossche @shoyer to be consistent with say .loc and other indexers? @hayd

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kind of a gray area to me. We're not doing actual indexing here. I would probably stick with ValueError, especially to remain consistent with the existing implementation.

@sinhrks sinhrks force-pushed the drop branch 2 times, most recently from 44421ca to 58b4a4d Compare February 15, 2015 03:29
@jreback jreback modified the milestones: 0.16.1, 0.16.0 Mar 5, 2015
@jreback
Copy link
Contributor

jreback commented Mar 5, 2015

have to think about this

@sinhrks sinhrks force-pushed the drop branch 2 times, most recently from fc6794d to 3fdca30 Compare April 1, 2015 12:05
@jreback
Copy link
Contributor

jreback commented Apr 4, 2015

@sinhrks can you rebase this.

@sinhrks
Copy link
Member Author

sinhrks commented Apr 4, 2015

Sure, rebased.

jreback added a commit that referenced this pull request Apr 8, 2015
ENH: drop function now has errors keyword for non-existing column handling
@jreback jreback merged commit a4ae0cf into pandas-dev:master Apr 8, 2015
@jreback
Copy link
Contributor

jreback commented Apr 8, 2015

@sinhrks thanks!

@sinhrks sinhrks deleted the drop branch April 11, 2015 13:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging this pull request may close these issues.

More tolerant dataframe drop method for multiple columns deletion
5 participants