Skip to content

Conversation

@antznette1
Copy link

@antznette1 antznette1 commented Oct 30, 2025

@rhshadrach

Adds an autofilter parameter to DataFrame.to_excel() to enable Excel autofilter
over the header row and data range when writing files.

Usage:

df.to_excel("output.xlsx", autofilter=True)

Supports xlsxwriter and openpyxl engines. When autofilter=True, applies an
autofilter over the written data range (header row through last data row).

Closes #62651

…ge for xlsxwriter/openpyxl; keep engine_kwargs semantics intact
@antznette1
Copy link
Author

Hi @rhshadrach, I hope you’re doing well.
When you get a chance, could you please review my PR? I’d really appreciate your feedback so we can move it forward.
Thanks a lot!

Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! I think this is looking good. Unfortunately it seems that odfpy does not provide a reasonable way to add an autofilter. If that is the case, I think we should raise when autofilter=True is passed and this engine is used. Can you add this and a test.

Can you also add tests for:

  • nonzero startrow / startcol
  • A DataFrame with MultiIndex columns (also called a hierarchical index) with both merge_cells=True and merge_cells=False.

autofilter: bool = False,
) -> None:
"""
Write object to an Excel sheet.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove the doc decorator on L2149

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also - it looks like there was a merge issue here and code got duplicated below.

Copy link
Author

@antznette1 antznette1 Nov 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove the doc decorator on L2149
@rhshadrach
It’s used for docstring parameter substitution. Removing it would break the docstring formatting used across pandas methods. The autofilter parameter is already documented in the Parameters section (around line 2252):

autofilter : bool, default False
    Whether to apply autofilter to the header row.

Is there a specific reason to remove the decorator, or would you prefer a different docstring format? If you saw something else at L2149, can you point me to the exact location?

Comment on lines +518 to +519
# track bounds (1-based for openpyxl)
if min_r is None or abs_row < min_r:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we able to determine these by startrow / startcol along with the DataFrame shape?

Copy link
Author

@antznette1 antznette1 Nov 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we able to determine these by startrow / startcol along with the DataFrame shape?

@rhshadrach
I looked into calculating the autofilter bounds from startrow/startcol and DataFrame shape, but _write_cells only receives the cells list, startrow, and startcol—not the DataFrame shape, header/index flags, or MultiIndex structure.

To use shape-based calculation, we'd need to:

  • Pass additional parameters (DataFrame shape, header/index flags, MultiIndex levels) through the call chain
  • Handle edge cases like filtered columns, custom formatting that changes row/col counts, and MultiIndex headers with varying depths

The current approach tracks bounds while iterating through the cells we're already writing:

  • Matches what's actually written
  • No API changes needed
    -Minimal overhead
  • Handles edge cases (MultiIndex, merged cells, conditional formatting)

If you prefer a shape-based calculation, I can add those parameters to _write_cells, though it will increase complexity. The current tracking approach is straightforward and accurate.

@antznette1
Copy link
Author

Thanks for the PR! I think this is looking good. Unfortunately it seems that odfpy does not provide a reasonable way to add an autofilter. If that is the case, I think we should raise when autofilter=True is passed and this engine is used. Can you add this and a test.

Can you also add tests for:

  • nonzero startrow / startcol
  • A DataFrame with MultiIndex columns (also called a hierarchical index) with both merge_cells=True and merge_cells=False.

Okay, Im working on them now

antznette1 and others added 3 commits November 2, 2025 00:31
- Remove duplicate to_excel function code in generic.py
- Add NotImplementedError for odfpy engine when autofilter=True
- Remove broad exception handling from autofilter implementations
- Add comprehensive tests for nonzero startrow/startcol
- Add tests for MultiIndex columns with merge_cells=True and False
- Improve tests to verify each column has autofilter
- Remove redundant test_to_excel test
- Remove redundant pytest.importorskip from test functions
@antznette1 antznette1 requested a review from rhshadrach November 3, 2025 12:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ENH: : Add header Autofilter and optional bold via engine_kwargs in to_excel

2 participants