Skip to content

use zlib-rs for gzip compression in rust code #15417

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 15, 2025

Conversation

folkertdev
Copy link
Contributor

What does this PR try to resolve?

This PR uses zlib-rs via the flate2 crate. It is used for (de)compressing gzip files (e.g. in cargo package).

Using zlib-rs is significantly faster, and produces output of roughly similar size. For the windows-bindgen crate
(a very large crate), the speedup is over 60%, or about 2 seconds.

> time cargo package -p windows-bindgen --no-verify
   Packaging windows-bindgen v0.61.0 (/home/folkertdev/rust/windows-rs/crates/libs/bindgen)
    Updating crates.io index
    Packaged 76 files, 31.2MiB (8.2MiB compressed)

________________________________________________________
Executed in    3.30 secs    fish           external
   usr time    3.19 secs  424.00 micros    3.19 secs
   sys time    0.05 secs   61.00 micros    0.05 secs

> time ~/rust/cargo/target/release/cargo package -p windows-bindgen --no-verify
   Packaging windows-bindgen v0.61.0 (/home/folkertdev/rust/windows-rs/crates/libs/bindgen)
    Updating crates.io index
    Packaged 76 files, 31.2MiB (8.3MiB compressed)

________________________________________________________
Executed in    1.25 secs    fish           external
   usr time    1.15 secs    0.00 micros    1.15 secs
   sys time    0.04 secs  589.00 micros    0.04 secs

How should we test and review this PR?

Generally CI/the test suite should handle correctness.

Something to look out for is zlib-rs producing larger binaries than before. So far we're seeing output sizes that are roughly the same (sometimes a bit better, sometimes a bit worse) as the status quo.

We've not specifically looked at decompression yet, mostly because we could not come up with a good command to benchmark. In general zlib-rs is much faster than stock zlib there too.

Additional information

For the time being, cargo still depends on stock zlib via e.g. curl and git. As far as I know it is the intention to eventually move away from these C dependencies, at which point the dependency on stock zlib also disappears.

Various C dependencies (curl, git) still rely on `libz-sys`, so for the time being a system libc is still required. But, using zlib-rs via flate2 is straightforward, and gives good speedup for `cargo package`. It is also extremely portable, because it's just rust code.
@rustbot
Copy link
Collaborator

rustbot commented Apr 10, 2025

r? @epage

rustbot has assigned @epage.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added A-documenting-cargo-itself Area: Cargo's documentation S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 10, 2025
@joshtriplett joshtriplett added I-nominated-to-discuss To be discussed during issue triage on the next Cargo team meeting and removed A-documenting-cargo-itself Area: Cargo's documentation labels Apr 10, 2025
@joshtriplett
Copy link
Member

Thanks, @folkertdev!

Nominating for next week's Cargo meeting.

@ehuss ehuss moved this to Nominated in Cargo status tracker Apr 15, 2025
@ehuss ehuss removed the I-nominated-to-discuss To be discussed during issue triage on the next Cargo team meeting label Apr 15, 2025
@joshtriplett joshtriplett added the relnotes Release-note worthy label Apr 15, 2025
@epage
Copy link
Contributor

epage commented Apr 15, 2025

We talked about how we want to approach these kind of changes in today's Cargo team meeting.

For gitoxide, its still under active development and each feature needs a lot of vetting before adoption, so we're taking an incremental, feature-at-at-ime, approach with an opt-in and opt-out windows to ensure as smooth of a path as possible.

For something like zlib-rs, it feels smaller and more mature. We likely don't need a a multi-release transition window. We do want it merged at the beginning of a release window to give as much runtime within the release as possible. We also appreciate how easy it would be to back this out during beta if need be. The main risk we see with zlib-rs is platform support.

For our -sys crates, we allow people building cargo to decide whether to use the vendored or system version of the package. The question came up if we should still support those cases in addition to the pure-Rust implementation. I don't feel that that model needs to be applied when adopting a pure-Rust variant of C code. We could use C implementations of json, blake3, toml, base64, etc and the question didn't come up of whether that should be made configurable.

As this is still early in the release Window, we can move forward with this.

@epage epage added this pull request to the merge queue Apr 15, 2025
Merged via the queue into rust-lang:master with commit d811228 Apr 15, 2025
25 checks passed
bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 17, 2025
Update cargo

4 commits in 864f74d4eadcaea3eeda37a2e7f4d34de233d51e..d811228b14ae2707323f37346aee3f4147e247e6
2025-04-11 20:37:27 +0000 to 2025-04-15 15:18:42 +0000
- use `zlib-rs` for gzip compression in rust code (rust-lang/cargo#15417)
- test(rustfix): Use `snapbox` for snapshot testing (rust-lang/cargo#15429)
- chore(deps): update rust crate gix to 0.71.0 [security] (rust-lang/cargo#15391)
- Make sure search paths inside OUT_DIR precede external paths (rust-lang/cargo#15221)

Also,

* The license exception of sha1_smol with BSD-3-Clause is no longer needed, as `gix-*` doesn't depend on it.
* Cargo depends on zlib-rs, which is distributed under Zlib license

r? ghost
@rustbot rustbot added this to the 1.88.0 milestone Apr 17, 2025
github-actions bot pushed a commit to rust-lang/rustc-dev-guide that referenced this pull request Apr 19, 2025
Update cargo

4 commits in 864f74d4eadcaea3eeda37a2e7f4d34de233d51e..d811228b14ae2707323f37346aee3f4147e247e6
2025-04-11 20:37:27 +0000 to 2025-04-15 15:18:42 +0000
- use `zlib-rs` for gzip compression in rust code (rust-lang/cargo#15417)
- test(rustfix): Use `snapbox` for snapshot testing (rust-lang/cargo#15429)
- chore(deps): update rust crate gix to 0.71.0 [security] (rust-lang/cargo#15391)
- Make sure search paths inside OUT_DIR precede external paths (rust-lang/cargo#15221)

Also,

* The license exception of sha1_smol with BSD-3-Clause is no longer needed, as `gix-*` doesn't depend on it.
* Cargo depends on zlib-rs, which is distributed under Zlib license

r? ghost
@reneleonhardt
Copy link

Does this mean a final decision has been made or are alternatives like zstd still on the table? #2526

@weihanglo
Copy link
Member

This was just a performance improvement. Had no discussion around #2526 when it got merged.

wip-sync pushed a commit to NetBSD/pkgsrc-wip that referenced this pull request Jun 29, 2025
Pkgsrc changes:
 * Adjust patches to adapt to upstream changes and new versions.
 * associated checksums

Upstream changes relative to 1.87.0:

Version 1.88.0 (2025-06-26)
==========================

Language
--------
- [Stabilize `#![feature(let_chains)]` in the 2024 edition.]
  (rust-lang/rust#132833)
  This feature allows `&&`-chaining `let` statements inside `if`
  and `while`, allowing intermixture with boolean expressions. The
  patterns inside the `let` sub-expressions can be irrefutable or
  refutable.
- [Stabilize `#![feature(naked_functions)]`.]
  (rust-lang/rust#134213)
  Naked functions allow writing functions with no compiler-generated
  epilogue and prologue, allowing full control over the generated
  assembly for a particular function.
- [Stabilize `#![feature(cfg_boolean_literals)]`.]
  (rust-lang/rust#138632)
  This allows using boolean literals as `cfg` predicates, e.g.
  `#[cfg(true)]` and `#[cfg(false)]`.
- [Fully de-stabilize the `#[bench]` attribute]
  (rust-lang/rust#134273). Usage of `#[bench]`
  without `#![feature(custom_test_frameworks)]` already triggered
  a deny-by-default future-incompatibility lint since Rust 1.77,
  but will now become a hard error.
- [Add warn-by-default `dangerous_implicit_autorefs` lint against
  implicit autoref of raw pointer dereference.]
  (rust-lang/rust#123239) The
  lint [will be bumped to deny-by-default]
  (rust-lang/rust#141661) in the next
  version of Rust.
- [Add `invalid_null_arguments` lint to prevent invalid usage of
  null pointers.] (rust-lang/rust#119220)
  This lint is uplifted from `clippy::invalid_null_ptr_usage`.
- [Change trait impl candidate preference for builtin impls and
  trivial where-clauses.] (rust-lang/rust#138176)
- [Check types of generic const parameter defaults]
  (rust-lang/rust#139646)

Compiler
--------
- [Stabilize `-Cdwarf-version` for selecting the version of DWARF
  debug information to generate.]
  (rust-lang/rust#136926)

Platform Support
----------------
- [Demote `i686-pc-windows-gnu` to Tier 2.]
  (https://blog.rust-lang.org/2025/05/26/demoting-i686-pc-windows-gnu/)

Refer to Rust's [platform support page][platform-support-doc]
for more information on Rust's tiered platform support.

[platform-support-doc]: https://doc.rust-lang.org/rustc/platform-support.html

Libraries
---------
- [Remove backticks from `#[should_panic]` test failure message.]
  (rust-lang/rust#136160)
- [Guarantee that `[T; N]::from_fn` is generated in order of
  increasing indices.] (rust-lang/rust#139099),
  for those passing it a stateful closure.
- [The libtest flag `--nocapture` is deprecated in favor of the
  more consistent `--no-capture` flag.]
  (rust-lang/rust#139224)
- [Guarantee that `{float}::NAN` is a quiet NaN.]
  (rust-lang/rust#139483)

Stabilized APIs
---------------

- [`Cell::update`]
  (https://doc.rust-lang.org/stable/std/cell/struct.Cell.html#method.update)
- [`impl Default for *const T`]
  (https://doc.rust-lang.org/nightly/std/primitive.pointer.html#impl-Default-for-*const+T)
- [`impl Default for *mut T`]
  (https://doc.rust-lang.org/nightly/std/primitive.pointer.html#impl-Default-for-*mut+T)
- [`HashMap::extract_if`]
  (https://doc.rust-lang.org/stable/std/collections/struct.HashMap.html#method.extract_if)
- [`HashSet::extract_if`]
  (https://doc.rust-lang.org/stable/std/collections/struct.HashSet.html#method.extract_if)
- [`proc_macro::Span::line`]
  (https://doc.rust-lang.org/stable/proc_macro/struct.Span.html#method.line)
- [`proc_macro::Span::column`]
  (https://doc.rust-lang.org/stable/proc_macro/struct.Span.html#method.column)
- [`proc_macro::Span::start`]
  (https://doc.rust-lang.org/stable/proc_macro/struct.Span.html#method.start)
- [`proc_macro::Span::end`]
  (https://doc.rust-lang.org/stable/proc_macro/struct.Span.html#method.end)
- [`proc_macro::Span::file`]
  (https://doc.rust-lang.org/stable/proc_macro/struct.Span.html#method.file)
- [`proc_macro::Span::local_file`]
  (https://doc.rust-lang.org/stable/proc_macro/struct.Span.html#method.local_file)

These previously stable APIs are now stable in const contexts:

- [`NonNull<T>::replace`]
  (https://doc.rust-lang.org/stable/std/ptr/struct.NonNull.html#method.replace)
- [`<*mut T>::replace`]
  (https://doc.rust-lang.org/stable/std/primitive.pointer.html#method.replace)
- [`std::ptr::swap_nonoverlapping`]
  (rust-lang/rust#137280)
- [`Cell::{replace, get, get_mut, from_mut, as_slice_of_cells}`]
  (rust-lang/rust#137928)

Cargo
-----
- [Stabilize automatic garbage collection.]
  (rust-lang/cargo#14287)
- [use `zlib-rs` for gzip compression in rust code]
  (rust-lang/cargo#15417)

Rustdoc
-----
- [Doctests can be ignored based on target names using `ignore-*` attributes.]
  (rust-lang/rust#137096)
- [Stabilize the `--test-runtool` and `--test-runtool-arg` CLI
  options to specify a program (like qemu) and its arguments to run
  a doctest.] (rust-lang/rust#137096)

Compatibility Notes
-------------------
- [Finish changing the internal representation of pasted tokens]
  (rust-lang/rust#124141). Certain invalid
  declarative macros that were previously accepted in obscure
  circumstances are now correctly rejected by the compiler. Use of
  a `tt` fragment specifier can often fix these macros.
- [Fully de-stabilize the `#[bench]` attribute]
  (rust-lang/rust#134273). Usage of `#[bench]`
  without `#![feature(custom_test_frameworks)]` already triggered
  a deny-by-default future-incompatibility lint since Rust 1.77,
  but will now become a hard error.
- [Fix borrow checking some always-true patterns.]
  (rust-lang/rust#139042) The borrow checker
  was overly permissive in some cases, allowing programs that
  shouldn't have compiled.
- [Update the minimum external LLVM to 19.]
  (rust-lang/rust#139275)
- [Make it a hard error to use a vector type with a non-Rust ABI
  without enabling the required target feature.]
  (rust-lang/rust#139309)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
relnotes Release-note worthy S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

7 participants