-
-
Notifications
You must be signed in to change notification settings - Fork 13
Maintain a list of reverse dependencies of sklearn #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
There may be a way to use pip options to have more info where dependencies come from during the install, but I haven't found something convincing in less than 5 minutes. If you have a working environment with sklearn installed you may be able to use pipdeptree to figure out which package requires sklearn, something like:
This does not seem like a very workable approach, there are likely many thousand packages depending on My hope so far is that people can identify which package still depend on sklearn, open an issue in the relevant repository, so that the situation will gradually improve. |
Here's a list of 1,605 PyPI packages that depend on sklearn: Taken from the database dump from https://github.com/sethmlarson/pypi-data via: sqlite3 'pypi.db' 'SELECT package_name FROM deps WHERE dep_name LIKE "sklearn" GROUP BY package_name;' > deps.txt |
Nice, thanks a lot! I guess it could also be useful to have it ordered by number of downloads (descending), which seems doable if I read the project README correctly. This would allow to potentially open issues/PRs in most downloaded repos first. Note that there are likely some caveats in this kind of things:
|
sqlite3 'pypi.db' 'SELECT DISTINCT downloads, package_name FROM deps INNER JOIN packages ON deps.package_name = packages.name WHERE dep_name LIKE "sklearn" ORDER BY downloads DESC;' > deps-by-downloads.txt Here's the top 50, it quickly tails off:
|
Also as a side comment, it seems like packages depending on for sklearn, ~332k downloads per day
Summing the number of downloads in the top 50 packages depending on |
Closing this one, the brownout period stops in a few days (December 1st) and we are not planning to do something more about this. |
We have just got our first container build broken by this error. The containers packages lists are very large (hundreds of packages, mostly in the form of secondary and tertiary dependencies), with many data scientists contributing their desired packages to the installation list. Yet pip is very uninformative as to the source of the problem, failing to show which package has deprecated
sklearn
in its requirements.Can you perhaps start a packages blacklist with primary packages that still require
sklearn
and let Github users maintain it?The text was updated successfully, but these errors were encountered: