42
20
roganartu
> Keeping the dependency information in a database, which is queried during the resolution process, allows us to choose dependencies using criteria specified by the developer instead of merely importing the latest possible versions, as pip's backtracking algorithm does. You can specify quality criteria depending on the application's traits and environment. For instance, applications deployed to production environments must be secure, so it is important that dependencies do not introduce vulnerabilities. When a data scientist trains a machine learning model in an isolated environment, however, it is acceptable to use dependency versions that are vulnerable but offer a performance gain, thus saving time and resources.

This seems like a really bad idea to me. I could understand and perhaps get behind the idea that you might use something like this to find the optimal version of a package to use in a given project, but unexpected differences between your development environment and production are a common source of outages.

It also requires using a different package manager called Thamos: https://thoth-station.ninja/docs/developers/thamos/. This tool then outputs requirements files compatible with Pipenv, pip, or pip-tools (though notably not Poetry).

That being said, all of the examples and config seems very centered around ML use cases, with the Thamos config accepting settings for OS, cpu, and cuda versions. Is variance in performance between otherwise-compatible versions of ML packages really that big a problem?

deycallmeajay
Yeah this sounds like a terrible idea. The current goal is to build reproducible and hermetic builds. By adding more complexity it’ll be much more difficult to get the same artifact, build after build as well as give another method for attackers to achieve supply chain injections.
benjamir
Yeah: Use even more complexity to build software. How about taming software? (duckandcover)
akx
Um...

> The Python Packaging Authority (PyPA), along with the Python community, is working on an endpoint to provide the dependency information.

So what is the `requires_dist` key in e.g. https://pypi.org/pypi/Django/3.2/json ?

(My experimental dependency locking tool Pipimi (https://github.com/akx/pipimi/blob/f055b0c0/pipimi.py#L43-L5...) uses that endpoint.)

dgan
Hm bike shedding, but "thoth" is a horrible name for anyone who isn't a proficient English speaker. I honestly can't pronounce it
zuj
Why, I mean, why ? I am serious, why ?
TotallyNotOla
I see it's time to update xkcd 1987. https://xkcd.com/1987/