This is great, and a definite step up from the current requirements.txt. Why are the requirements like function calls though? Why can't they just keep a simple text format that's more human readable and editable, rather than something that just looks like a bunch of code? I think it's better in Ruby because you can omit parentheses and it looks cleaner, but my preference above both is package.json. Cargo is also really good. So in my eyes, moving to something like this is somewhat of a step backwards from requirements.txt, but taken together it looks like it will be better overall.
There are arguments for and against, but in general I think a configuration file should be human readable and editable, as well as easily understood by an IDE without having to run an interpreter. So something like YAML or TOML, or even a simple INI would be better than the function calls in the Pipfile. However, the lock file isn't meant to be edited by hand or really ever looked at, so it being in JSON or something less human readable is fine.
Also, why doesn't pip just make it, by default, look for the requirements.txt or the Pipfile? It's silly having to type pip install -r requirements.txt and I will also find it silly having to type pip -p. It's how NPM, Bundler, and many other package managers work; why do I need to call a flag to install requirements?
The function calls in the Pipfile make me suspect it's going to be read and eval-ed within Python. And that makes me shudder.
Many packages have abused the executable nature of setup.py files by importing obscure packages or adding otherwise fragile logic. I would hate to see Pipfiles go the same way.
Strong +1 to making it a declarative format like YAML or TOML instead of something executable.
That's how Bundler does it in the Ruby world, and I haven't really seen any logic in Gemfiles. Pretty much everyone sticks to the DSL. Of course there's someone who did that, but I personally haven't seen that, ever.
For the non-install case, like when you're running a package index, having to eval the dependency specifying is horrible.
This is also why wheels (the new python package format) use a static file instead of setup.py. The Python ecosystem has been trying to get off of "just eval setup.py" for years.
I'm not sure why avoiding arbitrary code execution during install is important when you're going to do arbitrary code execution shortly after. What else is the purpose of installing a package?
My concern is more about fragile code than malicious code. I've had many experiences of packages where "python setup.py install" wouldn't work locally, and then I looked inside setup.py, and was confronted with a wall of conditional statements or dependencies on system libraries. Dependency declaration can, and should, be much simpler and more bulletproof than that.
> why do I need to call a flag to install requirements?
Because pip is used extensively as a manual command to install something - you don't expect something like DIR or ls to automatically get its list of files to list from a DIR.txt or a ls.txt file, do you?
Well ls without arguments lists the most relevant thing, ie the files in the current directly.
So it would follow that tools do their thing based on the current directory, as specified by package.json/Makefile/Gulpfile/Dockerfile/Vagrantfile et all.
One evening, Master Foo and Nubi attended a gathering of programmers who had met to learn from each other. One of the programmers asked Nubi to what school he and his master belonged. Upon being told they were followers of the Great Way of Unix, the programmer grew scornful.
“The command-line tools of Unix are crude and backward,” he scoffed. “Modern, properly designed operating systems do everything through a graphical user interface.”
Master Foo said nothing, but pointed at the moon. A nearby dog began to bark at the master's hand.
“I don't understand you!” said the programmer.
Master Foo remained silent, and pointed at an image of the Buddha. Then he pointed at a window.
“What are you trying to tell me?” asked the programmer.
Master Foo pointed at the programmer's head. Then he pointed at a rock.
“Why can't you make yourself clear?” demanded the programmer.
Master Foo frowned thoughtfully, tapped the programmer twice on the nose, and dropped him in a nearby trashcan.
As the programmer was attempting to extricate himself from the garbage, the dog wandered over and piddled on him.
At that moment, the programmer achieved enlightenment.
--Master Foo Discourses on the Graphical User Interface
What if one intended to type "pip install pillow" but got as far as "pip install " and unintentionally hit <Return> ?
Oh but there's a requirements.txt file sitting right there and "awwww heck now pip is reading it in and installing everything in there.. all I wanted was this new package not listed in that requirements.txt file..."
Sure that happens, but is it worth the hassle of having to type -p or -r requirements.txt every time you want to install the requirements for your project? Following that dogma without regard for the reality of what actually works better is silly. It's not always the case that doing what everyone else does is good, but it seems the consensus is have sensible defaults.
This is a similar argument to what happened when Babel in JS land decided to not have anything happen by default and explicitly require you to decide your configuration. It's silly to force users to do something that could be solved by a sensible default, and this is a clear case where a sensible default is appropriate.
I agree that explicit is better than implicit. Especially in this case. You can create an alias pipr='pip install -r requirements.txt' or similar if you find that fits your needs better.
Especially for newcomers, an implicit "use Pipfile if exists, else requirements.txt" is just unnecessary magic.
`pip install` should only be for the default way that projects define dependencies, where the default is the way. Were we to have both pipfile, and requirements.txt, and maybe a third, and a fourth option, I don't think `pip install` should work.
I also don't think it's ever a good idea to have multiple options for something like that.
It's not bike shedding, it's a valid concern: either I type it infrequently enough that I will probably forget, meaning I'll have to either remember after I type it incorrectly the first time or look on Google, or I type it frequently enough that it actually does affect my productivity. In my eyes, having no default and forcing users to type in something like -r requirements.txt or -p is just forcing users to do something that can be sensibly defaulted and wasting their time. It's like in an elevator, are you expected to close the door by pressing the close door button every time, or open the door when you arrive to the floor? No, because the elevator manufacturer knows getting in an elevator and pressing a floor button means you want the door to close and open for you. The same can be said about pip install: if a requirements file (requirements.txt, Pipfile) exists, and the user types in pip install, it's simple enough to infer that the user wants to install based on the installation file parameters. What's wrong with that inference? Dogma?
An elevator door must always be closed prior to going up or down in a building.
Pip will sometimes install from Pipfile, sometimes from requirements, sometimes from another command line argument. What if both Pipfile & requirements.txt exist?
Requiring a user to say what they want from a command line tool (that takes barely two seconds to type) is hardly an exhausting task.
Absolutely. If it does the wrong thing, e.g. by doing that outside a virtualenv, I'd very much prefer it to not destroy my global environment. Or if you have different .txt files for different purposes, I'd rather not destroy the virtualenv by adding / changing packages by accident.
Besides. Tab completion. And if it's really that important, make an `alias pipi="pip install -r requirements.txt"`.
For me it's more about encouraging good behavior. While it's common to pip install random packages, that should really be the exception rather than the rule. Usually you should make a project, add some packages, and run `pip install`.
Right now, the whole system encourages you to build bespoke environments that are really difficult to replicate. Making the default use requirements.txt encourages you to write down everything you're doing in a reproducible way.
> write down everything you're doing in a reproducible way
Making `pip install` context-sensitive (dependent on the current working directory) would reduce reproducibility. Imagine the instructions to a beginner: (1) download, (2) cd, (3) pip install. Many times the beginner will skip step 2 or accidentally wind up in the wrong directory.
I really think the average beginner is not somebody who knows nothing about packaging, but someone like me who knows a lot about other packaging systems. When I started with pip a few months ago, everything was confusing, largely because pip has confusing conventions that don't match tools like bundler, cargo, and npm that I'm more familiar with.
I do think you're misrepresenting the instructions to the beginner. The instructions to the beginner are (1) download (2) cd (3) run tox.
Beginners shouldn't even need to know what pip is, tox (or another build tool) should handle everything for you.
My preference would be that pip install only has one mode. Realistically, there is going to be backwards compatibility stuff, but ideally, even in advanced use you should never have a reason to use anything other than pip install, and the rest should be specified in config files.
The same thing that happens when you want to remove "#csv" but unintentionally hit <Return> after the "#".[1]
Having pip 'just work' if there's a config file in the current directory would make it behave the same as 'npm install'. I have absolutely no love for npm and the whole node_modules debacle, but neither have I heard of anyone complain of the problem you describe, inadvertently 'npm install'ing instead of 'npm install foo'ing. It's rare enough and easy enough to revert that it's not that much of a problem, IMO.
Of course, if you're going to design a new package manager, you should look at the way OSes do it, not the way language ecosystems do it. OSes are much more battle-tested with their managers, and can't afford to handwave away problems and edge-cases.
[1] HN interprets stars as italic markup, so here are hashes...
I don't think so, not if you're building a language package manager.
I think you should look carefully at what npm and cargo do, and try your best to avoid what os package managers do.
Unfortunately, half the problem with python is that system level package managers want to rule the world, and not allow you to have 'user level' packages (ie. libraries) installed.
pygtk, for example, can't be pip installed. In fact, you can't actually (afaik) even use it from a virtualenv without modifying sys.path.
The most annoying thing about pip for me is that it isn't a 'complete story' in many cases. You have to actually install system level python-foo packages, which are not pinned in any meaningful way, and then hope that your application works.
Repeatable builds should be the goal of a python package manager; and that means being able to install specific versions of python (and c) libraries, in a way that is idempotent to the underlying OS; whatever it is.
OS level package managers solve a different problem; providing a single consistent packaged version of libraries and applications for users; having repeatable builds to generate those packages is a different problem, and, as evidenced by the huge amount of work the debian folk are putting into repeatable builds, not a trivial one; certainly not one that dpkg has already solved.
"... and, as evidenced by the huge amount of work the debian folk are putting into repeatable builds, not a trivial one; certainly not one that dpkg has already solved."
The reproducible builds project in Debian is about reproducing the build and getting every single bit the same. I think that's the primary reason why it's hard - there are various innocent sources of non-determinism, e.g. dictionary ordering in Python (pre 3.6), time stamps, build environment details, etc. If you can settle for just getting a specific repeatable set of versions, I think Debian solved that ages ago, it's just that they don't keep old versions lingering around forever.
By the way, if you take a step back and squint, I wouldn't be so sure OS level package managers really are that different from what what you call language package managers.
I don't think you have to get far back. Package managers are a dime a dozen with very little to distinguish each other. Only reason I know to choose a particular one is always "when in Rome."
And if you read the link, you'll see discussion of this.
1. The executable format is used to create and output a .lock file, which is JSON and is the thing actually used to reproduce the environment (i.e., the .lock file is what you'd use when deploying the full environment).
2. TOML and other formats are being considered; right now something that parses to a Python AST is used for convenience to work out the API.
Of course, nobody commenting in this thread actually read any of that, so it's just people reiterating "I can't believe they didn't consider this thing they considered!"
Installation files by default are run. If you have a normal Python module sitting on PyPi, and install it it may run a setup.py which could potentially do anything that your currently active user is allowed to do.
Adding custom programming logic into the requirements' file isn't really an issue, when setup.py could already grab a C compiler and start binding in system libraries into its package.
You're missing the forest for the trees. My point was that you're losing flexibility when you're not able to parse/generate a data-driven file using a well-known data format.
And unless you're using a homoiconic format, code is not (easily) parsable as data.
When do you need to do such conditional installing? (Aside from the `if dev: install dev-dependencies`, which is special-cased in npm and this proposal too).
Py2/3 compatibility? I've never had to do this, interested what the use-case is.
Github and Discourse did some conditional installation stuff when dealing with different versions of Rails. I don't remember the specifics of Discourse, but the Github one was due to their change from Rails 2.3 to Rails 3 and the different packages they needed to include in 2.3. This case is pretty minimal though and I haven't heard of many (any?) cases outside of those two.
Can't we just leave as it is? I love the fact that requirements.txt is stupid simple. No over engineered json shenanigans. Wanna group things into prod, dev etc? Create 2 files.
pypa, the Python Packaging Authority, is a working group that maintains many of the Python packaging projects (e.g pip) and therefore pipfile in time will likely be the new accepted standard.
This is great news. Coming from Ruby and being used to Bundler, doing anything in Python or JS always was a huge pain. Countless times I deleted the current virtual environment or did an `rm -rf node_modules` to start fresh. So I'm excited to see Yarn for JS show up and now this.
The main problem with requirements.txt, as I see it, is that you don't get exact versions unless you specify it in your requirements.txt. So you'd have to have a loose requirements.txt and then generate a second requirements file after having done `pip install -r requirements.txt` to get the exact versions that were installed.
Further, if you happen to "accidentally" `pip install some-package` in your virtual environment, your app might now be using different packages locally without you noticing. With Pipfile the need for virtual environments is pretty much gone, assuming that at runtime it will automatically load the version of a package specified in the lockfile, which is not clear to me yet from the README.
generate a second requirements file after having done `pip install -r requirements.txt` to get the exact versions that were installed
So you'd need to have a `requirements.txt` with loose versions suitable for upgrading your apps deps, run `pip install -r requirements.txt` and then `pip freeze > requirements.locked.txt`. Then everyone should be using `pip install -r requirements.locked.txt` as well as during your build. But that's cumbersome and error prone and doesn't free you from having the wrong version of a dep in case you `pip install some-package` later on.
I'm not sure why you'd need two requirements.txt. You'd normally create a virtualenv, pip install what you need, then lock the versions with "pip freeze > requirements.txt". You don't need an initial requirements.txt to install new packages.
This way you can run pip install -r requirements.txt when you want to update your dependencies and then lock the resolved dependencies in requirements.locked.txt so that you get deterministic builds when the code runs in production environments where reproducibility and reliability are important. It also gives you a clearer idea of what are top level dependencies and what are transitive dependencies because the transitive dependencies will only be listed in requirements.locked.txt. However this system has limitations and isn't standardized. If you want to have different groups, say development, production, testing. You end up with
And even if you can tell which are your transitive dependencies by comparing .locked.txt to .txt it does not tell you why a given transitive dependency is in your locked dependencies e.g you don't know which of your top level dependencies is pulling it in.
One common reason is avoiding defining hard dependencies to versions of your transitive dependencies. In my current Django project I have 19 declared dependencies and 26 transitive dependencies. We have one file for the declared one and then another we generate with pip freeze. This way the transitive dependencies can evolve on their own without us having to keep track of them.
Pipfile looks like a definite improvement over the pip install, pip freeze workflow.
You're misunderstanding. The whole point is that nothing slips in, but at the same time, you don't have to force a specific version of something in order to achieve that. The killed feature of Bundler for long term maintenance is the ability to upgrade a single requirement in a minimal fashion.
So you start with a Gemfile that is your minimum requirements with no versions specified, the first time you `bundle install` it generates a Gemfile.lock which is then sticky. Over time your requirements are completely frozen until you decide to update, which you can do piecemeal via `bundle update gem1 gem2 etc...`. If you have a reason to avoid a newer library, then put a soft restriction in the Gemfile, preferably with a comment as to why that restriction is there and you have a very powerful long-term system for managing versions over time.
Just freezing and forgetting is a recipe for disaster when you have to update months or years later, and the transitive dependency updates are overwhelming and conflicted. Similarly exact versions specified make it fiddly to upgrade and hard to tell if there were reasons behind specific versions.
The model you describe works well enough for Bundler with a Gemfile describing desired versions which can be loose or tight, and a Gemfile.lock specifying exact versions for all dependencies. It works much better in practice than one without the other, as in the case of package.json and non-deterministic npm.
Why are you saying that? You can freeze npm deps if you want to. Frankly npm does something very right which is allowing diffetent versions of the same dependency in nested tree of dependencies. I dont think there is another language which allows that?
You can hack it in a fair number of languages, but yeah, NPM's approach is pretty uncommon. E.g. in Java, you can use "Jar Jar Links" to recompile a lib into a new namespace, which can allow multiple versions to coexist. NPM does make it transparent though, which is extremely convenient, and I can't name any other language that supports that.
But that's not always an option. Bower exists for a reason - all that duplication / bloat is unacceptable for browsers to download. It can also mean hell for static initialization / mutable state, because there's no longer a single owner of the global resource.
This is almost identical to how bundler in Ruby works, right down to the language native dependency DSL, named groups, file name conventions (Pipfile = Gemfile, Pipfile.lock = Gemfile.lock), and deterministic builds.
It's identical because bundler mostly got it right and dependency management in Ruby, while still not great/perfet, is better than just about everywhere else.
Wow I'm amazed at how skewed my world view was. The first two comments I read are praising ruby dependency management and scorning pythons requirements.
I'm sitting here thinking, "what the hell is wrong?". I've honestly only experienced trouble with Ruby while being completely satisified with how python virtualenv works.
I guess if anything this proves that it's about habit. Habitual use of something makes it the easiest product for the habitual user. Ruby is something I force myself through when I want to try a product while Python is something I develop my own products in.
My background is in Python, too. There was a summer where I dug into Ruby and Rails and I was really impressed with a lot of the concepts. I had dabbled with Flask and Django, but I things like the Gem lockfile, switching easily between test, dev, and production database, and database versioning (at least with little setup) solved problems I had when using Python (I'm not a webdev).
I tried to pitch the Gemfile and lockfile approach when we were developing our own internal packaging system but nobody seemed to "get it" or see the value. I also tried to pitch database versioning (which alembic seems to do), but again, no takers.
I feel like it was a failing on my part to communicate or show the value in these things...they came randomly out of meetings and I probably botched the concept when pitching it.
I've stumbled with requirements.txt and setting up a new package. I'm also picky and don't like installing packages to my system (and I'm not always using virtualenv) so I have to look up how to install to my homedir. So I've stumbled with Python packaging (although, I think everyone can admit it has a bit of a hodgepodge) while I liked how Ruby did it.
I used to think PHP was amazing and build tools were weird and unnecessary when I built things mostly in PHP. It's hard to see the point of many tools if you aren't working with the all the time.
I think this is the product of people who got to know both Python and Ruby very well and found Python lacking here. There are lots of things Ruby developers were gifted from people who also know Python and found Ruby lacking. Python is generally something I force myself through so I'm not one of those people but so glad they exist.
BTW, Ruby has tools similar to virtualenv: chruby, rbenv, and rvm all do basically the same thing.
My experience has been quite the opposite wrt ruby and python. I spent 2 years with Ruby as my primary language, and the regularity in which I would end up with a subtly (or drastically) broken Ruby environment was astounding. With python, I've rarely if ever run into such problems.
The problem in Ruby that the most frequently recommended tools are overcomplicated and break in horrible ways (but I repeat myself). You only need to set a couple of environment variables to define a working Ruby environment. I regularly have people ask me "do you use rvm or rbenv?" to be surprised when I say "neither, they're both horrible."
I think the sore point is definitely build and packaging systems.
Take zulip for example, they do use requirements but they go their own way in most other things.
Managing an application is much easier if it uses standard build system. Setup.py, requirements.txt and so forth.
Gitlab is an example of a very complex packaging for a ruby application, but it works! It's complex but solid.
People are tugging in all kinds of different directions. My bad experiences with Ruby and node usually include seeing a loooong list of dependencies being installed and then at dependency #187 it suddenly stops for some reason like one rogue commit breaking compatibility with other packages.
This is hell to someone who doesn't develop in the language regularly, it's a bad packaging system for users.
To be clear, I'm not saying Python is better. I'm just identifying the issues I've had. The only reason Python is easier for me is because I've decided to use it more than the other languages.
On the one hand you can build a complex but solid system like Gitlab has, on the other you can use more standardized systems to distribute your app that require more steps and are less automated. But they're well documented and established methods used for that language.
As rickycook mentions in another comment, pip-tools is a great solution for maintaining the distinction between allowed version ranges and locked, fully qualified versions for the given environment.
I've been using pip-tools with tox for a couple of years now. I maintain a requirements.in and requirements.testing.in, and then I can run
$ tox -e pip-compile
to generate my fully qualified requirements. The pip-compile command is handled by a tox.ini section.
The remaining nasty part is automated extraction requirements for setup.py's install_requires and dependency_links. I wrote a function to handle VCS links and other complicated syntax that I'm copying around to all of my projects. Otherwise, pip-tools has been a great solution.
Exactly, this is a much better way of doing pinning, if only because it's much more human readable and easily parsable. I've been using it for a while as well, and find it very convenient.
It seems to me that that's where we should be heading towards.
I'm afraid I never saw what was wrong with specifying dependencies in setup.py. For me, having requirements.txt and setup.py is confusing. Can't we just stick with setup.py? (and yes, I've read Donald Stufft's post https://caremad.io/posts/2013/07/setup-vs-requirement/ and remain unconvinced).
I best way this was explained to me was to keep your unversioned dependencies in setup.py and keep developing and testing against latest. Then when you release, requirements.txt should be a result of the build (not an input) which says "these versions work, this is how you install this release". It makes more sense for an application than for a library, but I think making that the primary distinction only adds confusion.
All that said, I have no clue what Pipfile is adding. The rationale appears to be that people sometimes don't use requirements.txt properly? Can one of the very enthusiastic commentators explain their enthusiasm?
That PEP leads down an interesting path -- hadn't realized there was a completely new packaging tool called "flit" that's aiming to be much more lightweight than setuptools.
The section titled "So Why Does Abstract and Concrete Matter?" on Donald's post explains pretty clearly why you can't have these two types of dependencies in the same file.
There is another discussion on the Pipfile repo about this that may also clarify things for you [0]. The example I posted there [1], which I'll post again here, is:
---
A project can only have 1 set of abstract dependencies (setup.py/pyproject.toml), but different users working with that project can have different sets of concrete dependencies (requirements.txt/Pipfile) which allow them to fulfill those abstract dependencies from PyPI mirrors, private package indexes, personal forks on GitHub, or somewhere else.
So if project Car depends on Engine (an abstract dependency), I can choose to install Car but grab Engine specifically from a fork I made on GitHub (a concrete dependency) that has some performance improvements. Meanwhile, someone else working at a big company that doesn't want to depend on external services to build and deploy their internal Python projects can choose to install both Car and Engine from their private package index as opposed to PyPI (another concrete dependency).
You can't merge these two types of dependencies together into one file without hampering people's ability to choose where to get their dependencies from.
---
According to another commenter on that issue [2], both Rust and Ruby have a similar split in how they specify dependencies.
We know it'll be abused; and we should have learnt our lesson from scons and setup.py that it wasn't a great idea before, and still isn't a great idea using python code itself as a declarative DSL. Just use a standard hierarchical file format (json, toml, xml, whatever)
Features of introspecting and editing `Pipfile.lock` should be rolled into pip and exported as a core python module; an api for editing pipfile.lock is a good idea, but executing a `Pipefile`, is not.
About time. The Ruby, Elixir, and Rust communities are far, far along in their package management tools. Working with pip feels like going back in time these days.
For me npm has always been the worst dependency manager I've used.
What bugs me the the most is that it installs all packages to node_modules by default. It is possible to specify another location but then your application will probably break because it has to know where your node_modules directory is.
Then there's the whole non-determinism thing: https://docs.npmjs.com/how-npm-works/npm3-nondet
There are some issues with it, but think more about it's concepts. I agree at the very least that npm is better than pip, but I feel like pip is really outdated. I'm surprised so many people who are saying it's fine and that setup.py is fine... I think it's just they're more familiar with it.
Check out yarn as well, some of it's practices are really awesome - getting inspiration from bundler and cargo.
This is as someone who doesn't use either of these languages that much. Even composer is better than pip IMO.
Pip has the same kinds of issues with transitive dependencies, but way way way worse since it doesn't allow multiple versions. When you have lib A requiring C ("A -> C1") and "B -> C2", the version of C you get depends on which of A or B was installed first, because the second's requirements are flat-out ignored.
Probably like 90% of the python projects I've looked at have requirement-conflicts because pip doesn't even warn you about this.
That's interesting, because personally I always found the fact that npm installs to `node_modules` to be one of its best features. No need for RVM Gemsets or python Virtualenv; everything gets installed inside your project directory and just works.
Additionally, since each package gets its own `node_modules` directory, there's no need to worry about conflicting dependencies. Multiple versions of the same module can run in the same process with no interference.
Minus the horrific bloat and slow install processes, and when you add a shrinkwrap file (otherwise prod runs who-knows-what)[1]: yeah, npm is pretty fantastic. No conflicts ever, super simple, they did a lot right. Fits in perfectly with the low-developer-brain-cost JS ecosystem.
[1] this is quite a large number of cases, but they are significant drawbacks.
I prefer composer defaults.
Always save dependencies to the composer.json (eay to forget --save)
Always create a lock file when doing composer install
I do sort of like node's ability to recursively nest dependencies to avoid dependency hell. Though that has it's downsides too.. you will never know what is in your modules dir.
If you need old-version retrieval on a routine basis, library regression testing has a problem. This gives package developers an excuse to break backwards compatibility. Historically, Python has avoided that, except for the Python 3 debacle.
I'm a bit confused as well, unless you're referring to "installing libs at any version except HEAD", in which case I vehemently disagree with you. Deterministic builds are a must-have for production systems.
Why can't you get the versions in question from the origin of the error (deployment config if it's an internal error, or asking the user if it's an external report)?
It is certainly possible, but adding a manual step for something that has no reason not to be automatic doesn't make sense. Yes, you can certainly choose to do more manual work for no reason, but why would you?
And if you don't want that (for example if you distribute a library), you can still distribute the project without the lockfile, and then have the users distribute the lockfile back to you to get a repeatable error.
They're probably claiming that using a lockfile excessively would put less incentive on library developers to keep backwards compatibility, because nothing would break if the user doesn't explicitly upgrade.
As someone who has been dealing with managing Python requirements a lot lately, I was excited to see what Pipfile is all about. After reading the post and all the comments here, it's still not really clear to me what the value add is over existing solutions.
There are a lot of mentions of deterministic builds, but that is already very achievable with pip-compile (part of pip-tools) or just pip freeze.
The grouping functionality allows you to have just one requirements file instead of one per environment (i.e. production, test, development, etc) which is mostly just a personal preference IMO. This isn't particularly compelling to me, but that could be because I'm already used to the traditional "pythonic" way of having one file per environment. Using the -r command within a requirement file allows one to recursively include other requirements files to avoid duplication of common dependencies across environments.
The difference in syntax between traditional requirements files and Pipfiles is indeed pretty large. The Pipfile syntax is quite a bit more verbose which I'm personally not a fan of, but this will come down to personal preference and what one is familiar with.
It's unclear if Pipfiles as proposed here is meant to include the dependency resolution functionality of the pip-compile command provided by pip-tools. That is a very critical step as vanilla pip makes no guarantees about respecting version pins of nested dependencies; only that some version of a nested dependency will be present but not necessarily the one intended.
Another big unknown that others have asked about as well is how Pipfiles can be used to manage requirements for a library in a way that allows other libraries/apps that do not use Pipfiles themselves to still list said library in their requirements.txt.
Apologies if my comments comes across as overly negative or dismissive; I applaud any effort to improve the tooling around Python dependencies. But as someone already familiar with the Python/pip ecosystem, it's not clear how this would improve or simplify the solutions that are already out there.
I have not used ruby extensively, but a lot of times i had to install something with it i got stuck with version conflicts of ruby itself and package deps.
In my python workflow everything lives within a project virtualenv. Dependencies are defined in setup.py with install_requires, extras_require and tests_require. I build against latest, version constraints are added mostly when the latest version of a package has a problem.
Now when i commit to dev, stage or prod branch our ci generates a version pinned requirements.txt which is used to install the virtualenv on stage/prod.
no, don't remember why that wasn't an option - maybe i was not even aware of it. I am not really experienced in the ruby toolset, just trying to use tools written in ruby.
Yeah, sadly Ruby has the same environment issues as Python. Python has virtualenv, Ruby has rbenv - they're essentially required, sometimes even for global binaries.
For Python, there's `pipsi` for automatically creating unique virtualenvs for binaries - that's the right approach, and it's worth adopting ASAP for future sanity. It even lets you mix python 2 and 3 binaries without any issues. I don't know what the equivalent would be for Ruby.
My initial reaction to reading this was "finally". Coming from a ruby background I found the package manager of Python to be seriously lacking in ease of use respects. Kudos to the pypa team!
I've been using pip-tools for the last year or so. It does a similar job so far. I'd be happy to switch to something that's supported out of the box by pip, though.
This feels like putting the cart before the horse: proposing a tool to solve a problem, without opening the problem to discussion, gathering requirements/desired features, and comparing it to similar tools in other languages.
I welcome any improvement to python packaging, but I'd start with improving pip itself & making it into a library that tools can wrap around.
I'd like something like `npm install --save Django` -- a command that finds the most recent stable version of Django, and then adds _that_ package to requirements.txt, and leaves Django's dependencies out of reqs.txt.
I know that it's just a few simple operations for me to do it myself, but npm has spoiled me.
If I'm reading you correctly, the missing feature in pip is to list dependencies for one package so they can be put into requirements.txt.
This is a missing feature, there are various hacks but nothing concrete.
If you're working on a large project you have to keep track of all your dependencies and preferably only add the main dependency, not its dependencies, into requirements.
I tend to avoid doing pip freeze > requirements.txt because it lists packages I know shouldn't be in requirements.
Like civilian, my main ask for an improved pip would be a way to save a single explicit package into my requirements.txt by adding a --save flag when installing it.
I don't want to include sub-dependencies in my requirements.txt.
The way to maintain requirements.txt with single dependencies (base dependencies maybe, I'm self-taught on the lingo), is to manually add them one by one.
Pip freeze is useless because it adds a lot of things you don't need into requirements, making it cluttered and hard to follow.
Yes, I (and civilian) know that. The point is that it's comparatively cumbersome. You have to manually add the specific version into requirements.txt instead of being able to automatically save it as you install.
The main reason this is better is explained in the third paragraph in the README: Deterministic builds. You can specify target versions in your Pipfile, while your Pipfile.lock will contain the actual exact versions pip installed (even the exact git commit) so if your app is built on another machine you know the libraries are exactly the same, preventing slight version differences from causing you problems without requiring you to be ultra-specific in your dependencies.
Another benefit is named groups, which allow you to more succinctly specify dependencies for various environments (dev/test/production/etc.) Along with this you get the benefits of the lock file so you can be assured the subset of libraries you use in production will be the exact tested libraries you use in your larger dev/test environment.
deterministic builds are good, but i think the issue people are seeing with this is that it's like setup.py, in that it's not easily parsable without the full python interpreter. We could already have a requirements.lock.
in fact, there's already a library for this called pip-tools (https://github.com/nvie/pip-tools) that generates a requirements.txt (the lock file) from a requirements.in (your direct dependencies)
Worth pointing out that the builds will be deterministic only in regards to Python libraries. For anything else (e.g. libxml) different machines might still have completely different versions, and behaviors, so determinism goes out the window.
How is the Pipfile.lock distributed? Is it intended to be checked into source control alongside the Pipfile? If so, how would that help someone pip install my Python project (say from pypi) using that Pipfile.lock and get the benefits of a deterministic build?
Installing the project would cause it to be built from the lock file (assuming the Pipefile hasn't changed). This will mean your users won't get "some version" of a library you depend on between verion 1.0 and 2.0 or whatever range you specified. They'll get the exact package you last successfully used and checked in yourself, right down to the git commit if applicable.
Once you modify the Pipfile then pip will resolve your dependencies and try to make the specified changes by adding/removing/upgrading packages. If Pypa continues following bundler conventions this will be done by making the fewest changes possible from your existing versions in your lock file. You'll also have an upgrade mode where pip will rebuild you project from the Pipfile looking for the most recent versions of all libraries or a specified library within your specified version ranges. When done and your app/tests are working, check in the new version and you can ensure all your users/environments will be able to upgrade cleanly.
Depends on the use. For a library, it doesn't need to be distributed at all, since the application that's using the library will eventually dictate the actual versions of dependencies (so it can play nicely with other libraries). For an application, yes: you commit it along-side the pipfile, and re-generate the lock file when you upgrade things.
One use case I don't see covered is grouping by python versions. Sometimes "soft" dependencies are only available/supported/not-broken on certain python versions.
It'd be nice to be able to mark a dependency as being conditional based on some expression, e.g. `$PYTHON_VERSION_MAJOR >= 3`.
Give it time. Moving to a native python DSL will make extending the dependency system more straightforward. Also, if they continue copying the best of Ruby's bundler they will have a way to specify python version/platform specific dependencies.
This seems like a huge step backwards to me. Why would you want to go from a parse-able, machine-readable, data-driven syntax to something that can't easily be introspected, isn't machine-readable without firing up a full Python interpreter, is as flexible as actual code and is thus subject to all the abuse you can introduce with actual code, etc..
"- If there's a corner case not thought of, file is still Python and allows people to easily extend"
... and ...
"- Using Python file might cause the same problems as setup.py"
Ugh.
Also, how is the "lock file" actually distributed? Unless you can pip install a wheel and have it include an embedded lock file, then you've still got to have some out-of-band mechanism for copying the lock file around like you would a requirements.txt or even a fully-fledged virtual environment.
Your lock file goes into version control. This ensures all your developers and environments are using the exact same version of every library. You generate a new lock file when upgrading/adding libraries.
Don't be so scared. Ruby developers have been doing this for over 7 years now. Trust us, this is much better. Your Python friends at Pypa are copying Ruby because dependencies are less painful there than just about anywhere. They are still painful, but this will help.
I think the issue here is that it adds so much complexity. There's a project called pip-tools (https://github.com/nvie/pip-tools) that does a similar thing (generates a requirements.txt, test-requirements.txt etc) from a requirements.in files, so it serves the same purpose as a .lock but is backward compatible, much simpler, not subject to abuse by having it be actual code, and is machine parsable by any language.
Having a more complex requirements system that what we have with requirements.txt is good, but at what cost? Is Python doing this just because other languages to it this way? I think it is actually
The purpose of a lock file is to separate the concepts of "acceptable versions" from "the list of last exact versions I used and were working". These are very different things and requirements.txt doesn't do both without you getting really anal with your version requirements (and losing the benefits of less strict version requirements).
With pip-tools, requirements.txt is your lock file, and everything in it is pinned. It's built from an acceptable-versions input usually called "requirements.in".
Probably, yeah. Though unless it's Real Python™ code, and can do lots of shenanigans, I don't see how it's more flexible. And all those shenanigans mean :'( in the same ways as setup.py. And if it's not Real Python™ and just a python-like declarative syntax, then why not just add the features to requirements.txt and the command-line, and maintain that parity?
Dependency groups don't really mean an advantage to me. E.g. with pip-tools, on the code I work on, we've got 3 requirements*.txt files. One (requirements.txt) for prod, and ones for test/dev/any additional scopes you may want. Then you `-r requirements.txt` in requirements-dev.in (and -test), and you're guaranteed to maintain the same versions as production when resolving requirements-dev.txt, or have conflicts if something dev adds prevents that from working. In CI / build / etc you just install the single file that's relevant to you.
Bundling groups into the file would be nice for not being able to make mistakes (our approach above requires you to compile things in order, for example), but that would have to weigh pretty heavily against breaking backwards compatibility and a very-simple DSL that already exists.
I'm not sure age is a good indication of goodness in this regard. Python has been doing setup.py for much longer than 7 years. It's a similar format, and ripe for abuse because you have full access to a general purpose programming language in it. I've seen setup.pys call out to Inkscape. I don't want a general purpose programming language in my requirements.txt (or equivalent). I want a dumb declarative data file.
Also, I'm not sure how a lock file in source control will help me when someone who is not checking out source wants to pip install my project using said lock file.
Age is a good indication if there were serious issues with this approach we would have found them.
There seem to be a lot of strange fears from python developers about this approach and my comment on age was an attempt to assuage those fears. Since ruby developers are quite happy with bundler (at least in comparison to other communities) and have been for some time it's a reasonable point and not the only one I made.
I'm not sure how pip plans to use the lock file for projects distributed via pip. They may very well have a solution planned for that.
However, for projects that are distributed by source (which are many) the lock file ensures deterministic builds, and requirements.txt generally doesn't without being pedantic with your versions.
> There seem to be a lot of strange fears from python developers
There you go again.
This isn't fear. It's asking the question of: What. Is. The Damn. Point. Of. This?
I have yet to see a simple, cogent explanation of why this is better than a list of requirements specifying;
pyside==x.y.z
pillow==x.y.z
As opposed to this proposed Thing.
You know what this reminds of?
It reminds me of replacing super simple .INI text files with configuration files stored as XML - change for the sake of it, because Change!
Now, I'd love to see a very simple explanation and justification on why we should all move to this new Thing you're making.
Something like "This new Thing is better because..." , followed by practical /examples/ , because right now, all I see is More Complicated Stuff For The Sake Of It.
There were clear fears/risks/concerns stated by witten directly above. I didn't invent the idea of fears being stated out of whole cloth.
> subject to all the abuse you can introduce
> might cause [...] problems
I've said a few things about the potential benefits, though I am not involved in the project. The actual authors offer the best list of benefits: https://github.com/pypa/pipfile#the-concept
They also point out this will eventually be built into pip, so it will have the benefits of all the existing workarounds like pip-tools without any setup. It does seem like a lot more than change for the sake of change, and along with it since there is so much change why not take the opportunity to adopt a flexible DSL built for future extension so there is less real change in the future.
In my view, that's already out of hand, and isn't even really parsable without some bespoke library.
I'm a big fan of this idea of having a requirements.lock. I can call pip to do the parsing, then I can parse the lock which is just json.
The fact that you need to parse the requirements.txt feels like a design issue with pip to me. (Of course, I'm coming from Rubyland.) As for setup.py, setup.py is broken because it conflates build scripting with package metadata and dependencies. Now that tox exists things are better (but the fact that tox is not a general-purpose programming language is IMO unfortunate coming from Rake/Ruby/Bundler/Gemfiles.)
Was that really necessary? You just moved the tone of your reply to a less pleasant manner.
The poster isn't scared. They, like me, are failing to see the necessity of this compared to the more simple, straightforward, already-working, KISS-principle-following requirements.txt, which was my immediate reaction to reading the description of this project on the github link.
It was an attempt to directly address the FUD about running python code during dependency resolution. You're doing this to run more python code from the project's author so this seemed odd and an irrational fear.
Maybe my tone came across condescending. That wasn't the intent. It was meant in jest in hopes the reader would second guess the fear.
KISS is great, but this project is here because it's not already working for everyone. There is benefit to a flexible dependency system that keeps the concepts of "acceptable versions" separate from "actual versions last used" while also adopting an extendible DSL ready for edge cases that haven't been thought of.
To be clear, I totally get the benefits of formal support for an "actual versions last used". That is a topic near and dear to my heart, having been burnt by floating versions more than once. It's just that I don't want to swallow a too-flexible DSL in order to get that benefit, because I've also been burnt multiple times by using a DSL when a declarative data format will suffice.
And you don't need both. You can pip freeze a frozen_requirements.txt. You can ship around a whole virtual environment. And I can conceive of more requirements.txt 2.0 solutions that don't require over-wrought DSLs.
You guys were really burned by this setup.py stuff, huh? I empathize.
I also find it a little odd since a destructive project author can still find ways to mess things up without any code executing during setup, right?
Since this is for project level dependencies it doesn't seem like the potential for abuse is too high since you'll likely have a single Pipfile per project and the project will control it.
I don't know that anything will prevent destructive developers from being destructive. I would hate to throw out all DSLs because of a few bad actors.
For what it's worth in Ruby even library dependencies are written using a ruby DSL that runs unprotected during install, as well as some post installation hook facilities. There have been minor abuses like annoyingly long post-install messages and other crap that has occurred, but through simple community pressure the offending libraries are eventually pushed back in line and it's not an issue Ruby developers regularly encounter. Communities are different but Python seems like a community that appreciates best practices and is maybe better at enforcing them than Ruby is, so I'm not sure preventing one minor way a terrible developer can bite you is worth hamstringing your build system.
Yes, an over-wrought DSL is not a prerequisite. You can do deterministic builds and even grouped dependencies without it. But complex platform specific dependencies, git based dependencies, and dependency edge cases we haven't yet envisioned are more easily captured in an extendible native DSL than a more static data format. The DSL also has the advantage of being "just Python" so it will be very easy to remember for Python developers.
Maybe the benefits aren't worth the trade-offs but don't be so quick to count potential or perceived trade-offs as actual ones until you try out the new build system and see how it impacts your workflow. I know you got burned but this isn't setup.py and potential for abuse isn't the same as abuse... yet.
It's a matter of opaqueness, not of abuse: the install system will be unable to reason about what an arbitrary script does, severely limiting functionality.
For example, pip will be unable to ensure dependencies remain constant (after installing something else, after DST begins or ends, between a dry run and an actual install, after unrelated software alters environment variables, etc.). This is much worse than a 95% correct dependencies manifest that can be easily hacked to 100% correct.
pip will be able to do everything you suggest by using the lockfile.
The whole point is that if you want things to be reproducible you don't even need to look in the Pipfile. requirements.txt already supports conditionals, so the Pipfile.lock is strictly simpler to reason about in that sense.
Can you give some concrete examples of the issues with using a setup.py/DSL? I've only been using python intermittently for a few years and I'm probably not exposed to these things. Am I setting myself up for a fall by using setup.py?
I've seen setup.pys that read random files from the filesystem. I've seen setup.pys that shell out to random system commands that may or may not be installed. It's just a horrible format for what it's trying to accomplish.. because it's not a format! It's a programming language.
Not as such, but I've encountered setups that shell out to Inkscape, ImageMagick and other tools, and don't gracefully handle failure, sometimes to the point where the only error is "Failed to install egg".
> Don't be so scared. Ruby developers have been doing this for over 7 years now. Trust us, this is much better. Your Python friends at Pypa are copying Ruby because dependencies are less painful there than just about anywhere. They are still painful, but this will help.
I think the condescending tone is quite unnecessary ("Trust us, this is much better. Your Python friends at Pypa")..
In my experience Ruby is a a far worse culprit than Python for breaking things. First of all `rvm` and `rbenv` are both bloated and rather unreasonable invasions of the shell (hi-jacking `cd` seriously?) Secondly, the fact that Ruby modules are globally shared -- can cause endless breakage from a few misbehaving packages, especially with ones that like to monkey-patch std-lib classes (refinements should improve the the later but it will take a while for the ecosystem to catch-up).
While I agree with the issues that may stem from using code to express dependencies, there are upsides. I think the main deficiency in Python is the need for virtual environments per project (and the fact this requires a separate tool).
If this Pipfile.lock approach allows me to always have the correct package versions without virtualenv, it's a huge step forward.
Since we can reference other requirements with `-r req-dev.txt` inside requirements.txt .
Current requirements.txt isn't really machine-readable without firing up a full Python interpreter.
All I ask is that if you add a freeze command, make sure that it remembers a git-style install of a package so that it yields the right directive later on. (the full git path and ref, NOT the egg name)
While we're on the topic... the name requirements.txt is so generic and undescriptive. It breaks the principle of least astonishment. Something like packages.pip would make more sense.
A .txt file is meant for human readable sentences, yet this file is meant to be consumed by pip. You can only "customize" it working within the confines of pip freeze. If you were new to programming, would you suspect that requirements.txt was somehow linked to Python? Probably not.
Why can't ruby or JavaScript or any language make use of requirements.txt? They could, and make the same claim you just made, which would break pip install -r.
Text simply means text format, not that "humans should read this". The human editability of text is a nice-to-have, but not a requirment of the format structure.
For a comparision, NPM has a packages.json which is hand-editable, and lots of people do customize it by hand to get their NPM scripts running, but:
a) If you're new to programming, you might be confused by which language it belongs to just like a requirements.txt
b) if you peek into it, you may not think it's generated, consumed, or editted by machines because JSON format is so readable
I never said humans should read it, just that it's human readable. And of course requirements.txt is technically human readable, but the problem is that it breaks scripts if you do anything other than pip freeze to it. This is obvious in the npm world (you better put JSON in a .json file!).
My point is that the .txt extension doesn't give any hints to its use or that it's breakable, and that the name "requirements" doesn't describe the intended domain of the file (Python only). Even naming it "python_requirements.txt" would be an improvement.
I wonder if they're using the term "deterministic build" in the same way that, say, Debian uses it, and whether they should choose something else to describe this.
Installing from GitHub is a pain and if you upgrade packages and freeze into requirements.txt you can't tell which versions you've overwritten. You can easily stuff this up with a deploy script.
The problem is that it can be used in both a deterministic and non-deterministic way (no version-specifiers specified, not all sub-dependencies listed). People want the best of both worlds.
There are arguments for and against, but in general I think a configuration file should be human readable and editable, as well as easily understood by an IDE without having to run an interpreter. So something like YAML or TOML, or even a simple INI would be better than the function calls in the Pipfile. However, the lock file isn't meant to be edited by hand or really ever looked at, so it being in JSON or something less human readable is fine.
Also, why doesn't pip just make it, by default, look for the requirements.txt or the Pipfile? It's silly having to type pip install -r requirements.txt and I will also find it silly having to type pip -p. It's how NPM, Bundler, and many other package managers work; why do I need to call a flag to install requirements?