Does anyone else think this reflects badly on Python? The fact that the author has to use a bunch of different tools to manage Python versions/projects is intimidating.
I don't say this out of negativity for the sake of negativity. Earlier today, I was trying to resurrect an old Python project that was using pipenv.
"pipenv install" gave me an error about accepting 1 argument, but 3 were provided.
Then I switched to Poetry. Poetry kept on detecting my Python2 installation, but not my Python3 installation. It seemed like I had to use pyenv, which I didn't want to use, since that's another tool to use and setup on different machines.
I gave up and started rewriting the project (web scraper) in Node.js with Puppeteer.
Granted, I'm just a scientific programmer, but my workplace has a full blown software team maintaining a multi million line codebase. That codebase is rebuilt every night, and as I understand it, you're not allowed to submit a change that breaks anything. And they have people whose job is to keep their tools working.
What people casually think of as "Python" is really a huge dynamic ecosystem of packages. Imagine that there are 60k packages, each with 1k lines of code... that's a 60 million line codebase, and it can't be checked for breaking changes. Short of continually testing your code against the latest versions of packages, you're going to hit some bumps if you haul an old code out of the vault and try to fire it up on a new system.
I don't know how Javascript developers handle this.
I handle it by running Python inside an isolated environment -- WinPython does this for me -- and occasionally having to fix something if a new version of a package causes a breaking change.
The drawback of my method is deployment -- there is no small nugget of code that I can confidently share with someone. They have to install their own environment for running my stuff, or take their chances, which usually ends badly.
> you're going to hit some bumps if you haul an old code out of the vault and try to fire it up on a new system.
Most package managers have lockfiles that allow for some degree of determinism. Of course if a library introduces breaking changes you're going to have to rewrite, but only when explicitly upgrading the dependency.
and nowadays most HPC centers run on spack or easybuild anyways. I adopted easybuild 2 years ago, since then switched to spack and see that our local HPC center is also using it for their new modules...
When I am working on large projects like this and the miryad of problems which python and its friends bring to us, I tend to see if projects like "nuitka" can solve those problems. For example if you need to scale you jobs ust imagine how many sysopen_at syscalls one single python script causes, and usually multiply that to the number of cores your job will have. Nuitka solves most of those problems by packaging everythin. It is almost as staticallly linking your code to run in the cluster.
I love Python. Throughout my life I tried learning many languages, and Python is the only one that really stuck, and was able to do useful things. Learning Python changed my life, in 5 years my salary more than doubled, and for the last 5 years I've been a full time developer. A coworker likes to say that I think in Python.
That said, I 100% agree. I don't have the answer, except that I wish that there was one official answer, for developers and deployment that was easy to explain to beginners.
For what it's worth I've been using pipenv for over a year and it works good enough. I think npm's approach is better, but not perfect. I've heard good things about yarn. I know CPAN is the grandfather of all of them. I've barely used gems but they seem like magic, and Go get your repo URI out of my code please and thank you. :-) All fun aside, what languages have it right? and is there maybe a way to come up with a unified language dependency manger?
Basically this. Every tool does the same thing slightly differently and all run into different variations of the exact same problem. Is your interpreter in your path? Do you have the right version of your modules installed?
Technically, OP didn’t have to use pipenv for anything except knowing which versions of each dependency to install (Pipfile.lock) with good old pip. Those other tools are mere conveniences. Giving up on a language for that...that’s drastic.
> and is there maybe a way to come up with a unified language dependency manger?
For interpreter/compiler version management you can use asdf [0]. It works for all popular programming languages. You can use it to replace tools such as pyenv, nvm, gvm, etc.
This is why I manage every nontrivial project I do nowadays with Nix (Haskell, Go, Python, C & C++, bash ..anything)
Everything is pinned to exact source revisions. You can be relatively sure to be able to git clone and nix-shell and be off to the races.
You can even go the extra mile and provide working editor integration in your nix-shell (especially easy with emacs). So you can enable anyone to git clone, nix-shell, and open a working editor.
The biggest downside is becoming proficient in Nix isn't easy or even straightforward.
While you're here, how do you do this with nix-pkgs?
I looked into using nixos for nodejs deployments recently, and was amazed to find that the versions of node in the nix-pkgs repo are just pinned to X.Y.0 releases, with no discernable way to update to a bugfix release after .0 , so... I don't see how this could possibly be used for production deployments?
While the nix package manager supports the coexistence of multiple versions of a package, the nixpkgs package collection does not contain every single versions of every package. However, it does make it very easy to refer to past versions of a package by importing package specifications from an older version of nixpkgs.
I think this is a reasonable choice, considering that the main purpose of nixpkgs is to provide packages for the NixOS distribution. It's impossible to actively maintain every single version of every package.
I've mostly stuck to nix-shell -p foo bar baz to setup my environments. It's holding me back and I should pull the trigger on a shell.nix file, just gotta learn the language.
And then to expand, if you wish to pin one of those packages to a specific rev, the easiest way is to create an "overlay" (it's a Nix design pattern) that you apply to nixpkgs to override whatever version is in your nixpkgs version.
The big gap is management of the full dependency tree. With yarn I can get a package.lock which pretty well ensures I'll have the same exact version of everything, with no unexpected changes, every time I run yarn install. I get the same thing in the Rust world with Cargo.
In Python it's a mess. Some packages specify their deps in setup.py; some in a requirements file, which may or may not be read in by their setup.py. It's not rare to have to have multiple 'pip install' commands to get a working environment, especially when installing fairly boutique extensions to frameworks like Django.
There just isn't a holistic, opinionated approach to specifying dependencies, especially at a project level. Which leaves unexpected upgrades of dependencies (occasionally leading to regressions) as a reality for Python devs.
There are two new tools in the Python ecosystem, which try to fill the gap left by cargo, npm, yarn & co.:
One is pipenv [0], which works similar to yarn & co. It uses Pipfile/Pipfile.lock to define and lock dependencies. Pipenv has a major flaw: It can't be used to publish packages on pypi.org (you still need twine & setup.py for that). It's also known for being slow and somewhat buggy. Despite all that pipenv is an "official" tool maintained by the "Python Packaging Authority".
The other one is poetry [1], which works exactly like yarn & co. It uses "pyproject.toml" to specify dependencies and "poetry.lock" to lock them. Poetry does most of the things "right", but it's still an underdog compared to pipenv.
Both tools have not yet fully matured, thus there are a lot of complaints.
I've worked on tons of small to medium-small Python projects over the years where I didn't fix dependency versions at all, not even major versions, just a requirements.txt with a list of package names (usually it's a list of maybe at most ten well-known libraries, resulting in at most twenty actual packages pulled from PyPI). Come back three years later, pull the latest versions of everything, code still works fine.
Now try that with JavaScript or Rust. If you don't fix versions, come back three months later and compatibility is usually fucked up beyond all recognition.
Some languages embraced better dependency locking because they absolutely couldn't not solve the problem.
I’ve only recently started working with Python, and I’ve already been bitten by TensorFlow v1 and v2 packages having different APIs, so the viability of that approach will depend heavily on which packages you use.
However in SemVer a major version number change is how breaking changes are documented, so seeing a v1 to v2 change coupled with having to do some work to fix breakage is just expected, something that may well be necessary for a project to make progress.
ML, pydata etc. are really worlds apart from more traditional Python ecosystems; guido himself a couple years back admitted he had no idea about those silos and sat down with some leaders of those communities to hear their needs. Those communities tend to have their own recommendations and best practices.
My very brief exposure to TF seems to suggest that dev environments surrounding TF are way harder to set up than my “list of bare packages in a requirements.txt file” scenario which is sufficient for a lot of more traditional endeavors.
pipenv generates a Pipfile.lock - if it can, I've used it primarily for Airflow, and some packages within Airflow have incompatible version ranges for the same dependency, which means it can't generate the lock file.
Yes. As someone who has never dove deep into python, but has had some contact with it: the package manager ecosystem is the #1 thing keeping me away from it.
npm sucks and all, but at least it just works and doesn't get in my way as much.
Anecdotally, I've had more Node packages with native code in them fail to build for me when installed via npm, than Python packages with native code fail when installed via pip. That whole node-gyp thing is a huge mess.
> npm sucks and all, but at least it just works and doesn't get in my way as much.
used many package managers: pip, gem, go, npm, composer, etc... npm is the only one i have recent memories of having to kill & the only one that makes the fan go off (well okay most c/c++ do that too...)
quite frankly surprised by what i am seeing about python here. i have never been into the new stuff, pipenv, poetry, pipx, etc... maybe that's where the bad experience is coming from? i even don't know when and why it got so complex...
npm is equivalent to combining pip and virtualenv into a single tool. This gives better ergonomics when switching between projects since you never have to "activate" your environment, it's always activated when standing in the project directory.
Isn't this what Pipenv does? What has been a downer for me is that many of the cloud providers do not support pipfiles in their serverless app services (Elastic Beanstalk, App Engine etc.)
On second thought, at least on GCP I should be able to put the pipfiles into .gcloudignore and just update the requirements.txt file with each new commit using git hooks, build scripts or a ci/cd tool.
That does sound convenient. I wonder if the virtualenv aspect is relevant though, i.e. do people really deploy npm apps outside of a container/isolation layer?
I imagine if you're deploying docker, you probably should be developing in docker (e.g. using PyCharm's remote interpreter/docker interpreter integration).
Coming from the scientific computing realm, I've only ever done "real" code in Python until a couple of weeks ago when I was forced to do some work in Typescript. Once I got over the initial worries about the best method to install npm to avoid Python-esque problems and gave nvm a try, I was very pleasantly surprised by the package management process and was able to just get on with the work.
I've tried various Python env management solutions in the past (mostly leaning towards conda), but had recently settled on just using separate LXC/LXD containers for each project.
Just spent ~1 hour trying to set up a working python environment .... so yes. It's in the classic phase where it has an ecosystem with a bunch of problems that are small enough that they aren't being tackled comprehensively by the core language, but large enough that n different solutions are being created in parallel by different groups. The result is an explosion in complexity for anybody just trying to get their job done and ... it's actually very unpythonic!
The alternative is an opinionated build system defined by the language developers.
Like any dictatorship, that can be fine if those in charge are benevolent and competent. For programming languages, the first is almost always true, but the second is far from guaranteed. Skill at programming language development has no bearing on skill at developing a build, packaging, and distribution system.
Go is a prime example of this. The language is pleasant to use with few ugly surprises, but their build system has been awful for a decade, only now reaching a semi-decent state with modules (which are still pretty damn ugly).
With python, on the other hand, there's competition in this space, and as a result the tools are pretty nice, albeit fragmented.
But then there's rust, which has a nice language AND a nice build system. You take a big risk when building both the language and build system; sometimes it works, sometimes it doesn't. And you risk fragmentation if you don't. It's a tough choice.
The thing is, Golang and Python had the same problem with regards to depenencies, in that they punted on the problem and the community came up with several competing products that confused users.
Until now I just deal with that by always explicitly specifying the whole path to the python executable. Granted my needs are very run-of-the-mill, but wouldn't that suffice for many people? I've tried several times over the last few months to get into all this python environment/package stuff, but it all feels like yak shaving... much like the hours and hours I spend customizing vim years ago, which were fun (at that time) but which weren't 'productive' in that they'll never in my lifetime pay themselves off.
I maintain a mix of old and new python projects (sadly, still working on migrating some older 2.7 stuff to 3) and my setup is the same as TheChaplain. I just keep a separate venv for each project and have it setup appropriately in VSCode. With VSCode, I don't even have to think about which venv I am working with outside of when I initially set the interpreter.
Until you have to deploy on a machine without internet access, and suddenly pip -r requirements is not enough, especially if you don't have a local pip mirror.
I develop a project that gets deployed to some users with locked down boxes (tight permissions, no internet), and it's really not that bad. You just download the dependencies using `pip download package_name` and bundle them with your project. Your install is basically the same; `pip install -r requirements.txt --no-index --find-links /path/localdeps/`.
It's not as nice as just doing a regular pip -r, but it works and isn't that much effort.
For me the biggest problem is C-based python modules that can't just be installed in your virtual environment but want to be part of the global installation.
Tkinter and mpi4py are the most recent ones I've had this problem with. I expect someone will tell me "it's trivial to install these in a venv, just do X", but X is not obvious to me.
it's trivial to install C-programs with the appropriate tools such as Nix and spack. You might end up in Tcl-hell in your new python-environment, but as it's not that widely used anymore, one should be fine.
Having said that: you generally want this packages integrated with your system (which provides a self-consistent Tcl as well as MPI-environment)
Pyenv is for installing multiple versions of python. Virtualenvs are a layer beneath that.
It’s super useful for maintaining static versions of python, like 2.7, 3.6 and 3.7 when you have many projects that have different python requirements.
I understand how this was needed historically, when using the official installer might overwrite the Python you already had installed. But as far as I know, you can download an installer for a new version, run it, and it doesn't touch your previous installation.
For example I've had 3.7 on my macOS system for a while, installed from the official installer, not through Homebrew. I just installed 3.8, which pointed my python3 to 3.8. But my 3.7 was still there; I created an alias python3.7 to point to that. So I can run or install anything I want against 3.7 or 3.8.
Why do we still need pyenv? I'm not asking that antagonistically, I really don't understand at this point and I'm wondering if I'm missing something.
Specifically, pyenv will dynamically and programmatically link the $PATH for the Python executables (python, pip, etc.) to the desired version as defined either by an environment variable or by the contents of a .python-version file in CWD or an ancestor thereof. The .python-version file can be checked into version control.
Honestly I haven’t used official installations before so I can’t speak to that too much. Pyenv mostly uses official builds though so it’s mostly an automated frontend to manual installs.
I like being able to specify the global and local versions for my projects and the system as a whole. I also use it as a virtualenv manager. It works well with pipenv (which I still use in anger) and vscode.
I suspect, but could be wrong, that the disconnect here is bc devs who are making open source packages need to make sure it runs on multiple different versions of Python.
If you're working on closed-source code or have tight control of your environment, it's enough to develop and run on a single version, rendering pyenv and whatnot unnecessary.
Poetry will use the default “python” command found on the PATH. If you’re working on multiple Python interpreters for the same project, it’s very useful to combine Poetry with Pyenv.
Compared to other languages and ecosystems, it really is lagging behind. Depedendency and version management were afterthoughts in Python. I dread having to maintain our Python projects.
We've stuck to virtualenv and pip mainly for the reason that we've got plumbing that works and we'd rather be doing other things than finding new plumbing.
Very few issues arise from our choice of build tooling. Not enough to consider switching at the moment. I suspect I'll try pyenv next time I have a new dev maching to set up but only as it seems fairly painless to switch from virtualenv.
There's a lot of "shiny new toy syndrome" where people want to try the latest tool that is "allegedly" better but it's a pain in multiple other factors.
Pip works, virtualenv works. They might not be great tools, but they do the job.
I don't want to worry to much about my environment, that's why I'm skeptical about new tools. Because they might break when you least expect
> The fact that the author has to use a bunch of different tools to manage Python versions/projects is intimidating.
It just shows that Python is used for a lot of purposes and there is no single tool that handles all usecases.
> I gave up and started rewriting the project (web scraper) in Node.js with Puppeteer.
And when you'll get to the point where you need to work with different node projects, you will also need several tools to manage node versions and different environments, so that doesn't help at all.
The thing is, if I need to switch node versions. I can use nvm. The thing is, I don't need to manage different environments in node because the dependencies are contained in "node_modules" and not "attached" to a Python interpreter instance.
I'm a Java developer and I make fun of Maven and Gradle as much as anyone, but overall it seems like I am better off than I would be in the Python ecosystem for dependency management as well as managing the version of the language I compile and run with.
To be fair, he says in the article that his requirements are somewhat different from most Python developers, to wit:
* "I need to develop against multiple Python versions - various Python 3 versions (3.6, 3.7, 3.8, mostly), PyPy, and occasionally Python 2.7 (less and less often, thankfully)."
* "I work on many projects simultaneously, each with different sets of dependencies, so some sort of virtual environment or isolation is critical."
It's kind of shocking to hear these two quoted as "different than most" requirements – as a Ruby developer, this sounds like exactly a thing that any engineer supporting production systems would need routinely. RVM and bundler are standard developer tools and I would never question the need for supporting multiple versions on the same machine, unless in a very well-defined scenario where RVM was unneeded (like in a containerized environment packaged through a pipeline.)
So sure, there is more than one way to manage a Ruby runtime version, but are there any competitors to Ruby's Bundler? I feel like it's the undisputed champion of Ruby dependency management, working with Rubygems, another unopposed incumbent in its own space, and it never even occurred to me that my language should have more than one of either such tool. Can someone help me understand what drove the Python ecosystem to need more than one dependency manager, ELI5?
I am pretty cloistered as a Ruby developer, but my limited professional experience with other language runtimes tells me that the Ruby "developer environment" experience is really second to none, (change my mind.) Is there any tool which is nearly comparable to Ruby's Pry for Python or other interpreted languages? (I doubt that there is!)
Managing gemsets and multiple versions of Ruby is old-hat.
I think contemporary best practices start with the production environment and work backwards towards creating a development environment as close to the production environment as possible.
These days, most productions environments are effectively isolated containers. If your production environment is a container, you probably should develop in a container as well. In that case you don't need much tooling for isolating an application's dependencies from other applications.
The tooling that you need is to build a python application, which means (1) get the dependencies (2) copy over some source code (or invoke the C-compiler if build a C<->Python extension) (3) run tests. Python's builtin setuptools does that fine. It didn't strike me as amazingly simple, but it's not amazingly complex either. pip is essentially a convenience wrapper for setuptools, i.e. pip is to setuptools as apt is to dpkg.
Basically, I believe that because of docker isolation is a irrelevant criterion by which to judge a language/ecosystem.
Well then I guess you've reasoned yourself into a nice position from which you can claim the problem I routinely handle cleanly almost every day, is out of scope and unworthy of attention. This is the one important feature of Ruby that has enabled me to stay working without containers.
I'm a developer that supports multiple production applications, and I frequently need to make my development environment as similar to production as possible in order to reproduce a production issue for debugging purposes. I depend on that isolation to be able to do this. My work environment is such that I'm not generally permitted to use containers in production (yet). So it stands to reason through your argument that perhaps I shouldn't use them in development either. It sounds like if I were using Python as well instead of Ruby, I'd be having a much harder time.
Honestly, if I could use containers in dev and prod I would, I truly do believe the grass is greener ;) but I would not sacrifice this marvelous isolation tech, in fact I'd prefer to take RVM with me into docker-land so that I can A-B test Ruby versions within the same container image, and be guaranteed that all my cluster's worker nodes will not have to take extraordinary measures and carry both images in order to ensure the application can boot without a download delay, whenever we have to revert the canary (or whatever other minor potentially reversible lifecycle event would normally trigger a node to need to download a new, expensive base set of image layers all over again.)
Bundler is really nice and much better than virtualenv/pipenv/etc. I find this sadly ironic as I actually consider most of the Python ecosystem to have much more breadth and quality in terms of libraries; it’s just a pain in the ass to manage those dependencies once you actually want to use them.
Package Management - npm? bower? yarn? Which should I use this week?
Interpreter Versions - Revisiting a project I last touched 2 years ago on node 8 has a load of locked dependencies that only work on that node version. OK, let's bring in nvm so I can get a dev environment going.
Executable Packages - oh no I've got two different projects with different and incompatible versions of apollo, what do I do? Oh right, sure npx for executable isolation so we don't pollute the global namespace.
Every ecosystem has these problems, and if they don't it's probably because they're still relatively esoteric.
> Every ecosystem has these problems, and if they don't it's probably because they're still relatively esoteric.
Exactly! I'm not aware of any non-compiled language where (all) these issues are solved much better. I can be very productive with the tools I mentioned above and I'm glad that they work almost identical for both my main drivers (Python and JS/TS).
I think it really comes down to Python not having a chosen way to handle package management as well as Python being dependent on the underlying C libraries and compilers for the given platform.
Since Python did not prescribe a way to handle it the community has invented multiple competing ways to solve the problem, most of which have shortcomings in one way or another.
To further add to the confusion, most Linux and Unix-based operating systems (Linux, MacOS, etc.) have their own system Python which can easily get fouled up if one is not careful.
This is one place where Java's use of a virtual machine REALLY shines. You can build an uberjar of your application and throw it at a JVM and (barring JNI or non-standard database drivers) it just works. There is also usually no "system Java", so there is nothing to break along those lines.
Exactly. It's not Python's only problem, but far and away the most painful snags I've hit with packages is when they use C code, and thereby drag in the whole system. "I'll just `pip install` this- Oh, I need to install foo-dev? Okay, `apt-install foo-dev`... oh, that's in Ubuntu but not Debian? Well this is gonna be fun..." Now I trend a bit more exotic in my systems (NixOS, Termux, *BSD, ...) but if my Python were just Python it would just work so long as I have a working Python install; in practice that's rarely enough.
You can have multiple JVMs or JDKs installed, and therefore the need to change environment variables depending on your use cases, but I was referring to Java being part of the operating system in the same way that Python is part of some operating systems, for example several Linux distributions (Fedora, RHEL, and practically all derivatives).
Isn't it still just a package on Red Hat distros? A base system package, granted, because some system tools are written in Python.
But in any case, it just becomes one more version of Python to consider. If you're already dealing with multiple versions, what difference does it make?
It is "just a package" in the sense that there are RPMs for Python, but many system management tools are Python scripts that assume you have Python and specific Python libraries installed so that everything will run correctly.
There's no issue with using the system Python, but any Python packages should be installed via yum or similar Red Hat / Fedora tools and not pip.
Note that the newer versions of RHEL have created developer-specific tool packages to separate the system packages from developer packages. This allows the developer packages to get upgraded quickly so developers have nee, shiny tools without breaking the compatibility that the base system needs to keep running.
I'm not sure how people manage to "foul up" their system python, but you are doing something extremely wrong, when giving random devtools root access to transform a typical (non NixOS and friends) production environment (your workstation!) into a custom development environment.
Given that: are you sure problems arise around pure python-packages (which generally have a well enough forward compatibility), or is the problem with all the cool "machine-code"-embedded packages (which are a lot!)? And yeah, these might indeed break randomly, when installed on different systems of different time-periods. But that's a problem all binary packages have!
I see the same problems with the python ecosystem.
There is a lot of tools and confusion between versions especially because of the breaking changes between v2 and v3 (try to explain someone why he has both and when he types python it will take v2.7 as default).
I love the elegance and simplicity of the language and many tools written with it but this is a point I'd really much appreciate to be improved.
Because of that I sometimes just rewrite something before fixing it with 2.7. It's perhaps a bit more work sometimes but not as frustrating as trying to get something running that is deprecated in parts.
It occurs to me that, with respect to version dependency, you can think of Python and Java programming as similar to Smalltalk programming: you program and alter the environment.
In Smalltalk you change parts of the environment. In Python, Java (and Ruby?), you change the entire environment, as described in TFA.
People hate on Gradle endlessly, but the fact that 99% of my JVM based applications can be successfully launched including entirely self-contained dependencies with
./gradlew run
is a huge boon and one of the things that keeps me sticking with the ecosystem.
You kind of mentioned this yourself already, but this boon is more of a feature of the JVM (the classpath) rather than the dependency manager.
If Python would have a similar concept rather than depending on a global module location we would be able to replicate the same developer ergonomics as we have for the JVM.
Well with Haskell and Clojure, many people just use a pretty plain text editor + a REPL and maybe. And for Haskell at least, that's super easy to install. I suppose there isn't One True Way, but each of the popular options (cabal, Nix+cabal, stack) are only a couple steps.
Haskell with new-style cabal works like a charm with `cabal init` for package setup. And then ghci from there will give you...
- expression evaluation ofc
- type-at-point via type hole annotations
- Laziness inspector via :print
- Breakpoint debugger (someone just posted a nice Reddit text post about it today)
- Package information of terms via :info (and :load $MODULE to get any top-level term in $MODULE into repl scope)
I use mostly JavaScript for my day-to-day web stuff. I've been really turned off from using Python for more things because of these issues. My experience with managing dependencies in JS has been much easier than with Python--I'm really astonished that such a popular language has done such a bad job at this for so long.
I read it as a symptom of a very active project. It’s being taken to new places daily and independently. It might be somewhat like all the various Linux distributions. Somewhat overwhelming from the outside but in practice not so much.
Yes. As a non-Python programmer, I sometime had to make software or dependencies in Python works. It was always a long steps of installing some package manager, setting up virtual environment, running another package manager, etc. And of course, it failed at some point before the all thing was working.
On the contrary, I seldom had these kind of issues with projects coded in C#, C or C++: most of the time the few steps to compile the project succeeded and produced a usable binary.
Really? With a decade long experience with C# I have never found the nuget package manager to be superior to pip. I don’t think nuget is necessarily worse either, but there are so many abandoned packages that died with some .Net version. Which means you’re either building your own extensions or abandoning packages.
As far as virtual enviroments go, I actually kind of like them. They were containers before containers became a thing, and I’ve had much fewer issues with them than say NPM, but they are more burdensome than adding full binaries to your C# project.
Where compiled languages shine, and maybe I’m misunderstanding you, is when you need to distribute and executional to end-usere. C# is much better at that than Python, but we haven’t actually done that at my shop, in around 10 years.
It's admittedly bad, any Python dev who says otherwise isn't being honest.
That said, once you get it down, you're not burdened by it much/at all. You can start and begin work on a new project in seconds, which feels important for a language that prides itself on ease of use.
I have never understood the need for all the different tools surrounding Python packaging, development environments, or things like Pipenv. For years, I have used Virtualenv and a script to create a virtual environment in my project folder. It's as simple as a node_modules folder, the confusion around it is puzzling to me.
Nowadays, using setuptools to create packages is really easy too, there's a great tutorial on the main Python site. It's not as easy as node.js, sure, but there's tools like Cookiecutter to remove the boilerplate from new packages.
requirements.txt files aren't very elegant, but they work well enough.
And with docker, all this is is even easier. The python + docker story is really nice.
Honestly I just love these basic tools and how they let me do my job without worrying about are they the latest and greatest. My python setup has been stable for years and I am so productive with it.
I'm firmly set on virtualenv with virtualenvwrapper for some convenience functions. Need a new space for a project? mkvirtualenv -p /path/to/python projectname (-p only if I'm not using the default configured in the virtualenv config file, which is rare)
From there it's just "workon projectname" and just "deactivate" when I'm done (or "workon otherprojectname")
It has been stable and working for ages now. I just don't see any strong incentive to change.
I have been doing this starting Ubuntu 14.04. it's been stable even when I upgraded now to 18.04 which has python 3 as default. The only downside compared with tool such as pipenv is the automatic update of packages that pipenv can offer and it's ability to be integrated into CI/CD pipelines.
I am fine with venv, but I need to keep a list of root dependencies and separate dev dependencies from those that the product needs to work. Since I need tools for this functionality, why not just use pipenv or poetry?
Isn't this just a variant of what the original comment is critiquing, though? In order to sanely use Python and external dependencies you need some highly stateful, magical tool that does 'environments' for you. The conceptual load of this is quite high. Adding docker only makes it higher - now you're not merely 'virtualizing' a single language runtime but an entire OS as well - just to deal with the fact your language runtime has trouble handling dependencies.
if the conceptual load of having a large collection of binaries separately from your project (and ready for reuse!) is that high, you probably should use something else. I wonder how C/C++ people just manage this hardship...
(and for all npm-friends: it's exactly the same "conceptual load" which arises from having node 8 and 12 around. except that these will be just incompatible by default, so noone bothers)
I don't understand the point you're making, where does a 'large collection of binaries' come in? The thing I'm talking about is right in the name 'virtual environment'. Or are you simply answering the GP's comment about this being a potential wart in Python with the suggestion that people should just use something else?
the point is that the classic virtualenv does not take care of different base distributions (3.5,3.6,3.7...). You have to install them yourselves and switch them in your PATH before setting up a virtualenv. if this is too much "conceptual load", I wish the parent good luck with finding something not having this problem ;). If there are binaries involved, there is the additional problem that you also depend on compiler features and C-libs, which I wish everyone good luck in replicating in their CI for any language (containerization and NixOS/guix/spack are approaches to that problem)
I think we are talking about completely different things. Forget the actual virtualenv and Python for a sec.
Imagine you want to check out the hypothetical language Jagoust. This goes something like so - you install Jagoust and maybe some Jagoust tools. You fire up your interactive environment and set it up so it knows where Jagoust and its tools are. Using your enviornment you do things like tell Jagoust where your project is or which particular version of Jagoust to use, etc.. And that's pretty much it and while the details are different, the process is quite familiar to you since it just leverages your standard interactive environment in ways mechanically similar to Rugoja, another language you know.
Later you decide you want to try the language Snek. The first thing you notice in the Snek tutorial is that your standard interactive environment is somehow not good enough - you need a meta-environment for it and your project. Perplexed, you google some more docs and ask some Snek people and find out that meta-environment used to be how things were done but these days, you probably want a quasi- or para-environment. But why do you need an environment for your environment? You do not know. Such are the mysteries of the Snek.
Well, you need some way to switch between different variants of Jagoust in the first setup too? And if everything is so well integrated (w/o bash-scripts and all...), just name Jagoust ;).
Noone forces people to use virtualenvs with "snek"/Python. They can always install python somewhere and "pip install" packages into the installation hierarchy, switching around different hierarchies with bash-scripts (and now have a look what all the other interactive environments do...).
Last time I looked, it was very tedious to set up `pip` to be secure and pin your dependencies to hashes. Without this, a compromise of a library's pypa account would allow them to execute arbitrary code on your system, assuming you didn't notice the change.
You can use `pip-tools` to get something like a Gemfile/package.json, but there are a few restrictions that are suboptimal.
So Pipenv/Poetry are the current best ways to get something like the package management story that other languages like Ruby and JS have had for a long time (and that Go's been polishing recently too).
I was going to say that pinning hashes seems to be painful, but since I last looked it now has a --generate-hashes flag.
Also if you want to link directly to a git repo, you can only install it in editable mode, you can't just install it with the rest of your packages (means you get a `src` directory where you ran pip, which makes Dockerizing slightly annoying, and probably impacts performance slightly).
Maybe this has also been fixed since I last upgraded.
You’re absolutely right. I use pipenv almost entirely because it can activate a local environment automatically when entering a folder in terminal. That and the fact that my virtual folder lives in ~/.local let’s me work directly in Dropbox. Nothing I couldn’t live without.
Seriously. The author is bending over backwards to accommodate poetry from every direction, from the stupidest installation instructions I've heard, to "can't transfer from requirements.txt" to "it doesn't work well with docker but doable". Like what exactly does it add that's worth all this complexity? Make you a maitai every hour?
> Please don't comment on whether someone read an article. "Did you even read the article? It mentions that" can be shortened to "The article mentions that."
Small python projects I did, I used venv and pip. Learned my lesson through wasting couple of hours after fighting through dependency issues.
Being from JAVA Shop for long time, If I have to switch between different version of JAVA, all I do is change JAVA_HOME to point to correct version, go to base project directory and "mvn clean install" does the job. :).
Miniconda makes this really simple and their doc on environments is easy to read/understand. The benefits of conda really shine when trying to install a package with external dependencies to an environment.
The one thing I thought was neat here was pipx. I do have a few CLIs set up in my default conda env and haven't run into any dependency problems yet, but have occasionally tried to use them while another env is activated. Having a separate env automatically created for the entry points is a nice value add.
Eeeehhh I think I will be downvoted to hell and back for this but after I read the article I had the feeling of "why are you making this feel more complex than it needs to be?"
I mean compared to Java and C# I have a MUCH MORE EASIER time to set up my development environment. Installing Python, if I am on a Windows box I mean, is enough to satisfy a lot of the requirements. I then clone the repo of the project and
> "why are you making this feel more complex than it needs to be?"
Because it's more complex if you have projects on multiple Python versions and if you want to lock your Python packages to specific versions. (Pip can bite you when different packages have different requirements for the same lib).
> Although Docker meets all these requirements, I don't really like using it. I find it slow, frustrating, and overkill for my purposes.
How so? I've been using Docker for development for years now and haven't experienced this
EXCEPT with some slowness I experienced with Docker Compose upon upgrading to MacOS Catalina (which
turned out to be bug with PyInstaller, not Docker or Docker Compose). This is on a Mac, btw;
I hear that Docker on Linux is blazing fast.
I personally would absolutely leverage Docker for the setup being described here: multiple versions
with lots of environmental differences between each other. That's what Docker was made for!
I would love to read a blog post covering how to do this!
My experience has been that it's significantly more effort to meet my requirements with Docker, and that I spend a _lot_ of time waiting on Docker builds, or trying to debug finicky issues in my Dockerfile/docker-compose setup.
I'm sure all of these things have fixes -- of course they do! But I find the toolset challenging, and the documentation difficult to follow. I'd love to learn what I'm missing, but I also need to balance that against Getting Shit Done.
It seems like long builds are either (a) necessary or (b) user error. (a) If you have a tree of dependencies and you change the root, you should rebuild everything that depends on it to make sure it's still compatible. (b) if you placed your application into one of the initial Dockerfile layers, but then you're installing dependencies that don't depend on you, it's user error.
What's the situation where your application needs to go first in the Dockerfile, and then you need to put a bunch of stuff that doesn't depend on your application?
The Dockerfile that's provided looks like it would be very slow to build. I always try to make Dockerfiles that install deps and then install my python package (usually just copy in the code and set PYTHONPATH) to fully take advantage of the docker build cache. When you have lots of services it really reduces the time it takes to iterate with `docker-compose up -d --build`-like setups.
In addition to the popular conda, it's worth checking out WinPython for scientific use. Each WinPython installation is an isolated environment that resides in a folder. To move an installation to another computer, just copy the folder. To completely remove it from your system, delete the folder.
I find it useful to keep a WinPython installation on a flash drive in my pocket. I can plug it into somebody's computer and run my own stuff, without worrying that I'm going to bollix up their system.
I've used both and recommend Poetry. It's got a larger feature set (it can be used to manage packages _and_ publish packages), it's faster, and it's more actively developed (measured by releases). Pipenv's last release was 2018-11-26, and Poetry has been publishing pre-releases as recently as three days ago to prepare for v1.0.0.
I did a quick comparison here[0], and I'm planning to do an update with the latest version of Poetry.
Similar to the OP, I found pipenv to be quite unstable. At the time (about a year ago) it was really more interesting beta software than the production-quality software it was advertised as. It was also quite a bit slower than pip.
But what really pushed me away is that installing or upgrading any single package upgraded all dependencies, with no way to disable this behavior. (I believe you can now.) A package manager should help me manage change (and thereby risk), not prevent me from doing so.
Poetry is probably the best of the all-in-one solutions. It does its job well but I've found the documentation lacking.
In the end, I've settled on pyenv-virtualenv to manage my environments and pip-tools to manage dependencies. It's simple and meets my needs.
Pipenv has been nearly completely broken for a year without a release. Installing from scratch rarely works without providing a full path to the python you want to reference.
Now that poetry also manages virtual environments it’s far and away the better choice.
Caveat - Heroku doesn’t understand pyproject files yet, so no native poetry integration. Heroku are working on this.
I switched from pipenv to poetry over 1 year ago. I love it!
The main reasoning was so that I could easily build and publish packages to a private repository and then easily import packages from both pypi and the private repository.
We have a custom pypi server and need all requests to go through it, however haven't figured a way to make poetry always use our index server for all modules instead of pypi.org
Poetry is amazing, if only for the ability to separate dev and build dependencies. Maybe pipenv does this, but I couldn't get it working well enough to find out.
I've flagged trying to manage python versions on my machine and just develop inside docker containers now (Thanks to VSCode). Using tightly versioned python base images
I chose to not use either after trying both. Nether solves understanding `setup.py` (they are just indirections on it). Of the two, poetry seemed more comprehensive and stable across releases. There’s a small cognitive load of knowing a couple of Twine commands if you don’t use either.
Switched to poetry and couldn't be happier. From my experience, poetry wins hands down. It managed to replace flit, remove duplicate dependencies, and maintain stability across machines. All while using the standard pyproject.toml configuration file.
> On Linux, the system Python is used by the OS itself, so if you hose your Python you can hose your system.
I never manged to hose the OS Python on Linux, by sticking to a very simple rule: DON'T BE ROOT. Don't work as root, don't run `sudo`.
On Linux, I use the system python + virtualenv. Good enough.
When I need a different python version, I use docker (or podman, which is an awesome docker replacement in context of development environments) + virtualenv in the container. (Why virtualenv in the container? Because I use them outside the container as well, and IMHO it can't hurt to be consistent).
I love Python syntax, but I still haven't found a sufficiently popular way that can deploy my code in the same set of setting s as my dev box (other than literally shipping a VM).
So setting up a dev env is one problem, but deploying it so that the prod env is the same and works the same is another.
which is another layer of abstraction and dependency that I do not really need, e.g poetry no longer maintained, poetry(or whatever) has an urgent bugfix,etc
No, pip has been available in standard Python distributions since 3.4.[1] Distributions that don't come with pip (e.g. some Linux distro packages) are non-standard.
This article is great, those are viable solutions for sure. One of the alternatives is conda: it's common among data scientists, but many of its features (isolation between environments, you can keep private repository off the internet) meet enterprise needs.
I would generally reach for conda instead of this, but they seem quite comparable in aggregate.
And, given that I've been trying NixOS lately and had loads of trouble and failing to get Conda to work, I will definitely give this setup a try.
(I haven't quite embraced the nix-shell everything solution. It still has trouble with some things. My current workaround is a Dockerfile and a requirements.txt file, which does work...)
I like Python has a language, but when I see how clean are the tools of other similar languages, for example Ruby, compared to the clusterfk of the Python ecosystem, it just make me want to close the terminal. I'm always wondering how it became the language #1 on StackOverflow.
Seconded. Just to be clear asdf manages interpreters, not project dependencies. It actually uses pyenv under the hood to manage Python versions. I use it to manage Elixir and Python versions and it works rather well. I also find its CLI interface well designed and consistent.
There are two things that I find a bit elusive with Python:
1. Highlight to run
2. Remoting into a kernel
Both features are somewhat related. I want to be able to fire up a Python Kernel on a remote server. I want to be able to connect to it easily (not having to ssh tunnel over 6 different ports). I want connect my IDE to it and easily send commands and view data objects remotely. Spyder does all this but its not great. You have to run a custom kernel to be able to view variables locally.
Finally, I want to be able to connect to a Nameko or Flask instance as I would any remote kernel and hot-swap code out as needed.
In my experience, conda breaks quite often. Most recently, conda has changed the location where it stores DLLs (e.g. for PyQt), which broke pyinstaller-based workflows.
In principle, it's a good idea; in practice, I'm not satisfied. On Windows, it's an easy solution, especially for packages that depend on non-python dependencies (e.g. hdf5).
Start with `docker` and learn the basics concepts: difference between image and container, layers, etc. Copy a Python `Dockerfile` and see that it works. After a while you'll get the hang of it and will be able to know what to google and how to navigate the docker manual. Pythonspeed.com has some good protips once you understand the basics.
You'll get confident and from there learning `docker-compose` is a breeze.
I use pip-tools. It fits in nicely as an additional component to the standard toolset (pip and virtualenv). But most people probably do not need to freeze environments so it's great to be able to not use it for most projects.
I can vouch for this. Anaconda is especially good for simulation/data stuff (based on the focus on which packages are included by default).
One pain point though: getting it to work with Sublime Text 3 requires you to set the `CONDA_DLL_SEARCH_MODIFICATION_ENABLE` environment variable to `1` on Windows.
Not a flaw of Anaconda: it just pays attention to how to with multiple Python installations on Windows.
Hi kovek! You might like https://www.codementor.io/ . I admit I'm a mentor there, and I've made about twenty bucks helping people, but anyway it was super fun helping poeple. :)
With pipx when you install things they go into isolated environments. With pip you're just installing things globally.
This difference is important due to dependencies- if you have two different CLI tools you want to install but they have conflicting dependencies then pip is going to put at least one of them into an unusable state, while pipx will allow them to both coexist on the same system.
I haven't used pipx, but as far as I understand, pipx = pip + venv.
If your pip executable is in a virtualenv, the "globally installed" is locally installed.
pipx, poetry, pipenv and co are still nice wrappers to have, I suppose. It just feel less useful now that most of my projects are dockerized.
> Governance: the lead of Pipenv was someone with a history of not treating his collaborators well. That gave me some serious concerns about the future of the project, and of my ability to get bugs fixed.
Doesn't seem fair. You're not abandoning requests, are you?
Is this comment just trolling? Load the main page of his site.
> I'm a software developer, co-creator of Django, and an experienced engineering leader. I previously ran teams at 18F and Heroku. I'm currently taking new clients through my consultancy, REVSYS.
Please don't break HN's guidelines by being snarky or putting down others' work in a shallow way. If you know more or have a different perspective to offer, try sharing some of what you know so we can all learn something!
Being in the data science community myself, I prefer straight venv + pip to conda. It’s simpler for me to manage errors. I only use conda when I have to.
Hi whalesalad, good to meet you! Now that you know me, you can never say that again anymore :-) Although, tbf, I only use conda for my machine learning related projects. I've tried using pip for that but was at risk of massive hair loss.
In the scientific community, there is a widespread "just use anaconda" message. Many people spend their entire lives inside anaconda, and equate it with Python.
1. Argument from authority doesn’t mean anything to me. I also don’t believe creating Django or running a Python consultancy endow someone with especially useful opinions of Python packaging tooling. (Not that the author isn’t knowledgeable, just you seem to think there’s an A implies B relationship between those two items and having good opinions about Python packaging, and there’s not).
2. Conda is quite widely used outside of data science. It’s for example part of Anaconda enterprise offerings used by huge banks, government agencies, universities, etc., on large projects often with no use cases related to data science. Conda itself has no logical connection with data science, it’s just a package & environment manager.
In each of my last 4 jobs, 2 at large Fortune 500 ecommerce companies, conda has been the environment manager used for all internal Python development. Still use pip a lot within conda envs, but conda is the one broader constant.
Giving a counterexample is not argument from authority. I did not respond to the parent comment to discuss any feature of conda, only to dispel the wrong claim that only mostly data science projects rely on it.
I don't say this out of negativity for the sake of negativity. Earlier today, I was trying to resurrect an old Python project that was using pipenv.
"pipenv install" gave me an error about accepting 1 argument, but 3 were provided.
Then I switched to Poetry. Poetry kept on detecting my Python2 installation, but not my Python3 installation. It seemed like I had to use pyenv, which I didn't want to use, since that's another tool to use and setup on different machines.
I gave up and started rewriting the project (web scraper) in Node.js with Puppeteer.