> Python actually has another, more primitive, package manager called easy_install, which is installed automatically when you install Python itself.
It's actually not, it's part of setuptools/distribute, though some Python distributions (actually just brew that I know of) include distribute alongside Python.
Also, while the quick skim of the rest of this looks mostly good, there's some unnecessary advice which complicates things.
virtualenv is not necessary if the crux of the advice is "don't install packages globally" (which is fantastic advice, second probably to the more important advice "don't install stuff with sudo").
What beginners (and even some more experienced programmers) need to learn about is --user. You install packages with `pip install --user thing`, and they become per-user installed (be sure to add `~/.local/bin` to your `$PATH`). This is enough to not require sudo and for 99% of cases is actually sufficient.
There is a rare case where you actually have packages that depend on different, conflicting versions of another dependency. This has happened ~1 time to me.
Don't get me wrong, I like virtualenv for other reasons (relating to workflow and maintenance), but if we're teaching people about packaging, there's no particularly great reason to dedicate most of a tutorial on it.
> What beginners (and even some more experienced programmers) need to learn about is --user. You install packages with `pip install --user thing`, and they become per-user installed (be sure to add `~/.local/bin` to your `$PATH`). This is enough to not require sudo and for 99% of cases is actually sufficient.
Is this relevant advice for packages needed by your web server account, e.g. www-data?
Thanks, this is great advice. easy_install comes preinstalled on a Mac, which is where the confusion arose. Will update the article when I get a chance.
As I install/develop my python apps into VMs with a fixed python version, I rarely use virtualenv... don't see the need for the extra complexity.
I've eagerly read pieces like this but haven't yet found out the reason this solution is problematic or that I'm doing it wrong. Just that no one else seems to be recommending it. Anyone have an idea?
One other thing I'm not happy about regarding packaging best-practices (and PyPi) is that security updates are not able to be automated leading to vulnerable packages.
Virtualenv is about managing multiple/conflicting versions of libraries, not different versions of Python itself. You're accomplishing the same thing by using a different VM for each app, just with higher overhead and isolation of things other than Python as well.
I recently learned the lesson about global packages. I thought it would be so nice to have all packages readily on hand, but now startup time is about 15 seconds of crawling the filesystem looking for files (over NFS).
Just start using virtualenv for everything. By default virtualenv starts you with a clean python setup each time and nothing you've installed globally with pip will affect you.
Lots of benefits, but trading 'cd path/to/my/project && source env/bin/activate' for 'workon project_env' (with autocomplete) is alone easily worth the five seconds it takes to check it out.
I find it's really useful for beginners to clearly understand what's actually happening under the covers before they start adding magical stuff on top. They can always add magic later, if they prefer that approach.
> Lots of benefits, but trading 'cd path/to/my/project && source env/bin/activate' for 'workon project_env' (with autocomplete) is alone easily worth the five seconds it takes to check it out.
And/or use the virtualenvwrapper plug-in for oh-my-zsh. Automatically activates virtualenv when you cd to the working directory.
The only problem with virtualenvwrapper is setting it up can and does cause problems for beginning developers new to the shell. Otherwise it's really awesome. :-)
Definitely worth using as an experienced Python person. But you can skip a lot of confusion with new Python developers by waiting until later to introduce it :)
Man, global system-wide installations that require admin rights by default? That's certainly something! Quite the stark comparison to Node.js and npm, where everything is installed locally into the current directory (under node_modules) by default, and "global" installation is actually a per-user installation. Tricking pip with virtualenv seems to get you pretty close to what you get by default with npm, albeit still somewhat more clunky. But to be fair, most other package managing solutions seem to pale in comparison to npm :-)
Either way, nice article. Now if only most packages weren't still for Python 2... PyPI says it has 30099 packages total, but only around 2104 of them are for Python 3 (according to the number of entries on the "Python 3 Packages"-page[1]).
npm's default for -g is to install to Node's prefix, which is usually /usr or /usr/local. If you want it to install to your home directory, you can set the prefix to somewhere appropriate in your ~/.npmrc, which gives roughly the same behavior as pip's --user flag.
Edit: perhaps you changed your .npmrc or set the option via npm and forgot about it? I just checked on a fresh user, and 'npm install -g' definitely tries to install to /usr, just like pip.
That sounds broken to me. Then it's no longer -g for global. You do run your app under a different account than your user-account, right? Something like "nodeuser"?
No, why would I do something like that on a development box? (Or run web stuff on Windows servers for that matter.) And pretty much the only things I install with -g are useful CLI tools - any code I write will have its dependencies installed locally and listed in package.json for 'npm install'.
It wasn't clear (to me) that this was a development box. And it certainly wasn't something npm could know -- so my point still stands. If there's a way to install packages globally, then they should be globally available -- also on windows. But perhaps this is documented somewhere.
As for why you would run stuff on windows, perhaps you were writing an ajax gateway to a legacy system and it made more sense to run the node server on the same machine as the legacy system?
(To be clear, I would pity you if that was the case, but you never know ;-)
The primary use case of npm is quite different. No one installs system-wide npm packages.
Virtualenv solves a different problem (create a complete Python environment inside a directory) so you can replicate the various production setups in your machine and develop. It's not a way to avoid admin-privileges to install system software, for that you can just pip install --user, use homebrew, whatever.
Based on the article I'd say the main reason to use it is so that you can have what amounts to local packages instead of having to rely on global packages (be they system-wide or user-specific). This is what npm does by default - packages are installed locally to node_modules.
And for replicating production setups I'd rather take it a step further and use something like Vagrant instead of replicating just one part of the setup (Python).
...Which would be equivalent to npm install -g XXX, whereas to replicate npm install XXX you'd need virtualenv. I don't think there even is an equivalent to pip install XXX without virtualenv with Node/npm (global system-wide installation for all users that requires admin rights).
If you're in Brighton in the UK and you like this then maybe you'll be interested in the one day workshop run by Jamie (author of the blog post) and myself. This post was actually based on some of the material written by Jamie for the course.
Next one is happening next Thursday and there are still a couple of tickets:
Nice article, but after using leiningen (the clojure solution to a similar problem, based on maven), it's really hard to go back to something like this. I really, really wish there was an equivalent in python (really, every language I use).
Generally you should strongly avoid putting generated artefacts into version control. This leads to complete pain if ever you find yourself trying to diff or merge when they inevitably change. The problem is that you end up with conflicts which are completely unnecessary - you should always be able to just regenerate the virtualenv at any time.
This is especially true for non-relocatable artefacts (as others have mentioned) such as virtualenvs or compiled binaries.
Another thing is that these generated artefacts can be costly in terms of space consumed in the repository - maybe not so much for a virtualenv with one package in it, but for binaries or larger virtualenvs, these things can become quite large. In addition they're often not so friendly for git's delta compression which is better suited for textual data. You can end up unnecessarily increasing the size of your repository significantly, which is another thing best avoided.
Keep your requirements file checked in, but not the virtualenv. The built env should be seen as disposable, and is both location (ie, path on disk) and platform (for libs which are not pure python) specific .
I keep my virtualenvs in ~/.virtualenvs/, away from the project.
I find it best to keep virtual envs completely away from the project (I use http://virtualenvwrapper.readthedocs.org/en/latest/ which puts them by default in ~/.virtualenvs). A virtualenv is completely machine-specific.
If your project is a package itself (i.e. it has a setup.py file), then use that file to specify dependencies. On a new machine I check out a copy, create a virtual env and activate it. Then in the local copy I run "pip install -e .". This installs all the requirements from setup.py in the virtualenv, and links the local copy of my project to it as well. Now your package is available in the virtual env, but fully editable.
If your python project is not a package, you can install its dependencies in a virtual env with pip. Then run "pip freeze" to generate a list of all installed packages. Save that to a text file in your repository, e.g. ``requirements.txt``. On a different machine, or a fresh venv, you can then do "pip install -r requirements.txt" to set everything up in one go.
Alright, so after I set up the environment using pip and virtualenv, I see it has python in it, etc. If I use pip freeze > requirements.txt, it lists the packages I have installed using pip, but it doesn't list anything for the python version itself. How do I make sure the right python version gets captured if I don't check in the /env/ folder?
I think this works well as a convention even if you're not deploying to Heroku. I also like the suggestion to put a guard in setup.py that checks sys.version_info.
Virtual envs don't relocate well, they tend to be very specific to machine and even install location. Plus you can recreate them from your requirements.txt file so there's no need.
You could have issues with other people not running the same version of python, also you might have different site packages if you are working on an experimental branch that will be pushed or merged later.
I would like to just give a difference advice regarding creating virtualenvs and installing dependencies:
When you create the virtualenv, the current package you're working on doesn't get added to site-packages, so you're forced to be at the repository root to import the package.
The best approach is to have a proper setup.py file so you can do `python setup.py develop`, which will link the package you're working on into the virtualenv site-packages. This way it acts as it's installed and you can import anyway you like.
If you define your requirements on the setup.py (I think you should), you can even skip the `pip install -r requirements.txt` step.
I've cooked up a package template that can help getting this working:
I'd like to know what the best practices with regards to security are for using pip, or installing packages in general.
How do you verify package integrity? Do you simply pray that PyPI isn't compromised at the moment, or do you download your packages from Github instead, because the main repositories have more eyeballs on them?
How do you do security updates with pip?
I'm using apt-get at the moment which gives me security updates AFAIK, but my need is growing for more recent versions and certain packages that aren't accessible with apt.
One important note is to use pip>=1.3 (included in virtualenv>=1.9) as prior to this version, pip downloaded from pypi using http and was thus vulnerable to man in the middle attacks.
You might also like to check out wheel, which allows you to compile signed binary distributions that you can install using pip.
Python actually has another, more primitive, package manager called
easy_install, which is installed automatically when you install Python itself.
pip is vastly superior to easy_install for lots of reasons, and so should
generally be used instead. You can use easy_install to install pip as follows:
I found it quite ironic that the author says pip is "vastly superior" to easy_install and then proceeds to install pip using easy_install.
Thanks for the article! I recently spent some time going through the process of learning VirtualEnv / Pip; after looking at several tutorials, agreed that this is one of the better ones out there. A few other things I saw in other articles that you might like to clarify (though I understand there's a tradeoff with simplicity/clarity).
(1) specifically state that `pip freeze` is how to create requirements file (as folks have said in comments already)
(2) add "further reading" link on VirtualEnvWrapper, as it adds some convenience methods to ease use of VirtualEnv
(3) the "yolk" package shows you what's currently installed; it can be helpful to `pip install yolk` then run `yolk -l` to view the different packages installed in each of your envs.
(4) when installing a package, you can specify the version, e.g. `pip install mod==1.1`, whereas `pip install mod` will simply install the latest
No. Use homebrew for system packages and use pip to install python packages. It's much more flexible and doesn't rely on package managers keeping up with releases.
In the real world, a typical pip requirements.txt file will have a mix of package names (which pip looks for and downloads from an index like pypi), git repo urls (eggs installed directly from a git server, eg from github) and bleeding edge track the latest changes -e git-repo#egg=eggname urls. That you can switch between these with ease is important, eg to switch to using your fork of some package rather than the last official release.
Consider MacPorts over homebrew.. I'll withhold opinion on any system that turns the only sanctioned UNIX for site adminitrator's control ONLY directory in the system (since like the 80s) into a Git repository. MacPorts (and basically every other system of this kind) gets it right, unsurprisingly
I'd never have believed a day would come where something like Homebrew would ever gain the traction it has.
a good introduction - I would like to hear more about deployment with virtualenv though - is it expected that you just document any packages with requirements.txt and then you would create the virtualenv in the deployment target and set everything up again? Or can you "package" a virtualenv for deployment?
Just generate your requirements file from your env locally (pip freeze > requirements.txt), deploy all of your files (env folder excluded) to your sever however you want and then run 'env/bin/pip install -r requirements.txt' on your server.
If I use pip freeze > requirements.txt, it lists the packages I have installed using pip, but it doesn't list anything for the python version itself. How do I make sure the right python version gets captured if I don't check in the /env/ folder?
Deployment/distribution can be handled by bundling the app. For Python 2.x, there's PyInstaller[0], Py2App[1], Py2Exe[2] all of which do much the same thing: make a single binary out of a python app and all its dependencies including the interpreter. Then you distribute that and don't worry about what the user has or hasn't got.
Yep. Years ago I settled on a convention of always installing a virtualenv into a 've' directory in the project so I can just set the shebang line on scripts (eg, django's manage.py) to "#!ve/bin/python". My Django project template sets all that up for me automatically. So now I just have muscle memory for typing "./manage.py ..." etc and I never have to activate virtualenvs, mess around with virtualenvwrapper type hacks or accidently run one project's script with a different project's virtualenv activated.
Speaking as someone relatively new to Python (coming from an embedded development background, mostly with C/C++): What's the standard way of distributing client-side programs/libraries? If you only have one script you can just put it in /usr/local/bin/ but otherwise you have to mess with the system site-packages or modify sys.path before importing, right? I've seen a surprisingly large number of distros that didn't check /usr/ and /usr/local/ for packages.
Do you just hand the user an automatic virtualenv script? (Outside of using one of the binary builders out there, obviously.)
Some of the mentioned problems with the traditional method are partially solved with `pip install requests --user` but I understand that the bigger problem/main reason for virtualenv isn't helped by this.
However, I was very surprised that the author didn't mention venv (http://docs.python.org/3.3/library/venv.html) at all since it is basically virtualenv but part of the Python standard library.
you aren't implying that dependency hell is a problem unique to python/pip are you?
dealing with dependencies is always step 2 for me when learning a new language and dependency hell seems like a universal problem. I could be wrong though.
Honestly, Maven is way easier, more powerful and works identically on Win, Linux or Mac. The key is that Java let's you set classpath as a command line arg and pythonpath is an environment variable.
IMO, buildout is more like maven for python. but again the tools solve this problem of "dependency hell" which to me is a problem all languages have, not just python
Excellent tutorial, new to python, and I always hated when things just install without letting you know where it's going..and next time there is some upgrade all sorts of weird errors keep coming..Thanks a lot.
Thanks for this article, it was definitely needed. I've known that I should be using virtualenv for a while now, but actually trying to figure it out has been somewhat daunting until now.
It also comes in handy when I'm trying to install something finicky that has a million dependencies, each of which also might be finicky (e.g. numba) and I screw something up and just want to start over clean and not muck around uninstalling things. virtualenv makes this easy—I just make a new environment for experimenting and then delete it if I mess up and want to start over, leaving my system configuration clean and other environments intact.
Unix fragmentation of where the bin and lib directories should reside, i.e. /bin, /usr/bin, /usr/local/bin, ~/bin, ...
Windows doesn't have symlinks and the different packaging tools have tried to implement the functionality in various different ways.
Python doesn't add the path of the "main" executed file to the module lookup path. (edit: actually, I think this is wrong. I meant to say "Python module import lookup is complicated.")
the problem is that different applications might require different versions of the same packages. if you install packages globally then you can only ever have one version available. this is a VERY BAD THING if you have any expectation of running more than application per machine.
It's actually not, it's part of setuptools/distribute, though some Python distributions (actually just brew that I know of) include distribute alongside Python.
Also, while the quick skim of the rest of this looks mostly good, there's some unnecessary advice which complicates things.
virtualenv is not necessary if the crux of the advice is "don't install packages globally" (which is fantastic advice, second probably to the more important advice "don't install stuff with sudo").
What beginners (and even some more experienced programmers) need to learn about is --user. You install packages with `pip install --user thing`, and they become per-user installed (be sure to add `~/.local/bin` to your `$PATH`). This is enough to not require sudo and for 99% of cases is actually sufficient.
There is a rare case where you actually have packages that depend on different, conflicting versions of another dependency. This has happened ~1 time to me.
Don't get me wrong, I like virtualenv for other reasons (relating to workflow and maintenance), but if we're teaching people about packaging, there's no particularly great reason to dedicate most of a tutorial on it.