Hacker News new | past | comments | ask | show | jobs | submit login
How to make a Python package in 2021 (antonz.org)
396 points by gilad on April 8, 2021 | hide | past | favorite | 205 comments



The number of extra tools used in this article boggles my mind.

Are you writing a simple library? Create a setup.py. Copy paste an existing setup.py and modify it to suit your purposes. Now you have a working, pip installable python package.

Want to publish to PyPI? Use twine. It's standard and it's simple.

You don't need complicated tooling for simple projects.


Comments on this reply seem to have forgotten two key things about python.

1) It’s batteries included. The standard library is a core feature, and it is extensive in order to reduce the number 3rd party library dependencies.

2) “There should be one— and preferably only one —obvious way to do it. Although that way may not be obvious at first unless you're Dutch.” The Pythonic way is to use the standard library, setup.py, pypi, and so on. Python is not Rust, or Node, or Clojure, or anything else. Do it Pythonicly and good things will happen.

In the history of Python, it’s only because of the Zen’s guiding principles and the “come join us” attitude of the community that Python accidentally became “enterprise grade”. It succeeded in spite of the lack of Cargo or NPM, or (shudder) Mavin[, or CI/CD, or 3D game engines, or easy binary package productization, or everything else Python is obviously bad at]; almost precisely because it’s not those things. Python encourages me to get my task done, and all it asks is that I leave the flame wars at the door and embrace it for what it is.

It saddens me to see so many people jump down zomglings throat for making the daring suggestion that Pythonistas use the language as designed. Because, he is absolutely right about this: “Complex is better than complicated.”


Unpopular opinion: Python is bait and switch. Naive, aspiring programmers are told to learn Python because it’s the perfect beginner language, only to quickly hit a brick wall of complication they are unable to foresee (because they’re completely new to all this) and ill-equipped navigate.

Like it or not Python is now “enterprise grade”. The sooner it either grows up and figures out packages, ci/cd, binaries etc, or it looses its reputation at a great beginner and general programming language to something like Julia or Crystal, the better.


The thing is that using setuptools is neither standard nor Pythonic. It isn't part of the standard library. It's a way of doing things that is broken, and has been specifically called out by the Python developer community as something that people should stop doing.


I actually think one should do the exact opposite to what you're suggesting. I've experienced modern package management through Cargo and anything below that level now seems like returning to stone age.

In the python ecosystem, poorly defined packages is a wide problem. You never know what you get in the next version upgrade.

So my suggestion: burn all the "simple" python packaging tools in a big fire and move eveything to Poetry today. It will be painful at start but will increase the quality of the ecosystem by a huge margin if just 50% of the projects do this.


Bit of a side note but remember cargo learned a lot from generations of package managers, across languages. It essentially represents some of the best that package managers have to offer. You only get that kind of result by starting from scratch every few years (as in, rust started from scratch when the Node/NPM ecosystem had been around for years to steal ideas from, Haskell had been around for years to steal language design from, etc.).

Rust is incredibly lucky to have been created when it was because it benefits immensely from these things (I think it's the best new-to-middle-aged production-usable systems language out there today).

I agree with the idea but languages like Python and their ecosystems are really hard to move (remember python 2->3? is that even over?) -- it's a herculean and often impossible task.


Cargo is not hugely different from Maven which has been working fine for over a decade. Yes, it takes some polished ideas from other systems, but Python has had more packaging tools come and go than several other ecosystems put together.


+1 from Maven. And Maven was launched in 2004, 17 (!) years ago. It's almost old enough to vote.

A lot of ecosystems put on some solid horse blinders in order to avoid Java at all costs (Javascript/Node/NPM being another example). They've avoided good ideas from Java for more than a decade.


Migrating python2 -> python3 is hard because it requires rewriting code and because you can't really run python3 code if you depend upon python2.

If poetry create consume non-poetry packages and create packages which other package management systems can consume? If so, then:

1. Projects can move more effectively independently.

2. Projects can re-package as a single task rather than a massive rewrite effort.


I was hoping to see this blog post highlight poetry usage. Any good resource for this? I've already seen the official page.


I learned with this article. And it teaches how to combine Poetry with Pyenv effectively.

https://blog.jayway.com/2019/12/28/pyenv-poetry-saviours-in-...


I quite liked the series of blog posts called Hypermodern Python: https://cjolowicz.github.io/posts/hypermodern-python-01-setu...


Like the other commenter, I was also expecting them to use Poetry.

It's the best one I've tried so far.


Came here to about how it would be better to use poetry, thanks for spelling it out for me.


Using setup.py does not mean "not using extra tools". It depends on setuptools, which is an "extra tools" just like flit (used in the article) or any other tool. In fact, with using only setuptools one will need a whole additional set of tools to manage things like:

    * virtual environments (granted, venv is now part of the stdlib, but it's still an "extra tool")
    * publishing to PyPI or another index (twine)
    * dependency management (both for development and actual dependencies)
Plus the tools that are needed anyway, to manage common development actions like:

    * unit tests (pytests)
    * static typing (mypy)
    * linting (flake8, pylint)
    * styling (black)
The article is correct in using pyproject.toml, which has become the standard way to specify build mechanism for your package [0]. Even setuptools supports it in the latest versions [1], meaning that setup.py is becoming obsolete, or at least unnecessary.

Finally, tools like Poetry [2] offer a whole set of functionalities in one place (dependency management, virtual environments, publishing), which means that they need fewer "extra tools" than just setuptools.

[0] https://www.python.org/dev/peps/pep-0518/

[1] https://setuptools.readthedocs.io/en/latest/build_meta.html

[2] https://python-poetry.org/


Where is there a comprehensive poetry packaging tutorial?


Right here:

    poetry build -f wheel
    poetry publish
You asked for packaging, and that is pretty much it. Of course, setting up a project and its dependencies take a bit more work; the basic intro for that is here: https://python-poetry.org/docs/basic-usage/


That is fantastic.


How do you handle version pinning? hash checking? CI? testing on multiple platforms? multiple python versions? deployment? credential management? package data? version bumps?

Sure, experts know how to do all these things because they spent many days learning them, but I'd rather outsource to a tool.


Iteratively. You don't need to solve all those problems at once.

Version pinning can be done in setup.py using the same syntax you would see in a requirements.txt file. You should be very conservative when pinning versions in a library, though.

You can lean on your ci tool (eg. Github actions) to handle testing, hash checking, credential management, etc. But I recommend all of this start as a bunch of locally runnable scripts.

I typically bump version directly in a version file and move on with my life.

This stuff usually builds up iteratively and at least for me has never been the starting point. Starting point should be a library worth sharing. It is not the end of the world of you release the first few versions manually.


TBH as someone trying to use Python professionally it is extremely frustrating that basic things with regards to package management are something you have to iterate towards, as opposed to just being obvious and default.


One thing that has become clear to me, from playing around a bit with go, rust, and nim, is that it is astonishingly better when the language has exactly one set of tools, that everyone uses.

Even if that set of tools is kinda crappy (Glances at go), it's just so nice to not have all the bikeshedding and just get on with it.


I’m not familiar enough with Go but at first glance the go mod stuff seemed pretty decent.

E.g. more flexible than Cargo in that you could have a large codebase with two different versions of a dependency isolated to their compile units allowing you to gradually adopt a breaking change dependency in a large code base. I was kinda bowled over with that feature (the isolation part is key).

For python I’m finding poetry much more ergonomic than pipenv, it’s not just the speed difference it’s the convenience of one tool which aligns with what you’re saying although the existence of poetry doesn’t delete historical aberrations in Python’s long history.


I sympathize. It is unfortunate that the python community never settled around a tool like leiningen for clojure or cargo for rust or npm for node.

What we saw with npm was the entire community iterating towards a feature set and everyone reaping the benefits automatically with npm updates. package-lock.json is a good example of this.


Worth noting is that cargo and npm weren't "settled around"; they were developed and presented, from the beginning, alongside the relevant compiler and runtime. There was never a question; the batteries were included.

Leiningen is the weird one where people did actually settle fairly well around an unofficial solution in the absence of an official one. I think the norm with languages that forego official tooling is closer to what we've seen in Python.


The Python community has considered an "official" packaging tool in the past, but in those conversations found that the community had too many preferences to find a good compromise. That's the trouble with having a highly diverse set of uses and integrations, and lots of legacy.

If you're curious, the email threads about Conda and defining wheels are interesting.


I feel like the entire point of a BDFL is that they can just ignore this sort of thing and make the hard call, but it never happened.


Maybe it could still happen? It seems like a super high value challenge that the BDFL could take on: build out the official set of tools (setup.py, twine, virtualenv, pip) to support features that make people seek out alternatives (pyproject.toml, poetry, flit, conda, pyenv, pipenv).


I realize this is controversial but from reading the docs I really thought Pipenv was the official solution. Took me a while to realize this wasn't the case.


I went through the same progression, thinking pipenv was the official solution before deciding it wasn’t. Then, just now, I realized that pipenv [1] is currently owned by the Python Packaging Authority (PyPA) who also owns pip [2] and virtualenv [3]. I don’t know the right answer but this illustrates the confusion of not coalescing around an official solution.

[1]: https://github.com/pypa/pipenv

[2]: https://github.com/pypa/pip

[3]: https://github.com/pypa/virtualenv


What happened was that Kenneth Reitz socially-engineered his way into the PyPA to get his tool blessed. The community lashed out (since the tool had obvious shortcomings and a somewhat dubious development process) and recommendations were softened. Eventually the PyPA had to take over pipenv when Reitz had other issues, and they are now forever burdened with what is a bit of a dud.


As I understand it, that was to prevent bad behavior rather than to signal approval.


I think the time has passed, both in terms of there no longer being a BDFL, as well as Python passing a point where the call can be made.


Sadly I bet you’re right on this. My small sliver of hope is that Guido van Rossum’s ideas to “make using Python better” [1] include better packaging.

[1]: https://news.ycombinator.com/item?id=25071847


There are quite a lot of people working on better packaging (look at the Python discourse forum, for instance), but this is not really a topic that Guido has got involved in, at least in the time I've been paying attention.

But with or without a BDFL, one packaging tool to rule them all is a pretty tall order. The needs of a package like scipy, which incorporates C & Fortran code, are pretty different from something like requests. And different communities using Python have their own entrenched tools and techniques. It takes more than someone saying "Foo is officially blessed" to shift that, even if everyone respects the person saying that.


Is there a new BDFL? I remember reading this half a year ago or so https://hub.packtpub.com/why-guido-van-rossum-quit/


On the whole, Guido's career as a BDFL was astoundingly effective. Maybe he made the right call. It'd have been a terrible idea to alienate the science community just when data science was taking off as a field.


I'm not disagreeing, but I am curious if you have anything specific you would point to with regards to Guido's role being successful. I'm just ignorant really, it's not intended to be a leading question at all.


Too many things to answer here. The easiest is to point to the popularity of the language. He's made decisions I disagree with, but I can't argue with success.


I don't know how much the original author had to do with Python's success in the last 20 years. The success of Python in data science is because of NumPy/Scipy/Pandas. Things like packaging that needed leadership never got any.


The leadership was that the science community would benefit from a science-specific tool, like Conda, and that making one ring to rule them all would be too difficult.

Also, yes, Guido and many other early contributers have been been active for 30ish years.


Conda isn't really science specific, just like pip isn't webdev specific.


The result isn't, but the motivation was. And have you noticed who is using which tool?


> The success of Python in data science is because of NumPy/Scipy/Pandas.

And NumPy/Scipy/Pandas exist (and are successful) because...? /s


Rubygems, and then Bundler, followed the same pattern. Neither was batteries-included, both were unofficial community efforts. Bundler directly influenced cargo.


Yarn / npm are still in competition today i think?

The js tooling is particularly immature, e.g. fake packages along with typo squatting is rife. Compare with boring old maven where that’s not been an issue in >10 years at this point.


I completely agree. Elsewhere we see sanity by default, in Python there's still open questions and hopeful but early stage projects.


The issue is that every time the community settles on a tool a new tool is made to fix the issues with the old tool rather than just refactoring the old tool.


I think the Python community have largely settled on pip, just like the JavaScript community have mostly opted for npm.

Personally I much prefer pip over npm, it feels much more polished.


but pip isn't enough and not exactly user friendly


I know Python much better than I do rust or node, but I think the Python design decision was decent here: you can use different tools, but all of them should put the configuration in pyproject.toml. That file has fields which are universal and others which can depend on the exact tooling used. So build tools, repos and so on can get the info they need and at least potentially, do the right thing for code packages with different tools.


yea, it is. but it's a sane thing to do. recommending poetry for a beginner is a bad idea. (nothing against that package).

python is a mature, old, software system. three times older than go or rust. way older than zig. these modern languages have learned from the field as a whole and implemented tools that people are taking for granted nowadays.

as with any mature software system, people & companies have established their preferred ways of doing things. it's going to be hard to have the language council dictate ways of doing things.

i would recommend going simple and using the standard set of modules until poetry or whatever gets into it.


Recommending the standard set of modules is the opposite of "going simple." Poetry removes a lot of the complexity and user-unfriendliness inherent in the previous set of standard modules. For any Python beginner coming from another popular language Poetry is likely going to be very similar to the dependency and package management in the language they're coming from. Python's standard set of modules sticks out like a sore thumb when compared to the tools in other popular languages.


Well, Python did set up a standard - pyproject.toml. And it's what poetry uses. It's just, uh, "light".


It's also not ready yet, missing critical features like editable installs. Right now, you still need a shim setup.py. Until pyproject.toml can actually replace setupy.py, I see little incentive to start using it: It's just one more file to add. The one exception is if the package actually has build requirements, e.g. for Cython modules.


The only reason we use pyproject.toml is because of the stupid black formatter that refuses to support setup.cfg, which every other python tool under the sun supports.


Yeah, it's annoying. But Black has so little configuration that I'm ok with hardcoding command-line parameters in my Makefile/tox.ini/precommit hooks (`--skip-string-normalization --line-length 79`)


There are other formatters, some may even be better.


I don't recognize this. I use poetry all the time, without setup.py and the local installs are editable. I have published half a dozen packages which don't have a setup.py and they all work fine.


You can't install a poetry package editably into another environment. It's really a missing feature in the pyproject spec. There's an open issue about it under one of the pypa repos. Someone just needs to do the work of implementing it in pip/setuptools.


> recommending poetry for a beginner is a bad idea.

Strongly disagree. There are so many footguns with low-level tools like pip that I can't recommend it to anybody but an expert (but an expert doesn't need my recommendation anyway).


To be fair, while this is a single article, if you only look at step 1, 2 and 3, you get a fully published package with only one tool used (flit) and not much extra.

It's the succeeding sections (A, B, C, D, E) that get more advanced, but they're all optional. You should definitely do A, but the rest I'd say it's a lot more opinionated and definitely not needed.


Fair point. I wouldn't recommend this article to someone just starting out with python, though. It's often good to understand generation n-1's way of doing things but not too be married to it.


> Version pinning can be done in setup.py using the same syntax you would see in a requirements.txt file

The problem with this approach is that it doesn't handle transitive dependencies well. Say you depend on version 1.4.6 of a particular library. And then that library depends on version >= 2 of some other library. When you install your package, you know that you'll get version 1.4.6 of the first library but have no idea what version you'll get of the second library. You can of course pin all the transitive dependencies - except that clutters up the setup.py and is a massive pain to keep up to date as you bump individual dependency versions.


Seems like a solid argument for a switch to use go's minimal version selection

the version selected during a build is the one with the most minimal version that satisfies all other constraints. this means if you have libA that needs dep>=1.1 and libB that needs dep>=1.3, you get dep=1.3 even if dep1.9 is out. your build never changes because of a new version release, as long as they release with proper semantic versioning. if you later include libC that needs dep>=1.8, you'll get that version. but because you changed your immediate dependencies, not due to a surprise down the dependency line.

https://research.swtch.com/vgo-principles


Its at least consistent - which, IMO, is better than just getting a random version. I do, however, think it's a bit unfortunate that it prevents picking up bug / security fixes in transitive dependencies.

Imagine that you depend on library A of a particular version which itself depends on library B. With minimal version selection, as long you don't bump you dependency on library A (or some other library that depends on library B), you'll continue to get that same version of library B. But, then library B releases a critical security fix. With minimal version selection, there isn't a great way to pick up that fix. You can _hope_ that library A releases a new version that requires the fix - but that may or may not happen and could take a while. Or, you could add an explicit dependency on the new version of library B - which is unfortunate, since, your main package doesn't depend on library B directly.

Lock files solve this problem. You can depend on whatever version of library A that you need and lock the transitive dependencies. And once library B releases its fix, you can update your lock file without having up bump the version of library A.

Tools like poetry providing the features to automate this workflow.


> You should be very conservative when pinning versions in a library, though

No; you should be very conservative when pinning versions in an application, not in a library. Check this article for the explanation: https://caremad.io/posts/2013/07/setup-vs-requirement/


Anybody familiar with the history of requests knows this is bad advice.


requests is definitely not a good example here. :/


> You don't need to solve all those problems at once.

Why spend time tweaking the setup when you can just get it right in the first place with less work?


Do you recommend poetry?

I've been meaning to try it but haven't had the time to migrate an existing project to poetry.


I used poetry for multiple projects already and I quite like it because it makes version management really straightforward and you can just go like

  poetry new foo
And it will setup all the basic stuff for you right away. Having all the project dependecies and Metadata in pyproject.toml makes sense and reduces cognitive overload. Having poetry manage your venvs autmagically is a good extra.

There is still room for improvement, e.g. I had some trouble with their install script at times (python vs python3)


Yeah it's pretty good. There are plenty of problems and holes of course, but for most pure-Python packages it does a nice job most of the time.

If you're an expert, you can do it with setup.py or setup.cfg, but I don't think it's normally worth the trouble.


I usually don't pin, I'd rather deal with upstream BC breaks as they are published instead of accumulating tech debt. I call this "continuously integrating upstream code", because Continuous Integration is a practice, not a tool.


You don't pin anything for a package. I'm not aware of any "standard" CI that a package tool could set up. I guess you mean testing on multiple versions, in which case tox will help. Deployment for a package is handled by twine. Package data? What about it? Version bumps should always be manual but I recommend setuptools-scm.

You seem to be confusing packages with "apps". It's very important to understand the clear distinction between these.


> You seem to be confusing packages with "apps".

I'm not confusing libraries with applications. Pinning dependency versions enables a repeatable test environment.


I generally reach for pip-tools, when I need to pin versions in a requirements for a deployable app, like an api.

It's by far the simplest option I've found.

If your project is a library, just use setup.py and express your deps abstractly (min, max or version range). Don't pin to specific versions at all, if you can help it.


This is bad advice. Do not create a setup.py for a new package. (Keeping setup.py for an old package can be okay.)

The author is correct that you want a tool such as flit or poetry which will work with pyproject.toml. Setting up a basic package will be no harder than using setuptools, and it is much more future-proof. You won't have to copy-paste some other crufty setup config either.

It is fair that you don't need all the external tools in this tutorial. In particular, using make is very silly since you can configure different linting and testing workflows directly in pyproject.toml, rather than pull in a whole other system which only works decently on *nix. Poetry also removes the need for tox.


Future-proof... Poetry 1.1 broke compatibility with 1.0. 1.1 lockfiles would crash Poetry 1.0, and 1.0 lockfiles would be thrown away by Poetry 1.1.

It does not correctly verify hashes (if at all) [1]. You can't add packages without updating all your dependencies. Monorepos are not supported. PEP-508-compliant Git dependencies cause the tool to crash with TypeError [2].

I think Poetry is the right direction, I use it for everything, but it's not the silver bullet you're painting it to be (yet). It's definitely not on par with Cargo, or maybe even npm.

[1]: https://github.com/python-poetry/poetry/issues/3765 [2]: https://github.com/python-poetry/poetry/issues/3425


I didn't say it was a silver bullet... I said "Don't Use setup.py".


Fair enough. I saw many mentions of Poetry and merged them in my head when I finally replied. Apologies.


Unfortunately, the pyproject.toml format still doesn't support editable installs. So, setup.py is still required if you need this feature.


And why exactly do you think setup.py is deprecated?


Because of PEP 518.


I would probably recommend just following https://packaging.python.org/tutorials/packaging-projects/ instead.


Ah I'm stuck in the dark ages.


For all the hate nodejs gets, it solved the software packaging problem. That is the sole reason npm ecosystem is so big.

I have to say that it probably isn't a fare comparison because python is much older than nodejs. Python package management might very well have been state of the art in 1995.


My impression was that npm does little more than fetch code from the Web and stick it in a 'node_modules' directory. I've even seen npm used for things that aren't even JS, just bunch of files.

This approach ends up with multiple, potentially-incompatible versions of the same package in a project. True that's less of a problem in JS, since it's interpreted (deferring imports to runtime) and un(i)typed (no need to check if interfaces match up). Yet even that has lead to replacements/complements like yarn.


> My impression was that npm does little more than fetch code from the Web and stick it in a 'node_modules' directory.

Yes. There's hardly even a standard directory structure, let alone a standard way to convert source code to published code. Every slightly non-trivial repo basically has an ad hoc build system of its own. Ever tried to fix a bug in a package, and realized that using git://github.com/user/repo#branch doesn't work, because npm only downloads the source code, which bears no resemblance to the built products? I fixed two bugs in two third party packages within the past week, had to deal with this twice. Ran into the Node 12+ and Gulp 3.x incompatibility issue twice in the past month (with totally modern, actively developed packages), too.

npm has more sophisticated dependency resolution and locking than pip, sure. Python packaging is more consistent in basically every other regard.


> This approach ends up with multiple, potentially-incompatible versions of the same package in a project.

If data generated by one version of a library is being consumed by another version, that's a bug in the code that moves data between them.


Maybe the reason that the library itself is simple in the article is that that’s just an example and the author wants to show how to do it properly, end to end.



You seem to have missed the point. The project in the article was simple because that is best for exposition. The article would hardly have been more helpful with the code for a realistic python package dumped into it, would it?


Just use Poetry [1]. It's popular and works well.

[1] https://python-poetry.org/


Huge thumbs up to Poetry. It's drastically simplified package management for me and replaced Pipenv (which I simply dreaded working with due to performance and DX issues).

I no longer start Python projects without Poetry. It's really good.

EDIT: Also, it's being integrated to most PaaS as well. I deploy to Render.com with Poetry now.


That's a 72MB download, and yet another way to fragment the ecosystem. Not something I'd get just to make a package when a default Python setup has recommended tools and everything I need to make and/or install packages.


> a default Python setup has recommended tools and everything I need to make and/or install packages

No it doesn't. Neither setuptools nor pip are part of the standard library. Yes, they are installed by default in many cases, but they are still "extra tools".


On the list of priorities for any package manager, download size should be #100 on the list.

If you've solved 1-99 and download size is an issue, you're in heaven already.


72MB? On a development machine? Are you on dialup?

Poetry doesn't fragment the ecosystem. Unlike setuptools it uses pyproject.toml, which can be read by other tools, and is the correct way of storing package configuration.

A package built using Poetry is installable without Poetry in the exact same way as one built using setuptools.


Last time I tried poetry it was so broken that it was not even usable. I may try again later.


I had a tough time with it ~3 years ago, but now it works great for me


I think it's the right direction, and I look forward to Poetry maturing. But right now it has a lot of gotchas, and I would only recommend it for people who are serious about dependency management/compliance/reproducibility.

See details upthread https://news.ycombinator.com/item?id=26739234


Default way is pip freeze which I feel is too verbose, is tracking every dependency of django worth it?


That's if you want to pin transitive dependencies, which is the de facto standard in JS world but not always true in Python world, depending on your context.


Yes. Also, check pip-tools by jazzband.


Which, like any jazzband project, you must not use for a commercial project, as stated by their CoC, the Contributor Covenant with a vague & bizarre modification about ethical use at the end. You are simply not allowed to contribute in any way on any kind of paid time, nor are you allowed to pay someone to contribute. This is of course not advertised for when they tell you to give them your OSS projects, for which you have already chosen a license and maybe even a CoC for. For this reason, despite the many non profit projects I maintain, I stay away from jazzband.


Do you mean

> Other unethical or unprofessional conduct

from https://jazzband.co/about/conduct ? Or something else?


There are at least 5 comments in this thread saying "just use the tool I use"



+1 for poetry. It also includes deterministic dependency resolution with a lock file.

I just published a repo today[0] using Poetry and it didn’t take me more than 5 minutes. Poetry build && poetry publish

[0] https://github.com/santiagobasulto/hyper-inspector


This looks cool. I had never come across rich[0] before.

[0] https://github.com/willmcgugan/rich


Yup, was surprised to see no mention of Poetry. Hands down, the best package manager for Python.


Holy crap, for some reason I never thought to consider another package manager than Pip, which I loathe.


`poetry` and `pipenv` are both so much slower than `pip-compile` for me (there are many open issues for both complaining about lock speed), and manage to update locked dependencies half the time despite me asking them not to with e.g. `poetry lock --no-update`.


Do any big projects actually use it?


It's only three years old, I can't think of many big projects created in that timespan.


Unless Instagram found migrating to it valuable enough to fund the effort


How does it compare to pipenv?


Pipenv only targets applications; Poetry targets both applications and libraries. Pipenv has quite some drama behind it that I do not want to get into; in contrast, Poetry's development has been quite professional. Pipenv enjoys better tool support, e.g. it is recognized and supported by VS Code; but Poetry does not have the same level of support.


I don't know anything about pipenv drama, so by morbid curiosity I looked for it and this is the first thing I found:

https://github.com/pypa/pipenv/issues/2228

This is one of the most ridiculous issues I've read. If the rest of the "drama" is like that, then eh.


The drama was surrounding false advertising so to speak. Pipenv promised a lot but did not quite deliver, much like the earlier days of MongoDB. But more importantly, it pretended or at least heavily implied it was an official PSF-affiliated project, when it was not. How that claim was substantiated was also subject to drama.


It also had no releases for over a year, even though the master branch was getting frequent updates and the performance of the last release was atrocious (and I'm not sure if it has improved much).

I did like the emojis in the logs though.


FWIW here's VSCode's issue for supporting Poetry[1] and here's their plan to support it[2]:

[1] https://github.com/microsoft/vscode-python/issues/8372

[2] https://github.com/microsoft/vscode-python/wiki/Support-poet...


From memory, there was a whole thing where pipenv, created by Kenneth Reitz of requests fame, was misaccurately portrayed as the official successor to pip when that wasn't true


I recall this, but I wasn’t quite clear on what the issue was with pipenv itself (other than questionable behaviour from the author)


It was buggy. It was slow. It didn't even try to support developing libraries.


Now it’s all coming back to me. I remember it half supporting in-house pip repositories but not quite


I just moved a project from pipenv to poetry at work. My biggest issue with pipenv is that you can't selectively upgrade dependencies. Trying to `pipenv update xyz` basically blows away your lockfile and updates everything. There's a command line flag to be more selective but it doesn't work. I found an open GitHub issue about it that's years old.

Poetry by contrast works pretty much like any modern dependency system you'd be familiar with from another language like cargo, npm, or hex.


In addition to everything else already said, pipenv is mind-bogglingly slow and buggy. Like 30 minutes and a timeout error to install pyspark.


Poetry is better.


Well that settles it


I think pipenv is better for managing non-published Python environments, so there.


pipenv has slightly better support in vscode


> Support for poetry environments is currently our highest upvoted feature request on GitHub.

https://github.com/microsoft/vscode-python/wiki/Support-poet...


For poetry, it makes sense to use `poetry self update --preview` — I often come across weird bugs that take too long to get fixed in the current release.


I hadn't heard of flit, it does seem like it's not brand new on the scene, however it is primarily a single author, so expect a tool which is opinionated and for which the opinions may not necessarily reflect a broad consensus:

https://github.com/takluyver/flit/graphs/contributors

With a title like this, I'd be expecting to see an article describing the latest tools and recommendations from the PyPA, which are here:

https://packaging.python.org/tutorials/packaging-projects/

(In short, it's setup.cfg + pyproject.toml, `python3 -m build` to build, twine to upload.)


Thomas is well know as one of the maintainer of IPython and Jupyter, and developed flit while working on the pep for pyproject.toml and the pip backend allowing things like python -m build.

Though `python -m build` only works _if_ you use something like flit or setup.py in the backend to do build the package and hence why you can set flit as a build-backend.

So yes, flit is one of the latest tool, and yes it is one of the things that push for the ability to use pyproject.toml+python3 -m build, you just seem to miss some subtleties of the toolchain.


The additional context is appreciated— it sounds like this tool is something which is likely to be supported long term, so that at least is good.


How does flit compare to poetry? The seem to be both doing the same thing, and in a very similar way.


Poetry does much more than Flit, like resolving dependencies, creating a lock file, and managing an environment where you can run your code. In particular, Poetry is meant to support application development (where you want to have a fixed version of your dependencies) as well as library development.

Flit is more aimed at being the simplest possible thing to put a package on PyPI, if that's all you want to do. It expects you to list dependencies in pyproject.toml manually.


I feel like this is a good place to mention Pip-tools [0] which can generate a lockfile of sorts from a standard requirements.txt (or a setup.py). Specifically, it resolves all dependencies (including hashes), and writes them to a new "requirements" file that you can read with the usual `pip install -r`.

The nice part about Pip-tools versus Flit or Poetry or Pipenv is that Pip-tools lets you keep using Setuptools if you want to, or if you're unable to switch to one of the others for some reason (and valid reasons do exist).

[0]: https://pypi.org/project/pip-tools


Flit is definitely opinionated, and not suitable for every use case. As Carreau hinted, I think its bigger impact will be from the specifications, especially PEP 517, which it helped to prompt, rather than people using Flit directly. The specifications mean it's practical to make new tools which interoperate nicely, without having to either wrap setuptools or carefully imitate its behaviour.


The PyPA recommendation is old and out of date. Using a system that doesn't manage constraints and hash-checked lockfiles is bad practice.


The author mixes different things like linting and testing into the packaging process, which (IMHO) are not really part of making a package. The process is really much easier than this article makes it seem:

- Write a simple setup.py file.

- Generate a source or binary release by e.g. running "python setup.py sdist"

- You're done!

Adding a setup.py file is already enough to make your library pip-installable, so you could argue that this is a package already. The files generated by "setup.py" can also be pip-imported, so they are also packages. Now you might want to upload your package to a repository, for which there are different tools available. The simplest one being twine. Again, you just install it and run "twine upload -r dist/*" and your packages get uploaded to PyPi (it will ask for a username and password). So why complicate things?


I don't see how your version is easier than the sequence of commands in the first few steps of the article, which is basically `pip install flit; flit init; flit publish`. Flit is just as easy to install as twine, but you save yourself the hassle of having to write a setup.py.


Maybe I'm too old-fashioned then. But I like that you don't have any dependencies when using distutils/setuputils with a `setup.py` file, so if you don't distribute your code you're already done. I'm also not a fan of tools that are just wrappers around other tools.


Flit isn't (mostly) a wrapper around other tools - it has its own code to create and upload packages. This was one of the motivating cases for the PEPs (517, 518) defining a standard interface for build tools, so it's practical to make tools like this without wrapping setuptools.

(flit install does wrap pip, however)


How does setuptools not count as a dependency?

If you've never run into setuptools compatibility problems, you've either been much luckier than me, or you haven't done much with Python packages.

Vanilla ubuntu used to come with a version of setuptools which didn't work for installing many recent packages.


Honestly, I don't even bother "packaging" Python tools anymore. Just put it in all in a git repo, and pip can install using

    pip install git+https://myg.it/repo.git


Also automatcially handles the issue of public vs private etc. Note this [1] recommends using zip files for speed, especially for larger repositories:

    pip install https://github.com/django/django/archive/master.zip

    pip install https://github.com/django/django/archive/stable/1.7.x.zip

[1] https://stackoverflow.com/questions/20101834/pip-install-fro...


I do that where I work because we have an internal git that is wide open, but packages from the outside world need to get whitelisted.

More controversially: With a few (<10) lines of code in your setup file, you can make an install-able module out of jupyter notebooks.


But now packages on PyPI can't depend on it.


But very often, that's ok...


How does this work for dependencies?

And what if some of the dependencies are incompatible with the versions already on your system?


Why not to make a Python package in 2021.

Even as a long time Python user, the packaging ecosystem feels fragmented and error-prone at best. Honestly, it sours the experience of writing Python code knowing you might eventually need to make it work on another computer.


I agree.

What I actually like is using the system package manager to install stuff. pacman -s or apt-get install

letting multiple packaging systems muck with your system is a recipe for hurt somewhere down the line.


Even more fragmentation? As a dev I'm not going to make packages for anything other than 'pip install' and maybe Ubuntu if you're lucky. I would also heavily discourage distros from shipping ancient buggy versions of the package, which is all distros are good for these days.

Packaging sucks.


The only constant in the python community over the last twenty years are the criticisms about its packaging


I think for the vast majority of at least pure-Python projects you could just use poetry and upload your packages to PyPI or a private index. You can go from an empty directory to publishing a package within minutes with poetry (although, of course, you probably shouldn't).


It's fragmented, but it doesn't need to be error prone if people use the good tools instead of the old low-level tools.


There would be no room for error if we just put the libraries in with the project as files instead of adding all these extra steps. Nobody seems to like this simple, bulletproof method anymore for some reason though.


That's exactly what a package manager does


A package manager is a whole separate program with config files that adds an extra build step and works like a black box. I mean just having the libraries entirely present in the repository. If an update is needed then someone pastes it in and commits it. This also lets you organize how you want.


That's often called "vendoring" these days.


After reading this guide I would still recommend people to use this guide.

https://cjolowicz.github.io/posts/hypermodern-python-01-setu...

It's great for beginners and experts. As a long time python veteran it completely changed how I work with python for the better. It is lengthy with lots of optional steps. Just skip the ones you don't find relevant.


The recommendation to set up a Makefile on top of tox is a bit odd to be honest. Tox basically "just works", and you can do things like pass stuff to `pytest` by setting up `{posargs}` in the tox config (see [0])

I do feel like tox gets a bad rap despite having a complete feature set. I think a part of it is that the documentation is complete but not organized in the "Tox user"'s perspective, so for someone who shows up on a project using it, it's hard to figure out the quickstart( though the "general tips and tricks" page gets somewhere [1])

Anyways yeah, would not recommend Make over just leaning into tox more here.

EDIT: also, this article reminded me of how much I really dislike Github Action's configuration syntax. Just balls of mud on top of the Docker "ball of mud" strategy. I will re-iterate my belief that a CI system where the configuration system isn't declarative but just like..... procedural Lua will be a billion dollar business. CI is about running commands one after another! If you want declarative DAGs use Bazel

[0]: https://tox.readthedocs.io/en/latest/example/pytest.html?hig... [1]: https://tox.readthedocs.io/en/latest/example/general.html


I’d argue the counterpoint actually: Writing Makefile targets for common commands significantly improves usability and ergonomics, especially when they follow common idioms (make test, make build, make install, ...).

The recipes for each target describe not only how a project intends to run each tool, but which tools it intends to run. Instead of having to know that this project runs tests under tox, while that one runs only under pytest, we can run ‘make test’ in each, and count on the recipe doing the Right Thing.

That consistency across projects makes it much easier for someone to get started on a new project (or to remember how the pieces fit together on your own project from a few months ago)


For me it’s important for a testing command to be able to receive parameters at runtime (for example tox test —- —pdb) , is it possible to do that with make in general? I never knew how.

I generally agree with your sentiment, though. I’m usually limiting myself to Python stuff so don’t have much exposure to make, it’s always felt like a less powerful task runner than other stuff


IMO as long as there is a healthy CI pipeline which people can check to see how to build/test things, ultimately it doesn't matter much.


I prefer to put `.PHONY` before every phony target, not gather them all in one place.

When phony targets are all written out in the beginning of the Makefile, it's easy to forget to alter this list when a phony target is added or removed later.

This rarely causes an error, but still I've seen many Makefiles with outdated .PHONY lists.


I’m always glad to see Make being used. It’s such a powerful and simple tool that usually does the job just as well as more “bespoke” CLI’s for various frameworks and languages


I rather just write a bash script. It's the lowest common denominator. Make may not be installed by default in many places and it has some weird syntax quirks that make it annoying to use IMO.

Here's an example of how I like to do my "bash scripts that sorta work like make": https://github.com/francislavoie/laravel-websockets-example/... basically each function is a "command", so I do like "./utils start" or whatever.


I like this style as well, we keep all our scripts in a ./bin directory, e.g. ./bin/lint.sh, ./bin/test.sh, etc. just as discoverable as make commands (run ls ./bin) and much easier to maintain.

If you really want make, you can also just call out to your bash scripts from make:

    test:
        ./bin/test.sh
    lint:
        ./bin/lint.sh
    ...


Please don't write a bash script.


Please elaborate. This is a useless comment without more information.


I would strongly discourage using make on new projects, make syntax is full of footguns and quirks (not being able to pass multiple args to subcommands is an easy example).

Bash, Python, or even Typescript are much easier, safer, and more widely standardized environments to maintain and grow your scripts once you get past a few lines.


It's programming language that distinguishes between spaces and tabs in a way that changes behavior. It's also the only PL that I know of that's outright incompatible with expand-all-tabs-to-spaces editing policy, which is what the vast majority of coders use in practice.

If you want to see a powerful and actually simple tool that does the job, take a look at DJB redo: https://redo.readthedocs.io/en/latest/


I like the idea of Make, but it's far too hacky. Even using it for a static blog site (turning foo.md -> foo.html, which is pretty close to the usual foo.c -> foo.o examples) ended up with recursive invocations, rule-creation macros, double-escaped sigils, eval, etc.

There are a bunch of lightweight alternatives to Make out there (I hear Ninja is pretty good). My personal preference is Nix these days (although that's quite heavyweight).


The biggest problem with make appears to be that people refuse to spend a little time learning how it works, and instead charge off to reimplement it, poorly, instead.


I have a cookiecutter template for this at https://github.com/simonw/python-lib (also click-app and datasette-plugin)

It sets up GitHub actions for publishing the package to PyPI when you create a release on GitHub - which I find way to be a really productive way of working.


cookiecutter is great. I recently moved from click to typer, which I so far really have enjoyed. I probably should make a cc template for that one day...


When you are doing machine learning, conda is widely used. Why? Because you can install non-python things like cudatoolkit or ffmpeg (you can even install python, so you are sure that everybody are using the same version of python)

Python is fantastic at gluing specialized tools/libraries, but a lot of these require non-python dependencies (most are written in more performant languages). IMO, this is a big differences when comparing with Cargo for Rust because most of the dependencies in Rust are written in Rust.

The state of packaging in Python is kinda meh, the official documentation here [0] suggests to create 3 additional files in order to create a package:

  - pyproject.toml
  - setup.cfg
  - setup.py  # optional, needed to make editable pip installs work
if you add conda, you may need 2 additional files:

  - meta.yaml  # to build you conda package
  - environment.yml
With this much boilerplate, I understand why people are creating tools like flit.

[0] -- https://packaging.python.org/tutorials/packaging-projects/


2021 and still there is no clear winner or official way to install dependencies and create packages.

The worst part of Python.


Is the 2.7 vs 3.?? not an issue anymore in the Python world?


2.7 is dead.


Lets see how many high-quality Python packaging guides we can cram into one discussion, shall we?

I have no problems using setuptools directly, as outlined in https://packaging.python.org/guides/distributing-packages-us...


I still don't understand why in 2021 `pip`, which is the standard package and dependencies manager in Python, cannot 1) build a package from some spec 2) generate the scaffolding needed to build such package

`gem` from Ruby does that (well, it doesn't 2 AFAIK but at least it does 1)


Slightly off topic but does any recommend any great guides for building browser javascript / node.js packages i.e listing recommended linters, profilers, testing strategy, documentation template, or a project structure?


Somewhat related: I have an example repository which I've been using to keep track of the tools I use, aimed at people in a research lab who are relatively new to Python. I made it because the existing example/template repositories I found don't gel nicely with the way I like to set up and think about things. Here it is -- hope you find it useful:

https://github.com/alknemeyer/python-template/


Which is the more standard way to use venv for Python 3.8 or newer?

1) venv at the same level:

create new project on GH

git clone https://github.com/user/myproject

python -m venv myproject

cd myproject

(activate venv)

____

2) clone first, venv in subdirectory:

create new project on GH

git clone https://github.com/user/myproject

cd myproject

python -m venv venv

(activate venv)

___

3) something completely different? if so, what?


Can somebody confirm that this is reasonable and state of the art? Or is it just a "look what I can do" type blog post?


It's more the latter, if you read the rest of the thread you'll get a feel for the issue - there's no broad consensus on what Python should do for package management.

Flit is not an unreasonable pick, but it's not a silver bullet.


I would say it is quite up to date. First, it is using pyproject.toml which is now the standard way to define build requirements for Python packages. Second, its collection of additional (linting etc) tools is pretty solid; there are potential alternatives in few cases (e.g. I would personally use Poetry rather than flit, and wouldn't use make for development scripts) but that is pretty much it.


I've never even heard of pyproject.toml and I open github projects behind packages to vet them (to a minimal degree, but still, making sure it's not a typosquat and still maintained and often also reading a bit of source code) on at least a weekly basis. It's not always python of course, but often enough. Maybe it's just some freak coincidence that I somehow never saw it or just don't recall while really it's mostly everywhere, but this broad statement for something I've never even heard of makes me think of JavaScript and the 'standard framework' that changes every six months.


With PEP-518 being 5 years old it's still relatively new, and most tools have implemented the support for it relatively recently. The key introduction article was written exactly one year ago: https://snarky.ca/what-the-heck-is-pyproject-toml/


It's actually not too uncommon - mostly due to the rising popularity of poetry.



Can we have "How to make a X package" ? I would love to see also examples with C++(if it is possible at all) and Nim


What happened to setup.py and wheels?


I work in a very small company where we are building the plane as we learn to fly, and I'm the only person there with any programming experience. I've been working on trying to improve my hobbyist-level (at best) knowledge of the Python ecosystem over the last year or so.

I've gotten to the point where a number of smaller tools I've put together can now be used in larger projects. I learned the hard way that just copying files around makes it hard to know which version is in that project, and upgrading, especially once there is more than one file, becomes a lot tougher. I learned the harder way that trying to link the same file into multiple projects is a great way to really screw things up.

About 4 or 5 months ago I made a real effort to try to learn how to use virtual environments (pipenv) and packaging to make it so that if I update one of the smaller tools, I don't clobber all the downstream projects that rely on it. I wanted to make it so that when I update something to add features or change things, I can go back and fix older projects it's used in at my leisure. I haven't even begun to touch on unit testing, and I have no clue what linting is. Things are kind of working so far, but it feels very hacky and fragile, and I know it can (and SHOULD) be better.

All of this stuff around packaging and being able to install those packages very daunting, and trying to stumble on the right tutorials is extremely frustrating. The vast majority of them assume I want to share my stuff with the world on PyPI, or that I have servers available to me to create private PyPI indexes, but I don't. Yet I still want my packages to "resolve their own dependencies" when I install or upgrade them.

And when it comes to learning things like testing, the few tutorials I've looked at either use different tools to do it, or their examples are so oversimplified that when I look at my own code, I don't know where to begin.

I say all of this because looking at this tutorial, it's more of the same. I want to make my code better. I want to make it easier to use those smaller projects in larger projects. But then it says things like "Every solid open-source project runs cloud tests after each commit, so we will too," but it doesn't do anything to explain what that is or why it should be done, besides "everyone does it, so you should, too."

I think what makes it even harder is that when something like this gets shared, there are so many conflicting opinions. Some people say to just use setuptools, other say that setuptools is on its way out and to use pyproject.toml (or some other tool) instead. It's all just so... hard!

I'm sorry. This is coming off a bit ranty, and that's not what I intended. I'm just feeling frustrated and I'm not sure of a better way to express that I need help with finding help. There are even a lot of things that I'm sure I need help with, but I just don't know what they are. It makes it really hard to verbalize what I need to another person, let alone to get the right words into a search engine to take me there.


Still hardcoding version numbers in 2021? Why not use setupmeta


Probably because setupmeta seems to only supports setup.py, which this guide doesn’t use (and isn’t generally recommended for most use cases).


I maintain over 50 public packages with setup.py and most of them have setupmeta, it's great and I would recommend it for all use cases.


Sure, and that’s fine. My statement was based off https://packaging.python.org/tutorials/packaging-projects/

> dynamic metadata [(setup.py)] should be used only as an escape hatch when absolutely necessary.


Dynamic version numbers based on git tags are absolutely necessary to me, to ease my continuous integration practice, that's also what openstack/pbr does amongst probably other things




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: