Python's New Package Landscape

by teajunkyon 10/18/2018, 11:58 AMwith 165 comments

by whitehouse3on 10/18/2018, 12:28 PM

While pipenv has garnered a lot of attention and praise for ease of use, it falls over whenever I integrate it with any serious work. Pipenv lock can take 20-30 minutes on a small flask app (~18 dependencies). And it often mixes up virtualenvs, enabling the wrong one with seemingly no remedy. I see the problems on Windows, MacOS and Ubuntu. 2018 is not the year of pipenv, for me. I'm sticking with regular virtualenvs and the manual-hell of requirements.txt. I hope it gets better eventually.

by cikon 10/18/2018, 3:31 PM

We just went through this cycle - ultimately we build packages (debs) and dockers, for deployment within VMs. Our build process - depending on the component pushes the deb to repos, or uses the deb in the docker.

After trying to replace pip with Pipenv, we had to stop. The dependency resolution time for 20 declared dependencies (that in turn pull down > 100 components) takes well over 5 minutes. With poetry - it takes less than 33 seconds on a clean system. The times are consistent for both Ubuntu 16.04 and Mac OS X.

Our only goal is to get to the point we're now in - tracking dependencies, and separate dev requirements (like ipython and pdbpp) from our other requirements. Poetry made it fast, simple, and made me an addict.

Over two days, I moved our entire codebase and every single (active) personal project I had to poetry. I don't regret it :)

by scrollawayon 10/18/2018, 12:42 PM

Published May 11th, 2018. But it's interesting it's popping up again. It's a good explanation of the landscape as of 2018, though Pipenv has since gone in a weird direction. There's a lot of recommendations for it, but I sometimes get the feeling people don't understand what they're recommending, such as replacing some things that work (setup.cfg) by things that don't do the same thing (Pipfile).

Man, the Python packaging ecosystem is one of those things which really bring me down regarding the state of Python, because there is such an extremely high barrier for breaking backwards compatibility and nothing really works.

The JS ecosystem is far better in this regard. Pipenv was most promising because it followed in Yarn's footsteps, but it didn't go all the way in replacing pip (which it really should have). So now there's still a bunch of stuff handled by pip, which pipenv does not / cannot know about, and this isn't really fixable.

The end result is that instead of telling people about pip + virtualenv, we now have pip, virtualenv and pipenv to talk about. And people who don't understand the full stack, and the exact role of each tool, can't really understand how to properly do the tasks we choose to recommend delegating to each one of them.

There's three separate-but-related use cases:

- "Installing a library" (npm install; pip install).

- "Publishing a library" (setup.py. Or Twine if you're using a tool. Both use setuptools.).

- "Deploying a Project", local dev or production (pipenv. Well, if it's configured with a pipfile, otherwise virtualenv, and who knows where your dependencies are, maybe requirements.txt. Pipenv does create a virtualenv anyway, so you can use that. Anyway you should be in docker, probably. Make sure you have pip installed systemwide. Yes I know it comes with python, but some distributions remove it from Python. Stop asking why, it's simple. What do you mean this uses Python 3.6 but there's only Python 3.5 available on Debian? Wait, no, don't install pyenv, that's not a good idea! COME BACK!)

The JS ecosystem manages to have two tools, both of which can do all of this. I don't know how we keep messing up when we have good prior work to look at.

by andybakon 10/18/2018, 1:34 PM

In case this scares any new users, I've used nothing more than pip and virtualenv for several years with no issues of note.

by sambeon 10/18/2018, 2:05 PM

Whenever talk in Python-world goes towards packaging, I feel like I have been transported to Javascript-world: it's never clear to me what concrete problems are being solved by the new tools/libraries.

This article seems well-written and well-intentioned. Despite reading it, I don't know why I would not have loose dependencies in setup.py and concrete, pinned dependencies in requirements.txt. It's never felt hard to manage or to sync up - the hard part is wading through all the different tools and recommendations.

by Bogdanpon 10/18/2018, 12:24 PM

Having used pure pip + virtualenv{,wrapper}, pip-tools + virtualenv, poetry and Pipenv for medium to large applications, I'm going to be sticking to pip-tools for the time being for apps. Poetry is fine, but pip-tools is faster and there's less to learn. Pipenv is unbearably slow for large applications and often buggy.

For libraries, I've been using Poetry for molten[1] and pure setuptools for dramatiq[2] and, at least for my needs, pure setuptools seems to be the way to go.

[1]: https://github.com/Bogdanp/molten

[2]: https://github.com/Bogdanp/dramatiq

by phren0logyon 10/18/2018, 1:01 PM

I'm not a programmer by trade, but I dabble, and these issues make it much less fun.

In my limited experience, Clojure's Leinengen is a far more pleasant way to solve these problems. I'm sure there are many other examples in other languages, but in the few I've used, nothing comes close. Each project has versioned dependencies, and they stay in their own little playground. A REPL started from within the project finds everything. Switch directories to a different project, and that all works as expected, too. It's a dream.

[https://leiningen.org/]

by CMCDragonkaion 10/18/2018, 1:08 PM

I've tried a lot of solutions, but nix-shell hands down is the best I've used. I wrote a little gist detailing how to develop in python using a Nix: https://gist.github.com/CMCDragonkai/b2337658ff40294d251cc79...

by samwillison 10/18/2018, 12:38 PM

Not really packaging but related, my favourite new tool is Pyenv (https://github.com/pyenv/pyenv) it made getting a new laptop setup with various versions of Python so much quicker.

I haven’t used Pipenv yet but it works with pyenv to create virtual envs with a specified puthon version as well as all the correct packages.

by patagoniaon 10/18/2018, 2:22 PM

Imagine I’m just getting started with Python, and I see this article. I think to myself, “Awesome, a primer!”

Then I start reading (these comments)... mayyybe I should try Julia... or anything else, at least while I’m still getting started.

by breatheoftenon 10/18/2018, 3:04 PM

I’m looking forward to a future where I no longer have to use languages that require the use of different mechanisms to reference functionality from library code than one uses to take advantage of your own source ... all the incidental complexity around custom compilation processes are in reality, just enormously non-productive relics of the past.

In the future — you have a set of entry points to your program, these are crawled by the language aware tool chain to identify and assemble all the requirements for the program (including 3rd party functionality). There’s no need for separate tools to manage packages, caches, and virtual environments — let’s just put all this logic into the compiler(s) — where necessary let the application describe the necessary state of the external world and empower language toolchains to ensure that it’s so ... let’s live in the future already ...

by epageon 10/18/2018, 1:16 PM

The biggest problem I have with python packaging tools is how do I start using them. I'd rather not install all of them in my global site-packages. Do I need to create a virtualenv just to get a tool to manage my virtualenv's?

I have seen poetry is working on their bootstrapping story. I could not get their current solution to work on Ubuntu. Maybe what they are developing towards will work.

https://github.com/sdispater/poetry/issues/342

by hultneron 10/18/2018, 12:52 PM

I've migrated to pipenv in most of my projects, it's simple and great for application development but I still write everything to work with pure pip as well so the Pipfile basically lists my application as a dependency and I mainly use it for the lock files.

For library development I target pure pip/setuptools but still use pipenv during development phase. There have been a few cases where pipenv had problems and I had to either remove my virtualenv and reinitialize it or even remove my pip-file/lockfile, but since I still have my setup.py it's not a big deal for me.

As for uploading etc I use twine but I wrap everything in a makefile to make handling easier.

A problem I noticed recently was a case where one of my developers used a tool which was implicitly installed in the testing environment since it was a subdependency of a testing tool but it was not installed into the production image. This resulted in "faulty" code passing the CI/CD and got automatically deployed to the live development environment where it broke (so it never reached staging). Caused a little bit of a headache before I found the cause.

by antplson 10/18/2018, 12:43 PM

No mention of containers? I didn't write Python code since a while now, but it would have been nice to have a comparison with container technologies, which weren't available at the creation time of pypi and pip. Containers solve both the problems of the article : isolation and repeatability, for any language. Are virtual env tools still needed in the container era?

by xapataon 10/18/2018, 3:33 PM

No mention of Anaconda?! How strange. I recommend using `conda` instead of virtualenv, and instead of pip where possible.

A Python project does not only depend on Python modules, but non-Python modules as well. Beyond Python, conda helps manage your other dependencies, like your database. I use Miniconda instead of Anaconda, to avoid the initial mega-download.

by binalpatelon 10/18/2018, 7:44 PM

I find conda far and above the best tool to manage python packages and dependencies. Being able to concisely contain all Python and binary dependencies together is invaluable.

I recently wrote about it as blog post, using conda within containers has solved almost every pain point we had with python packaging and how to get things into production reliably.

by azag0on 10/18/2018, 12:52 PM

For pure-python library projects, I found Poetry the best option these days (haven't tried Hatch). But it is still heavily under development, so it's not necessarily a black-box solution.

The biggest pain point of Pipenv for me is that it cannot as yet selectively update a single dependency without updating the whole environment.

by erikbon 10/18/2018, 8:33 PM

If you only think in Python then packaging really might be a painpoint. But honestly if we look at other languages it's not so bad actually, very specifically calling out Golang here because it claimed exactly this topic as an initial design goal and to this day has basically failed at delivering it.

There are even languages like C++ where the community as a (w)hole has given up on that topic and instead opts for completely building every tool by building the underlying libraries up first manually.

Considering all this, who can actually beat Python at this point? Java maybe? Is Ruby still competing? How is NodeJS doing?

Currently with what I see around me (mostly Go and C++) I don't feel too bad about setuptools+pip+virtualenv anymore.

by dlitvakbon 10/18/2018, 12:28 PM

I have been using purely setuptools for all of our open source Python libraries at Contentful, but have found that lately I've been getting deprecation warnings from PyPI not to use `setup.py upload` anymore.

What should the alternative be now?

Edit: I'm reading about twine right now, but I cannot begin to comprehend why it's not bundled directly if this is what they are intending for us to use to upload packages.

by _fbpton 10/19/2018, 1:36 AM

Pipenv is unusable for me, since launching your app only works when your current working directory is the Pipfile directory. If you want to launch an app via a shell script from another directory, you have to first cd to the Pipfile dir, pipenv shell (maybe you can pass in a second shell script as an argument).

The article mentions Pipsi is designed to make command-line apps globally accessible, and I'll try it out.

Additionally, adding git/src/package/module.py may be fine when you're using an IDE, but when browsing in a file manager, you must navigate 3 directories deep to even see any source files, which seems to be trending towards the inconvenience amd pain of Java projects.

by ericcholison 10/18/2018, 12:43 PM

I gave conda a shot and found it to be better than pip + virtualenv, but still not amazing.

by EamonnMRon 10/18/2018, 7:10 PM

We tried Pipenv last year, ran into a number of bugs, this one being the most irritating:

https://github.com/pypa/pipenv/issues/786

by reilly3000on 10/18/2018, 7:08 PM

Why is it that Python is geared towards archiving packages at the site level by default while npm, composer, et all tend towards including packages in the project's folder? Is it convention from a time when disk space was less plentiful?

by luordon 10/18/2018, 9:53 PM

I rarely even use requirements.txt and never use it in my personal projects.

I just pin the project's direct dependencies in the setup.py file and install the folder directly. I know it might cause bugs with different developers (or the CI) using different versions of the upstream dependencies but I guess I trust the developers who create each library I'm using. The moment I directly import something from what used to be an upstream dependency, I pin it too.

So far this approach hasn't given me trouble, but I'll still take a look at poetry based on what I read in the comments here.

by Gaelanon 10/18/2018, 3:58 PM

Huh, OpenDNS blocks this due to "a security threat that was discovered by the Cisco Umbrella security researchers."

by mattdeboardon 10/18/2018, 1:32 PM

Interestingly this URL gets blocked by my work's security thing. Never saw that before.

edit: I requested an exemption but corp IT staff came back and said there's definitely been malware identified on that site. So... be careful with your clicks.

edit2: Well who knows where the malware alert is coming from, might be an ad or something.

by qwerty456127on 10/18/2018, 6:48 PM

Sadly there are things you can't always install reliably from the Python repository. E.g. you may have to install things like scipy and keras from the OS or a 3-rd party (like brew or conda) repository as pip install would fail in the build process.

by TheOtherHobbeson 10/18/2018, 7:28 PM

I was trying to set up pipenv on a Mac earlier in the week.

Being able to select P2 or P3 environments is great.

Unfortunately it decided all my packages were in /var/mail.

No patience to debug it, so I gave up on it.

by agumonkeyon 10/18/2018, 3:23 PM

(incf confusion)

by liveoneggson 10/18/2018, 12:40 PM

this incredible complexity is what docker is really good at simplifying

by datavirtueon 10/18/2018, 12:56 PM

Who curates the packages to prevent security issues?