Python performance myths and fairy tales

by todsacerdotion 8/6/2025, 8:36 AMwith 209 comments

by btownon 8/6/2025, 11:11 AM

I think an important bit of context here is that computers are very, very good at speculative happy-path execution.

The examples in the article seem gloomy: how could a JIT possibly do all the checks to make sure the arguments aren’t funky before adding them together, in a way that’s meaningfully better than just running the interpreter? But in practice, a JIT can create code that does these checks, and modern processors will branch-predict the happy path and effectively run it in parallel with the checks.

JavaScript, too, has complex prototype chains and common use of boxed objects - but v8 has made common use cases extremely fast. I’m excited for the future of Python.

by dganon 8/6/2025, 12:37 PM

"Rewrite the hot path in C/C++" is also a landmine because how inefficient the boundary crossing is. so you really need "dispatch as much as possible at once" instead of continuously calling the native code

by nu11ptron 8/6/2025, 1:09 PM

The primary focus here is good and something I hadn't considered: python memory being so dynamic leads to poor cache locality. Makes sense. I will leave that to others to dig into.

That aside, I was expecting some level of a pedantic argument, and wasn't disappointed by this one:

"A compiler for C/C++/Rust could turn that kind of expression into three operations: load the value of x, multiply it by two, and then store the result. In Python, however, there is a long list of operations that have to be performed, starting with finding the type of p, calling its __getattribute__() method, through unboxing p.x and 2, to finally boxing the result, which requires memory allocation. None of that is dependent on whether Python is interpreted or not, those steps are required based on the language semantics."

The problem with this argument is the user isn't trying to do these things, they are trying to do multiplication, so the fact that the lang. has to do all things things in the end DOES mean it is slow. Why? Because if these things weren't done, the end result could still be achieved. They are pure overhead, for no value in this situation. Iow, if Python had a sufficiently intelligent compiler/JIT, these things could be optimized away (in this use case, but certainly not all). The argument is akin to: "Python isn't slow, it is just doing a lot of work". That might be true, but you can't leave it there. You have to ask if this work has value, and in this case, it does not.

By the same argument, someone could say that any interpreted language that is highly optimized is "fast" because the interpreter itself is optimized. But again, this is the wrong way to think about this. You always have to start by asking "What is the user trying to do? And (in comparison to what is considered a fast language) is it fast to compute?". If the answer is "no", then the language isn't fast, even if it meets the expected objectives. Playing games with things like this is why users get confused on "fast" vs "slow" languages. Slow isn't inherently "bad", but call a spade a spade. In this case, I would say the proper way to talk about this is to say: "It has a fast interpreter". The last word tells any developer with sufficient experience what they need to know (since they understand statically compiled/JIT and interpreted languages are in different speed classes and shouldn't be directly compared for execution speed).

by ehsantnon 8/6/2025, 9:21 PM

The article highlights important challenges regarding Python performance optimization, particularly due to its highly dynamic nature. However, a practical solution involves viewing Python fundamentally as a Domain Specific Language (DSL) framework, rather than purely as a general-purpose interpreted language. DSLs can effectively be compiled into highly efficient machine code.

Examples such as Numba JIT for numerical computation, Bodo JIT/dataframes for data processing, and PyTorch for deep learning demonstrate this clearly. Python’s flexible syntax enables creating complex objects and their operators such as array and dataframe operations, which these compilers efficiently transform into code approaching C++-level performance. DSL operator implementations can also leverage lower-level languages such as C++ or Rust when necessary. Another important aspect not addressed in the article is parallelism, which DSL compilers typically handle quite effectively.

Given that data science and AI are major use cases for Python, compilers like Numba, Bodo, and PyTorch illustrate how many performance-critical scenarios can already be effectively addressed. Investing further in DSL compilers presents a practical pathway to enhancing Python’s performance and scalability across numerous domains, without compromising developer usability and productivity.

Disclaimer: I have previously worked on Numba and Bodo JIT.

by Mithriilon 8/6/2025, 4:04 PM

> His "sad truth" conclusion is that "Python cannot be super-fast" without breaking compatibility.

A decent case of Python 4.0?

> So, maybe, "a JIT compiler can solve all of your problems"; they can go a long way toward making Python, or any dynamic language, faster, Cuni said. But that leads to "a more subtle problem". He put up a slide with a trilemma triangle: a dynamic language, speed, or a simple implementation. You can have two of those, but not all three.

This trilemma keeps getting me back towards Julia. It's less simple than Python, but much faster (mitigated by pre-compilation time), and almost as dynamic. I'm glad this language didn't die.

by nromiunon 8/6/2025, 11:04 AM

I really hope PyPy gets more popular so that I don't have to argue Python is pretty fast for the nth time.

Even if you have to stick to CPython, Numba, Pythran etc, can give you amazing performance for minimal code changes.

by Ultion 8/6/2025, 12:02 PM

Feel like Mojo is worth a shoutout in this context https://www.modular.com/mojo Solves the issue of having a superset of Python in syntax where "fn" instead of "def" functions are assumed static typed and compilable with Numba style optimisations.

by johnisgoodon 8/7/2025, 10:29 AM

> Python is fast enough for some tasks, he said, which is why there are so many people using it and attending conferences like EuroPython.

The conclusion is logically flawed: it conflates language popularity with performance, and conference attendance and widespread use are sociological indicators, not evidence of Python's performance. Conflating the two is intellectually negligent.

Additionally, Python's speed is largely due to C extensions handling performance-critical tasks, not the interpreter itself. Perl, however, is often faster even in pure code, especially for text processing and regex, thanks to its optimized engine, making it inherently quicker in many common scenarios.

by mrkeenon 8/6/2025, 11:17 AM

I didn't read with 100% focus, but this lwn account of the talk seemed to confirm those myths instead of debunking.

by ic_fly2on 8/6/2025, 12:19 PM

It’s a good article on speed.

But honestly the thing that makes any of my programs slow is network calls. And there a nice async setup goes a long way. And then k8 for the scaling.

by quantumspandexon 8/6/2025, 10:55 AM

So we are paying 99% of the performance just for the 1% of cases where it's nice to code in.

Why do people think it's a good trade-off?

by NeutralForeston 8/6/2025, 10:16 AM

Cool article, I think a lot of those issues are not Python specific so it's a good overview of whatever others can learn from a now 30 years old language! I think we'll probably go down the JS/TS route where another compiler (Pypy or mypyc or something else) will work alongside CPython but I don't see Python4 happening.

by coldteaon 8/6/2025, 11:48 PM

One big Python performance myth is the promise made several years ago that Python will get 5x faster in the next 5 years. So far the related changes have brought not even 2x gains.

by adsharmaon 8/6/2025, 2:21 PM

The most interesting part of this article is the link to SPy. Attempts to find a subset of python that could be made performant.

by teo_zeroon 8/6/2025, 1:26 PM

I don't know Python so well as to propose any meaningful contribution, but it seems to me that most issues would be mitigated by a sort of "final" statement or qualifier, that prohibits any further changes to the underlying data structure, thus enabling all the nice optimizations, tricks and shortcuts that compilers and interpreters can't afford when data is allowed to change shape under their feet.

by taericon 8/6/2025, 4:15 PM

Is amusing to see the top comment on the site be about how Common LISP approached this. And hard not to agree with it.

I don't understand how we had super dynamic systems decades ago that were easier to optimize than people care to understand. Heaven help folks if they ever get a chance to use Mathematica.

by hansvmon 8/6/2025, 2:34 PM

In the "dynamic" section, it's much worse than the author outlines. You can't even assume that the constant named "10" will point to a value which behaves like you expect the number 10 to behave.

by abhijeetpbodason 8/6/2025, 12:19 PM

An earlier version of the talk is at https://www.youtube.com/watch?v=ir5ShHRi5lw (I could not find the EuroPython one).

by pu_peon 8/6/2025, 2:51 PM

Python and other high-level languages may actually decrease in popularity with better LLMs. If you are not the one programming it, might as well do it in a more performant language from the start.

by actinium226on 8/6/2025, 6:11 PM

A lot of the examples he gives, like the numpy/calc function, are easily converted to C/C++/Rust. The article sort of dismisses this at the start, and that's fine if we want to focus on the speed of Python itself, but it seems like both the only solution and the obvious solution to many of the problems specified.

by writebettercon 8/6/2025, 10:51 AM

Good job on dispelling the myth of "compiler = fast". I hope SPython will be able to transfer some of its ideas to CPython with time.

by pabeon 8/6/2025, 2:32 PM

The SPy demo is really good in showing the the difference in performance between Python and their derivative. Well done!

by lkirkon 8/6/2025, 6:51 PM

For me, in my use of Python as a data analysis language, it's not python's speed that is an annoyance or pain point, it's the concurrency story. Julia's built in concurrency primatives are much more ergonomic in my opinion.

by 1vuio0pswjnm7on 8/6/2025, 8:02 PM

"He started by asking the audience to raise their hands if they thought "Python is slow or not fast enough";"

Wrong question

Maybe something like, "Python startup time is as fast as other interpreters"

Comparatively, Python (startup time) is slow(er)

by fumeux_fumeon 8/6/2025, 3:53 PM

Slow or fast ultimately matter in the context for which you need to use it. Perhaps these are only myths and fairly tales for an incredibly small subset of people who value execution speed as the highest priority, but choose to use Python for implementation.

by ntollon 8/6/2025, 12:24 PM

Antonio is a star. He's also a very talented artist.

by pjmlpon 8/6/2025, 11:38 AM

Basically, leave Python for OS and application scripting tasks, and as BASIC replacement for those learning to program.

by Redoubtson 8/6/2025, 5:48 PM

Wonder if mojo has gotten anywhere further, since they’re trying to bring speed while not sacrificing most of the syntax

https://docs.modular.com/mojo/why-mojo/#a-member-of-the-pyth...

by zazazxon 8/7/2025, 4:59 PM

Whoever wrote that up would be smart to put down the computer language books for a while and brush up on their English composition skills.

(Your downvotes prove me right.)

by meinersburon 8/6/2025, 11:05 AM

Is it just me or does the talk actually confirm all its Python "myths and fairy tales"?

by crabboneon 8/6/2025, 2:00 PM

Again and again, the most important question is "why?" not "how?". Python isn't made to be fast. If you wanted a language that can go fast, you needed to build it into the language from the start: give developers tools to manage memory layout, give developers tools to manage execution flow, hint the compiler about situations that present potential for optimization, restrict dispatch and polymorphism, restrict semantics to fewer interpretations.

Python has none of that. It's a hyper-bloated language with extremely poor design choices all around. Many ways of doing the same thing, many ways of doing stupid things, no way of communicating programmer's intention to the compiler... So why even bother? Why not use a language that's designed by a sensible designer for this specific purpose?

The news about performance improvements in Python just sound to me like spending useful resources on useless goals. We aren't going forward by making Python slightly faster and slightly more bloated, we just make this bad language even harder to get rid of.

by game_the0ryon 8/6/2025, 1:56 PM

I know I am going to get some hate for this from the "Python-stans" but..."python" and "performance" should never be associated with each other, and same for any scripting/interpreted programming language. Especially if it has a global interpreter lock.

While performance (however you may mean that) is always a worthy goal, you may need to question your choice of language if you start hitting performance ceilings.

As the saying goes - "Use the right tool for the job." Use case should dictate tech choices, with few exceptions.

Ok, now that I have said my piece, now you can down vote me :)

by 2d8a875f-39a2-4on 8/6/2025, 10:40 AM

Do you still need an add-on library to use more than one core?

by tuna74on 8/6/2025, 5:02 PM

In computing terms, saying something is "slow" is kind of pointless. Saying something is "effective" or "low latency" provides much more information.

by robmccollon 8/6/2025, 12:41 PM

Python as a language will likely never have a "fast" implementation and still be Python. It is way too dynamic to be predictable from the code alone or even an execution stream in a way that allows you to simplify the actual code that will be executed at runtime either through AOC or JIT. The language is itself is also quite large in terms of syntax and built-in capability at this point which makes new feature-conplete implementations that don't make major trade offs quite challenging. Given how capable LLMs are at translating code, it seems like the perfect time to build a language with similar syntax, but better scoped behavior, stricter rules around typing, and tooling to make porting code and libraries automated and relatively painless. What would existing candidates be and why won't they work as a replacement?