2 min read
The Curious Case of 257

The mysterious ways of a certain programming language

Python has an interesting caching feature, which I discovered recently -

code

This is called small integer caching.

What Python does is pre-allocate a global list of integers as singletons in the range [-5, 256] during initialisation. So every time that an integer is created in this range, instead of creating a new object, a reference to the existing object is returned. Therefore it saves a lot of space and computation for commonly used integers.

This can also be seen during computations.

code

Another interesting byte. When initialised in the same line, the interpreter creates a new object for the first variable, and then reference the second variable with the same object. This is because the compiler can see both the variables and can optimise the referencing.

code

Hey wait! I am not done yet.

code

Another new confusing thing?

The compiler also optimises identical literals, when the code is compiled in the same code object. So, 1. If the Python compiler can see the whole code, more optimisations may apply to the code. When typing into the Python shell each line is a completely different statement, parsed, and compiled separately.

This works for floats as well, which is not cached.

code

Literals inside tuples and other data structures are shared as well.

code

Java uses a similar approach and caches the integers between -128 to 127. However this works only on autoboxing. They’re not cached if built with the constructor.

References -

Why Python is Slow: Looking Under the Hood

Python internals: Arbitrary-precision integer implementation

What’s with the integer cache maintained by the interpreter?