Understanding CPUs can help speed up Numba and NumPy code
When you need to speed up your NumPy processing—or just reduce your memory usage—the Numba just-in-time compiler is a great tool. It lets you write Python code that gets compiled at runtime to machine code, allowing you to get the kind of speed improvements you’d get from languages like C, Fortran, or Rust.
Or at least, that’s the theory. In practice, your initial Numba code may be no faster than the NumPy equivalent.
But you can do better, once you have a better understanding of how CPUs work. And this knowledge will help you more broadly with any compiled language.
Source: Understanding CPUs can help speed up Numba and NumPy code, an article by Itamar Turner-Trauring.