Sun 11 Apr 2021

Bloom filters

A Bloom filter is a probabilistic data structure present in many common applications. Its purpose is answering the question: "is this item in the set?" very fast and not using a lot of space. The answers can be NO, or MAYBE YES.

Source: Bloom filters an article by Ricardo Ander-Egg Aguilar.

Using PyTorch + NumPy? You're making a mistake.

Bugs in ML code are notoriously hard to fix - they don’t cause compile errors but silently regress accuracy. Once you have endured the pain and fixed one of these, the lesson is forever etched into your brain, right? Wrong. Recently, an old foe made a comeback - a familiar bug bit me again! As before, the performance improved significantly after fixing it.

The bug was subtle and easy to make. How many others has it done damage to? Curious, I downloaded over a hundred thousand repositories from GitHub that import PyTorch, and analysed their source code. I kept projects that define a custom dataset, use NumPy’s random number generator with multi-process data loading, and are more-or-less straightforward to analyse using abstract syntax trees. Out of these, over 95% of the repositories are plagued by this problem. It’s inside PyTorch’s official tutorial, OpenAI’s code, and NVIDIA’s projects. Even Karpathy admitted falling prey to it.

Source: Using PyTorch + NumPy? You're making a mistake, an article by Tanel Pärnamaa.

Rust async is colored, and that’s not a big deal

In this article I will show that Rust async functions are colored, by both the original definition and in practice. This is not meant as an criticism of Rust async, though – I don’t see function colors as an insurmountable issue, but as a reflection of the fundamental difference of async and sync models of the world. Languages that hide that difference do so by introducing compromises that might not be acceptable in a systems language like Rust or C++ – for example, by entirely forbidding the use of system threads, or by complicating the invocation of foreign or OS-level blocking calls. Colored functions are also present in at least C#, Python, Kotlin, and C++, so they’re not a quirk of JavaScript and Rust. And additional features of Rust async do make it easier to connect async code with traditional blocking code, something that is just not possible in JavaScript.

Source: Rust async is colored, and that’s not a big deal.