week 04, 2022

Uninitialized Memory: Unsafe Rust is Too Hard

Rust is in many ways not just a modern systems language, but also quite a pragmatic one. It promises safety and provides an entire framework that makes creating safe abstractions possible with minimal to zero runtime overhead. A well known pragmatic solution in the language is an explicit way to opt out of safety by using unsafe. In unsafe blocks anything goes.

Source: Uninitialized Memory: Unsafe Rust is Too Hard, an article by Armin Ronacher.

Black - The Uncompromising Code Formatter

Black is the uncompromising Python code formatter. By using it, you agree to cede control over minutiae of hand-formatting. In return, Black gives you speed, determinism, and freedom from pycodestyle nagging about formatting. You will save time and mental energy for more important matters.

Blackened code looks the same regardless of the project you're reading. Formatting becomes transparent after a while and you can focus on the content instead.

Black makes code review faster by producing the smallest diffs possible.

Source: black.

Procrastinate: PostgreSQL-based Task Queue for Python

Procrastinate is an open-source Python 3.7+ distributed task processing library, leveraging PostgreSQL to store task definitions, manage locks and dispatch tasks. It can be used within both sync and async code.

In other words, from your main code, you call specific functions (tasks) in a special way and instead of being run on the spot, they’re scheduled to be run elsewhere, now or in the future.

Source: Procrastinate: PostgreSQL-based Task Queue for Python.

Devious SQL: Message Queuing Using Native PostgreSQL

An interesting question came up on the #postgresql IRC channel about how to use native PostgreSQL features to handle queuing behavior. There are existing solutions for queuing, both in PostgreSQL, with the venerable pgq project, or dedicated message queues like RabbitMQ, Kafka, etc. I wanted to explore what could be done with native Postgres primitives and I thought this warranted an entry in my Devious SQL series.

Source: Devious SQL: Message Queuing Using Native PostgreSQL, an article by David Christensen.

An alternative Docker installation with Multipass on macOS

Last week I received an email from the Docker Team which said that Docker for Mac (the software which also comes with a GUI) will be forbidden for commercial use when the company has more than 250 employees AND makes more than $10 million per year. To use it commercially the company has to get licenses for every developer using it, starting at $5/month. This made me think what an alternative could be for devs that don’t want to use Docker for Mac anymore, since I read a lot of posts that many devs don’t even need it. Most of them interact via CLI anyway. I stumbled across a nice article from Josh Gorneau where he uses multipass to host his Docker VM. In this case, it is a Ubuntu 20.04 installation. So a couple of commands will be similar to Josh’s article such as the VM configuration used in this post.

Source: An alternative Docker installation with Multipass on macOS without using Docker for Mac, an article by Niklas Metje.

Docker for Mac - Docker Machine / Vagrant / Ansible

If you need docker and kernel modules to support things like SCTP, IP_VS, WireGuard etc. then this project might be for you.

These Vagrant boxes are intended to replace Docker for Mac and utilises docker-machine, Vagrant, VirtualBox and Ansible to provide a fully featured linux vm.

Motivation: Docker for Mac was proving to be a workflow pain rather than a workflow gain. It was slowing down my 16" Macbook Pro (32GB RAM, 6 CPUs), draining the battery, and causing the fans to constantly spin at full speed. There had also been occurrences where kernel modules had been removed, rendering it difficult to do system development.

Source: Docker for Mac - Docker Machine / Vagrant / Ansible, an article by James Stenhouse.

Rust Futures and Tasks

With the recent talk about “Contexts” in the Rust community, and some other thoughts I had recently, I want to explore in a bit more detail what the difference between Futures and Tasks is in Rust.

The difference between Futures and Tasks is like the difference between concurrency and parallelism.

Source: Rust Futures and Tasks, an article by Arpad Borsos.

The Internals of PostgreSQL

PostgreSQL is a well-designed open-source multi-purpose relational database system which is widely used throughout the world. It is one huge system with the integrated subsystems, each of which has a particular complex feature and works with each other cooperatively. Although understanding of the internal mechanism is crucial for both administration and integration using PostgreSQL, its hugeness and complexity prevent it. The main purposes of this document are to explain how each subsystem works, and to provide the whole picture of PostgreSQL.

Source: The Internals of PostgreSQL, an online book by Hironobu Suzuki.

Three kinds of memory leaks

So, you’ve got a program that’s using more and more over time as it runs. Probably you can immediately identify this as a likely symptom of a memory leak.

But when we say “memory leak”, what do we actually mean? In my experience, apparent memory leaks divide into three broad categories, each with somewhat different behavior, and requiring distinct tools and approaches to debug. This post aims to describe these classes, and provide tools and techniques for figuring out both which class you’re dealing with, and how to find the leak.

Source: Three kinds of memory leaks, an article by Nelson Elhage.

Micro C, Part 0: Introduction

In this series, we will explore how to write a compiler for a small subset of C to LLVM in Haskell. Our language, Micro C, is basically a small subset of real C. We'll have basic numeric types, a real bool type, pointers, and structs. At the end of the series, we'll have a beautiful executable, mcc (Micro C Compiler), that takes one .mc source file and produces an executable.

Source: Micro C, Part 0: Introduction, an article by Joseph Morag.

The fastest way to read a CSV in Pandas

You have a large CSV, you’re going to be reading it in to Pandas—but every time you load it, you have to wait for the CSV to load. And that slows down your development feedback loop, and might meaningfully slows down your production processing.

But it’s faster to read the data in faster. Let’s see how.

Source: The fastest way to read a CSV in Pandas, an article by Itamar Turner-Trauring.

Static Typing Python Decorators

Accurately static typing decorators in Python is an icky business. The wrapper function obfuscates type information required to statically determine the types of the parameters and the return values of the wrapped function.

Let's write a decorator that registers the decorated functions in a global dictionary during function definition time.

Source: Static Typing Python Decorators, an article by Redowan Delowar.

Creating a Postgres Foreign Data Wrapper

Here at DoltHub some of us have been working with PostgreSQL extensions recently. This is an introductory tutorial on how to get started building a PostgreSQL foreign data wrapper. We introduce the basics around setting up a project for building and installing the extension and how to implement a very basic read only scan.

Source: Creating a Postgres Foreign Data Wrapper, an article by Aaron Son.

Introducing Parser Builders

We are excited to release 0.5.0 of swift-parsing, our library for turning nebulous data into well-structured data, with a focus on composition, performance, and generality. This release brings a new level of ergonomics to the library by using Swift’s @resultBuilder machinery, allowing you to express complex parsers with a minimal amount of syntactic noise.

Source: Introducing Parser Builders.

The Curse of NixOS

I've used NixOS as the only OS on my laptop for around three years at this point. Installing it has felt sort of like a curse: on the one hand, it's so clearly the only operating system that actually gets how package management should be done. After using it, I can't go back to anything else. One the other hand, it's extremely complicated constantly changing software that requires configuration with the second-worst homegrown config programming language I've ever used.

Source: The Curse of NixOS, an article by Wesley Aptekar-Cassels.

Modern Bash Scripting

Writing shell scripts used to be a major, major pain for me. I remember many frustrating sessions, where I tried to find a misplaced quote or a missing backtick. I cursed shell script and only used it as a last resort.

Source: Modern Bash (Zsh) Scripting.

Why we're migrating (many of) our servers from Linux to FreeBSD

There are many alternative operating systems to Linux and the *BSD family is varied and complete. FreeBSD, in my opinion, today is the "all rounder" system par excellence, i.e. well refined and suitable both for use on large servers and small embedded systems. The other BSDs have strengths that, in some fields, make them particularly suitable but FreeBSD, in my humble opinion, is suitable (almost) for every purpose.

So back to the main topic of this article, why am I migrating many of the servers we manage to FreeBSD? The reasons are many, I will list some of them with corresponding explanations.

Source: Why we're migrating (many of) our servers from Linux to FreeBSD, an article by Stefano Marinelli.