week 01, 2023

Sun 08 Jan 2023

Things they didn’t teach you about Software Engineering

As always, a disclaimer before we start, this is purely subjective. Whether you are a seasoned professional or just starting out in the field, I hope these insights will provide valuable perspective.

Source: Things they didn't teach you about Software Engineering, an article by Vadim Kravcenko.

software development

Dotfiles Management

Not much of a project, but this might be useful for some folks. Here's how I am currently keeping track of all the configuration for my laptop.

The system I've settled on is copied from other people – tracking dotfiles as a git repo – but taken to its extreme where the entire root filesystem is trackable. Importantly,

Any file on the machine can be added to the dotfiles repo

The dotfiles repo doesn't interfere with any other git repos you're working with

Source: Dotfiles Management, an article by Tim Alex Jacobs.

dotfile

Verifiable AES: encryption using zero-knowledge proofs

Encryption is transforming messages into random-looking texts to ensure confidentiality between two parties. What is our objective here? We want to generate proof allowing us to verify an encryption algorithm, ensuring it does what it was designed for.

Source: Verifiable AES: encryption using zero-knowledge proofs.

cryptography

To Green Angel Tower, Part 1

The evil minions of the undead Sithi Storm King are beginning their final preparations for the kingdom-shattering culmination of their dark sorceries, drawing King Elias ever deeper into their nightmarish, spell-spun world.

As the Storm King’s power grows and the boundaries of time begin to blur, the loyal allies of Prince Josua struggle to rally their forces at the Stone of Farewell. There, too, Simon and the surviving members of the League of the Scroll have gathered for a desperate attempt to unravel mysteries from the forgotten past.

For if the League can reclaim these age-old secrets of magic long-buried beneath the dusts of time, they may be able to reveal to Josua and his army the only means of striking down the unslayable foe....

In the evening I started in To Green Angel Tower, Part 1, book 3 of Memory, Sorrow, and Thorn by Tad Williams.

Sat 07 Jan 2023

A Brief Defense of XML

XML is precisely what it says on the tin: an extensible markup language. It’s a markup language with a completely uniform syntax so that the alphabet of markup elements is customizable. And for what it is, there is truly no replacement. Every other markup language supports only a limited set of markup directives defined from the factory. The tradeoff is generality for ease of authoring: limited markup languages can have terser syntax for specific elements.

Source: A Brief Defense of XML, an article by Fernando Borretti.

Parallel streaming in Haskell: Part 1 - Fast, efficient, and fun!

Over the last 2 years, we moved our inherently sequential data processing engine, written in Haskell, to a parallel version. Running the parallel version of our system barely increases CPU time, while the wall time (time from start to end) is significantly reduced.

Source: Parallel streaming in Haskell: Part 1 - Fast efficient fun, an article by Yorick Sijsling and Joris Burgers.

haskell

Logging practices I follow

There are many pitfall that can lead to useless, wasteful and confusing logs. Therefore I follow a specific set of practices which allows me to write better logs while also being consistent across the system.

You should remmember that logging is for the developers, you are going to be the only one who’s reading them, so as you are about to log something, ask yourself this:

Is this log really needed? does it rely important information I couldn’t get from the other logs in the same flow?

Am I going to log an object that can be huge on production? If so, can I just log a few metrics of that objects instead? for example, it’s length, or handpick a few important attribute to log.

Does the information I am about to log will help me to debug/understand the flow?

Source: Logging practices I follow, an article by Eliran Turgeman.

logging

The Snowman (2017)

Detective Harry Hole investigates the disappearance of a woman whose scarf is found wrapped around an ominous-looking snowman.

In the evening Esme and I watched The Snowman. The movie is based on a book by Jo Nesbø which I read several years ago. I liked the movie, although it was a bit slow, and give it a 7 out of 10.

Fri 06 Jan 2023

Stone of Farewell: Good

In the afternoon I finished Stone of Farewell, Memory, Sorrow & Thorn Book 2 by Tad Williams. I liked the book; it is at least as good as the previous one in the series or maybe slightly better.

Microfeatures I'd like to see in more languages

There are roughly three classes of language features:

Features that the language is effectively designed around, such that you can’t add it after the fact. Laziness in Haskell, the borrow checker in Rust, etc.

Features that heavily define how to use the language. Adding these are possible later, but would take a lot of design, engineering, and planning. I’d say pattern matching, algebraic data types, and async fall under here.

Quality-of-life features that aren’t too hard to add, and don’t meaningfully change a language in its absence. Often syntactic sugar, like Python’s chained evaluators (if 2 <= x < 10).

Most PLT and language design work is focused around (1) and (2), because those are the most important, but I have a deep fondness for (3)-type features. Because they’re so minor, they’re the most likely to spread between languages, since the overhead of adding them is so small. Since I spend a lot of time in niche obscure languages, I also encounter a lot of cool QoL features that most people might not have seen before. Here’s a few of them!

Source: Microfeatures I'd like to see in more languages, an article by Hillel Wayne.

software development

Announcing turmoil

Today, we are happy to announce the initial release of turmoil, a framework for developing and testing distributed systems.

Testing distributed systems is hard. Non-determinism is everywhere (network, time, threads, etc.), making reproducible results difficult to achieve. Development cycles are lengthy due to deployments. All these factors slow down development and make it difficult to ensure system correctness.

turmoil strives to solve these problems by simulating hosts, time and the network. This allows for an entire distributed system to run within a single process on a single thread, achieving deterministic execution. We also provide fine grain control over the network, with support for dropping, holding and delaying messages between hosts.

Source: Announcing turmoil, an article by Brett McChesney.

rust

Johnny Mnemonic (1995)

A data courier, literally carrying a data package inside his head, must deliver it before he dies from the burden or is killed by the Yakuza.

In the evening Alice, Esme, and I watched Johnny Mnemonic. The movie was quite slow and I give it a 6 out of 10.

Thu 05 Jan 2023

Prototype Pollution in Python

The main objective of this research is to prove the possibility of having a variation of Prototype Pollution in other programming languages, including those that are class-based by showing Class Pollution in Python.

Source: Prototype Pollution in Python.

python

6 Practices for Effective Pull Requests

We’ll dig deeper into some of the benefits and drawbacks of Github Flow - specifically around pull request-based code review - later in this article, with the goal of identifying practices that allow us to work most effectively within this model.

Source: 6 Practices for Effective Pull Requests, an article by Pete Hodgson.

github

Generalized LL (GLL) Parser

This tutorial is a complete implementation of GLL Parser in Python including SPPF parse tree extraction ¹. The Python interpreter is embedded so that you can work through the implementation steps.

Source: Generalized LL (GLL) Parser, an article by Rahul Gopinath.

Wed 04 Jan 2023

Python Magic Methods You Haven't Heard About

Python's magic methods - also known as dunder (double underscore) methods - can be used to implement a lot of cool things. Most of the time we use them for simple stuff, such as constructors (__init__), string representation (__str__, __repr__) or arithmetic operators (__add__/__mul__). There are however many more magic methods which you probably haven't heard about and in this article we will explore all of them (even the hidden and undocumented)!

Source: Python Magic Methods You Haven't Heard About, an article by Martin Heinz.

python

Lazy Evaluation Using Recursive Python Generators

We all are familiar with Python's generators and all their benefits. But, what if I told you that we can make them even better by combining them with recursion? So, let's see how we can use them to implement "lazy recursion" and supercharge what we already do with generators in Python!

Source: Lazy Evaluation Using Recursive Python Generators, an article by Martin Heinz.

python

Don't do this: creating useless indexes

I’d like to talk about a fight I’m having with some developers (most developers?) almost every time I’m called for performance reasons: stop creating more indexes and only keep the useful ones!

Source: Don't do this: creating useless indexes, an article by Lætitia Avrot.

postgres

Tue 03 Jan 2023

Zero-dependency random number generation in Rust

Random numbers are very interesting. It feels like magic that we can generate such unpredictable entropies from deterministic sources.

But how does this happen? Before jumping into the generation of random numbers in Rust, let's understand the process of random number generation and how true randomness could never be created without special hardware.

Source: Zero-dependency random number generation in Rust, an article by Orhun Parmaksız.

rust

Writing a Python SQL engine from scratch

When I first started writing SQLGlot in early 2021, my goal was just to translate SQL queries from SparkSQL to Presto and vice versa. However, over the last year and a half, I've ended up with a full-fledged SQL engine. SQLGlot can now parse and transpile between 18 SQL dialects and can execute all 24 TPC-H SQL queries. The parser and engine are all written from scratch using Python.

Source: Writing a Python SQL engine from scratch, an article by Toby Mao.

python

Keeping Your Valuables Under Lock and Key

Locking reference attributes can be a fast and easy way to protect the internal state of your objects.

Perl has built-in support for read-only arrays and hashes via Internals::SvREADONLY, but modules like Sub::Trigger::Lock exist to make using the feature simpler in object-oriented code.

Source: Keeping Your Valuables Under Lock and Key, an article by Toby Inkster.

perl

The Man from Toronto (2022)

The world's deadliest assassin and New York's biggest screw-up are mistaken for each other at an Airbnb rental.

In the evening Esme and I watched The Man from Toronto. I liked the movie and give it a 7 out of 10.

Mon 02 Jan 2023

A Decade of HardenedBSD

This year, HardenedBSD's codebase will turn a decade old. This article provides a retrospective on ten years of hard, rewarding work along with some personal reflections.

Source: A Decade of HardenedBSD, an article by Shawn Webb.

How to Test

This post describes my current approach to testing. When I started programming professionally, I knew how to write good code, but good tests remained a mystery for a long time. This is not due to the lack of advice — on the contrary, there’s abundance of information & terminology about testing. This celestial emporium of benevolent knowledge includes TDD, BDD, unit tests, integrated tests, integration tests, end-to-end tests, functional tests, non-functional tests, blackbox tests, glassbox tests, …

Knowing all this didn’t help me to create better software. What did help was trying out different testing approaches myself, and looking at how other people write tests. Keep in mind that my background is mostly in writing compiler front-ends for IDEs. This is a rather niche area, which is especially amendable to testing. Compilers are pure self-contained functions. I don’t know how to best test modern HTTP applications built around inter-process communication.

Source: How to Test, an article by Alex Kladov.

testing

Classifying Python virtual environment workflows

I have been spending some time as of late thinking, and asking the community via the fediverse, about how people deal with virtual environments in Python. I have ended up with various ways of classifying people's virtual environment management and I wanted to write it all down to both not forget and to explain to all the nice people answering my various polls on the topic why I was asking those questions.

Source: Classifying Python virtual environment workflows, an article by Brett Cannon.

python

Wind River (2017)

A veteran hunter helps an FBI agent investigate the murder of a young woman on a Wyoming Native American reservation.

In the evening Alice, Esme, and I watched Wind River. I liked the movie and give it an 8 out of 10.