Nix shell template

Nix shells are the best tool for creating software development environments right now. This article provides a template to get you started with Nix shells from scratch, and explains how to add common features.

Source: Nix shell template, an article by Victor Engmark.

TIL - IN is not the same as ANY

Not exactly from today, rather from a month or two ago, but still on my “noteworthy list”. So after a remarkably long quiet period of no surprises (Postgres doesn’t generally surprise one badly), I managed to learn something controversial - a thing considered generally good, using ANY instead of IN-list in this case, can have downsides nevertheless!

Source: TIL - IN is not the same as ANY, an article by Kaarel Moppel.

Corinna in the Perl Core

It’s been a years-long, painful process, but with the release of Perl v.5.38,0, the first bits of Corinna have been added to the Perl core. For those who have not been following along, Corinna is a project to add a new object system to the Perl core. Note that it’s not taking anything away from Perl; it’s adding a core object system for better memory consumption, performance, and elegance.

Source: Corinna in the Perl Core, an article by Curtis “Ovid” Poe.

Demystifying Text Data with the unstructured Python Library

In the world of data, textual data stands out as being particularly complex. It doesn’t fall into neat rows and columns like numerical data does. As a side project, I’m in the process of developing my own personal AI assistant. The objective is to use the data within my notes and documents to answer my questions. The important benefit is all data processing will occure locally on my computer, ensuring that no documents are uploaded to the cloud, and my documents will remain private.

To handle such unstructured data, I’ve found the unstructured Python library to be extremely useful. It’s a flexible tool that works with various document formats, including Markdown, , XML, and HTML documents.

Source: Demystifying Text Data with the unstructured Python Library (+alternatives), an article by Saeed Esmaili.

Image Upscaling Using Neural Networks

Do you remember those classic scenes from CSI TV series? When a detective, peering at a pixelated image from a surveillance camera, instructs the tech whiz, "zoom enhance". With some keyboard strokes, the blurry image transforms, revealing a perfectly clear license plate. We've all had a good laugh at that, dismissing it as pure Hollywood bullshit, right?

Source: Image Upscaling Using Neural Networks.

Regex engine internals as a library

Over the last several years, I’ve rewritten Rust’s regex crate to enable better internal composition, and to make it easier to add optimizations while maintaining correctness. In the course of this rewrite I created a new crate, regex-automata, which exposes much of the regex crate internals as their own APIs for others to use. To my knowledge, this is the first regex library to expose its internals to the degree done in regex-automata as a separately versioned library.

This blog post discusses the problems that led to the rewrite, how the rewrite solved them and a guided tour of regex-automata’s API.

Source: Regex engine internals as a library, an article by Andrew Gallant.

Mastering Intermediate Linux Commands for Server Management

As a sysadmin, you often come across complex tasks that require more than just basic commands. That’s why it’s important to learn some intermediate-level Linux commands that can make your work easier and more efficient.

These commands can help you automate repetitive tasks, manage processes, and monitor system performance, among other things. In this article, we will explore some of these commands and their usage.

Source: Mastering Intermediate Linux Commands for Efficient Server Management, an article by Akash Rajpurohit.

Most Tests Should Be Generated

Traditional testing wisdom eventually invokes the test pyramid, which is a guide to the proportion of tests to write along the isolation / integration spectrum. There’s an eternal debate about what the best proportion should be at each level, but interestingly it’s always presented with the assumption that test cases are hand-written. We should also think about test generation as a dimension, and if I were to draw a pyramid about it I’d place generated tests on the bottom and hand-written scenarios on top, i.e. most tests should be generated.

Source: Most Tests Should Be Generated, an article by Alex Weisberger.

Two Ways to Turbo-Charge tox

The traditional way to speed up tox runs is running it as tox run-parallel (née tox --parallel or just tox -p). And while it’s currently broken in tox 4 for some users (yours truly included), it’s a great feature that Nox is sorely lacking.

But there are more ways, and I’d like to share two of them with you. Both methods don’t make much difference in CIs like GitHub Actions (just like tox run-parallel, mind you!), but they can do wonders for your local development. Which is where I have the least patience, so let’s dive right in!

Source: Two Ways to Turbo-Charge tox, an article by Hynek Schlawack.

Demystifying Pratt Parsers

Pratt parsers are a beautiful way of solving the operator precedence problem:

How can an expression like 1+2-3*4+5/6^7-8*9 be parsed to meet the expectations of your PEMDAS-trained brain? Where do you put the parentheses? What goes first?

Source: Demystifying Pratt Parsers, an article by Martin Janiczek.

How to think about async/await in Rust

Some documentation of Rust async and await has presented it as a seamless alternative to threads. Just sprinkle these keywords through your code and get concurrency that scales better! I think this is very misleading. An async fn is a different thing from a normal Rust fn, and you need to think about different things to write correct code in each case.

This post presents a different way of looking at async that I think is more useful, and less likely to lead to cancellation-related bugs.

Source: How to think about async/await in Rust, an article by Cliff L. Biffle.

Joins 13 Ways

Relational (inner) joins are really common in the world of databases, and one weird thing about them is that it seems like everyone has a different idea of what they are. In this post I’ve aggregated a bunch of different definitions, ways of thinking about them, and ways of implementing them that will hopefully be interesting. They’re not without redundancy, some of them are arguably the same, but I think they’re all interesting perspectives nonetheless.

Source: Joins 13 Ways, an article by Justin Jaffray.

You Don’t Need __all__

Every now and then, I get a PR from a well-meaning contributor trying to add __all__ to a Python module for whatever reason. I always decline these, they are unnecessary (at least for the way I structure my code) and I thought I’d write a short post explaining why.

Source: You Don’t Need all, an article by James Turk.

Tree-Structured Concurrency

In this post I want to provide you with a practical introduction to structured concurrency. I will do my best to explain what it is, why it's relevant, and how you can start applying it to your rust projects today. Structured concurrency is a lens I use in almost all of my reasoning about async Rust, and I think it might help others too.

Source: Tree-Structured Concurrency, an article by Yoshua Wuyts.

Cause of Death

On a quiet day, away from the hustle of Richmond, in a small cottage on the Virginia coast, Dr. Kay Scarpetta receives a disturbing phone call from the Chesapeake police. Thirty feet deep in the murky waters of Virginia's Elizabeth River, a scuba diver's body is discovered near the Inactive Naval Shipyard.As the police begin searching for clues, the wallet of investigative reporter Ted Eddings is found.

Unnerved by the possible identity of the victim, Scarpetta orders the crime scene roped off and left alone until she arrives. What was he doing there, searching for Civil War relics as the officer suggested, or was there a bigger story? As she rifles through the multitude of clues, a second murder hits much closer to home. This new development puts Scarpetta and her colleagues hot on the trail of a military conspiracy.

In the evening I started in Cause of Death, Kay Scarpetta book 7 by Patricia Cornwell.

My git worfklow

Every now and then, at work, I find myself discussing git worfklows, commit messages, branching, releasing, versioning, changelogs etc. Since my opinion has remained fairly consistent for the past few years, I found myself repeating the same points a lot, so I wrote it down. This page is the resulting compilation of my opinions on the software development lifecycle (SDLC), without workplace-specific tangeants.

Source: My git worfklow, an article by Jean-Baptiste Doyon.

Automating Command Execution Across All Tmux Panes

As developers, we often find ourselves working in multiple tmux panes, each running different applications or instances of the same application. When we make changes to a configuration file, such as ~/.vimrc for Vim or ~/.aliases for our shell, we need to manually reload that configuration in each relevant instance. This can be a time-consuming process, especially when working with a large number of panes. But also, let's be wizards and automate this process!

In this post, we'll explore a simple automation that can save you a lot of time and effort. We'll focus on a specific use case — reloading a .vimrc file across all Vim instances in tmux panes — but the pattern can be applied to a variety of scenarios.

Source: TIL: Automating Command Execution Across All Tmux Panes, an article by François Leblanc.

NixOS and my Descent into Insanity

As I tend to do, I picked a topic to write about that is much larger in scope than I could manage in a reasonable amount of time. Did I learn? Apparently not. This article started off with switching from zsh to fish. Then I thought, "Might as well manage it all with Nix!", which led me to switch to home manager to manage my dotfiles which led me to using Nix everywhere I possibly could.

As expected, using Nix where it's not supported caused some issues. Buckle up, and watch my slow descent into madness (Nix).

Source: NixOS and my Descent into Insanity.

Practical Procedural Macros

An explaination of how to implement practical procedural macros in the Rust programming language. Explains the different types of macros, then shows an implementation of a procedural macro following best practices, focusing on testing and ergonomics. Assumes some familiarity with Rust.

Source: Practical Procedural Macros, an article by Hugo Elhaj-Lahsen.

Rust fact vs. fiction: 5 Insights from Google's Rust journey

In this post, we will analyze some data covering years of early adoption of Rust here at Google. At Google, we have been seeing increased Rust adoption, especially in our consumer applications and platforms. Pulling from the over 1,000 Google developers who have authored and committed Rust code as some part of their work in 2022, we’ll address some rumors head-on, both confirming some issues that could be improved and sharing some enlightening discoveries we have made along the way.

Source: Rust fact vs. fiction: 5 Insights from Google's Rust journey in 2022, an article by Lars Bergstrom and Kathy Brennan.

FreeBSD Jails Containers

FreeBSD networking and containers (Jails) stacks are very mature and provide lots of useful features … yet for some reason these features are not properly advertised by the FreeBSD project … or not even documented at all.

Source: FreeBSD Jails Containers.

Is ORM still an 'anti pattern'?

ORMs are one of those things that software writers like to pick on. There are many online articles that go by the same tune: “ORMs are an anti-pattern. They are a toy for startups, but eventually hurt more than help.”

This is an exaggeration. ORMs aren’t bad. Are they perfect? Definitely not, just like anything else in software. At the same time, the criticisms are expected—two years ago, I would’ve agreed with that stereotyped headline wholeheartedly. I’ve had my share of “What do you mean the ORM ran the server out of memory?” incidents.

But in reality, ORMs are more misused than overused.

Is ORM still an 'anti pattern'?, an article by Anh-Tho Chuong.

When NumPy is too slow

If you’re doing numeric calculations, NumPy is a lot faster than than plain Python—but sometimes that’s not enough. What should you do when your NumPy-based code is too slow?

Your first thought might be parallelism, but that should probably be the last thing you consider. There are many speedups you can do before parallelism becomes helpful, from algorithmic improvements to working around NumPy’s architectural limitations.

Let’s see why NumPy can be slow, and then some solutions to help speed up your code even more.

Source: When NumPy is too slow, an article by Itamar Turner-Trauring.

The self-supervised learning cookbook

We have released a new "Cookbook of Self-Supervised Learning,” a practical guide for AI researchers and practitioners on how to navigate SSL recipes, understand its various knobs and levers, and gain the know-how needed to experiment with SSL's untapped flavors. This is part of our efforts to lower the barrier and help democratize access to SSL research. You’ll also find tips and tricks from more than a dozen authors across multiple universities, including New York University, University of Maryland, UC Davis, University of Montreal; as well as leading Meta AI researchers, such as Yann LeCun.

Source: The Self-Supervised Learning Cookbook.

Welcome to Codon

Codon is a high-performance Python compiler that compiles Python code to native machine code without any runtime overhead. Typical speedups over Python are on the order of 100x or more, on a single thread. Codon supports native multithreading which can lead to speedups many times higher still.

The Codon framework is fully modular and extensible, allowing for the seamless integration of new modules, compiler optimizations, domain-specific languages and so on. We actively develop Codon extensions for a number of domains such as bioinformatics and quantitative finance.

Source: Welcome to Codon.

Advanced macOS Command-Line Tools

macOS is fortunate to have access to the huge arsenal of standard Unix tools. There are also a good number of macOS-specific command-line utilities that provide unique macOS functionality. To view the full documentation for any of these commands, run man <command>.

Source: Advanced macOS Commands.

From Potter's Field

An unidentified nude female sits propped against a fountain in Central Park. There are no signs of struggle. When Dr. Kay Scarpetta and her colleagues Benton Wesley and Pete Marino arrive on the scene, they instantly recognize the signature of serial killer Temple Brooks Gault. Scarpetta, on assignment with the FBI, visits the New York City morgue on Christmas morning, where she must use her forensic expertise to give a name to the nameless—a difficult task. But as she sorts through conflicting forensic clues, Gault claims his next victim. He has infiltrated the FBI’s top secret artificial-intelligence system developed by Scarpetta’s niece, and sends taunting messages as his butchery continues, moving terrifyingly closer to Scarpetta herself.

In the afternoon I started in From Potter's Field, Kay Scarpetta book 6 by Patricia Cornwell.

An Introduction to Parser Combinators

If you’ve ever had to write a parser before, you know that creating parsers can be a tedious and complicated process. The good news is that it doesn’t have to be this way. In this post, I’m going to introduce parser combinators - a technique for building parsers that I’ve found to be both practical and fun to play around with1.

Source: An Introduction to Parser Combinators, an article by Varun Ramesh.

Gitflow and GitHub Flow compared: Which one is better?

Gitflow is, by far, the most popular branching model and possibly the one that has endured the test of time the most. Introduced by Vincent Driessen in 2010, its fundamental idea is that you should isolate your work into different types of git branches.

Other branching strategies, such as the centralized workflow (for those teams that come from SVN), and the forking workflow (for open-source projects) exist. Git, as a version control system, only details basic branching operations, and it remains controversial as to which approach is the best. Beyond those basic branching operations, it's a matter of opinion.

‍> In this article we will compare Gitflow with its newer approach,

GitHub Flow.

Source: Gitflow and GitHub Flow compared: Which one is better?.

Completely purge files from a git repository

I have occasionally ended up with files I did not want in my git repositories. These can both take up a lot of space, and contain sensitive data that we just want to remove (such as MySQL dumps, deploy keys etc).

Git keeps a history of all files, so just deleting the file doesn’t “make it go away”. The only way to completely remove the file is to scan through all history, removing all references to (and history of) those files, and finally pruning the git repo (physically removing references to what we just deleted). Finally you have to force-push the repo changes back to the remote, overwriting the remote.

Source: Completely purge files from a git repository (including history), an article by Ralph Slooten.

Big O Notation: A Simple Explanation With Examples

It’s hard to create efficient algorithms without understanding the time and space complexity of various operations. The concept of Big O notation helps programmers understand how quickly or slowly an algorithm will execute as the input size grows.

In this article, we’ll cover the basics of Big O notation, why it is used and how describe the time and space complexity of algorithms with example.

Source: Big O Notation: A Simple Explanation With Examples.

Clearing up some misconceptions about Passkeys

I am unreasonably excited about passkeys, I’ve long been looking for a better/more convenient way than passwords to do authentication, and I think passkeys are finally it.

However, whenever I see passkeys mentioned (for example on the recent Tailscale post about them), there are always a lot of misconceptions that surface in the debate. I’d like to clear some of them here, and hopefully explain a bit better what passkeys are.

Source: Clearing up some misconceptions about Passkeys, an article by Stavros Korokithakis.

The Bubble (2022)

A group of actors and actresses stuck inside a pandemic bubble at a hotel attempt to complete a film.

In the evening Alice, Esme, and I watched The Bubble. I didn't like the movie much and give it a 5 out of 10.