week 05, 2022

Sun 06 Feb 2022

Gut microbe linked to depression in large health study

The trillions of bacteria in and on our bodies can bolster our health and contribute to disease, but just which microbes are the key actors has been elusive. Now, a study involving thousands of people in Finland has identified a potential microbial culprit in some cases of depression.

The finding, which emerged from a study of how genetics and diet affect the microbiome, “is really solid proof that this association could have major clinical importance,” says Jack Gilbert, a microbial ecologist at the University of California, San Diego, who was not involved with the work.

Source: Gut microbe linked to depression in large health study, an article by Elizabeth Pennisi.

science

What's in a Good Error Message?

As software developers, we’ve all come across those annoying, not-so-useful error messages when using some library or framework: "Couldn’t parse config file", "Lacking permission for this operation", etc. Ok, ok, so something went wrong apparently; but what exactly? What config file? Which permissions? And what should you do about it? Error messages lacking this kind of information quickly create a feeling of frustration and helplessness.

So what makes a good error message then? To me, it boils down to three pieces of information which should be conveyed by an error message:

Context: What led to the error? What was the code trying to do when it failed?

The error itself: What exactly failed?

Mitigation: What needs to be done in order to overcome the error?

Source: What's in a Good Error Message?, an article by Gunnar Morling.

software development

Personal Linux Setup with Git Repos and Stow

I had a dream

A low power, always-on computer I could SSH into from any other computer in the house.

All of my projects and data in Git repos available for cloning and updating from any computer in the house.

My personal Linux/UNIX configuration ("dotfiles") available to any computer in the house for instant and granular installation.

No dependencies on any computer outside my home network.

Source: Personal Linux Setup with Git Repos and Stow, an article by Dave Gauer.

Sat 05 Feb 2022

A Primer: Accessing services in Kubernetes

Whenever I work on a local or remote Kubernetes cluster, I tend to want to connect to my application to send it a HTTP request or something similar.

Source: A Primer: Accessing services in Kubernetes, an article by Alex Ellis.

Comprehensive Guide to Grouping and Aggregating with Pandas

One of the most basic analysis functions is grouping and aggregating data. In some cases, this level of analysis may be sufficient to answer business questions. In other instances, this activity might be the first step in a more complex data science analysis. In pandas, the groupby function can be combined with one or more aggregation functions to quickly and easily summarize data. This concept is deceptively simple and most new pandas users will understand this concept. However, they might be surprised at how useful complex aggregation functions can be for supporting sophisticated analysis.

This article will quickly summarize the basic pandas aggregation functions and show examples of more complex custom aggregations. Whether you are a new or more experienced pandas user, I think you will learn a few things from this article.

Comprehensive Guide to Grouping and Aggregating with Pandas, an article by Chris Moffitt.

python

What Is a Python Package?

This article is a tutorial that’s meant to cover Python packages, modules, and import statements in enough depth for a beginner to intermediate Python developer. The goal is to answer common questions people have about these topics. We also will strive to explain everything clearly and step by step, to show you errors you might see along the way and how to solve them, and to illustrate how things work with simple examples.

Source: What Is a Python Package?, an article by John Lockwood.

python

This Is 40 (2012)

Pete and Debbie are both about to turn 40, their kids hate each other, both of their businesses are failing, they're on the verge of losing their house, and their relationship is threatening to fall apart.

In the evening we watched This Is 40. I liked the movie a little so I give it a 6 out of 10. I did like it that a part of Debaser by the band Pixies was played and that there was a Pixies poster in the office of one of the main characters. And Alice liked that Billie Joe Armstrong of the band Green Day was in the movie.

Fri 04 Feb 2022

Implementation of CIDR routing table in Rust

In this article, I will implement the basic CIDR routing table in Rust, optimize the routing algorithm, and benchmark different solutions for comparison.

Source: Implementation of CIDR routing table in Rust, an article by Rostyslav Toch.

Why not ZFS

ZFS is a hybrid filesystem and volume manager system that is quite popular recently but has some important and unexpected problems.

It has many good features, which are probably why it is used: snapshots (with send/receive suppport), checksumming, RAID of some kind (with scrubbing support), deduplication, compression, and encryption.

But ZFS also has a lot of downsides. It is not the only way to achieve those features on Linux, and there are better alternatives.

Source: Why not ZFS.

Optimizing GoAWK with a bytecode compiler and virtual machine

I recently sped up GoAWK by switching from a tree-walking interpreter to a bytecode compiler with a virtual machine interpreter. I discuss why it’s faster and how the new interpreter works.

Source: Optimizing GoAWK with a bytecode compiler and virtual machine, an article by Ben Hoyt.

Thu 03 Feb 2022

Using org-mode as an SQL playground

For every web app I work on, a database client is always present for exploring data and building complex queries. Recently, I have moved on from my PgAdmin to org-mode for this purpose, because why not.

Source: Using org-mode as an SQL playground, an article by Charanjit Singh.

Dumpster diving the Go garbage collector

In order to better understand how the garbage collector works, I decided to trace its low-level behavior on a live application. In this investigation, I'll instrument the Go garbage collector with eBPF uprobes.

Source: Dumpster diving the Go garbage collector, an article by Natalie Serrino.

Modern inetd in FreeBSD

The inetd ‘super-server’ is a special application which ties incoming network connections to locally-run commands. Using a single super-server to handle all network requests conserves memory and CPU resources at the expense of increased application latency. Although inetd has largely fallen out of fashion today, it was the most common method for handling network requests in the early days of the Internet.

Source: Modern inetd in FreeBSD, an article by Tom Jones.

Wed 02 Feb 2022

A toy DNS resolver

In this post, I want to explain how DNS resolvers work in a different way – with a short Go program that does the same thing described in the comic. The main function (resolve) is actually just 20 lines, including comments.

Source: A toy DNS resolver, an article by Julia Evans.

Postgres large JSON value query performance

Postgres supports three types for "schemaless" data: JSON (added in 9.2), JSONB (added in 9.4), and HSTORE (added in 8.2 as an extension). Unfortunately, the performance of queries of all three gets substantially slower (2-10×) for values larger than about 2 kiB, due to how Postgres stores long variable-length data (TOAST). The same performance cliff applies to any variable-length types, like TEXT and BYTEA. This article contains some quick-and-dirty benchmark results to explore how Postgres's performance changes for the "schemaless" data types when they become large. My conclusion is that you should expect a 2-10× slower queries once a row gets larger than Postgres's 2 kiB limit.

Source: Postgres large JSON value query performance, an article by Evan Jones.

grep Flags – The Good Stuff

grep is one of the most universal and commonly-used commands on the command line. I count about 50 flags you can use on it in my man page. So which ones are the ones you should know about for everyday use?

Source: grep Flags – The Good Stuff, an article by Ian Miell.

command line

Tue 01 Feb 2022

Introduction to Free Monads

If you’ve been around Haskell circles for a bit, you’ve probably seen the term “free monads”. This article aims to introduce free monads and explain why they are useful.

To whet your appetite a little, free monads are basically a way to easily get a generic pure Monad instance for any Functor. This can be rather useful in many cases when you’re dealing with tree-like structures, but to name a few:

To build an AST for an eDSL using do-notation.

To have different semantics for the same monad in different contexts, e.g., define an interpreter and a pretty-printer for an eDSL, or have a mock interpreter in addition to a real one.

To build a decision-tree type structure harnessing the do-notation for non-determinism (like with lists, but for trees).

Source: Introduction to Free Monads, an article by Nikolay Yakimov.

haskell

A brief guide to perl character encoding

I originally wrote this at work, after my team spent far too many days yelling at the computer because of Mojibake. Thanks to my employer for allowing me to publish it, and the several colleagues who provided helpful feedback. Any errors are, naturally, not their fault.

Source: A brief guide to perl character encoding, an article by David Cantrell.

The Basics of Emacs Configuration

The most valuable aspect of Emacs is the ability it gives you to customize your environment and workflow perfectly for whatever it is you want to do.

Source: The Basics of Emacs Configuration.

emacs

Mon 31 Jan 2022

10 Unknown Security Pitfalls for Python

Python developers trust their applications to have a solid security state due to the use of standard libraries and common frameworks. However, within Python, just like in any other programming language, there are certain features that can be misleading or misused by developers. Often it is only a very minor subtlety or detail that can make developers slip and add a severe security vulnerability to the code base.

In this blog post, we share 10 security pitfalls we encountered in real-world Python projects. We chose pitfalls that we believe are less known in the developer community. By explaining each issue and its impact we hope to raise awareness and sharpen your security mindset. If you are using any of these features, make sure to check your Python code!

Source: 10 Unknown Security Pitfalls for Python, an article by Dennis Brinkrolf.

Panics vs cancellation, part 1

One of the things people often complain about when doing Async Rust is cancellation. This has always been a bit confusing to me, because it seems to me that async cancellation should feel a lot like panics in practice, and people don’t complain about panics very often (though they do sometimes). This post is the start of a short series comparing panics and cancellation, seeking after the answer to the question “Why is async cancellation a pain point and what should we do about it?” This post focuses on explaining Rust’s panic philosophy and explaining why I see panics and cancellation as being quite analogous to one another.

Source: Panics vs cancellation, part 1, an article by Niko Matsakis.

rust

Slicing in Python

Bas Steins wrote a A Comprehensive Guide to Slicing in Python.

python