
week 05, 2022

Gut microbe linked to depression in large health study

The trillions of bacteria in and on our bodies can bolster our health and contribute to disease, but just which microbes are the key actors has been elusive. Now, a study involving thousands of people in Finland has identified a potential microbial culprit in some cases of depression.

The finding, which emerged from a study of how genetics and diet affect the microbiome, “is really solid proof that this association could have major clinical importance,” says Jack Gilbert, a microbial ecologist at the University of California, San Diego, who was not involved with the work.

Source: Gut microbe linked to depression in large health study, an article by Elizabeth Pennisi.

What's in a Good Error Message?

As software developers, we’ve all come across those annoying, not-so-useful error messages when using some library or framework: "Couldn’t parse config file", "Lacking permission for this operation", etc. Ok, ok, so something went wrong apparently; but what exactly? What config file? Which permissions? And what should you do about it? Error messages lacking this kind of information quickly create a feeling of frustration and helplessness.

So what makes a good error message then? To me, it boils down to three pieces of information which should be conveyed by an error message:

  • Context: What led to the error? What was the code trying to do when it failed?
  • The error itself: What exactly failed?
  • Mitigation: What needs to be done in order to overcome the error?

Source: What's in a Good Error Message?, an article by Gunnar Morling.

Personal Linux Setup with Git Repos and Stow

I had a dream

  • A low power, always-on computer I could SSH into from any other computer in the house.
  • All of my projects and data in Git repos available for cloning and updating from any computer in the house.
  • My personal Linux/UNIX configuration ("dotfiles") available to any computer in the house for instant and granular installation.
  • No dependencies on any computer outside my home network.

Source: Personal Linux Setup with Git Repos and Stow, an article by Dave Gauer.

Comprehensive Guide to Grouping and Aggregating with Pandas

One of the most basic analysis functions is grouping and aggregating data. In some cases, this level of analysis may be sufficient to answer business questions. In other instances, this activity might be the first step in a more complex data science analysis. In pandas, the groupby function can be combined with one or more aggregation functions to quickly and easily summarize data. This concept is deceptively simple and most new pandas users will understand this concept. However, they might be surprised at how useful complex aggregation functions can be for supporting sophisticated analysis.

This article will quickly summarize the basic pandas aggregation functions and show examples of more complex custom aggregations. Whether you are a new or more experienced pandas user, I think you will learn a few things from this article.

Comprehensive Guide to Grouping and Aggregating with Pandas, an article by Chris Moffitt.

What Is a Python Package?

This article is a tutorial that’s meant to cover Python packages, modules, and import statements in enough depth for a beginner to intermediate Python developer. The goal is to answer common questions people have about these topics. We also will strive to explain everything clearly and step by step, to show you errors you might see along the way and how to solve them, and to illustrate how things work with simple examples.

Source: What Is a Python Package?, an article by John Lockwood.

This Is 40 (2012)

Pete and Debbie are both about to turn 40, their kids hate each other, both of their businesses are failing, they're on the verge of losing their house, and their relationship is threatening to fall apart.

In the evening we watched This Is 40. I liked the movie a little so I give it a 6 out of 10. I did like it that a part of Debaser by the band Pixies was played and that there was a Pixies poster in the office of one of the main characters. And Alice liked that Billie Joe Armstrong of the band Green Day was in the movie.

Why not ZFS

ZFS is a hybrid filesystem and volume manager system that is quite popular recently but has some important and unexpected problems.

It has many good features, which are probably why it is used: snapshots (with send/receive suppport), checksumming, RAID of some kind (with scrubbing support), deduplication, compression, and encryption.

But ZFS also has a lot of downsides. It is not the only way to achieve those features on Linux, and there are better alternatives.

Source: Why not ZFS.

Modern inetd in FreeBSD

The inetd ‘super-server’ is a special application which ties incoming network connections to locally-run commands. Using a single super-server to handle all network requests conserves memory and CPU resources at the expense of increased application latency. Although inetd has largely fallen out of fashion today, it was the most common method for handling network requests in the early days of the Internet.

Source: Modern inetd in FreeBSD, an article by Tom Jones.

A toy DNS resolver

In this post, I want to explain how DNS resolvers work in a different way – with a short Go program that does the same thing described in the comic. The main function (resolve) is actually just 20 lines, including comments.

Source: A toy DNS resolver, an article by Julia Evans.

Postgres large JSON value query performance

Postgres supports three types for "schemaless" data: JSON (added in 9.2), JSONB (added in 9.4), and HSTORE (added in 8.2 as an extension). Unfortunately, the performance of queries of all three gets substantially slower (2-10×) for values larger than about 2 kiB, due to how Postgres stores long variable-length data (TOAST). The same performance cliff applies to any variable-length types, like TEXT and BYTEA. This article contains some quick-and-dirty benchmark results to explore how Postgres's performance changes for the "schemaless" data types when they become large. My conclusion is that you should expect a 2-10× slower queries once a row gets larger than Postgres's 2 kiB limit.

Source: Postgres large JSON value query performance, an article by Evan Jones.

Introduction to Free Monads

If you’ve been around Haskell circles for a bit, you’ve probably seen the term “free monads”. This article aims to introduce free monads and explain why they are useful.

To whet your appetite a little, free monads are basically a way to easily get a generic pure Monad instance for any Functor. This can be rather useful in many cases when you’re dealing with tree-like structures, but to name a few:

  • To build an AST for an eDSL using do-notation.
  • To have different semantics for the same monad in different contexts, e.g., define an interpreter and a pretty-printer for an eDSL, or have a mock interpreter in addition to a real one.
  • To build a decision-tree type structure harnessing the do-notation for non-determinism (like with lists, but for trees).

Source: Introduction to Free Monads, an article by Nikolay Yakimov.

10 Unknown Security Pitfalls for Python

Python developers trust their applications to have a solid security state due to the use of standard libraries and common frameworks. However, within Python, just like in any other programming language, there are certain features that can be misleading or misused by developers. Often it is only a very minor subtlety or detail that can make developers slip and add a severe security vulnerability to the code base.

In this blog post, we share 10 security pitfalls we encountered in real-world Python projects. We chose pitfalls that we believe are less known in the developer community. By explaining each issue and its impact we hope to raise awareness and sharpen your security mindset. If you are using any of these features, make sure to check your Python code!

Source: 10 Unknown Security Pitfalls for Python, an article by Dennis Brinkrolf.

Panics vs cancellation, part 1

One of the things people often complain about when doing Async Rust is cancellation. This has always been a bit confusing to me, because it seems to me that async cancellation should feel a lot like panics in practice, and people don’t complain about panics very often (though they do sometimes). This post is the start of a short series comparing panics and cancellation, seeking after the answer to the question “Why is async cancellation a pain point and what should we do about it?” This post focuses on explaining Rust’s panic philosophy and explaining why I see panics and cancellation as being quite analogous to one another.

Source: Panics vs cancellation, part 1, an article by Niko Matsakis.