week 39, 2022

The Art of Logging

In this article, we will identify the optimal format for structuring our logs that is easy for humans and machines to parse and understand. Next, we will highlight the key info to log in addition to a proposal of data structure. Finally we will try to provide some important notes to keep in mind for your own projects.

Source: The Art of Logging, an article by Jaouher Kharrat.

Signals in prod: dangers and pitfalls

A signal is an event that Linux systems generate in response to some condition. Signals can be sent by the kernel to a process, by a process to another process, or a process to itself. Upon receipt of a signal, a process may take action.

Signals are a core part of Unix-like operating environments and have existed since more or less the dawn of time. They are the plumbing for many of the core components of the operating system—core dumping, process life cycle management, etc.—and in general, they've held up pretty well in the fifty or so years that we have been using them. As such, when somebody suggests that using them for interprocess communication (IPC) is potentially dangerous, one might think these are the ramblings of someone desperate to invent the wheel. However, this article is intended to demonstrate cases where signals have been the cause of production issues and offer some potential mitigations and alternatives.

Source: Signals in prod: dangers and pitfalls, an article by Chris Down.

Rate Limiting with NGINX and NGINX Plus

One of the most useful, but often misunderstood and misconfigured, features of NGINX is rate limiting. It allows you to limit the amount of HTTP requests a user can make in a given period of time. A request can be as simple as a GET request for the homepage of a website or a POST request on a log‑in form.

Rate limiting can be used for security purposes, for example to slow down brute‑force password‑guessing attacks. It can help protect against DDoS attacks by limiting the incoming request rate to a value typical for real users, and (with logging) identify the targeted URLs. More generally, it is used to protect upstream application servers from being overwhelmed by too many user requests at the same time.

In this blog we will cover the basics of rate limiting with NGINX as well as more advanced configurations. Rate limiting works the same way in NGINX Plus.

Source: NGINX Rate Limiting, an article by Amir Rawdat.

Fun with FreeBSD: Your First Linux Guest

The FreeBSD operating system contains innumerable powerful features. One of these features is bhyve, its native type 2 (OS-level) hypervisor, which can host virtual machines running multiple different OSes, including Linux.

This post will walk you through creating a Linux virtual machine on FreeBSD using the CBSD tool, which greatly simplifies creating and managing bhyve VMs.

Source: Fun with FreeBSD: Your First Linux Guest, an article by Karen Bruner.

Maps and Memory Leaks in Go

When working with maps in Go, we need to understand some important characteristics of how a map grows and shrinks. Let’s delve into this to prevent issues that can cause memory leaks.

Source: Maps and Memory Leaks in Go, an article by Teiva Harsanyi.

3 Ways to Watch Logs in Real Time in Linux

You know how to view files in Linux. You use cat command or probably less command for this purpose.

That's good for files that has static content. But log files are dynamic and their content change with time. To monitor logs, you need to watch the log file as its content changes.

How do you see the content of log files in real time? Tail is the most popular command for this purpose but there are some other tools as well. I'll show them to you in this tutorial.

Source: Watch Logs in Real Time in Linux With Tail, Less & Multitail, an article by Abhishek Prakash.

Facts about State Machines

I hold the opinion that state machines are often misunderstood and under-applied

And that's why I wrote this. The goal of this list of facts is not to teach you what state machines are or how to use them; there are plenty of other resources for that. Rather, the goal here is to motivate their usage and to highlight things about them that are frequently overlooked, but nonetheless relevant.

Source: Facts about State Machines, an article by Chris Pressey.

The International (2009)

An Interpol agent attempts to expose a high-profile financial institution's role in an international arms dealing ring.

In the evening I watched The International. Halfway the movie I realised that I had seen it before, years ago. Still, I did like the movie and give it a 7 out of 10.

Making python fast for free - adventures with mypyc

I recently learnt that mypy has a compiler called mypyc. The compiler uses standard python type hints to generate c extensions automatically from python code. I found the idea very interesting as I have a library (Lagom - a dependency injection container) which is fairly extensively annotated with types. I liked the idea of getting a performance boost without having to rewrite any code or having to deal with multiple languages. This blogpost is intended to be a short overview of what I did, the problems I ran into and the workflow I ended up with.

Source: Making python fast for free - adventures with mypyc, an article by Steve Brazier.

LISTEN / NOTIFY: Automatic client notification in PostgreSQL

LISTEN / NOTIFY is a feature that enables users to listen to what goes on in the database. It is one of the oldest functionalities in PostgreSQL and is still widely used. The main question is: What is the purpose of the asynchronous query interface (LISTEN / NOTIFY), and what is it good for? The basic idea is to avoid polling.

Source: LISTEN / NOTIFY: Automatic client notification in PostgreSQL, an article by Hans-Jürgen Schönig.

Find slow data processing tasks (before your customers do)

Here are some of the ways you can discover your data processing jobs are too slow:

  1. Jobs start getting killed when they hit timeouts.
  2. Customers start complaining about slow or failed jobs.
  3. Your cloud computing bill is twice what it was last month.

While these notification mechanisms do work, it’s probably best not to rely on them. Life is easier when jobs finish successfully, customers are happy, and you have plenty of money left over in your budget.

That means you want to identify unexpected slowness or high memory usage before the situation get that bad. The sooner you can identify performance problems, the sooner you can fix them.

So how can you identify inefficient tasks in your data pipeline or workflow? Let’s find out!

Source: Find slow data processing tasks (before your customers do), an article by Itamar Turner-Trauring.

Why Async Rust

I often find async Rust to be misunderstood. Conversations around "why async" often focus on performance 1 - a topic which is highly dependent on workloads, and results with people wholly talking past each other. While performance is not a bad reason to choose async Rust, we often we only notice performance when we experience a lack of it. So I want to instead on which features async Rust provides which aren't present in non-async Rust. Though we'll talk a bit about performance too at the end of this post.

Source: Why Async Rust, an article by Yoshua Wuyts.