Plurrrr

home

Testing Code That is Difficult to Test (With Perl)

Code that performs side effects is difficult to test because we need figure out how to sandbox the effects so we can observe the state of the sandbox before and after executing the effectful code. The difficulty is increased when the side effectful code also depends on specific OS configurations. Let us explore my solution to such a predicament.

Source: Testing Code That is Difficult to Test (With Perl), an article by Nicholas Hubbard.

The tools I use to build my website

Every so often I get an email from someone starting out in web development who asks something along these lines: “What do you use to create your website, benhoyt.com? Do you use a Content Management System? What theme do you use?”

I generally reply with a brief response, saying how I like to keep it simple: I use my text editor to write Markdown files, test locally using the Jekyll static site generator, and then push them live to GitHub Pages using a Git tool. I don’t use a fancy “theme”, just a simple layout I created using a few dozen lines of HTML and CSS.

Source: The tools I use to build my website, an article by Ben Hoyt.

Semantic Networks

A semantic network or net is a graph structure for representing knowledge in patterns of interconnected nodes and arcs. Computer implementations of semantic networks were first developed for artificial intelligence and machine translation, but earlier versions have long been used in philosophy, psychology, and linguistics. The Giant Global Graph of the Semantic Web is a large semantic network (Berners-Lee et al. 2001; Hendler & van Harmelen 2008).

Source: Semantic Networks, an article by John F. Sowa.

Worst practices #5 through #1

Every so often, you see code that someone else has written—or code that you wrote—and smack your head in wonder, disbelief, and dismay.

My previous article, “Ten Java coding antipatterns to avoid: Worst practices #10 through #6,” explores five of those antipatterns. I’ll conclude the discussion here with the final five worst practices, plus a bonus.

I’ll reiterate what I wrote in the previous article’s introduction: You should avoid these worst practices—and eliminate them when you maintain or refactor existing code. And, of course, resolve them if you see these issues during a code review.

Source: Ten Java coding antipatterns to avoid: Worst practices #5 through #1, an article by Ian Darwin.

How Kubernetes Reinvented Virtual Machines (in a good sense)

There are lots of posts trying to show how simple it is to get started with Kubernetes. But many of these posts use complicated Kubernetes jargon for that, so even those with some prior server-side knowledge might be bewildered. Let me try something different here. Instead of explaining one unfamiliar matter (how to run a web service in Kubernetes?) with another (you just need a manifest, with three sidecars and a bunch of gobbledygook), I'll try to reveal how Kubernetes is actually a natural development of the good old deployment techniques.

Source: How Kubernetes Reinvented Virtual Machines (in a good sense), an article by Ivan Velichko.

I've long been an enthusiastic user of print based debugging, although I did eventually realize that I reach for a debugger when dealing with certain sorts of bugs. But print based debugging is eternally controversial, with any number of people ready to tell you that you should use a debugger instead and that you're missing out by not doing so. Recently I had a thought about that and how it interacts with how much programming people do.

Source: Print based debugging and infrequent developers, an article by Chris Siebenmann.

Running Linux microVMs on macOS (M1/M2)

Sometimes, while working on macOS, you may find the need to test something quick on Linux, or use some utility that's only available on this OS. But, of course, you don't want to go through all the process of creating the VM from scratch.

The good news is, you don't need to! Using krunvm you can create and start a microVM from a regular container image (that is, an OCI image), in just two commands and a couple of seconds.

Source: Running Linux microVMs on macOS (M1/M2), an article by Sergio López.

Design Patterns with Rust Types

This post introduces some patterns and tricks to better utilise Rust's type system for clean and safe code.

This post is on the advanced side and in general there are no absolutes - these patterns usually need to be evaluated on a case-by-case basis to see if the cost / benefit trade-off is worth it.

Source: Design Patterns with Rust Types.

asdf: Manage multiple runtime versions with a single CLI tool

asdf is a tool version manager. All tool version definitions are contained within one file (.tool-versions) which you can check in to your project's Git repository to share with your team, ensuring everyone is using the exact same versions of tools.

The old way of working required multiple CLI version managers, each with their distinct API, configurations files and implementation (e.g. $PATH manipulation, shims, environment variables, etc...). asdf provides a single interface and configuration file to simplify development workflows, and can be extended to all tools and runtimes via a simple plugin interface.

Source: the Introduction of asdf.

Lisp in Vim

Fifteen years ago, writing Lisp code in Vim was an odd adventure. There were no good plugins for Vim that assisted in structured editing of Lisp s-expressions or allowed interactive programming by embedding a Lisp Read-Eval-Print-Loop (REPL) or a debugger within the editor. The situation is much better now. In the last ten years, we have seen active development of two Vim plugins named Slimv and Vlime. Slimv is over 10 years old now. Vlime is more recent and less than 3 years old right now. Both support interactive programming in Lisp.

I am going to discuss and compare both Slimv and Vlime in this article. I will show how to get started with both plugins and introduce some of their basic features.

Source: Lisp in Vim, an article by Susam Pal.

The many flavors of hashing

In practical computer science hashing is a very important concept. It is used from simple data structures (like hash maps), highly complex data structures (like bloom filters or hyperloglog counters), database indices and sharding, storage and communication integrity, distributed storage, most password authentication and storage mechanisms, digital signatures, other cryptographic constructs based on Merkle trees (including Git or digital ledgers), and possibly many other use-cases I'm not even aware of right now.

However, not every hash algorithm is appropriate in all of these scenarios, and in fact, very few algorithms are usable in more than a couple of situations. Even worse, using the wrong algorithm will lead in the best case scenario to performance problems, but in the worst case scenario to security issues and even financial loss. Thus, knowing which algorithm to pick for which application is crucial.

Therefore I'll try to summarize how I approach the topic of hashing, including use-cases, recommended algorithms, and links to other articles.

Source: The many flavors of hashing, an article by Ciprian Dorin Craciun.

Uncompressing Folders in Swift

So imagine you need to get multiple files and folders from an API. One option for doing so is to get all the file names and request them from what ever file server you are using. This is terrible don't do this. The optimal way is to bundle the entire directory into a compressed format and distribute that one file. Okay great, say you needed these files in an iOS/iPadOS or MacOS application. That means you will need to decompress the files that you received in swift.

Source: Uncompressing Folders in Swift, an article by Quindarius Lyles-Woods.

Debugging Postgres autovacuum problems: 13 tips

If you’ve been running PostgreSQL for a while, you’ve heard about autovacuum. Yes, autovacuum, the thing which everybody asks you not to turn off, which is supposed to keep your database clean and reduce bloat automatically.

And yet—imagine this: one fine day, you see that your database size is larger than you expect, the I/O load on your database has increased, and things have slowed down without much change in workload. You begin looking into what might have happened. You run the excellent Postgres bloat query and you notice you have a lot of bloat. So you run the VACUUM command manually to clear the bloat in your Postgres database. Good!

But then you have to address the elephant in the room: why didn’t Postgres autovacuum clean up the bloat in the first place…? Does the above story sound familiar? Well, you are not alone. 😊

Source: Debugging Postgres autovacuum problems: 13 tips, an article by Samay Sharma.

Oldest and Fatherless: The Terrible Secret of Tom Bombadil

Old Tom Bombadil. Possibly the least liked character in The Lord of the Rings. A childish figure so disliked by fans of the book that few object to his absence from all adaptations of the story. And yet, there is another way of looking at Bombadil, based only on what appears in the book itself, that paints a very different picture of this figure of fun.

What do we know about Tom Bombadil? He is fat and jolly and smiles all the time. He is friendly and gregarious and always ready to help travellers in distress.

Except that none of that can possibly be true.

Source: Oldest and Fatherless: The Terrible Secret of Tom Bombadil.

You might also be interested in the comments on Hacker News.

Responsive and accessible typography and why you should care

How many times have you been aware of text's different shapes and sizes while browsing the web lately? Probably not many, unless you found an extremely uncomfortable typography that pushed you to quickly flee the website.

Typography is a silent tool that UX designers and developers can sometimes take for granted. There is much noise around this topic. Pixels? Are breakpoints enough to switch sizes across devices? Do we even need breakpoints at all?

Let’s find out about a few key concepts to succeed at a responsive and accessible typography as a front-end developer or as a UX designer.

Source: Responsive and accessible typography and why you should care, am article by Maria Eugenia Trapani.

Swift Proposal: Move Function

In this document, we propose adding a new function called move to the swift standard library, which ends the lifetime of a specific local let, local var, or consuming function parameter, and which enforces this by causing the compiler to emit a diagnostic upon any uses that are after the move function. This allows for code that relies on forwarding ownership of values for performance or correctness to communicate that requirement to the compiler and to human readers.

Source: Move Function + "Use After Move" Diagnostic, an article by Michael Gottesman, Andrew Trick, and Joe Groff.

SQLite Internals: Pages & B-trees

This constrained size means that SQLite doesn't include every bell and whistle. It's careful to include the 95% of what you need in a database—strong SQL support, transactions, windowing functions, CTEs, etc—without cluttering the source with more esoteric features. This limited feature set also means the structure of the database can stay simple and makes it easy for anyone to understand.

Source: SQLite Internals: Pages & B-trees, an article by Ben Johnson.

Using GNU Stow to manage your dotfiles

I accidentally stumbled upon something yesterday that I felt like sharing, which fell squarely into the "why the hell didn’t I know about this before?" category. In this post, I’ll describe how to manage the various configuration files in your GNU/Linux home directory (aka "dotfiles" like .bashrc) using GNU Stow.

Source: Using GNU Stow to manage your dotfiles, an article by Brandon Invergo.

The limits of Python vectorization as a performance technique

Vectorization in Python, as implemented by NumPy, can give you faster operations by using fast, low-level code to operate on bulk data. And Pandas builds on NumPy to provide similarly fast functionality. But vectorization isn’t a magic bullet that will solve all your problems: sometimes it will come at the cost of higher memory usage, sometimes the operation you need isn’t supported, and sometimes it’s just not relevant.

Source: The limits of Python vectorization as a performance technique, an article by Itamar Turner-Trauring.

Solving “The Dangler” Conundrum with Container Queries and :has()

Y’know that situation where you tell the client, “Here’s your website and you can edit those four (4) little homepage features in the CMS” and the client says “Okay okay okay” and you check the site a week later and it looks bad because the client —despite your incredible documentation— put an odd number of items in the feature grid? It’s a major minor problem that’s tough to explain to the client, but it all comes down to…

The dangler.

Source: Solving “The Dangler” Conundrum with Container Queries and :has(), an article by Dave Rupert.

When Not to Use Docker: Cases Where Containers Don’t Help

Many organizations that adopt Docker or an adjacent containerization technology find it increases efficiency and accelerates the development process. Docker’s not something that magically improves every system though. In this article, we’ll look at some scenarios where moving to containers might be more of a hindrance than a help.

Source: When Not to Use Docker: Cases Where Containers Don’t Help, an article by James Walker.

What they don't teach you about sockets

In order to effectively write applications that communicate via sockets, there were some realizations I needed to make that weren't explicitly told to me by any of the documentation I read.

If you have experience writing applications using sockets, all of this information should be obvious to you. It wasn't obvious to me as an absolute beginner, so I'm trying to make it more explicit in the hopes of shortening another beginner's time getting their feet wet with sockets.

Source: What they don't teach you about sockets, an article by Macoy Madson.

Useless Math That Turned Out to Be Extremely Important

Mathematicians are a peculiar people. We live in our own little world, studying esoteric ideas that may or may not have much connection to the real world. One might wonder whether the bulk of what we study is actually useful. It’s true, after all, that we often pursue ideas not because there’s an immediate application, but simply because they’re interesting. By and large, it seems we aren’t overly concerned about immediate real-world application of our results.

Don’t start campaigning to cut math funding just yet, though. Math that doesn’t have applications today may very quickly become extremely important, even becoming integral to our way of life!

Source: Useless Math That Turned Out to Be Extremely Important, an article by Alex Shumway.

A Guide to Naming Variables

Software is written for people to understand; variable names should be chosen accordingly. People need to comb through your code and understand its intent in order to extend or fix it. Too often, variable names waste space and hinder comprehension. Even well-intentioned engineers often choose names that are, at best, only superficially useful. This document is meant to help engineers choose good variable names. It artificially focuses on code reviews because they expose most of the issues with bad variable names. There are, of course, other reasons to choose good variable names (such as improving code maintenance).

Source: A Guide to Naming Variables, an article by Jacob Gabrielson.

Is keeping dates in UTC really the best solution?

In many projects, the approach to dates is quite nonchalant. People do as they want. When on-premise systems were king, the common problem was that it was hard to know precisely when something happened. The consistency of the configuration depended on how meticulous ops people were. It wasn’t shocking to find out that the server had a different time zone, the application had a different one, and the user had a different time zone. At one point, the development community found a compromise that “maybe we would use the same time zone everywhere, for instance UTC.

Source: Is keeping dates in UTC really the best solution?, an article by Oskar Dudycz.

Hardening SSH

In 2019, Netcraft found 74.2% of web-facing machines run Linux. During an IPv4-wide census in 2016, an OpenSSH banner was detected 75% of the time when there was a response on TCP port 22. It's safe to say OpenSSH is probably the world's most popular software for connecting to servers remotely. It's also one of the most prized attack vectors given the functionality offered to anyone able to connect.

Hardening the security aspects of an OpenSSH configuration is very challenging. It's even worse for teams that aren't focused on network security and can't justify the budget for consultants setting up bespoke systems.

Source: Hardening SSH, an article by Mark Litwintschik.

Unit-aware data frames with composite, dimensional and ixset-typed

n this post we’re going to see how we can stitch together a few libraries to make a unit-aware queryable data frame from a CSV using extensible records. By the end of this text, we’ll be able to parse a CSV of data from the periodic table, complete with the correct units, and able to quickly ask questions about our data set using the generated indices.

Source: Unit-aware data frames with composite, dimensional and ixset-typed., an article by Dan Firth.

Why Would Git Push a Larger than Necessary Pack

In my time pretending to be an engineer and working with git at Twitter, I’ve seen an interesting behavior pop up intermittently. People start complaining about git-push being slow. This particular issue becomes hard to diagnose, especially since the pandemic because we can’t be certain of the quality of connection being used, and optimizations to git-push has always taken a back seat to all the other changes we’ve done to git internally. But it has persisted long enough that it needed some deeper diving into, and the intermittent nature always fascinated me. Let’s talk about the problem a little more.

Source: Why Would Git Push a Larger than Necessary Pack, an article by Kiran Paul.

A freshly molted Chromatopelma cyaneopubescens

Yesterday, in the early evening, I noticed that the Chromatopelma cyaneopubescens I keep had molted. And today, because I could guide it carefully in a different position, I took a few photos.

Freshly molted Chromatopelma cyaneopubescens
Freshly molted Chromatopelma cyaneopubescens.

In the photo above you can see why this tarantula has the common name green bottle blue tarantula or GBB for short.

Typing your way into safety

I've been working with Python typing annotation in the last few years as part of our main product at Flare Systems. I've found it to be a wonderful tool to support refactoring and make the code more readable. Lately, I explored how we can make API safer with the uses of types. I will specifically look about how we can use Python typing annotation to make os.system foolproof.

Source: Typing your way into safety, an article by Israël Hallé.