week 52, 2021

Almost Always Unsigned

The need for signed integer arithmetic is often misplaced as most integers never represent negative values within a program. The indexing of arrays and iteration count of a loop reflects this concept as well. There should be a propensity to use unsigned integers more often than signed, yet despite this, most code incorrectly choses to use signed integers almost exclusively.

Source: Almost Always Unsigned, an article by Dale Weiler.

Go Fuzzing

Fuzzing is a type of automated testing which continuously manipulates inputs to a program to find bugs. Go fuzzing uses coverage guidance to intelligently walk through the code being fuzzed to find and report failures to the user. Since it can reach edge cases which humans often miss, fuzz testing can be particularly valuable for finding security exploits and vulnerabilities.

Source: Go Fuzzing.

Databass, Part 1: Queries

It's been a while since my last language series on this blog, but I figured I shouldn't let an entire calendar year go by without doing some technical writing here. This time we'll be working on creating a toy relational database in the vein of Tutorial D, as described in Databases, Types, and The Relational Model: The Third Manifesto by C.J. Date and Hugh Darwen. However, instead of creating a full database language with a its own syntax, we're going to embed the database language in Haskell. In particular, we're going to try and get ghc to ensure that queries are well typed as opposed to writing our own type checker.

Source: Databass, Part 1: Queries, an article by Joseph Morag.

The Modern Guide to OAuth

I know what you are thinking, is this really another guide to OAuth 2.0?

Well, yes and no. This guide is different than most of the others out there because it covers all of the ways that we actually use OAuth. It also covers all of the details you need to be an OAuth expert without reading all the specifications or writing your own OAuth server. This document is based on hundreds of conversations and client implementations as well as our experience building FusionAuth, an OAuth server which has been downloaded over a million times.

Source: The Modern Guide to OAuth, an article by Brian Pontarelli and Dan Moore.

James’s OpenBSD setup notes

These are my personal notes on installing, setting up, and using OpenBSD on two Thinkpads (an X220 and a T400). They’re applicable to OpenBSD-current as at 2020-09-05 (somewhere between OpenBSD versions 6.7 and 6.8) - please bear in mind that some things may have changed if you’re using a different version.

Source: James's OpenBSD setup notes.

Daddy's Home (2015)

Brad Whitaker is a radio host trying to get his stepchildren to love him and call him Dad. But his plans turn upside down when their biological father, Dusty Mayron, returns.

In the evening Adam, Alice and I watched Daddy's Home. I liked the movie and give it a 7 out of 10. I also liked the sound track: Here Comes Your Man (Pixies), Self Esteem (The Offspring), and Hate to Say I Told You So (The Hives).

My Setup for Self-Hosting Dozens of Web Applications

There are nearly infinite options available for hosting software today and more come come out every day. However, many articles and guides you'll find online for this kind of thing are either from public cloud providers or companies with massive infrastructure, complex application needs, and huge amounts of traffic.

I wanted to write this up mostly to share the decisions I made for the architecture and why I've done things the way I have. Although my needs are much smaller-scale and I don't currently charge any money for anything I'm running, I still want to provide the best possible experience for my sites' users and protect all the work I've put into my projects.

Source: My Setup for Self-Hosting Dozens of Web Applications + Services on a Single Server, an article by Casey Primozic.

How to back up your Git repositories

Making backups is important. You don’t want to lose all your information because of a broken device or a stolen account. One proposed solution is the 3–2–1 method (3 copies, at least in 2 different devices, and 1 of them off-site) and you should make at least one full backup every year (that could match the World Backup Day). What to back up is up to you. You can backup your contacts, emails, messages, social networks content… and your code.

Backing up code is a bit tricky question. Most of the people host their code on their computer, probably with Git and maybe on Github. But having one copy is having no copies. You don’t want to depend on Github exclusively for your code, and it is wise to have at least one extra copy. The question is then, how to make that extra copy.

Source: How to back up your Git repositories, an article by Alberto de Murga.

The Matrix (1999)

When a beautiful stranger leads computer hacker Neo to a forbidding underworld, he discovers the shocking truth--the life he knows is the elaborate deception of an evil cyber-intelligence.

In the afternoon we watched The Matrix. I like this movie a lot and give it a sold 8.5 out of 10.

Consider SQLite

If you were creating a web app from scratch today, what database would you use? Probably the most frequent answer I see to this is Postgres, although there are a wide range of common answers: MySQL, MariaDB, Microsoft SQL Server, MongoDB, etc. Today I want you to consider: what if SQLite would do just fine?

Source: Consider SQLite, an article by Wesley Aptekar-Cassels.

5 lessons learned when I TDD an algorithm in JavaScript

I just found out that Uncle Bob wrote an article about TDDing the Diamond Square algorithm (yea, I’m slow on catching up sometimes).

In the article, Uncle Bob leads us through a way to TDD an algorithm. It’s pretty nice and gave me a few insights as to how to mock and test in intervals until the algorithm emerges.

Problem is – Uncle Bob’s code is not JavaScript!!!

So I set down and reimplemented the algorithm. This time, I used the lessons learned from Uncle Bob’s article.

Source: 5 lessons learned when I TDD an algorithm in JavaScript, an article by Yonatan Kra.

Optimizing Postgres Queries at Scale

Heap's thousands of customers can build queries in the Heap UI to answer almost any question about how users are using their product. Optimizing all of these queries across all our customers presents special challenges you wouldn't typically encounter if you were optimizing the performance of a small set of queries within a typical app.

This post is about why this scale requires us to conduct performance experiments to optimize our SQL, and it details how we conduct those experiments.

Source: Optimizing Postgres Queries at Scale, an article by Matt Dupree.

Predictive CPU isolation of containers at Netflix

Because microprocessors are so fast, computer architecture design has evolved towards adding various levels of caching between compute units and the main memory, in order to hide the latency of bringing the bits to the brains. However, the key insight here is that these caches are partially shared among the CPUs, which means that perfect performance isolation of co-hosted containers is not possible. If the container running on the core next to your container suddenly decides to fetch a lot of data from the RAM, it will inevitably result in more cache misses for you (and hence a potential performance degradation).

Source: Predictive CPU isolation of containers at Netflix, an article by Benoit Rostykus and Gabriel Hartmann.

Using PostgreSQL and SQL to Randomly Sample Data

In the last post of this series we introduced trying to model fire probability in Northern California based on weather data. We showed how to use SQL to do data shaping and preparation. We ended with a data set that was ready with all the fire occurrences and weather data in a single table almost prepped for logistic regression.

There is now one more step: sample the data. If you have worked with logistic regression before you know you should try to balance the number of occurrences (1) with absences (0). To do this we are going to sample out from the non_fire_weather equal to the count in fire_weather and then combine them into one table.

Source: Using PostgreSQL and SQL to Randomly Sample Data, an article by Steve Pousty.