Plurrrr

home

From Potter's Field

An unidentified nude female sits propped against a fountain in Central Park. There are no signs of struggle. When Dr. Kay Scarpetta and her colleagues Benton Wesley and Pete Marino arrive on the scene, they instantly recognize the signature of serial killer Temple Brooks Gault. Scarpetta, on assignment with the FBI, visits the New York City morgue on Christmas morning, where she must use her forensic expertise to give a name to the nameless—a difficult task. But as she sorts through conflicting forensic clues, Gault claims his next victim. He has infiltrated the FBI’s top secret artificial-intelligence system developed by Scarpetta’s niece, and sends taunting messages as his butchery continues, moving terrifyingly closer to Scarpetta herself.

In the afternoon I started in From Potter's Field, Kay Scarpetta book 6 by Patricia Cornwell.

An Introduction to Parser Combinators

If you’ve ever had to write a parser before, you know that creating parsers can be a tedious and complicated process. The good news is that it doesn’t have to be this way. In this post, I’m going to introduce parser combinators - a technique for building parsers that I’ve found to be both practical and fun to play around with1.

Source: An Introduction to Parser Combinators, an article by Varun Ramesh.

Gitflow and GitHub Flow compared: Which one is better?

Gitflow is, by far, the most popular branching model and possibly the one that has endured the test of time the most. Introduced by Vincent Driessen in 2010, its fundamental idea is that you should isolate your work into different types of git branches.

Other branching strategies, such as the centralized workflow (for those teams that come from SVN), and the forking workflow (for open-source projects) exist. Git, as a version control system, only details basic branching operations, and it remains controversial as to which approach is the best. Beyond those basic branching operations, it's a matter of opinion.

‍> In this article we will compare Gitflow with its newer approach,

GitHub Flow.

Source: Gitflow and GitHub Flow compared: Which one is better?.

Completely purge files from a git repository

I have occasionally ended up with files I did not want in my git repositories. These can both take up a lot of space, and contain sensitive data that we just want to remove (such as MySQL dumps, deploy keys etc).

Git keeps a history of all files, so just deleting the file doesn’t “make it go away”. The only way to completely remove the file is to scan through all history, removing all references to (and history of) those files, and finally pruning the git repo (physically removing references to what we just deleted). Finally you have to force-push the repo changes back to the remote, overwriting the remote.

Source: Completely purge files from a git repository (including history), an article by Ralph Slooten.

Big O Notation: A Simple Explanation With Examples

It’s hard to create efficient algorithms without understanding the time and space complexity of various operations. The concept of Big O notation helps programmers understand how quickly or slowly an algorithm will execute as the input size grows.

In this article, we’ll cover the basics of Big O notation, why it is used and how describe the time and space complexity of algorithms with example.

Source: Big O Notation: A Simple Explanation With Examples.

Clearing up some misconceptions about Passkeys

I am unreasonably excited about passkeys, I’ve long been looking for a better/more convenient way than passwords to do authentication, and I think passkeys are finally it.

However, whenever I see passkeys mentioned (for example on the recent Tailscale post about them), there are always a lot of misconceptions that surface in the debate. I’d like to clear some of them here, and hopefully explain a bit better what passkeys are.

Source: Clearing up some misconceptions about Passkeys, an article by Stavros Korokithakis.

The Bubble (2022)

A group of actors and actresses stuck inside a pandemic bubble at a hotel attempt to complete a film.

In the evening Alice, Esme, and I watched The Bubble. I didn't like the movie much and give it a 5 out of 10.

Even in the Wild, Mice Run on Wheels

In 2009, neurophysiologist Johanna Meijer set up an unusual experiment in her backyard. In an ivy-tangled corner of her garden, she and her colleagues at Leiden University in the Netherlands placed a rodent running wheel inside an open cage and trained a motion-detecting infrared camera on the scene. Then they put out a dish of food pellets and chocolate crumbs to attract animals to the wheel and waited.

Wild house mice discovered the food in short order, then scampered into the wheel and started to run. Rats, shrews, and even frogs found their way to the wheel—more than 12,000 animals over 3 years. The creatures seemed to relish the feeling of running without going anywhere.

Source: Even in the Wild, Mice Run on Wheels, an article by Emily Underwood.

Carry on Screaming! (1966)

The sinister Dr Watt has an evil scheme going. He's kidnapping beautiful young women and turning them into mannequins to sell to local stores.

In the evening Adam, Alice, Esme, and I watched Carry on Screaming!. Adam had proposed to watch this this very old movie together. I didn't like it much and give it a 6 out of 10.

The Body Farm

Little Emily Steiner left a church meeting late one afternoon and strolled toward home along a lakeside path; a week later, her nude body was discovered, bound in blaze-orange duct tape. Called by the North Carolina authorities, forensic pathologist Kay Scarpetta recognizes similarities to the gruesome work of a serial killer who has long eluded the FBI But as she tries to make sense of the evidence, she is left with questions that lead her to the Body Farm, a little known research facility in Tennessee where, with the help of some grisly experiments, she might discover the answer.

It is Scarpetta alone who can interpret the forensic hieroglyphics that eventually reveal a solution to the case as staggering as it is horrifying. But she must also endeavor to help her niece, Lucy, who is embroiled in controversy at Quantico. And Scarpetta, too, is vulnerable, as she opens herself to the first physical and emotional bond she has felt in far too long a time.

In the afternoon I started in The Body Farm, Kay Scarpetta book 5 by Patricia Cornwell.

Understanding Quantum Secrecy

We begin with describing the core problem of quantum communication: the encoding (and decoding) of information. We distinguish between the classical elementary unit of information (bit) and the quantum elementary unit of information (qubit).

We end with a description of the core problem of quantum secrecy: quantum key generation and distribution.

Source: Understanding Quantum Secrecy, an article by Declain Thomas.

Extraction II (2023)

After barely surviving his grievous wounds from his mission in Dhaka, Bangladesh, Tyler Rake is back, and his team is ready to take on their next mission.

In the evening Esme and I watched Extraction II. I liked the movie more than the first one and give it an 8 out of 10.

Custom giraffe caret

Recently, a colleague asked me if there is a way to customize an input caret using CSS. I knew you could change the color of it, but it got me thinking if we could completely replace it. The problem seemed interesting to solve.

Source: Custom giraffe caret, an article by Stanko Tadić

Making Python 100x faster with less than 100 lines of Rust

A while ago at $work, we had a performance issue with one of our core Python libraries.

This particular library forms the backbone of our 3D processing pipeline. It’s a rather big and complex library which uses NumPy and other scientific Python packages to do a wide range of mathematical and geometrical operations.

Our system also has to work on-prem with limited CPU resources, and while at first it performed well, as the number of concurrent physical users grew we started running into problems and our system struggled to keep up with the load.

We came to the conclusion that we had to make our system at least 50 times faster to handle the increased workload, and we figured that Rust could help us achieve that.

Source: Making Python 100x faster with less than 100 lines of Rust, an article by Ohad Ravid.

Bit Hacking (with Go code)

At a fundamental level, a programmer needs to manipulate bits. Modern processors operate over data by loading in ‘registers’ and not individual bits. Thus a programmer must know how to manipulate the bits within a register.

Source: Bit Hacking (with Go code), an article by Daniel Lemire.

Hashing

As a programmer, you use hash functions every day. They're used in databases to optimise queries, they're used in data structures to make things faster, they're used in security to keep data safe. Almost every interaction you have with technology will involve hash functions in one way or another.

Hash functions are foundational, and they are everywhere.

But what is a hash function, and how do they work?

Source: Hashing, an article by Sam Rose.

Compiling typed Python

It’s been nine whole years since PEP 484 landed and brought us types from on high. This has made a lot of people very angry and been widely regarded as a bad move[1]. Since then, people on the internet have been clamoring to find out: does this mean we can now compile Python to native code for more speed? It’s a totally reasonable question. It was one of my first questions when I first started working on Python compilers. So can we do it?

Source: Compiling typed Python, an article by Max Bernstein.

Introducing the Nix Flake Checker

Quite possibly the best thing about the Nix ecosystem is that there's a small army of people hard at work improving Nixpkgs, the largest software package repository in existence and one of the most active repos on GitHub, every single day. Not only are they constantly adding brand new packages for stuff that you might want to use—over 80,000 packages and counting!—they're also updating existing packages, which sometimes even includes fixes for critical security vulnerabilities.

But to take full advantage of this steady drumbeat of progress, it's important that you follow some best practices. To help you adopt those practices, we at Determinate Systems have created a tool called Nix Flake Checker and we're excited to release it to the Nix community.

Source: Introducing the Nix Flake Checker, an article by Luc Perkins.

NATS: building a Nix binary cache

For the past month or so, I’ve been experimenting with Nits, a different approach to NixOS deployments that is pull rather than push-based. And as part of that effort, I needed to address how exactly I would push NixOS system closures to the machines under management.

Typically, in a push-based deployment approach, you can copy the system closure directly via SSH whenever you’re deploying. But with Nits, the agent process running on the target machine needs to be able to connect and download the latest closure on demand, sometimes days or weeks after the deployment was triggered.

“Use a Binary Cache”, I hear you say. And yes, I did. But instead of spinning up an instance of Harmonia, configuring an S3 Bucket or hitting up Domen over at Cachix, instead, I decided to roll my own.

Source: NATS: building a Nix binary cache, an article by Brian McGee.

Error vs. Exception

There has been confusion about the distinction between errors and exceptions for a long time, repeated threads in Haskell-Cafe and more and more packages that handle errors and exceptions or something between. Although both terms are related and sometimes hard to distinguish, it is important to do it carefully.

Source: Error vs. Exception.

My First Impressions of Nix

Nix is a tool for configuring software environments according to source files. I’ve been hearing more and more about Nix on Hacker News and Twitter. The idea of it appeals to me, so I’ve been tinkering with it over the past few weeks.

Source: My First Impressions of Nix, an article by Michael Lynch.

Designing Pythonic library APIs

This article describes some principles I’ve found useful for designing good Python library APIs, including structure, naming, error handling, type annotations, and more. It’s a written version of a talk I gave in June 2023 at the Christchurch Python meetup.

Source: Designing Pythonic library APIs, an article by Ben Hoyt.

Building Search DSLs with Django

Search capabilities span from free text (think Google) to raw data access (think SQL). In between, there’s a wide range of options for narrowing a search that are often provided with UI elements. But what if there are too many fields for a UI to search on? Search DSLs can give a user more granular access to searching without exposing an overly complicated interface.

Source: Building Search DSLs with Django, an article by Dan Lamanna.

Children of Memory

Earth failed. In a desperate bid to escape, the spaceship Enkidu and its captain, Heorest Holt, carried its precious human cargo to a potential new paradise. Generations later, this fragile colony has managed to survive, eking out a hardy existence. Yet life is tough, and much technological knowledge has been lost.

Then strangers appear. They possess unparalleled knowledge and thrilling technology – and they've arrived from another world to help humanity’s colonies. But not all is as it seems, and the price of the strangers' help may be the colony itself.

In the evening I started in Children of Memory, Children of Time book 3 by Adrian Tchaikovsky.

In the acknowledgements the author gives a nod to his research sources which include the fantastic book The Genius of Birds by Jennifer Ackerman. I read this book several years ago while living in Mexico and highly recommend it.

The best Python feature you cannot use

Instead of having to limit sanity checks to the boundaries of the program, we could re-use those as function contracts using the assert keyword. Indeed, setting PYTHONOPTIMIZE removes all assert, making the check useful in dev, and free in production.

Unfortunately, the community doesn't know about the feature, and use assert for things that should never be removed, so using the flag would likely introduce bugs into your program.

Source: The best Python feature you cannot use.

Ride Along 2 (2016)

As his wedding day approaches, Ben heads to Miami with his soon-to-be brother-in-law James to bring down a drug dealer who's supplying the dealers of Atlanta with product.

In the evening Adam, Alice, Esme, and I watched Ride Along 2. I liked the movie and give it a 7 out of 10.

How does Machine Learning work?

Machine learning (ML) lets computers learn from data on their own without requiring software developers to write out all the logic by hand. Given enough data, the machine can learn useful patterns in the data, which turns out to be quite powerful.

The earliest ML algorithms go back to the 1960’s but machine learning started being commonly used in the early 2000’s. In 2012, “deep learning” involving large neural networks became practical and their usage and capabilities has grown exponentially since then.

Source: How does Machine Learning work?.

The recent breakthroughs in Large Language Model (LLM) technology are positioned to transition many areas of software. Search and Database technologies particularly have an interesting entanglement with LLMs. There are cases where Search improves the capabilities of LLMs as well as where inversely, LLMs improve the capabilities of Search. In this blog post, we will break down 5 key components of the intersection between LLMs and Search.

  • Retrieval-Augmented Generation
  • Query Understanding
  • Index Construction
  • LLMs in Re-Ranking
  • Search Result Compression

We will also conclude with some thoughts on how Generative Feedback Loops fit into this picture.

Source: Large Language Models and Search, an article by Connor Shorten and Erika Cardenas.

Ride Along (2014)

Security guard Ben must prove himself to his girlfriend's brother, top police officer James. He rides along James on a 24-hour patrol of Atlanta.

In the evening Adam, Esme, and I watched Ride Along. I liked the movie and give it a 7 out of 10.

Exploring Dataflow Analysis in the Rust Compiler

Recently I’ve been working in static analysis land and as a part of that have been familiarizing myself with data flow analysis. I look at a fair amount of MIR and so decided to delve into the rustc_mir_dataflow crate to see how these things are handled in the rust compiler. There is a helpful introduction to this topic in the rustc dev guide, and this post fleshes things out a bit.

Source: Exploring Dataflow Analysis in the Rust Compiler, an article by David Anekstein.

Three techniques to adapt LLMs for any use case

Large language models (LLMs) have powerful general capabilities out of the box: they can answer questions, write poems and stories, invent recipes, and write code. But they may not precisely fit your use case. Their answers may be too vague, poorly formatted, or even incorrect.

Fortunately, you can adapt LLMs to meet your needs. There are three levels of LLM customization:

  1. Prompt engineering
  2. Embeddings via vector databases
  3. Fine-tuning

Each level is an order of magnitude more difficult and expensive than the previous, but offers far more customization.

Source: Three techniques to adapt LLMs for any use case, an article by Philip Kiely.

An Ode to Emacs. The Greatest Operating System

Emacs is one of those magical pieces of technology that manages to bridge the gap between being a tool that does useful work, and becoming a deeply personal component in a software developer’s life. It is one half of the editor war and is one of the longest running examples of programmers elevating their personal choices to moral imperatives. But beyond that, it is also a piece of software that has seen decades of iteration. It’s old enough that parts of it were contributed by people that have long since passed. They may be gone, but their code still lives on, making our lives just a bit easier.

Source: An Ode to Emacs. The Greatest Operating System, an article by Diego Crespo.

Why PostgreSQL High Availability Matters and How to Achieve It

Ensuring your application can handle failures and outages is crucial, and the availability of your application is only as good as the availability of your PostgreSQL instance. With that in mind, you may be wondering which PostgreSQL high availability (HA) deployment option is best for your application.

Let’s review several popular solutions that increase the high availability of PostgreSQL deployments and, as a result, the availability overall of your application. Why several and not one? Well, there’s no silver bullet or one-size-fits-all solution when it comes to high availability and PostgreSQL. So, walk through the options for a highly available deployment of PostgreSQL and then you can make a choice that fits your use case.

Source: PostgreSQL High Availability Options: A Guide.

Lou (2022)

A storm rages. A young girl is kidnapped. Her mother teams up with the mysterious woman next door to pursue the kidnapper, a journey that tests their limits and exposes shocking secrets from their pasts.

In the evening I watched Lou. I liked the movie and give it a 7 out of 10.

WWDC23: Passkeys

Last week I managed to glean more insights into Apple’s continuing evolution of their passkeys support from other places online, though, so I decided to pull it all together. Here’s the latest news on changes coming in iOS 17, iPadOS 17, and macOS 14 to Apple’s passkeys experience.

Source: WWDC23: Passkeys, an article by Matthew Miller.

Cruel and Unusual

The fingerprints say the murderer is the man who's just been executed . . .

At 11.05 one December evening in Richmond, Virginia, convicted murderer Ronnie Joe Waddell is pronounced dead in the electric chair. At the morgue Dr Kay Scarpetta waits for Waddell's body. Preparing to perform a post-mortem before the subject is dead is a strange feeling, but Scarpetta has been here before. And Waddell's death is not the only newsworthy event on this freezing night: the grotesquely wounded body of a young boy is found propped against a rubbish skip. To Scarpetta the two cases seem unrelated, until she recalls that the body of Waddell's victim had been arranged in a strikingly similar position . . .

In the evening I started in Cruel and Unusual, Book 4 in the Kay Scarpetta series by Patricia Cornwell.