An unidentified nude female sits propped against a fountain in
Central Park. There are no signs of struggle. When Dr. Kay Scarpetta
and her colleagues Benton Wesley and Pete Marino arrive on the
scene, they instantly recognize the signature of serial killer
Temple Brooks Gault. Scarpetta, on assignment with the FBI, visits
the New York City morgue on Christmas morning, where she must use
her forensic expertise to give a name to the nameless—a difficult
task. But as she sorts through conflicting forensic clues, Gault
claims his next victim. He has infiltrated the FBI’s top secret
artificial-intelligence system developed by Scarpetta’s niece, and
sends taunting messages as his butchery continues, moving
terrifyingly closer to Scarpetta herself.
In the afternoon I started in From Potter's
Field,
Kay Scarpetta book 6 by Patricia Cornwell.
If you’ve ever had to write a parser before, you know that creating
parsers can be a tedious and complicated process. The good news is
that it doesn’t have to be this way. In this post, I’m going to
introduce parser combinators - a technique for building parsers
that I’ve found to be both practical and fun to play around
with1.
Gitflow is, by far, the most popular branching model and possibly
the one that has endured the test of time the most. Introduced by
Vincent Driessen in
2010,
its fundamental idea is that you should isolate your work into
different types of git branches.
Other branching strategies, such as the centralized workflow (for
those teams that come from SVN), and the forking workflow (for
open-source projects) exist. Git, as a version control system, only
details basic branching operations, and it remains controversial as
to which approach is the best. Beyond those basic branching
operations, it's a matter of opinion.
> In this article we will compare Gitflow with its newer approach,
The first time, I've heard of Raku was
maybe a year ago. I was too busy to look into it though. I've done
that now and BOY OH BOY, do I like this language.
PostgreSQL 9.5 introduces a new SKIP LOCKED option to SELECT ... FOR
[KEY] UPDATE|SHARE. It’s used in the same place as NOWAIT and, like
NOWAIT, affects behaviour when the tuple is locked by another
transaction.
The main utility of SKIP LOCKED is for building simple, reliable and
efficient concurrent work queues.
I have occasionally ended up with files I did not want in my git
repositories. These can both take up a lot of space, and contain
sensitive data that we just want to remove (such as MySQL dumps,
deploy keys etc).
Git keeps a history of all files, so just deleting the file doesn’t
“make it go away”. The only way to completely remove the file is to
scan through all history, removing all references to (and history
of) those files, and finally pruning the git repo (physically
removing references to what we just deleted). Finally you have to
force-push the repo changes back to the remote, overwriting the
remote.
It’s hard to create efficient algorithms without understanding the
time and space complexity of various operations. The concept of Big
O notation helps programmers understand how quickly or slowly an
algorithm will execute as the input size grows.
In this article, we’ll cover the basics of Big O notation, why it is
used and how describe the time and space complexity of algorithms
with example.
I am unreasonably excited about
passkeys, I’ve
long been looking
for a better/more convenient way than passwords to do
authentication, and I think passkeys are finally it.
However, whenever I see passkeys mentioned (for example on the
recent Tailscale post about
them), there are always a lot of misconceptions that surface in the
debate. I’d like to clear some of them here, and hopefully explain a
bit better what passkeys are.
In 2009, neurophysiologist Johanna Meijer set up an unusual
experiment in her backyard. In an ivy-tangled corner of her garden,
she and her colleagues at Leiden University in the Netherlands
placed a rodent running wheel inside an open cage and trained a
motion-detecting infrared camera on the scene. Then they put out a
dish of food pellets and chocolate crumbs to attract animals to the
wheel and waited.
Wild house mice discovered the food in short order, then scampered
into the wheel and started to run. Rats, shrews, and even frogs
found their way to the wheel—more than 12,000 animals over 3
years. The creatures seemed to relish the feeling of running without
going anywhere.
For more than a quarter of a century, people have been discussing
“Which is better, MySQL or PostgreSQL?” — with no resolution. When
people ask me which is better, I have to ask them what they want to
do and how they want to do it.
A transformer model is a neural network that learns context and thus
meaning by tracking relationships in sequential data like the words
in this sentence.
The sinister Dr Watt has an evil scheme going. He's kidnapping
beautiful young women and turning them into mannequins to sell to
local stores.
In the evening Adam, Alice, Esme, and I watched Carry on
Screaming!. Adam had proposed
to watch this this very old movie together. I didn't like it much and
give it a 6 out of 10.
Little Emily Steiner left a church meeting late one afternoon and
strolled toward home along a lakeside path; a week later, her nude
body was discovered, bound in blaze-orange duct tape. Called by the
North Carolina authorities, forensic pathologist Kay Scarpetta
recognizes similarities to the gruesome work of a serial killer who
has long eluded the FBI But as she tries to make sense of the
evidence, she is left with questions that lead her to the Body Farm,
a little known research facility in Tennessee where, with the help
of some grisly experiments, she might discover the answer.
It is Scarpetta alone who can interpret the forensic hieroglyphics
that eventually reveal a solution to the case as staggering as it is
horrifying. But she must also endeavor to help her niece, Lucy, who
is embroiled in controversy at Quantico. And Scarpetta, too, is
vulnerable, as she opens herself to the first physical and emotional
bond she has felt in far too long a time.
In the afternoon I started in The Body
Farm,
Kay Scarpetta book 5 by Patricia Cornwell.
The Go programming language provides powerful tools for managing
concurrency, but robust asynchronous code requires us as developers
to design around uncertain tasks and manifold queues. Step through
an async codebase with us in this post!
We begin with describing the core problem of quantum communication:
the encoding (and decoding) of information. We distinguish between
the classical elementary unit of information (bit) and the quantum
elementary unit of information (qubit).
We end with a description of the core problem of quantum secrecy:
quantum key generation and distribution.
A short write-up on combining digraphs, a feature built-in to vim,
and Haskell's UnicodeSyntax extension, to easily write beautiful
Haskell programs with unicode symbols.
After barely surviving his grievous wounds from his mission in
Dhaka, Bangladesh, Tyler Rake is back, and his team is ready to take
on their next mission.
In the evening Esme and I watched Extraction
II. I liked the movie more
than the first
one
and give it an 8 out of 10.
Recently, a colleague asked me if there is a way to customize an
input caret using CSS. I knew you could change the color of it, but
it got me thinking if we could completely replace it. The problem
seemed interesting to solve.
A while ago at $work, we had a
performance issue with one of our core Python libraries.
This particular library forms the backbone of our 3D processing
pipeline. It’s a rather big and complex library which uses NumPy and
other scientific Python packages to do a wide range of mathematical
and geometrical operations.
Our system also has to work on-prem with limited CPU resources, and
while at first it performed well, as the number of concurrent
physical users grew we started running into problems and our system
struggled to keep up with the load.
We came to the conclusion that we had to make our system at least 50
times faster to handle the increased workload, and we figured that
Rust could help us achieve that.
I finally had some time to play around with Nix - the immutable
package manager and build system. This was on my agenda since a
long time, but I finally took the plunge on my M1 OSX system. I by
no means understand Nix fully yet, but making progress and it is
usable to me already.
At a fundamental level, a programmer needs to manipulate
bits. Modern processors operate over data by loading in ‘registers’
and not individual bits. Thus a programmer must know how to
manipulate the bits within a register.
As a programmer, you use hash functions every day. They're used in
databases to optimise queries, they're used in data structures to
make things faster, they're used in security to keep data
safe. Almost every interaction you have with technology will involve
hash functions in one way or another.
Hash functions are foundational, and they are everywhere.
But what is a hash function, and how do they work?
It’s been nine whole years since PEP
484 landed and brought us types
from on high. This has made a lot of people very angry and been
widely regarded as a bad
move[1]. Since
then, people on the internet have been clamoring to find out: does
this mean we can now compile Python to native code for more
speed?
It’s a totally reasonable question. It was one of my first questions
when I first started working on Python compilers. So can we do it?
Quite possibly the best thing about the Nix
ecosystem is that there's a small army of people hard at work
improving Nixpkgs, the largest
software package repository in existence and one of the most active
repos on GitHub, every single day. Not only
are they constantly adding brand new packages for stuff that you
might want to use—over 80,000 packages and counting!—they're also
updating existing packages, which sometimes even includes fixes for
critical security vulnerabilities.
But to take full advantage of this steady drumbeat of progress, it's
important that you follow some best practices. To help you adopt
those practices, we at Determinate
Systems have created a tool called
Nix Flake
Checker and
we're excited to release it to the Nix community.
For the past month or so, I’ve been experimenting with
Nits, a different approach to
NixOS deployments that is pull rather than
push-based. And as part of that effort, I needed to address how
exactly I would push NixOS system closures to the machines under
management.
Typically, in a push-based deployment approach, you can copy the
system closure directly via SSH whenever you’re deploying. But with
Nits, the agent process running on the target machine needs to be
able to connect and download the latest closure on demand, sometimes
days or weeks after the deployment was triggered.
“Use a Binary Cache”, I hear
you say. And yes, I did. But instead of spinning up an instance of
Harmonia, configuring
an S3
Bucket
or hitting up Domen over at
Cachix, instead, I decided to roll my
own.
Hi! This is an explorable explanation of Python dictionaries. This
page is dynamic and interactive — you can plug in your data and see
how the algorithms work on it.
There has been confusion about the distinction between
errors and
exceptions for a long time,
repeated threads in Haskell-Cafe and more and more packages that
handle errors and exceptions or something between. Although both
terms are related and sometimes hard to distinguish, it is important
to do it carefully.
Nix is a tool for configuring software
environments according to source files. I’ve been hearing more and
more about Nix on Hacker News and Twitter. The idea of it appeals to
me, so I’ve been tinkering with it over the past few weeks.
This article describes some principles I’ve found useful for
designing good Python library APIs, including structure, naming,
error handling, type annotations, and more. It’s a written version
of a talk I gave in June 2023 at the Christchurch Python meetup.
Search capabilities span from free text (think Google) to raw data
access (think SQL). In between, there’s a wide range of options for
narrowing a search that are often provided with UI elements. But
what if there are too many fields for a UI to search on? Search DSLs
can give a user more granular access to searching without exposing
an overly complicated interface.
In the evening I finished Cruel and
Unusual,
Book 4 in the Kay Scarpetta series by Patricia Cornwell. I liked the
story. Cornwell is getting better and better with each book in the Kay
Scarpetta series.
Earth failed. In a desperate bid to escape, the spaceship Enkidu and
its captain, Heorest Holt, carried its precious human cargo to a
potential new paradise. Generations later, this fragile colony has
managed to survive, eking out a hardy existence. Yet life is tough,
and much technological knowledge has been lost.
Then strangers appear. They possess unparalleled knowledge and
thrilling technology – and they've arrived from another world to
help humanity’s colonies. But not all is as it seems, and the price
of the strangers' help may be the colony itself.
In the evening I started in Children of
Memory,
Children of Time book 3 by Adrian Tchaikovsky.
In the acknowledgements the author gives a nod to his research sources
which include the fantastic book The Genius of
Birds
by Jennifer Ackerman. I read this book several years ago while living
in Mexico and highly recommend it.
Instead of having to limit sanity checks to the boundaries of the
program, we could re-use those as function contracts using the
assert keyword. Indeed, setting PYTHONOPTIMIZE removes all
assert, making the check useful in dev, and free in production.
Unfortunately, the community doesn't know about the feature, and
use assert for things that should never be removed, so using the
flag would likely introduce bugs into your program.
In my recent post about data
archiving
to removable media, I laid out the difference between backing up and
archiving, and also said I’d evaluate
git-annex and
dar. This post evaluates git-annex.
This article discusses the technical details of static
initialization for map data in Go binaries, and some alternative
strategies for dealing with the performance impacts.
As his wedding day approaches, Ben heads to Miami with his
soon-to-be brother-in-law James to bring down a drug dealer who's
supplying the dealers of Atlanta with product.
In the evening Adam, Alice, Esme, and I watched Ride Along
2. I liked the movie and give
it a 7 out of 10.
Machine learning (ML) lets computers learn from data on their own
without requiring software developers to write out all the logic by
hand. Given enough data, the machine can learn useful patterns in
the data, which turns out to be quite powerful.
The earliest ML algorithms go back to the 1960’s but machine
learning started being commonly used in the early 2000’s. In 2012,
“deep learning” involving large neural networks became practical and
their usage and capabilities has grown exponentially since then.
The recent breakthroughs in Large Language Model (LLM) technology
are positioned to transition many areas of software. Search and
Database technologies particularly have an interesting entanglement
with LLMs. There are cases where Search improves the capabilities of
LLMs as well as where inversely, LLMs improve the capabilities of
Search. In this blog post, we will break down 5 key components of
the intersection between LLMs and Search.
Retrieval-Augmented Generation
Query Understanding
Index Construction
LLMs in Re-Ranking
Search Result Compression
We will also conclude with some thoughts on how Generative Feedback
Loops fit into this picture.
Recently I’ve been working in static analysis land and as a part of
that have been familiarizing myself with data flow analysis. I look
at a fair amount of MIR and so decided to delve into the
rustc_mir_dataflow crate to see how these things are handled in
the rust compiler. There is a helpful introduction to this topic in
the rustc dev guide,
and this post fleshes things out a bit.
Large language models (LLMs) have powerful general capabilities out
of the box: they can answer questions, write poems and stories,
invent recipes, and write code. But they may not precisely fit your
use case. Their answers may be too vague, poorly formatted, or even
incorrect.
Fortunately, you can adapt LLMs to meet your needs. There are three
levels of LLM customization:
Prompt engineering
Embeddings via vector databases
Fine-tuning
Each level is an order of magnitude more difficult and expensive
than the previous, but offers far more customization.
Emacs is one of those magical pieces of technology that manages to
bridge the gap between being a tool that does useful work, and
becoming a deeply personal component in a software developer’s
life. It is one half of the editor war and is one of the longest
running examples of programmers elevating their personal choices to
moral
imperatives. But
beyond that, it is also a piece of software that has seen decades of
iteration. It’s old enough that parts of it were contributed by
people that have long since passed. They may be gone, but their code
still lives on, making our lives just a bit easier.
Ensuring your application can handle failures and outages is
crucial, and the availability of your application is only as good as
the availability of your PostgreSQL instance. With that in mind, you
may be wondering which PostgreSQL high availability (HA) deployment
option is best for your application.
Let’s review several popular solutions that increase the high
availability of PostgreSQL deployments and, as a result, the
availability overall of your application. Why several and not one?
Well, there’s no silver bullet or one-size-fits-all solution when it
comes to high availability and PostgreSQL. So, walk through the
options for a highly available deployment of PostgreSQL and then you
can make a choice that fits your use case.
A Bloom filter is a
standard data structure in computer science to approximate a
set. Basically, you start with a large array of bits, all
initialized at zero. Each time you want to add an element to the
set, you compute k different hash values and you set the bits at
the k corresponding locations to one.
A storm rages. A young girl is kidnapped. Her mother teams up with
the mysterious woman next door to pursue the kidnapper, a journey
that tests their limits and exposes shocking secrets from their
pasts.
In the evening I watched
Lou. I liked the movie and
give it a 7 out of 10.
Tig is an ncurses-based text-mode interface for git. It functions
mainly as a Git repository browser, but can also assist in staging
changes for commit at chunk level and act as a pager for output from
various Git commands.
Last week I managed to glean more insights into Apple’s continuing
evolution of their passkeys support from other places online,
though, so I decided to pull it all together. Here’s the latest news
on changes coming in iOS 17, iPadOS 17, and macOS 14 to
Apple’s passkeys experience.
The fingerprints say the murderer is the man who's just been
executed . . .
At 11.05 one December evening in Richmond, Virginia, convicted
murderer Ronnie Joe Waddell is pronounced dead in the electric
chair. At the morgue Dr Kay Scarpetta waits for Waddell's
body. Preparing to perform a post-mortem before the subject is dead
is a strange feeling, but Scarpetta has been here before. And
Waddell's death is not the only newsworthy event on this freezing
night: the grotesquely wounded body of a young boy is found propped
against a rubbish skip. To Scarpetta the two cases seem unrelated,
until she recalls that the body of Waddell's victim had been
arranged in a strikingly similar position . . .
In the evening I started in Cruel and
Unusual,
Book 4 in the Kay Scarpetta series by Patricia Cornwell.