Bulk loading into PostgreSQL: Options and comparison
You have a file, possibly a huge CSV, and you want to import its
content into your database. There are lots of options to do this but
how would you decide which one to use. More often than not the
question is how much time would the bulk load would take. I found my
self doing the same few days back when I wanted to design a data
ingestion process for PostgreSQL where we needed to bulk load around
250GB of data from CSV files every 24 hours.
Let’s analyze two different communication patterns:
Polling: Service B periodically query Service A for the current
state of the users and updates its local storage.
Event driven: Service A publishes in a queue every time user
information is updated. Service B consumes the updates to stay up
Using TLA+ we are
going to model the two different patterns to see how well they fit
our requirements. In our specification, we are also going to take
into account unexpected failures that can affect the system and show
how to model them in TLA+.
Haskell Object Observation Debugger (HOOD) is a small post-mortem
debugger for the lazy functional language
Haskell. It is based on the concept of
observation of intermediate data structures, rather than the more
traditional stepping and variable examination paradigm used by
imperative language debuggers.
Shell literacy is one of the most important skills you ought to
possess as a programmer. The Unix shell is one of the most powerful
ideas ever put to code, and should be second nature to you as a
programmer. No other tool is nearly as effective at commanding your
computer to perform complex tasks quickly — or at storing them as
scripts you can use later.
The regex literals optimization avoids running the regex engine on
parts of the input text that cannot possibly ever match the regex.
An example of a regex this can be applied to is \w+@\w+\.\w+, where
the algorithm quickly finds the first @, then matches \w+ backwards
to find the start of the match, and then matches \w+\.\w+ forward to
find the end of the match. It then finds the second @, starting from
the end of the previous match, and so on. This is a fairly naive
(and incorrect) implementation, but it gives the idea of how it
We've been focusing on the second step for quite a while. In part
we've looked at the evaluation loop, a place where Python bytecode
gets executed. And in part
we've studied how the VM executes the instructions that are used to
implement variables. What we haven't covered yet is how the VM
actually computes something. We postponed this question because to
answer it, we first need to understand how the most fundamental part
of the language works. Today, we'll study the Python object system.
Our Christmas tree
In the evening we decorated the Christmas tree that was delivered by
the end of the afternoon to our house. Because the tree was quite
large we had to move around our furniture to make space.
Cameras and Lenses
Pictures have always been a meaningful part of the human
experience. From the first cave drawings, to sketches and paintings,
to modern photography, we’ve mastered the art of recording what we
Cameras and the lenses inside them may seem a little mystifying. In
this blog post I’d like to explain not only how they work, but also
how adjusting a few tunable parameters can produce fairly different
I am writing this post from my new 13” Macbook Pro with an Apple
M1 Chip. If you’ve been following the five year long
then this might come as a sudden surprise. I will be honest, it
comes as a surprise to me too. The decision was a bit impulsive but
my dev environment was blocking me and my time and patience was not
a luxury I could afford. I’m living that aluminium utopia dongle
life now and will stick to this for the forseeable future.
’Tis the time of the year again when doors are
opened, so I
thought, “Well, would be nice, if we could open some doors with CSS
only.” And lo and behold, modern CSS is equipped with everything
In the morning I finished The Law of
by Michael Connelly. What a great read! Excellent, highly recommended.
In the afternoon the 2 for 1 FlexiStands™ I
had ordered the 1st of December 2020 arrived. One for Alice
and one for me.
Pixelmator Pro 2.0
Today, when I opened Pixelmator I got a splash screen with a special
offer: Pixelmator Pro with a 50%
I couldn't resist and I bought the Pro version of this program
The Saints of Salvation
Humanity is struggling to hold out against a hostile takeover by an
alien race that claims to be on a religious mission to bring all
sentient life to its God at the End of Time. But while billions of
cocooned humans fill the holds of the Olyix’s deadly arkships,
humankind is playing an even longer game than the aliens may have
anticipated. From an ultra-secret spy mission to one of the grandest
battles ever seen, no strategy is off the table. Will a plan
millennia in the making finally be enough to defeat this seemingly
unstoppable enemy? And what secrets are the Olyix truly hiding in
their most zealously protected stronghold?
In the evening I started in The Saints of
book 3 in the Salvation Sequence by Peter F. Hamilton. I liked the
previous 2 books a lot so I have high expectations of the third and
final book in the series.
The mythical “fast” web page
Web performance can mean a lot of different things to a lot of
different people. Fundamentally, it’s a question of how fast a web
page is. But fast to whom?
When this page loaded moments ago, was it fast? If so,
congratulations, you had a fast experience. So ask yourself, does
that make this a fast page? Not so fast! Just because you had a fast
experience doesn’t mean everyone else does too. You might even
revisit this page and have yourself a slow experience.
vipe allows you to run your editor in the middle of a unix pipeline
and edit the data that is being piped between programs. Your editor
will have the full data being piped from command1 loaded into it,
and when you close it, that data will be piped into command2.
Today I noticed that the female Aphonopelma seemanni I keep was in a
death curl; it was either dying or already dead. The day before I had
noticed it was leaking hemolymph from the top of its abdomen close to
the pedicel. I couldn't see any damage and had no idea why this was
When I inspected my other tarantulas I noticed another loss: the
Caribena versicolor I keep was also in a death curl. That's two in a
single day 😢.
Restoring individual postgres table
PostgreSQL allows restoration of individual tables from dump files
which can be used, for instance, to query a particular table for
retrieving data from a previous state in time, say, for
investigating a bug, or recovering accidentally deleted data.
Secret Santa is a
traditional Christmas gift exchanging scheme in which each member of
a group is randomly and anonymously assigned another member to give
a Christmas gift to (usually by drawing names from a container). It
is not valid for a person to be assigned to themself (if someone
were to draw their own name, for example, all the names should be
returned to the jar and the drawing process restarted).
Given a group of a certain size, how many different ways are there
to make valid assignments? What is the probability that at least one
person will draw their own name? What is the probability that two
people will draw each other’s names? What is a good way to have a
computer make the assignments while guaranteeing they are generated
with equal probability among all possible assignments?
It turns out that these questions about secret santa present good
motivation for exploring some of the fundamental concepts in
combinatorics (the math of counting). In the sections below we will
take a look at a bit of that math and algorithms that allow us to
answer the questions we posed above. The final section presents a
program that allows
generating and anonymously sending secret santa assignments via
email so that we no longer need to go through the tedious ordeal of
drawing names from a hat.
The other day I was looking at a nested map/reduce/filter
constellation which had a bunch of nesting, therefore there were lot
of closures. This colleague had an interesting question: "In PHP,
usually we can tell the interpreter that a function is relying on
something from outside of the function with the use keyword, so
e.g. we could tell at one level of the nesting that a function not
only relying on it's input, but something from the outside
Fuzz testing is a well-known technique for uncovering programming
errors. Many of these detectable errors have serious security
implications. Google has found thousands of security vulnerabilities
and other bugs using this technique. Fuzzing is traditionally used
on native languages such as C or C++, but last year, we built a new
Python fuzzing engine. Today, we’re releasing the Atheris fuzzing
engine as open
Mastering the Terminal to Improve Development Speed
When I was new to programming, there was nothing more impressive
than watching an expert navigate around a terminal. They could be
doing something as simple as editing a text file, but from the
outside perspective, it was awe-inspiring. A wizard at the keys,
churning out lines of codes without the need to even glance at their
mouse. Fast forward several years, and I have slowly acquired the
art of the terminal. In this post, I will share several techniques
that can be used to speed up development processes. We will
specifically cover topics such as grep, tmux, aliasing, and several
others. Let’s get started!