versus Data Science” would be a better title for this book. While
that one word sums up how many people view the language, we hope we
useful. Scientists and engineers are who we were thinking of when we
wrote this book but we hope that these lessons will also help
librarians, digital humanists, and everyone else who uses computing
in their research.
Git was created by Linus Torvalds out of a need. At the time the
Linux Kernel team was using a proprietary Distributed Source Control
Management (DSCM) system. However, due to licensing issues the Linux
Kernel team could no longer use this proprietary DSCM
system. Therefore, Linus decided to build Git as the DSCM system he
always wished they had.
You might also know Linus as the creator of Linux. In fact, Linus
still manages the Linux Kernel today. As of 2020, the Linux kernel
had over 27.8 million lines of code spread across ~66 thousand files
from ~21 thousand different contributors. It has continued to be
successfully developed, maintained, and extended since it was
publicly announced in 1991. It is also worth noting that Git itself,
another large, successful open source project, is managed in the
So there must be some useful insights around long term software
development and maintenance practices we can glean by looking at how
Linus and his team use Git for their development and peer review
Storing timestamps instead of booleans, however, is one of those
things I can go out on a limb and say it doesn’t really depend all
that much. You might as well timestamp it. There are plenty of times
in my career when I’ve stored a boolean and later wished I’d had a
timestamp. There are zero times when I’ve stored a timestamp and
regretted that decision.
Writing Good Unit Tests; Don’t Mock Database Connections
Unit tests are unbelievably important to us as developers because
they allow us to demonstrate the correctness of the code we’ve
written. More importantly, unit tests allow us to make updates to
our code base with the confidence that we haven’t broken
anything. In our eagerness to get 100% code coverage, however, we
often write tests for logic that perhaps we have no business
testing. I’m here to assert that creating mock database abstractions
in order to write unit tests is a bad idea almost all of the time.
Most deep neural networks are trained by stochastic gradient
descent. Now “stochastic” is a fancy Greek word for “random”; it
means that the training data are fed into the model in random order.
So what happens if the bad guys can cause the order to be not
random? You guessed it – all bets are
off. Suppose for example a
company or a country wanted to have a credit-scoring system that’s
secretly sexist, but still be able to pretend that its training was
actually fair. Well, they could assemble a set of financial data
that was representative of the whole population, but start the
model’s training on ten rich men and ten poor women drawn from that
set – then let initialisation bias do the rest of the work.
AVIF (AV1 Image File Format) is a royalty-free image format that
better than other
popular alternatives (JPEG, PNG, WebP). With Chrome 85+ and Firefox
86+ (behind a feature flag) the browser support is getting better,
so it is now worth including AVIF images on web pages.
Complete Guide to Generative Adversarial Networks (GANs)
The technological advancements and developments in machine learning,
deep learning, and neural networks have led to a revolutionary
era. Creating and replicating photos, texts, images, and pictures
based on only a collection of examples can be considered shocking to
some, and marvelous to others.
We are now at a point where technology is so advanced that deep
learning and neural networks can even generate realistic human faces
from scratch. The faces generated do not belong to any person, alive
or dead, yet they are astoundingly realistic.
One special deep learning network we have to thank for these
achievements is the Generative Adversarial Network (GAN), which is
the topic of this article. Let's briefly explore our table of
contents to understand the main topics we'll cover.
Iterated Local Search is a stochastic global optimization
It involves the repeated using of a local search algorithm to
modified versions of a good solution found previously. In this way,
it is like a clever version of the stochastic hill climbing with
random restarts algorithm.
Simple Python Profiling with the @profile Decorator
A way of analyzing the performance of any given function or program
is to use profilers during its execution. Profilers can help us
understand timing, memory usage, and other pertinent information
about code. The key to using profilers is to determine which portion
of the code is slow or computationally expensive and assist the
process of catching errors for optimization.
You know that question you can get asked casually by a person you’ve
never met before or even by someone you’ve known for a long time but
haven’t really talked to about this before. Perhaps at a social
event. Perhaps at a family dinner.
Welcome to the world of Git. I hope this document will help to
advance your understanding of this powerful content tracking system,
and reveal a bit of the simplicity underlying it — however dizzying
its array of options may seem from the outside.
In this article, we’ll develop a Haskell library for continued
fractions. Continued fractions are a different representation for
real numbers, besides the fractions and decimals we all learned
about in grade school. In the process, we’ll build correct and
performant software using ideas that are central to the Haskell
programming language community: equational reasoning, property
testing, and term rewriting.
or better or worse, extruded text was a staple of the mid-90s
desktop publishing design landscape. It was rare for a party
invitation or gaming fanzine not to be blessed by this 3D text
effect on its way out the printer.
As design was increasingly destined for screen rather than page,
extrusion fell out of favour. But recently, I've noticed a quiet
Accessibility is a critical skill for developers doing work at any
point in the stack. For front-end tasks, modern CSS provides
capabilities we can leverage to make layouts more accessibly
inclusive for users of all abilities across any device.
Usually when I start a new project, I either copy the HTML structure
of the last site I built or I head over to HTML5
Boilerplate and copy their
boilerplate. Recently I didn’t start a new project, but I had to
document the structure we use at work for the sites we build. So,
simply copying and pasting wasn’t an option, I had to understand the
choices that have been made. Since I spent quite some time
researching and putting the structure together, I decided to share
it with you.