In this post, I’d like to re-introduce lock-free programming, first
by defining it, then by distilling most of the information down to a
few key concepts. I’ll show how those concepts relate to one another
using flowcharts, then we’ll dip our toes into the details a little
bit. At a minimum, any programmer who dives into lock-free
programming should already understand how to write correct
multithreaded code using mutexes, and other high-level
synchronization objects such as semaphores and events.
Considering migrating to Ubuntu from other Linux platforms, such as CentOS?
Think Ubuntu- the most popular Linux distribution on public clouds,
data centre and the edge. Since its inception, Ubuntu consistently
gains market share, as of today reaching almost 50%.
Welcome to the third post in our series on Python at scale at
Instagram! As we mentioned in the first post in the
series,
Instagram Server is a several-million-line Python monolith, and it
moves quickly: hundreds of commits each day, deployed to production
every few minutes.
We’ve run into a few pain points working with Python at that scale
and speed. This article takes a look at a few that we imagine might
impact others as well.
Bayesian Decision Theory is the statistical approach to pattern
classification. It leverages probability to make classifications,
and measures the risk (i.e. cost) of assigning an input to a given
class.
In this article we'll start by taking a look at prior probability,
and how it is not an efficient way of making predictions. Bayesian
Decision Theory makes better predictions by using the prior
probability, likelihood probability, and evidence to calculate the
posterior probability. We'll discuss all of these concepts in
detail. Finally, we'll map these concepts from Bayesian Decision
Theory to their context in machine learning.
Git has a reputation for being
confusing. Users
stumble over terminology and phrasing that misguides their
expectations. This is most apparent in commands that “rewrite
history” such as git cherry-pick or git rebase. In my experience,
the root cause of this confusion is an interpretation of commits as
diffs that can be shuffled around. However, commits are snapshots,
not diffs!
DRY, or Don't Repeat
Yourself is
frequently touted as a principle of software
development. "Copy-pasta" is the derisive term applied to a
violation of it, tying together the concept of copying code and
pasta as description of software development bad practices (see also
spaghetti code).
It is so uniformly reviled that some people call DRY a "principle"
that you should never violate. Indeed, some linters even detect
copy-paste so that it can never sneak into the code. But copy-paste
is not a comic-book villain, and DRY does not come bedecked in
primary colors to defeat it.
It is worthwhile to know why DRY started out as a principle. In
particular, some for some modern software development practices,
violating DRY is the right thing to do.
Developers spend most of their time reading code, understanding it
and exploring other ways to use existing solutions. Frankly, in our
profession, there is very little time on actually writing new
libraries and creating new interfaces in real-life development. So
it is quite important to have some help in the most common
activities. Naming conventions is one such thing that improves
readability and eases the usage cost if agreed upon and spread
worldwide.
Some languages have their own special naming conventions that make
sense. Haskell is among them. There are a bunch of naming patterns
that are commonly used everywhere in the ecosystem (including the
standard libraries) that may help you to recognise the function’s
meaning without looking at its documentation and even its type! This
ability is especially relevant because naming is one of the hardest
development problems, so having some help and no-brainer rules to
guide in this area improves everyone’s life.
In this post, we will explore common naming conventions in Haskell
together. It is going to be useful for both creators (library and
API developers) and consumers (library users), as it establishes
norms accepted in the libraries’ APIs.
Bash scripts. Almost anyone needs to write one sooner or
later. Almost no one says “yeah, I love writing them”. And that’s
why almost everyone is putting low attention while writing them.
I won’t try to make you a Bash expert (since I’m not a one either),
but I will show you a minimal template that will make your scripts
safer. You don’t need to thank me, your future self will thank you.
Mypy is an optional static type checker
for Python. It's been around since 2012 and is gaining traction even
since. One of the main benefits of using a type checker is getting
errors at "compile time" rather than at run time.
Exhaustiveness checking is a common feature of type checkers, and a
very useful one! In this article I'm going to show you how you can
get mypy to perform exhaustiveness checking!
Today Apple officially released their new image format for creative
professionals, Apple ProRAW. It marks a monumental leap forward in
digital imaging on iPhone and I can’t wait to share a bit more about
it.
I’ll cover why ProRAW matters, how to shoot ProRAW, and some of the
best tools and apps for your iPhone ProRAW workflow.
If you follow my work, you know I’m a travel photographer and I’m
usually testing my camera gear in extreme environments, so I
designed a few tests for ProRAW in this same vein and that’s where I
saw this image format really shine.
Web crawling is a powerful technique to collect data from the web by
finding all the URLs for one or multiple domains. Python has several
popular web crawling libraries and frameworks.
In this article, we will first introduce different crawling
strategies and use cases. Then we will build a simple web crawler
from scratch in Python using two libraries: requests and Beautiful
Soup. Next, we will see why it’s better to use a web crawling
framework like Scrapy. Finally, we will build an example crawler
with Scrapy to collect film metadata from IMDb and see how Scrapy
scales to websites with several million pages.
I've been writing JSON-REST APIs in Python for a number of years,
and over that time I've found the tooling has greatly improved. To
show you how you can benefit I'm going to show you how I've
evolved. However, if you want to skip to the tooling I use today
take a look at Quart-Schema.
I first started writing Haskell about 15 years ago. My learning
curve for the language was haphazard at best. In many cases, I
learnt concepts by osmosis, and only later learned the proper
terminology and details around them. One of the prime examples of
this is pattern matching. Using a case expression in Haskell, or a
match expression in Rust, always felt natural. But it took years
to realize that patterns appeared in other parts of the languages
than just these expressions, and what terms like irrefutable
meant.
It's quite possible most Haskellers and Rustaceans will consider
this content obvious. But maybe there are a few others like me out
there who never had a chance to realize how ubiquitous patterns are
in these languages. This post may also be a fun glimpse into either
Haskell or Rust if you're only familiar with one of the languages.