Principal components analysis (PCA) is one of a family of techniques
for taking high-dimensional data, and using the dependencies between
the variables to represent it in a more tractable, lower-dimensional
form, without losing too much information. It has been widely used
for data compression and de-noising. However, its entire
mathematical process is sometimes ambiguous to the user.
In this article, I would like to discuss the entire process of PCA
mathematically, including PCA projection and reconstruction, with
most of the derivations and proofs provided. At the end of the
article, I implemented PCA projection and reconstruction from
scratch. After reading this article, there should be more black box
in PCA anymore.
True or not, a strong password hashing is crucial for a large
ecosystem like the WordPress one, which has always been a juicy
target for hackers. So, I decided to take a closer look at the
hashing system and try to crack WordPress hashes from scratch!
In this multi-part series of articles about Kubernetes, I'll try and
capture what I think everyone who wants to learn and work with
Kubernetes should know about.
When confronting a new data science problem, one of the first
questions to ask is which technology to use. There is hype; there
are standard tools; there are bleeding-edge technologies, entire
platforms and off-the-shelf solutions.
In the last year, the Haskell language and associated technology
have been seen developing into the most mature ecosystem we’ve seen
to date, with innovation happening left and right across a variety
of fronts. Editor tooling, for instance, is reaching levels of
maturity we only dreamed about years ago. Simultaneously, there has
been a fair bit of discussion about the economics of the Haskell
ecosystem and the confounding factors that have led to its potential
stagnation. Most recently, for instance, there have been discussions
about “Simple Haskell” as a set of best practices to spur more
successful industry projects.
I’ve been toying around with some ideas for how to use custom
properties (aka CSS variables) for global settings in a project. The
idea is to provide control to designers/developers over consistent
styles across multiple components.
One of the interesting facts about writing your own parser
combinators library, is that you will learn (or consolidate) other
knowledges in the process, like: Functors, Applicatives and, of
course, Monads, and more generaly, how to design DSL in Haskell.
So, how do you test the functionality of how your app responds to a
web request without making an actual request and returning a
response. One approach is to mock the requests and responses.
This is the first in a series of blog posts intended to provide a
gentle introduction to
flakes, a new Nix feature
that improves reproducibility, composability and usability in the
Nix ecosystem. This blog post describes why flakes were introduced,
and give a short tutorial on how to use them.
Deep Learning is all about linear algebra and calculus. If you try
to read any deep learning paper, matrics calculus is a needed
component to understanding the concept.
This article series is a guide to modern Python tooling with a focus
on simplicity and minimalism. It walks you through the creation of
a complete and up-to-date Python project structure, with unit tests,
static analysis, type-checking, documentation, and continuous
integration and delivery.
In this article, let's look at some of the ways to do batch HTTP
requests in Python and some of the tools at our disposal. Mainly,
we'll look at the following ways:
With the
release of
Python 3.9.0b1, the first of four planned betas for the development
cycle, Python 3.9 is now feature-complete. There is still plenty to
do in terms of testing and stabilization before the October final
release. The release announcement lists a half-dozen Python
Enhancement Proposals (PEPs) that were accepted for 3.9. We have
looked at some of those PEPs along the way; there are some updates
on those. It seems like a good time to fill in some of the gaps on
what will be coming in Python 3.9
Today I was finally able to feed the Caribena versicolor sling that
has been in my care since the first of this
month. I held a small
mealworm, Tenebrio molitor, with tweezers close to it, and it
"jumped" on the small prey item. In the past the small tarantula had
refused food items of the same size, no idea why. Maybe still getting
used to its enclosure.
I was also finally able to feed a second instar Chaerilus
sp. "Java"; a very small scorpion that I have been keeping since the
7th of April
2020. While it has small
springtails in its enclosure, which maybe it eats, I prefer to
actually see it eat. So in the late afternoon I managed, after a
few attempts, to start it accepting and eating a very tiny mealworm
larva.
In this blog post we’ll build a CLI application in Go, which we’ll
call go-grab-xkcd. This application fetches comics from
XKCD and provides you with various options
through command-line arguments.
Among the many use cases Python covers, data analytics has become
perhaps the biggest and most significant. The Python ecosystem is
loaded with libraries, tools, and applications that make the work of
scientific computing and data analysis fast and convenient.
But for the developers behind the Julia
language — aimed specifically at
“scientific computing, machine learning, data mining, large-scale
linear algebra, distributed and parallel computing”—Python isn’t
fast or convenient enough. Python represents a trade-off, good for
some parts of data analytics work but terrible for others.
Piping is one of the core concepts of Linux & Unix based operating
systems. Pipes allow you to chain together commands in a very
elegant way, passing output from one program to the input of another
to get a desired end result.
In the afternoon I noticed that the Psalmopoeus irminia sling I keep
had molted; I saw the exoskeleton dangling from a piece of moss coming
out of the cork tube it lives in. Because I had an appointment I
couldn't take photos, so that I did in the evening.
In the evening, after I had taken the above photo I also spotted a
cast-off exoskeleton in the terrarium in which I keep a Pterinochilus
murinus sling. On the 12th of this
month it had already
opened its burrow, and I suspected back then that it had molted. And
now I had proof. Maybe I overlooked the exoskeleton earlier because it
was underneath the leaf of a plastic plant.
I also checked on the Chromatopelma cyaneopubescens sling I keep,
which recently also
molted. Because it moved
out of its webbing I decided to try to feed it, and it readily
accepted a pre-killed mealworm, Tenebrio molitor.
I use Docker, Kubernetes, and Microsoft Azure every day. That said,
it makes sense for me to have aliases supporting me with these tools
and environments. However, maybe you are using different clouds and
command-line tools so that you will end up with different
aliases. The key takeaway should be that you create and use aliases
to help you get your job done.
Sometimes I find the git diff command a little inconvenient. It
can throw a lot of information at the screen at once. I use git diff not only for verifying my changes before a commit, but also to
review pull requests, or for finding bugs introduced between two
commits. In the situations when you’re looking at a lot of changed
files, having to scroll up and down so much is tedious.
Initially I wanted to write articles on those two topics separately
(mocking time and testing event loops), but during the process I
realized that the things I want to talk about are too interrelated:
when I need to mock time, it's usually to test some event loop with
it, and when I test event loops, typically mocked time is also
involved in that.
So in the end, it felt better to just combine all that in a single
article.
In 2020 there are a lot of developers and designers who want to
learn the basics of CSS. In this series of articles, I will teach
you those main topics. In this specific article, I will review the
essential CSS properties of typography while using many visual
examples.
The PDF, or Portable Document Format, is one of the
most common formats for sharing documents over the
Internet. PDFs can
contain text, images, tables, forms, and rich media like videos and
animations, all in a single file.
This abundance of content types can make working with PDFs
difficult. There are a lot of different kinds of data to decode when
opening a PDF file! Fortunately, the Python ecosystem has some great
packages for reading, manipulating, and creating PDF files.