week 22, 2020

Sun 31 May 2020

Principal Component Analysis

Principal components analysis (PCA) is one of a family of techniques for taking high-dimensional data, and using the dependencies between the variables to represent it in a more tractable, lower-dimensional form, without losing too much information. It has been widely used for data compression and de-noising. However, its entire mathematical process is sometimes ambiguous to the user.

In this article, I would like to discuss the entire process of PCA mathematically, including PCA projection and reconstruction, with most of the derivations and proofs provided. At the end of the article, I implemented PCA projection and reconstruction from scratch. After reading this article, there should be more black box in PCA anymore.

Source: Principal Component Analysis, an article by Lei Mao.

WordPress passwords, explained and cracked

True or not, a strong password hashing is crucial for a large ecosystem like the WordPress one, which has always been a juicy target for hackers. So, I decided to take a closer look at the hashing system and try to crack WordPress hashes from scratch!

Source: WordPress passwords, explained and cracked, an article by Francesco Carlucci.

Getting started with Kubernetes

In this multi-part series of articles about Kubernetes, I'll try and capture what I think everyone who wants to learn and work with Kubernetes should know about.

Source: Getting started with Kubernetes, an article by Peter Jausovec.

container

Sat 30 May 2020

Beyond Pandas

When confronting a new data science problem, one of the first questions to ask is which technology to use. There is hype; there are standard tools; there are bleeding-edge technologies, entire platforms and off-the-shelf solutions.

Source: Beyond Pandas: Spark, Dask, Vaex and other big data technologies battling head to head, an article by Jonathan Alexander.

python

On Marketing Haskell

In the last year, the Haskell language and associated technology have been seen developing into the most mature ecosystem we’ve seen to date, with innovation happening left and right across a variety of fronts. Editor tooling, for instance, is reaching levels of maturity we only dreamed about years ago. Simultaneously, there has been a fair bit of discussion about the economics of the Haskell ecosystem and the confounding factors that have led to its potential stagnation. Most recently, for instance, there have been discussions about “Simple Haskell” as a set of best practices to spur more successful industry projects.

Source: On Marketing Haskell, an article by Stephen Diehl.

haskell

Global CSS options with custom properties

I’ve been toying around with some ideas for how to use custom properties (aka CSS variables) for global settings in a project. The idea is to provide control to designers/developers over consistent styles across multiple components.

Source: Global CSS options with custom properties, an article by Mark Otto.

Fri 29 May 2020

What's in a parser combinator?

One of the interesting facts about writing your own parser combinators library, is that you will learn (or consolidate) other knowledges in the process, like: Functors, Applicatives and, of course, Monads, and more generaly, how to design DSL in Haskell.

Source: What's in a parser combinator?

haskell

How Not to Mock Web Requests for Unit Testing

So, how do you test the functionality of how your app responds to a web request without making an actual request and returning a response. One approach is to mock the requests and responses.

Source: How Not to Mock Web Requests for Unit Testing.

testing

Nix Flakes, Part 1: An introduction and tutorial

This is the first in a series of blog posts intended to provide a gentle introduction to flakes, a new Nix feature that improves reproducibility, composability and usability in the Nix ecosystem. This blog post describes why flakes were introduced, and give a short tutorial on how to use them.

Source: Nix Flakes, Part 1: An introduction and tutorial, an article by Eelco Dolstra.

Matrix Calculus for DeepLearning (Part1)

Deep Learning is all about linear algebra and calculus. If you try to read any deep learning paper, matrics calculus is a needed component to understanding the concept.

Source: Matrix Calculus for DeepLearning (Part1), an article by Kiran U Kamath.

machine learning

Thu 28 May 2020

Hypermodern Python

This article series is a guide to modern Python tooling with a focus on simplicity and minimalism. It walks you through the creation of a complete and up-to-date Python project structure, with unit tests, static analysis, type-checking, documentation, and continuous integration and delivery.

Source: Hypermodern Python, an article by Claudio Jolowicz.

python

Concurrency In Python For Network I/O

In this article, let's look at some of the ways to do batch HTTP requests in Python and some of the tools at our disposal. Mainly, we'll look at the following ways:

Synchronous with requests module

Parallel with multiprocessing module

Threaded with threading module

Event loop based with asyncio module

Source: Concurrency In Python For Network I/O - Synchronous, Threading, Multiprocessing and Asynchronous IO, an article by Abhishek Nagekar.

python

The PEPs of Python 3.9

With the release of Python 3.9.0b1, the first of four planned betas for the development cycle, Python 3.9 is now feature-complete. There is still plenty to do in terms of testing and stabilization before the October final release. The release announcement lists a half-dozen Python Enhancement Proposals (PEPs) that were accepted for 3.9. We have looked at some of those PEPs along the way; there are some updates on those. It seems like a good time to fill in some of the gaps on what will be coming in Python 3.9

Source: The PEPs of Python 3.9, an article by Jake Edge.

python

Wed 27 May 2020

Feeding a Caribena versicolor sling and a tiny scorpion

Today I was finally able to feed the Caribena versicolor sling that has been in my care since the first of this month. I held a small mealworm, Tenebrio molitor, with tweezers close to it, and it "jumped" on the small prey item. In the past the small tarantula had refused food items of the same size, no idea why. Maybe still getting used to its enclosure.

Caribena versicolor with prey — *Caribena versicolor* with prey: a small *Tenebrio molitor* larva.

I was also finally able to feed a second instar Chaerilus sp. "Java"; a very small scorpion that I have been keeping since the 7^th of April 2020. While it has small springtails in its enclosure, which maybe it eats, I prefer to actually see it eat. So in the late afternoon I managed, after a few attempts, to start it accepting and eating a very tiny mealworm larva.

Diving into Go by building a CLI application

In this blog post we’ll build a CLI application in Go, which we’ll call go-grab-xkcd. This application fetches comics from XKCD and provides you with various options through command-line arguments.

Source: Diving into Go by building a CLI application.

Julia vs. Python: Which is best for data science?

Among the many use cases Python covers, data analytics has become perhaps the biggest and most significant. The Python ecosystem is loaded with libraries, tools, and applications that make the work of scientific computing and data analysis fast and convenient.

But for the developers behind the Julia language — aimed specifically at “scientific computing, machine learning, data mining, large-scale linear algebra, distributed and parallel computing”—Python isn’t fast or convenient enough. Python represents a trade-off, good for some parts of data analytics work but terrible for others.

Source: Julia vs. Python: Which is best for data science?, an article by Serdar Yegulalp.

How Linux pipes work under the hood

Piping is one of the core concepts of Linux & Unix based operating systems. Pipes allow you to chain together commands in a very elegant way, passing output from one program to the input of another to get a desired end result.

Source: How Linux pipes work under the hood, an article by Brandon Wamboldt.

command line

Tue 26 May 2020

Tarantula molts and feeding

In the afternoon I noticed that the Psalmopoeus irminia sling I keep had molted; I saw the exoskeleton dangling from a piece of moss coming out of the cork tube it lives in. Because I had an appointment I couldn't take photos, so that I did in the evening.

Psalmopoeus irminia molt — The exoskeleton of a *Psalmopoeus irminia*.

In the evening, after I had taken the above photo I also spotted a cast-off exoskeleton in the terrarium in which I keep a Pterinochilus murinus sling. On the 12^th of this month it had already opened its burrow, and I suspected back then that it had molted. And now I had proof. Maybe I overlooked the exoskeleton earlier because it was underneath the leaf of a plastic plant.

Pterinochilus murinus molt — The (partial) exoskeleton of a *Pterinochilus murinus*.

I also checked on the Chromatopelma cyaneopubescens sling I keep, which recently also molted. Because it moved out of its webbing I decided to try to feed it, and it readily accepted a pre-killed mealworm, Tenebrio molitor.

spider

5 Types Of ZSH Aliases You Should Know

I use Docker, Kubernetes, and Microsoft Azure every day. That said, it makes sense for me to have aliases supporting me with these tools and environments. However, maybe you are using different clouds and command-line tools so that you will end up with different aliases. The key takeaway should be that you create and use aliases to help you get your job done.

Source: 5 Types Of ZSH Aliases You Should Know, an article by Thorsten Hans.

shell

Better git diffs with FZF

Sometimes I find the git diff command a little inconvenient. It can throw a lot of information at the screen at once. I use git diff not only for verifying my changes before a commit, but also to review pull requests, or for finding bugs introduced between two commits. In the situations when you’re looking at a lot of changed files, having to scroll up and down so much is tedious.

Source: Better git diffs with FZF, an article by Rafael Mendiola.

Mon 25 May 2020

Mocking time and testing event loops in Go

Initially I wanted to write articles on those two topics separately (mocking time and testing event loops), but during the process I realized that the things I want to talk about are too interrelated: when I need to mock time, it's usually to test some event loop with it, and when I test event loops, typically mocked time is also involved in that.

So in the end, it felt better to just combine all that in a single article.

Source: Mocking time and testing event loops in Go, an article by Dmitry Frank.

CSS Basics for Typography

In 2020 there are a lot of developers and designers who want to learn the basics of CSS. In this series of articles, I will teach you those main topics. In this specific article, I will review the essential CSS properties of typography while using many visual examples.

Source: CSS Basics for Typography, an article by Elad Shechter.

Creating and Modifying PDF Files in Python

The PDF, or Portable Document Format, is one of the most common formats for sharing documents over the Internet. PDFs can contain text, images, tables, forms, and rich media like videos and animations, all in a single file.

This abundance of content types can make working with PDFs difficult. There are a lot of different kinds of data to decode when opening a PDF file! Fortunately, the Python ecosystem has some great packages for reading, manipulating, and creating PDF files.

Source: Creating and Modifying PDF Files in Python, an article by David Amos.

python