Plurrrr

week 02, 2023

How to improve Python packaging

There is an area of Python that many developers have problems with. This is an area that has seen many different solutions pop up over the years, with many different opinions, wars, and attempts to solve it. Many have complained about the packaging ecosystem and tools making their lives harder. Many beginners are confused about virtual environments. But does it have to be this way? Are the current solutions to packaging problems any good? And is the organization behind most of the packaging tools and standards part of the problem itself?

Source: How to improve Python packaging, or why fourteen tools are at least twelve too many, an article by Chris Warrick.

Formalizing f-strings

Python's formatted strings, or "f-strings", came relatively late to the language, but have become a popular feature. F-strings allow a compact representation for the common task of interpolating program data into strings, often in order to output them in some fashion. Some restrictions were placed on f-strings to simplify the implementation of them, but those restrictions are not really needed anymore and, in fact, are complicating the CPython parser. That has led to a Python Enhancement Proposal (PEP) to formalize the syntax of f-strings for the benefit of Python users while simplifying the maintenance of the interpreter itself.

Source: Formalizing f-strings, an article by Jake Edge.

Javascript Hoisting: Thank you for nothing

As a JavaScript developer, I’m sure you’re well aware of the “joy” that is hoisting. But in case you need a refresher, hoisting is the behavior in JavaScript where variables and functions are automatically moved to the top of their scope before code execution.

Now, some might argue that hoisting is a useful feature, allowing us to reference variables and functions before they are actually declared in our code. But let’s be real here: the cons far outweigh the pros when it comes to this “feature.”

Source: Javascript Hoisting: Thank you for nothing, an article by Krishna Singh.

Makefiles for Web Work

make is a build tool that’s been around since the 1970s. It was originally designed for automating the building of C programs: installing dependencies, running tests, and compiling binaries.

These days, web projects involve many of the same steps: installing node_modules, running linters and tests, starting dev servers, and compiling files with esbuild or Rollup.

The default choice for automating these steps is often npm/yarn scripts: little shell commands written into your project’s package.json file. More complex projects sometimes evolve into using tools like Gulp/Grunt, or even full-blown Docker builds.

But I find make often fills many of the same needs without as much fuss.

Source: Makefiles for Web Work, an article by Ross Zurowski.

Sharing, Space Leaks, and Conduit and friends

Sharing conduit values leads to space leaks. Make sure that conduits are completely reconstructed on every call to runConduit; this implies we have to be careful not to create any (potentially large) conduit CAFs (skip to the final section “Avoiding space leaks” for some details on how to do this). Similar considerations apply to other streaming libraries and indeed any Haskell code that uses lazy data structures to drive computation.

Source: Sharing, Space Leaks, and Conduit and friends, an article by Edsko de Vries.

The yaml document from hell

For a data format, yaml is extremely complicated. It aims to be a human-friendly format, but in striving for that it introduces so much complexity, that I would argue it achieves the opposite result. Yaml is full of footguns and its friendliness is deceptive. In this post I want to demonstrate this through an example.

Source: The yaml document from hell, an article by Ruud van Asseldonk.

Upgrading Kubernetes

One common question I see on Mastodon and Reddit is "I've inherited a cluster, how do I safely upgrade it". It's surprising that this still isn't a better understood process given the widespread adoption of k8s, but I've had to take over legacy clusters a few times and figured I would write up some of the tips and tricks I've found over the years to make the process easier.

Source: Upgrading Kubernetes - A Practical Guide, an article by Mathew Duggan.

Understanding Git through images

Git is a tool to facilitate development work by recording and tracking the changelog (version) of files, comparing past and current files, and clarifying changes.

The system also allows multiple developers to edit files at once, so the work can be distributed.

Source: Understanding Git through images.

Reducing Docker Image size

Docker Images are made to bring unity among the developers despite the platform and environment. But obtaining this at cost of memory and speed is not acceptable.

  • Bulkier images take time to get pulled.
  • Increased time to spin up a container using it.
  • It increases the load on the image registry used.
  • With upgrades, it increases and becomes harder to check for vulnerabilities

Source: Reducing Docker Image size, an article by Pratik Singh.

GraphQL API and REST API

REST is probably the most popular way to expose your application to the external world (e.g. as a backend for the frontend or to establish communication protocol with other applications/services). However, GraphQL is now getting more and more popular and has become a strong competitor for REST. Nevertheless, the aim of this post is not to carry out a detailed comparison of the advantages/disadvantages of these approaches, because there is a lot of stuff covering that topic. I would rather like to present how REST and GraphQL differ in implementation strategies. This post can be especially helpful for people who are familiar with the REST pattern and wonder how to broaden their interests in GraphQL technology.

Source: GraphQL API and REST API.

Pagination and the problem of the total result count

When processing a big result set in an interactive application, you want to paginate the result set, that is, show it page by page. Everybody is familiar with that from the first web search on. You also get a button to scroll to the next page, and you get a total result count. This article shows the various options for pagination of a result set and their performance. It also discusses the problem of the total result count.

Source: Pagination and the problem of the total result count, an article by Laurenz Albe.

In Defense of Testing Mocks

This isn’t going to be a great defense, because I generally agree with the conventional wisdom that mocking should be avoided when possible. But I think it’s important to give things a fair shake, even if I don’t like them.

Source: In Defense of Testing Mocks, an article by Hillel Wayne.

Introducing Content Defined Chunking (CDC)

In a backup program, data de-duplication can be applied in two locations: Removing duplicate data from the same or different files within the same backup process (inter-file de-duplication), e.g. during the initial backup, or removing it between several backups that contain some of the same data (inter-backup de-duplication). While the former is desirable to have, the latter is much more important.

Source: Introducing Content Defined Chunking (CDC).

Where are my Git UI features from the future?

The Git version control system has been causing us misery for 15+ years. Since its inception, a thousand people have tried to make new clients for Git to improve usability.

But practically everyone has focused on providing a pretty facade to do more or less the same operations as Git on the command-line — as if Git’s command-line interface were already the pinnacle of usability.

No one bothers to consider: what are the workflows that people actually want to do? What are the features that would make those workflows easier? So instead we get clients which think that git rebase -i as the best possible way to reword a commit message, or edit an old commit, or split a commit, or even worth exposing in the UI.

Source: Where are my Git UI features from the future?, an article by Waleed Khan.