Brent Huisman

The world-wide home of Brent Huisman. Enjoy your stay!

Particle Physics, Proton Therapy, Treatment verification.

C++, Python, Numpy, Dicom, Qt.

Get in touch.

Trust

In many ways this is an excellent web… page/app, that is meant to teach you the social mechanics of trust. A concept that might seem hard to simulate, but you’ll have no problem understanding the mechanics thanks to the high-quality introduction. You’ll learn to understand how strategies (starting with the simplest such as trusting everyone) translate to outcomes. Highly recommended you take a look.

Trust, see bottom for the many translations.

Cogs in the machine

Quite some ex-fellow students of physics and math have gone into finance. In Amsterdam it’s one of the few areas outside of academia where such skills are in demand. Some of those businesses engage in high frequency trading, on which the American Securities Exchange Commission published a study (even smaller ticks?). I of course expected this: the first post immediately goes into technical detail, and the responses to it even further. Higher level critiques are at this time absent from the discussion. Critiques that HFT does not help productive business, is a form of casino trading, ghost liquidity, you’re actually trading rich peoples money and making the market harder for smaller investors, and so on. The pattern is: HFT traders discuss the mechanics of it in great detail.

The second post is about the question of why object oriented programming still so widespread is. Fortunately, a meta-comment is currently leading there, explaining how it takes some experience to know when to use OOP and when not, what the pitfalls are and where its application best fits. Less experienced people who hear of such discussions then ‘take sides’ without really grasping the essence of the discussion. A few responses is the key comment:

My favourite HN stereotype is that every comment chain involves the next poster correcting the last poster over a tiny detail, and then descending into long threads arguing minute details of no consequence.

I think this forum attracts a non-trivial number of people who have a hyperfocus issue beginning with the letter A.

Hyperfocus. As in: unable to not focus, unable to see/work big picture. Needing thought-drugs to satisfy the need to focus. Well, yeah, these types are the norm in physics, in STEM in general I suppose. If you had a mountain of money, what types would you want to hire to grow your mountain? Hyperfocussed types that can’t help themselves but to wade into details, and a salary to help quench any big picture thoughts should they pop up. Also other big picture thoughts that most of this salary is eaten up in housing cost by the same class of people that pays them. It painful to realize it’s still quite straightforward to co-opt smart people (maybe especially smart people) into making the rich richer and thus the world poorer in exchange for coin and thought-drugs such as technically involved topics (e.g. HFT).

We should not segregate (higher) education (STEM seperate from humanities) and we should not forget what universities are for: producing well-rounded, mentally trained individuals, ready to take on issues coming their way and coming our way, with the capacity to balance and integrate concerns. We should not produce mono-skilled tradespeople in universities, but thinkers and tinkerers. A succesful outcome of a university system would be to produce human beings able to resist petty comforts like salary and intellectual self-gratification but people that are able and have the drive to produce a better world, and the skills to discern what ‘better’ actually means.

Distrohop 2

Some errata to Distrohop:

  1. Neon actually isn’t quite as up to date as I thought! It turns out they only keep KDE software up to date, not all Qt-based tools. And they even break them! KeepassXC is at 2.4.6, same as Ubuntu, and the Calibre from the repo is broken, and won’t be fixed, because the Neon people don’t care about the admittedly slightly idiosyncratic way Calibre is put together. Still, outright breaking it isn’t very nice, so that’s two points off for Neon.
  2. My little hardware issue with my screen appears to be related to a regression in kernel 5.7. I have not yet tried the workaround, but for now booting with kernel 5.6 also works around the issue (Ubuntu comes with 5.4 and I suppose that’s why it works).
  3. Installing Noto Sans improves Fedoras looks quite a bit (same as Neon). Also, that font stuff? Apparently the patents expired in 2010 and Fedora has been shipping with the same engine as Ubuntu ever since.

So, what problems are left? That silly modprobe to access my works VPN, can only use Fedoras mediawriter to create EFI-bootable USB drives and a worse breeze-gtk theme. The pro’s are: (nearly) as up to date as Neon (for KDE apps), much more up to date (already) than Ubuntu, no #WONTFIX broken packages, a few NeuroFedora packages. Starting with Fedora 33 (in two months) BTRFS by default. I guess I’ll be staying with Fedora after all!

Update 1: The just released kernel update 5.8 also seems to work now with my screen! Update 2: Today (august 31) after 5 hours, the USB-C connection suddenly gave out again. Replugging, restarting, different kernels, different cable, even a USB boot with Neon, nothing produces any sign anymore. dmesg is suddenly silent… I guess I should be pointing finger at my screen, even though it does work find if I plug it back into my desktop (over regular USB3). Also, the laptop works fine with the dock… Hmm…

Distrohop

Distrohopping, or as I like to call it, ultimate bikeshedding. I really hadn’t done that in a very, very long time. My first Linux machine was based on Fedora Core 1. A server which mainly served files. On a (second) desktop, Ubuntu was my first distro, Warthy Warthog in fact. Memories are vague now, but I liked the warm brown color, which makes me one of the few I think.

Ever since Ubuntu has been my main and only distro. It just werks. What I recall from Fedora is SELINUX needing to be turned off to do things like access a DVD (at the time an important and frequent event). I tried Fedora briefly around the time of Core 6, and I used Scientific Linux during my masters, had a Fedora-based cluster during the PhD and briefly an OpenSuse desktop, but mostly Ubuntu. Ever since Gnome 3 the biggest change has been to use KDE Neon (again a distro I’ve been using since its first release). Gnome 3 just wasted too much of my screen, and KDE around the time of Plasma 5.5 really shaped up. But Neon is still 99% Ubuntu. Nothing beats the ecosystem around it: if somebody has some build/install instructions, they’re probably going to be for Ubuntu. I do not really like configuring and playing sysadmin you see. And I certainly do not like pet systems, I treat them like cattle. I meticulously write down/store configuration instructions, stored in a Resilio Sync share so I can restore a borked install or setup a new one within about an hour. (Why Resilio you ask, you freedom-hating cretin? I also use it to share files with family, friends, and Syncthing was not made for that use-case. Rather than having two similar-ish systems, I prefer to keep it down to the one. Fewer moving parts and all that.)

Recently I started a new job, and Ubuntu 20.04 was just out, but Neon was still on 18.04 (they only track LTS releases). Having to integrate into a new team that uses latest versions of everything as much as possible (what a refreshing change from before!), I suddenly noticed how old 18.04 is. I quickly paved over with Kubuntu 20.04, but they didn’t even bother to update things like Digikam to v7! They left it at 6.4! Meanwhile, under the banner of NeuroFedora, some commonly used packages in my new field are only a yum install away. So I decided to give Fedora (32) another try, 16 years since I last used it!. Here are my findings.

  1. Software is a lot more recent! I use the KDE spin obviously, and I find that in terms of KDE packages things are barely behind Neon, and many other packages as well. That is compared to Ubuntu 20.04.
  2. Provided you enable the rpmfusion repos, you don’t really need PPAs! Up to date packages for software like qownnotes, quodlibet, lutris, it’s all there! PPAs are nice of course, but this is not any worse.
  3. I actually wanted to try Fedora again a year or two ago, but I found that it just won’t boot on EFI-only systems, of which I now have a few. I discovered by coincidence that the mediawriter shipped with Fedora is the only way I could create USB sticks that will allow me to install Fedora on EFI systems! I tried the KDE USB Creator, Ventoy, Rufus, Unetbootin, which all work fine with Ubuntu ISOs, but not Fedora.
  4. SELINUX strikes back. Well, not really, but strange Red Hat security precautions do. In order to use the VPN for work, I had to manually modprobe a kernel module. WHAT YEAR IS THIS???
  5. A few cosmetic nitpicks: font aliasing (remember that? WHAT YEAR IS THIS? Apparently Fedora still does not allow a possibly proprietary hinting/subpixel rendering algo in its repos.) and GTK styling (there is a breeze-gtk package but it doesn’t style nearly as well as Ubuntus breeze-gtk-theme, so buttons really stand out).
  6. The reason I’ll be going back to Neon today (they rebased to 20.04 last week): why I plug the laptop into my screen with a USB-C cable, nothing happens! Under Ubuntu I could charge, see my screen and use my screen-connected USB peripherals without issue. I’m done troubleshooting this (already took my 3 hours figuring out the L2TP problem).

So, if you do not rely on those particular things, Fedora works very well and I’ve installed it on all my non-essential devices to give it longer go. But I’ve felt the pain of not having it Just Werk enough for at least another decade. I’d like to try Arch and OpenSuse someday, seem like nice projects, but all this troubleshooting is really not my cup of tea, and the truth is, on Ubuntu this almost never seems to be necessary.

Data ethics

A great talk on a very relevant subject: formalizing fairness in a drag-n-drop big data / ML world: Arvind Narayanan. Even though many people, including those working in the field, seem to think anything automated is therefore value free and fair, this presentation shows concretely how formalizing fairness is being investigated. For instance, you can go for outcome fairness or process fairness. Outsourcing decision making to computers means formalizing problems, and of course therefore also fairness, wether you leave it there as a hidden bias or not. Great example he gives is stereotype mirroring: if you search for images of CEOs you’ll find many men, and since most CEOs are men, this result is unbiased and correct, according to some. Or is it?

Such is the field of data ethics. A comprehensive course at fast.ai

Agroforestry

Combining forests and agriculture for fun and profit and climate and food security: super cool read!

agroforestry

Supercomputers

Apart from learning about computational neuroscience, I’ve learned a little bit about computing today as well. Coming from particle physics, I was used to using supercomputers for simulations. Of course, when using a supercomputer, you don’t run programs yourself. You define your workload or job: which program to run, in which directory, with which data, what ind of resources do you need, how many, how much memory, expected runtime, at minimum. Then, you submit this definition to a job or workload scheduler, which then prioritizes all the jobs submitted to it (usually FIFO I guess?) such that you and all the other users make optimal use of resources in a fair manner.

Such a supercomputer, or cluster, is then primarily interfaced with the scheduler. I thought computing cluster equalled qsub/qstat/qdel (because in particle physics, I never encountered another system), but it turns out that’s only one of many schedulers (Portable Batch System, see here for a typical particle physics oriented user guide). My new supercomputer at the Jülich Supercomputing Centre, actually a set of a few clusters, uses Slurm.

The JSC does not only do research with supercomputers, but also on them: what are good ways to use these machines? I guess that’s why the use the more recent scheduler Slurm. Research also goes into how to use an increasingly more diverse computational landscape, next to CPUs now GPUs are common for (certain) computational workloads. A sort of intermediate attempt is perhaps the Xeon Phi, which is in essence (currently) an 64-72 core Atom, with AVX(2/256/512) units and 4-fold hyperthreading, a small amount of fast integrated RAM, leading to thread counts somewhere between what you find on a regular CPU and an regular GPU. (I don’t know if these CPUs downclock under sustained AVX use. Update: they do too) Then there are a some ARM-based supercomputers and of course more specialized coprocessors. This leads to the problem of knowing how to 1) use all these different types of processors and 2) how to schedule on such diverse architectures. This field is called heterogeneous computing, or heterogeneous system architecture (HSA).

Another thing that makes life simple for particle physicists is the trivial parallelizable nature of (most?) simulations: shooting billions of particles in hundreds of configurations is a ton of independent runs, so you can easily split these over a number of cores and merge the outputs in post-processing. In other kinds of simulations, runs may not be (that) independent, because certain quantities accumulate or generate feedbacks. You can think of material science simulations, and, of course, neuroscientific simulations.

A common tool for multithreaded computation is OpenMP. This older project is all about parallel execution, of the kind found in particle physics. Efficient use of heterogeneous systems is a bit trickier however, because since the execution pipelines are not identical, the way to break up your problem must be handled differently. You probably want to break up your computation according to how well suited it is for a particular architecture. Massively parallel atomic and memory-light to the GPU, hard-to-break-down memory-intensive tasks to a CPU. For such computations, you probably also need to bring some scheduling logic into your program: if a GPU is present, how will we use it? Whether we’re assigned 4 cores, 40 or 400, whether those cores have SSE4.x, AVX, AVX2, may raise the same question.

Thus, man invented the Message Passing Interface, a format for communication between threads. If we write a program from the ground up based on the assumption that not only will there be various execution units but also that they may differ in type, we make sure to think of how we might parallelize performance at every step of the program. MPI is one of the ways to it. How does MPI work? Thanks to Stackoverflow:

As to your question: processes are the actual instances of the program that are running. MPI allows you to create logical groups of processes, and in each group, a process is identified by its rank. This is an integer in the range [0, N-1] where N is the size of the group. Communicators are objects that handle communication between processes. An intra-communicator handles processes within a single group, while an inter-communicator handles communication between two distinct groups.

By default, you have a single group that contains all your processes, and the intra-communicator MPI_COMM_WORLD that handles communication between them. This is sufficient for most applications, and does blur the distinction between process and rank a bit. The main thing to remember is that the rank of a process is always relative to a group. If you were to split your processes into two groups (e.g. one group to read input and another group to process data), then each process would now have two ranks: the one it originally had in MPI_COMM_WORLD, and one in its new group.

So, tools to help break down the computation into a topology that seems most efficient to you. How that is done, I imagine is a topic for much debate. Thus far theory as I understand it. Now it’ll be time to understand how MPI is used in Arbor, a simulation tool I’ll be working on. A colleague actually gave a lecture roughly following the task management system in Arbor: see the video here. I’ll end the post here, I’m sure it won’t be the last on the subject!

Small update: NVIDIA call their parallel programming model SIMT - “Single Instruction, Multiple Threads”. Two other different, but related parallel programming models are SIMD - “Single Instruction, Multiple Data”, and SMT - “Simultaneous Multithreading”. Each model exploits a different source of parallelism:

Out of it

I’ve been a little bit out of it! That’s what you get when you’ve been working at a place where C++98 was not universally assumed in the code base. But let’s not speak of that now 😉

Lambdas make an awful lot of sense, but until 2011 they didn’t exist in C++. In the name of being able to compile on CentOS 6 (5?) we only could start using C++11 right around the time I left (finished the PhD and all that), so I never had an opportunity to really go into it. I only saw scary line noise on the webz. Turns out, C++ lambdas are as simple as they should be, and here they are explained simply. Thanks Bartek!

Trust Experiments

For lack of time, you’ll have to make do without much of a discussion, but that’s not really necesary because this website guides you through every step quite clearly. It’s a discussion of trust and public (online) discourse in the context of trolls, fake news, all that, but very neatly simulates right along side the discussion the effect of strategies to the discussion as a whole. What’s really neat is that it explores increasingly complex strategies in a very accesible way, and make it both clear why (slightly) more complex strategies are essential to keeping our (online) debates healthy and how they actually work (thereby making them far less complex).

If you follow any links from this blog, make sure this one is among them.

Massacring C Pointers

You know what I’m talking about right? That awful book. It’s an oldy, but a colleague brought it up and I thought it deserves a place on this website (future reference!).

Massacring C Pointers.

Read more in the archive.