Sam’s Blog

Interesting Bits from a Hackathon Kudos Bot

At a recent work hackathon I built a Slack bot where people give each other kudos – short public messages like “@kudos @jane Great job leading the incident retro” – and…

Fast Julia testing with julia-daemon

I’ve been using Claude Code with Julia, and the main pain point was startup time. Every test run pays Julia’s compilation tax — a simple include("runtests.jl") can take 30+…

pytest-familywise: Controlling false positives in randomized test suites

Randomized algorithms are convenient to test statistically: run the algorithm many times, compute a p-value, and assert that the null hypothesis is not rejected. The problem…

Targeted Learning Basics

What follows is a highly informal and simplified introduction to targeted learning. Our goal will be to estimate some functional \(T\) of a distribution \(P\) from a…

Building a Choral Source Separator with SepReformer in JAX

machine learning

This tutorial demonstrates how to perform audio source separation using the SepReformer architecture. We’ll use equinoix for the neural network, with beartype for runtime…

Introducing tbexport

machine learning

I’ve been writing up training results for blog posts and documentation a lot lately, and the workflow always felt clunky. TensorBoard is great for interactive exploration…

Case Control Studies

Say we want to find the odds ratio for how a factor \(X\) affects the probability of having a disease \(D\). We can model this with logistic regression: \[ \text{logit}(P(D=1…

Counting People You Haven’t Met: Design-Based Survey Inference with NestedSurveys.jl

Suppose you want to know the average household income in a city of a million people. You can’t ask everyone, so you pick a sample and ask them. The question is: how do you…

A Recursive Bayesian Model of Reputation

machine learning

How should you update your beliefs about whether a claim is true when the person telling you might be lying — and the people vouching for them might also be lying? This post…

Fitting Gaussians with Missing Observations

Say you want to fit a multivariate Normal distribution to some data.

Air Quality and Congestion Pricing

This post dives into the data of a recent paper quantifying the effects of Manhattan’s recent congestion pricing scheme (Fraser et al. (2025)). The authors argue that…

AnkiVec: Vector Search for Anki

I just released AnkiVec, an Anki addon that creates vector embeddings for cards using Ollama and enables hybrid semantic search with ChromaDB.

Classifying Ships with Gaussian Process Mixtures

machine learning

I recently came across a dataset of container ship movement between Tallinn and Helsinki on Kaggle. In this notebook, we’ll try to classify whether a given ship’s trajectory…

Analyzing Coffee Yields

This post demonstrates working with generalized linear mixed models in the context of coffee bean yield data. Each row in the following dataset is an observation of coffee…

Sizecheck: Making Tensor Code Self-Documenting with Runtime Shape Validation

Writing neural networks often feels like juggling tensors in the dark. You know that attention_weights should be 4-dimensional, but PyTorch won’t tell you until your matrix…

Books and Guides

machine learning

In the style of Susan Rigetti’s classic “So You Want to Learn Physics”, this post lists some of my favorite resources for learning stuff.

Frequentist Sample Size Estimation

In the previous post, I showed a Bayesian method of sample size estimation for A/B/n testing. This post goes over the more conventional frequentist method.

Sample Size Estimation in Bayesian A/B/n Testing

This post highlights a Bayesian approach to sample size estimation in A/B/n testing. Say we’re trying to test which variant of an email message generates the highest…

An Opinionated Tooling Guide

Statistics and Data Analysis: Overall: use R. It has the largest ecosystem of statistical packages.

Finding Common Topics

machine learning

How do you find thematic clusters in a large corpus of text documents? The techniques baked into sklearn (e.g. nonnegative matrix factorization, LDA) give you some intuition…

Synthetic Controls for Texas Prison Data

This post uses a synthetic control design to study whether Texas’s prison building boom in 1993 resulted in them incarcerating more prisoners than they would have if their…

Hop Lists

Hop Lists are a novel retroactive set data-structure that allow for a branching timeline. Each hop list node \(h_t\) is associated with a specific time \(t\) and a randomly…

Graph SLAM

For a robot to navigate autonomously, it needs to learn both its own location, as well as the locations of any potential obsticles around it, given its sensors’ observations…

Diagnosing Lack of Independence in Exogenous Variables

While performing linear regression with statsmodels, you might occasionally find that your exogenous variables aren’t independent, giving you a error about a singular matrix.

Finite Basis Gaussian Processes

machine learning

By Mercer’s theorem, every positive definite kernel \(k(x, y) : \mathcal{X} \to \mathcal{X} \to \mathbb{R}\) that we might want to use in a Gaussian Process corresponds to…

Finite Particle Approximations

machine learning

Say you have a discrete distribution \(\pi\) that you want to approximate with a small number of weighted particles. Intuitively, it seems like the the best choice of…

Nearest Neighbor Gaussian Processes

machine learning

In a \(k\)-Nearest Neighbor Gaussian Process, we assume that the input points \(x\) are ordered in such a way that \(f(x_i)\) is independent of \(f(x_j)\) whenever \(i > j +…

Krylov Methods

The \(i\)th Krylov subspace \(\mathcal{K}_i\) for a symmetric matrix \(A\) starting from vector \(b\) is the subspace spanned by the vectors \(b, Ab, A^2b, \dotsc A^{i-1}b\).…

Mapping with Gaussian Conditioning

For a robot to navigate autonomously, it needs to learn the locations of any potential obsticles around it. One of the standard ways to do this is with an algorithm known as…

Conjugate Computation

machine learning

This post is about a technique that allows us to use variational message passing on models where the likelihood doesn’t have a conjugate prior. There will be a lot of Jax…

Generative ODE Models are VAEs

machine learning

Generative image models based on ordinary differential equations can be seen as forms of variational auto-encoders with a partially deterministic inference network. \(\newcom…

Sparse Variational Gaussian Processes

This notebook introduces Fully Independent Training Conditional (FITC) sparse variational Gaussian process model. You shouldn’t need any prior knowledge about Gaussian…

Differential Equations Refresher

In my freshman year of college, I took an introductory differential equations class. That was nine years ago. I’ve forgotten pretty much everything, so I thought I’d review…

Fun with Likelihood Ratios

machine learning

Say you’re trying to maximize a likelihood \(p_{\theta}(x)\), but you only have an unnormalized version \(\hat{p_{\theta}}\) for which \(p_{\theta}(x) =…