Tensions rise in the condaverse
GitHub has a tool called Dependabot that automatically finds outdated package
versions pinned in project configuration files and issues a pull request to
update them. Support for conda environment.yml
files has long been one of the
most requested
features in the Dependabot repo. At long last, GitHub has now added partial
support for conda to Dependabot, first as a
beta announced last week,
and now
generally available.
But there have been some issues with the rollout.
The main appeal of conda over something like
Poetry, uv, or just
plain-old requirements.txt
is that conda can manage arbitrary dependencies,
not just Python packages. You can
conda create --no-default-packages git micro compilers
to set up a Fortran dev
environment if you want. Dependabot’s conda support includes only Python
packages. A few folks grumbled about this limitation in the GitHub issue
comments, but it’s understandable: The space of “all conda installable packages”
is vast indeed, and the Dependabot devs had to start somewhere.
A more compelling criticism of the new feature stems from the fact that
Dependabot determines the latest versions of Python packages by looking up the
names given in environment.yml
on PyPI. This is a problem because PyPI is an
entirely different package ecosystem from conda. Some package versions are
released on PyPI well before they appear in conda repos, and some packages have
different names between the two.
For a nasty example, Ipopt is a nonlinear
programming solver written in C, and
cyipopt provides Python
bindings. conda install ipopt
installs the C library, and
conda install cyipopt
installs the Python wrapper. But pip install ipopt
actually refers to cyipopt. The upshot, if I
understand correctly, is that if you pin ipopt
in your environment.yml
, then
Dependabot will check its version number against that of the latest version of
cyipopt, a flawed comparison.
Luckily, Ipopt/cyipopt is the only such case I could find in this Rosetta stone (the fact that this exists …) mapping package names across ecosystems. But anyone(ish) can post packages on PyPI, so the current behavior of Dependabot creates new opportunities for typo-squatting attacks on conda users. As Jannis Leidel (a conda maintainer) put it, “This premature rollout makes the conda ecosystem less secure and shouldn’t have occurred.”
I’m not sure what the right move is for Dependabot. For a start, they could use the Rosetta stone to map conda packages to the correct PyPI names, but this would only solve the naming issue, and not the possibility of different versions between the two repositories.
Is ABC-SMC just an evolutionary algorithm?
Suppose we have data and a model that expresses as a noisy function of a parameter vector . We want to determine a value of that fits the data. For the purposes of this post, we’re concerned with models that are “difficult,” meaning we cannot write down a simple expression for the likelihood function and maximize it, whether analytically (as in ordinary least squares) or numerically (as in nonlinear regression). In fact, all we really know how to do is sample data from the model when given an arbitrary . (We’ll get a different every time, because the model is nondeterministic.)
If you enjoy Bayesian statistics, then you may have already pattern-matched this problem statement to the ABC-SMC algorithm. But if you are like me and view parameter estimation as an optimization problem (there is no reason to privilege this view; it’s just how I turned out), then you might instead apply an evolutionary algorithm. Below, I describe such an algorithm, then argue that ABC-SMC is a special case. This insight suggests improvements to the implementation and usage of both evolution and ABC-SMC.
Full of types
I’m obsessed with this essay “Raising a person in a culture full of types” by Dan Brooks. Before we get to the author’s message, let’s take a moment to appreciate his sense of comedic timing, e.g.:
My son talks incessantly about VSCO girls and Karens and other categories of people he has learned about from YouTube. He described a classmate as “the kind of person who borrows your pencil and doesn’t give it back,” i.e. she borrowed his pencil and didn’t give it back. For a while he tried to propagate a type of his own invention, “the Suzan,” whose behavior was ill-defined but tracked closely with that of my mother of the same name. It did not catch on, and eventually he concluded that he was not the kind of person who could come up with memes.
Is it just me, or is this a highly clever paragraph?
Brooks’s point, expressed better there than I will here, is that our culture’s emphasis on “being“ over “doing” prevents us from separating people’s actions from their destiny. At best, the idea that people belong to fixed, inescapable categories is merely the antithesis of the growth mindset; it saps motivation from our desire to try new things (why learn programming if I am not a “math person”?). At worst, as Brooks points out, “The illusion of a fixed nature gives us an excuse to repeat bad behavior,” because every mistake is like a movie trailer for the rest of your life. And like a TV commercial, the fixed mindset is an illusion with a slope (sorry): it inclines us toward the most automatic decision instead of assessing the alternatives on their merits.
I will not pretend to be a paragon of the growth mindset. I talk myself out of good ideas, such as learning how to actually cook, all the time. But if I could choose to have one impact on the world, it would be to motivate those around me to reject limiting beliefs and embrace challenge—even at the risk of failure. And if linking you again to this essay, which has a wonderful subplot about fatherhood, is the way to do it, then, OK.
Jekyll plugin to recommend related posts
I wrote my first plugin for the Jekyll static website builder: a tool that
recommends related posts
at the end of each page. It determines the similarity between post pairs using
a fairly unremarkable token-counting algorithm, so it’s fast enough to rerun on
every site build. You can configure the number of posts to recommend and a
parameter factor
which determines the algorithm’s sensitivity to rare vs.
common words.
I made a little demo of the plugin with a fake blog whose posts are the articles of the UN Universal Declaration of Human Rights. You can also see a demo on the current version of this site if you click the “read more” link below to go to this post’s individual page. I think it works pretty well!
Some things I tried recently
Kagi Search: It’s a paid search engine that promises to give better results than Google and friends. Indeed, the search results are a little more relevant, especially when researching technical topics. I made great use of the ability to filter and promote entire domains. However, the pricing doesn’t work for me: $5/month gets you 300 searches, which isn’t enough (I burned through the free 100 searches in a week), and for unlimited searches, you have to pay $10 for a bundle deal that also includes AI stuff I don’t care about. Kagi wants to become an everything app (probably adding email soon), which a tough sell while claiming to be a privacy-focused company. (Same issue with Proton, by the way.)
Fender Studio: Fender, the guitar company, just kind of threw this over the fence in May. It’s a free (but not open source) digital audio workstation, so it competes with the likes of Ardour and GarageBand. But Fender Studio runs on Linux, and quite well at that. On my machine, it supports JACK with minimal configuration and achieves lower latency than Guitarix while doing a lot more. The vendored backing tracks are a bit cheesy but well engineered.
Proselint: It’s a prose … linter, i.e.
you feed it your draft blog post and it complains about vague wording and common
typography problems like curly vs. straight quotes. I like that Proselint uses
regex instead of an LLM, so there’s no
creative interference; it’s more
like an automated style guide than a chatty editor. But my homegrown
typography.py
script (I need to upload this to GitHub sometime) enforces a few
lesser irks, such as
en dashes in numerical ranges,
that Proselint lets be, so I’m still using both.