Urban Arts Career Pathways Takeover
My team at work recently got to host a Career Pathways Takeover at Urban Arts, a nonprofit based in New York that teaches programming, animation, and storytelling education to high schoolers to prepare them for success in college and beyond.
I was so impressed by the students’ thoughtfulness and engagement. We walked our group of Game Academy students through our team’s event planning process, and in less than an hour they put together a rock-solid plan for a community arts program to raise awareness about mental health issues, including how they’d use data to capture the event’s positive impact (where I fit in). I hope the vision can become a reality before long…!
Massive thanks the Urban Arts team for hosting us and for all that you do through the Game Academy.
Apply for CLS
This year’s application for Critical Language Scholarship from the US State Department is now open. (Thanks, group chat!)
CLS is a fully funded summer language immersion program for American undergraduate and graduate students. I strongly encourage any interested in learning one of the nine critical languages to apply. Some languages require prior language study; others accept beginners.
Feel free to email me if you are considering the CLS program or have questions about the experience. I completed CLS Korean in 2016 and have kept in touch with participants from lots of cohorts since then. Here is my advice post.
Good CLS blogs:
- Bethany Maz
- The Good Things Coming (Paula Zhang)
Python scripts
I cleaned up a few utility Python scripts for the GitHub:
- fetch_python_docs.py sets
up a local mirror of docs.python.org on
http://localhost:8004for offline reference. It can also provide and enable a systemd unit file, so you can run the script once, bookmark the local URL, and forget about it. - typography.py (which I mentioned
here) checks for ASCII typography
that can be better rendered as Unicode. For example, it recommends changing
the hyphen in the page range
278-81to an en dash. - dated.py applies my obnoxious filename convention to create a dated working copy of a file—useful when collaborating with people who aren’t comfortable with version control systems.
Tensions rise in the condaverse
GitHub has a tool called Dependabot that automatically finds outdated package
versions pinned in project configuration files and issues a pull request to
update them. Support for conda environment.yml files has long been one of the
most requested
features in the Dependabot repo. At long last, GitHub has now added partial
support for conda to Dependabot, first as a
beta announced last week,
and now
generally available.
But there have been some issues with the rollout.
The main appeal of conda over something like
Poetry, uv, or just
plain-old requirements.txt is that conda can manage arbitrary dependencies,
not just Python packages. You can
conda create --no-default-packages git micro compilers to set up a Fortran dev
environment if you want. Dependabot’s conda support includes only Python
packages. A few folks grumbled about this limitation in the GitHub issue
comments, but it’s understandable: The space of “all conda installable packages”
is vast indeed, and the Dependabot devs had to start somewhere.
A more compelling criticism of the new feature stems from the fact that
Dependabot determines the latest versions of Python packages by looking up the
names given in environment.yml on PyPI. This is a problem because PyPI is an
entirely different package ecosystem from conda. Some package versions are
released on PyPI well before they appear in conda repos, and some packages have
different names between the two.
For a nasty example, Ipopt is a nonlinear
programming solver written in C, and
cyipopt provides Python
bindings. conda install ipopt installs the C library, and
conda install cyipopt installs the Python wrapper. But pip install ipopt
actually refers to cyipopt. The upshot, if I
understand correctly, is that if you pin ipopt in your environment.yml, then
Dependabot will check its version number against that of the latest version of
cyipopt, a flawed comparison.
Luckily, Ipopt/cyipopt is the only such case I could find in this Rosetta stone (the fact that this exists …) mapping package names across ecosystems. But anyone(ish) can post packages on PyPI, so the current behavior of Dependabot creates new opportunities for typo-squatting attacks on conda users. As Jannis Leidel (a conda maintainer) put it, “This premature rollout makes the conda ecosystem less secure and shouldn’t have occurred.”
I’m not sure what the right move is for Dependabot. For a start, they could use the Rosetta stone to map conda packages to the correct PyPI names, but this would only solve the naming issue, and not the possibility of different versions between the two repositories.
Is ABC-SMC just an evolutionary algorithm?
Suppose we have data and a model that expresses as a noisy function of a parameter vector . We want to determine a value of that fits the data. For the purposes of this post, we’re concerned with models that are “difficult,” meaning we cannot write down a simple expression for the likelihood function and maximize it, whether analytically (as in ordinary least squares) or numerically (as in nonlinear regression). In fact, all we really know how to do is sample data from the model when given an arbitrary . (We’ll get a different every time, because the model is nondeterministic.)
If you enjoy Bayesian statistics, then you may have already pattern-matched this problem statement to the ABC-SMC algorithm. But if you are like me and view parameter estimation as an optimization problem (there is no reason to privilege this view; it’s just how I turned out), then you might instead apply an evolutionary algorithm. Below, I describe such an algorithm, then argue that ABC-SMC is a special case. This insight suggests improvements to the implementation and usage of both evolution and ABC-SMC.