DRMacIver's Notebook

Thoughts from David R. MacIver


Ensuring Downward Paths

I had some half-formed thoughts about test-case reduction that I wanted to work through, and this notebook is a place for working through half-formed thoughts, so here we go...

The thing I've been working on on and off for about a month now is using language inference to figure out new reduction passes.

It can be thought of as a way of taking an idea that doesn't work (using L* for test case reduction) and combining it with an idea that works but doesn't generalise well (learning a table of small string substitutions to make), and getting an idea that does work and does generalise.

This has me thinking: What, exactly, does it mean for an approach to test-case reduction to generalise?

Also does this approach generalise well? The big question with it is basically whether it works well when you change the interestingness test. I think it will, but it probably depends a lot on the choice of interestingness test and how close your reducer is to normalising it already.

In the approach to test-case reduction I favour, test-case reduction starts from set of test cases \(X\) and a total order \(\preceq\) over \(X\) called the reduction order, and a test-case reducer is a function \(r(U, x)\) where \(x \in U \subseteq X\), \(r(U, x) \in U\), and \(r(U, x) \preceq x\). For me usually \(X\) is a set of strings and \(\preceq\) is the shortlex order where \(x \preceq y\) if \(|x| < |y|\) or \(|x| = |y|\) and \(x \leq y\) lexicographically.

Your prototypical test-case reducer starts from a bunch of small-step reductions, which gives you a function \(t: X \to X^{<\omega}\), where \(t(x)_i \prec x\), and the reducer greedily tries each of \(t(x)_i\) in order. If none of them are in \(U\) it returns \(x\), otherwise it it recurses and returns \(r(U, t(x)_i)\) for the first \(i\) with \(t(x)_i \in U\).

More generally given such a function we might say that a reducer is \(t\)-local if it satisfies the following conditions:

  1. \(t(U, x) = x\) if and only if \(t(x)_i \not\in U\) for all \(i\).
  2. For all \(U, x\) is some sequence \(x = x_0, \ldots, x_n = r(U, x)\) with \(x_{i + 1} \in t(x)_i\) (note that we don't necessarily require that \(x_i \in U\)).

That is, basically a \(t\)-local reducer is one that looks like the prototypical reducer but may have some optimisations and fiddly bits. e.g. delta debugging is a \(t\) local reducer where \(t\) is maps \(x\) to all test cases that can be formed by deleting a single element of \(x\). Delta debugging starts by trying large bulk combinations of these deletions and gradually narrows it down until it's only trying the small ones. Another approach which uses the same \(t\) but a different algorithm is what I've called adaptive delta debugging.

Most reducers implemented in practice are roughly \(t\)-local for some \(t\) (often they fail to be \(t\)-local in some cases because they hit some internal limit and stop when they could be making more progress, and rerunning the reducer may produce further redctions) and I think this is good because it allows you to reason in terms of what transformations they are capable of performing.

So lets assume we have a \(t\)-local reducer. What makes it good?

Basically the tradeoff is this:

  1. The larger \(|t(x)|\) tends to be the better you will reduce \(x\).
  2. The smaller \(|t(x)|\) is the better your reducer tends to perform.

The ideal seems to be that you want \(|t(x)| = O(|x|)\), as per delta debugging, and the trick is to tweak the set within that constraints to try to get good results.

(Note that having \(|t(x)| = O(|x|)\) doesn't mean that the overall reducer runs in \(O(|x|)\) - even if you reduce the size at every step you can easily be made to make \(O(|x|^2)\) membership checks, and if you just make lexicographic transformations it's easy for performance to get exponentially bad)

The problem that you run into is that basically what you really want is not just for \(|t(x)|\) to be large, but for it to contain values that are in some sense "likely to be in \(U\)". One way this can fail to happen is when there is some sort of precondition on the range of \(U\) we are interested in. The biggest example of this is when interesting test cases have to be syntactically valid according to some grammar. When this happens, effectively you are censoring \(t\) by replacing it with its syntactically valid subset. Often when something like delta debugging gets stuck it's not because it's not because of any intrinsic limitation related to the interestingness test, it's just ended up at a point where \(t(x)\) contains no syntactically valid test-cases.

So, essentially what we want in order to get a good reducer for a format is to try to ensure that \(t(x)\) is always well populated with syntactically valid reductions of \(x\).

One way to do this might be to artificially construct interestingness tests that are basically designed to expose small reductions. I'm not quite sure what those would be. One thing I was thinking of is having interestingness tests of where \(x\) is interesting if \(y \preceq x\) for some fixed \(y\), then once those are normalised gradually "censor" the tests by removing all intermediate parts, stopping when you can't learn anything other than the direct jump to \(y\).

This can be thought of as ensuring that whenever the final reduced result is \(y\) you have as rich a set of paths leading to it as the learning process is possibly able to find.

I don't yet know how likely this is to work. I'm planning to do more experiments soon on getting this running on hypothesis-csmith and we'll see how well it can learn that, but I think there's some interesting stuff to investigate here.


From a Certain Point of View

From Jonathan Strange & Mr Norrell by Susanna Clarke, page 480:

"Artists are tricky fellows, sir, forever reshaping the world according to some design of their own," said Strange. "Indeed they are not unlike magicians in that."

A thing I often think about is the blurry line between true and useful, communication and performance.

In the context of the story, this is about visual art, but the same happens with the written or spoken word: When we attempt to convey facts, we end up summarising, simplifying, focusing on what we think is important about it, and often when we do this we end up saying things that are not, in the strictest sense, true.

(In context, Norrell is complaining about a painting that has a mirror where there is none).

Sometimes this is because we elide crucial details. Sometimes it's a lie-to-children. Sometimes it's to give the story the true aesthetic without the elaborate scaffolding needed to convey that aesthetic accurately (the mirror). Sometimes, of course, it's just a lie.

I think we end up doing this more than we think, because memory is more fallible than we like to treat it as, and we tend to think more in terms of aesthetic and narrative than detailed recollection of facts. In this sense we are all artists.


Standardised Facts

From Seeing Like a State by James C. Scott, page 80:

State simplifications have at least five characteristics that deserve emphasis. Most obviously, state simplifications are observations of only those aspects of social life that are of official interest. They are interested, utilitarian facts. Second, they are also nearly always written (verbal or numerical) documentary facts. Third, they are typically static facts. Fourth, most stylized state facts are also aggregate facts. Aggregate facts may be impersonal (the density of transportation networks) or simply a collection of facts about individuals (employment rates, literacy rates, residence patterns). Finally, for most purposes, stateofficials need to group citizens in ways that permit them to make a collective assessment. Facts that can be aggregated and presented as averages or distributions must therefore by standardized facts. However unique the actual circumstances of the various individuals who make up the aggregate, it is their sameness or, more precisely, their differences along a standardized scale or continuum that are of interest.

This is a very interesting passage that I think I probably just nodded and skimmed over when first reading this book. I find that often the random page approach is a good way of highlighting such passages.

I don't have a huge amount to say about this passage right now (because I'm ill and it's hot and as a result I am quite stupid today) but I feel like it's especially relevant right now during the COVID-19 epidemic, where we're seeing a lot of this sort of thing. Death rates, hospitalisation rates, etc. are very much an example of standardised facts, and are eliding a lot of the weird and very unpleasant ways in which people's COVID-19 symptoms differ.


Politics as long periods of boredom punctuated by moments of sheer terror

From Democracy in Small Groups by John Gastil, page 145:

Consensus and democracy didn't always work to the satisfaction of the faculty and students at the Meeting School. Even as students obtained ever-greater power, they remaind reluctant to deal with many important issues.

I think an under-appreciated constraint on the running of human civilisation is just how incredibly boring well-functioning politics is. (2020 politics is of course, sadly, not boring).

Worse, it manages to be simultaneously:

I feel like this is a combination of factors that a democratic process is almost uniquely poorly suited for, which is I guess why we end up delegating so much of the running of a country to its civil service.

But there end up being all sorts of things which have these properties and you can't delegate to a civil service, because they require actual policy decisions, and I feel like we don't really have a good handle on how those work (or, at least I don't have a good handle on how those work. Is there some sort of ethnographic study of the UK parliament I could read?).

The worst combination seems to happen when a decision requires an understanding of all these boring details, but people who are not willing to believe that have democratic oversight over the decision anyway. This is basically how you end up with Brexit.

On the other hand, I know enough about how the sausage is made that I wouldn't want a country run entirely by the civil service either (I admit I might still prefer this to the current lot). Experts are, ultimately, still people, and those people do in some way need to be aligned with the wishes of the populace, at least to the degree that those wishes are reality-based and not actively murderous.

(OK, maybe I don't want much democracy in the UK, as we seem to be pretty bad at both halves of that)

I don't have any sort of solution here. I often describe the problem of how to coordinate large groups of people whose interests are only weakly mutually aligned as the fundamental problem of civilisation, so it's not surprising that I'm not going to solve it in a dashed off notebook post, but I do feel like managing this boredom aspect is more important than I had previously realised.

Unfortunately, currently we manage the boredom of politics by getting increasingly frustrated with it and, as a result, eventually deciding to make politics exciting, and that doesn't go so well.


Inclusivity and Onboarding

From Talking About Machines by Julian Orr, page 70:

The conversation of a closely cooperating group can be quite cryptic when the members are sharing information about work; they are considering a well-defined field which can be discussed with considerable economy, verging on code.

A thing I have been thinking about a lot recently is the trade off between making a community accessible and the community being able to serve the needs of its core members.

In Being Deep in an Abstraction Stack I talk about how you need a community of people who share your abstraction stack and there is really no substitute for this no matter how much we want one.

But unfortunately by creating such a community there is a certain amount of inherent gatekeeping: It is a community for people who have that abstraction stack, therefore if you do not have that abstraction stack you cannot participate in the community as a full member. I don't necessarily mean this as a moral judgement - sometimes gateekeeping is good - but there is nevertheless a certain amount of elitism implicit in the existence of such communities. In order to talk about our niche interests we need this cryptic shared language, and attempts to make the group more inclusive by avoiding that will also make it worse for us.

I don't really have an answer to that other than "deal with it". "I need other people to talk about my interests with" is a nice problem to have, but it's still a need and it's one that it is reasonable to satisfy, and creating slightly exclusionary communities is pretty much the only way to solve that problem..

Part of why I've been thinking about this is Weird CS Theory Discord. The opening paragraph includes the sentence:

You don’t have to “be” a computer scientist, or even know much computer science, to be here, but if you’re not interested in computer science you probably won’t enjoy the space.

This hopefully makes it clear that we're not looking to gatekeep (I originally typoed that as "gategeek" which, well, valid) based on formal qualifications, but nevertheless there is a reasonably high bar to participating in the space. People are welcome to join, and they're welcome to learn to become full participants through their experience in the space, but that is there.

But, at the same time, we're a tiny free community that I created to serve my particular need for more chatter about weird CS theory. We're absolutely not a computer science institution. We cannot invest a tonne of effort in onboarding people, the best we can possibly hope to do is to be patient and welcoming to people who are still in the process of onboarding themselves.

One problem that occurs, and is important to avoid, is when this lack of inclusitivity becomes considered a feature rather than an unfortunate trade off. The specialised language becomes not a useful shorthand but a shibboleth. I have tried to head that off by making it clear that asking us to explain ourselves is always valid:

[You are encouraged to] Ask questions. (“Can you explain that?” is always fine, though answering “No” is fine too)

We have a specific norm that is designed to give people the space to indicate that they need help understanding something, but also a specific norm that we should not exhaust ourselves trying to bring people in. If people want to bring themselves in, the door is open, but we can only help so much.

Unfortunately, this too is exclusionary: Many people don't have the time or resources to do that work themselves. I feel bad about this, but ultimately the world is still a better place for the community, and there is only so much I can do about it.