I think the easiest way to think about it is that it's gradient descent, it's just a slightly weird form of it over a discrete space.
If you imagine you've got some neighbourhood function N(x) that takes a program and returns a large sample of valid programs that are "similar" to x (its neighbourhood), then your basic test-case reduction algorithm is just:
1. Start with some initial test case.
2. Look at the neighbourhood of your current test case. If any elements of it are both interesting and smaller than your current test case, move to one of those. If not, stop.
3. Return the smallest interesting test case you've seen at the end.
And most of what varies from test-case reducer to test-case reducer is that neighbourhood function (how you explore the neighbourhood also varies, but you can think of that as an implementation detail that mostly only affects performance).
This approach will generally work pretty well because the sort of bugs you tend to find are "dense in program space". This will e.g. do nothing if you start from the property that the file has a particular sha1 hash, but if you start from a property like "this uses this particular pair of language features", most nearby programs will share it.
The nice thing about doing test-case reduction is that it doesn't actually matter if your neighbourhood contains only valid programs, because you can rely on the interestingness test to filter out any invalid program. So all you care about really is:
1. It contains a fairly large set of valid programs among the test cases you consider.
2. It's not too large so as to be prohibitive to explore.
And this frees you up to just try a bunch of heuristics that are likely to work pretty well, and it turns out that there are a lot of easily accessible ones. In particular, deleting contiguous chunks of a program pretty often produces a valid program, which will always be shorter.
For a trivial example, imagine you've got something like:
if True:
blah()
in a Python program. It might look like you need to know about Python syntax to reduce this to just `blah()`, but in fact you can reduce it by deleting the 13-byte string `"if True:\n "`. So if your reducer is happy to brute force try all short strings, it will be able to make this transformation.
This isn't a real example exactly. C-reduce I think won't find this particular one (but I might be misremembering its exact heuristics here). Shrinkray will, but it will do so by understanding Python - it stops at 10-byte sequences, which won't work here. But it demonstrates the sort of thing that can work surprisingly well without understanding syntax.
Another thing you can do is you can understand just enough syntax in a general purpose way. For example, shrinkray (and I think C-Reduce) will look for integer literals in the program and try replacing them with integer literals representing a smaller number. In order to do this, it doesn't have to know anything about the program's lexical syntax, it just has to know what a number looks like.
Heuristics like this will often fail to work, but that's OK, you just need them to succeed enough, and often some pretty crude stuff will get you down to something that is small enough that a human can get it the rest of the way.
If you imagine you've got some neighbourhood function N(x) that takes a program and returns a large sample of valid programs that are "similar" to x (its neighbourhood), then your basic test-case reduction algorithm is just:
1. Start with some initial test case. 2. Look at the neighbourhood of your current test case. If any elements of it are both interesting and smaller than your current test case, move to one of those. If not, stop. 3. Return the smallest interesting test case you've seen at the end.
And most of what varies from test-case reducer to test-case reducer is that neighbourhood function (how you explore the neighbourhood also varies, but you can think of that as an implementation detail that mostly only affects performance).
This approach will generally work pretty well because the sort of bugs you tend to find are "dense in program space". This will e.g. do nothing if you start from the property that the file has a particular sha1 hash, but if you start from a property like "this uses this particular pair of language features", most nearby programs will share it.
The nice thing about doing test-case reduction is that it doesn't actually matter if your neighbourhood contains only valid programs, because you can rely on the interestingness test to filter out any invalid program. So all you care about really is:
1. It contains a fairly large set of valid programs among the test cases you consider. 2. It's not too large so as to be prohibitive to explore.
And this frees you up to just try a bunch of heuristics that are likely to work pretty well, and it turns out that there are a lot of easily accessible ones. In particular, deleting contiguous chunks of a program pretty often produces a valid program, which will always be shorter.
For a trivial example, imagine you've got something like:
if True:
in a Python program. It might look like you need to know about Python syntax to reduce this to just `blah()`, but in fact you can reduce it by deleting the 13-byte string `"if True:\n "`. So if your reducer is happy to brute force try all short strings, it will be able to make this transformation.This isn't a real example exactly. C-reduce I think won't find this particular one (but I might be misremembering its exact heuristics here). Shrinkray will, but it will do so by understanding Python - it stops at 10-byte sequences, which won't work here. But it demonstrates the sort of thing that can work surprisingly well without understanding syntax.
Another thing you can do is you can understand just enough syntax in a general purpose way. For example, shrinkray (and I think C-Reduce) will look for integer literals in the program and try replacing them with integer literals representing a smaller number. In order to do this, it doesn't have to know anything about the program's lexical syntax, it just has to know what a number looks like.
Heuristics like this will often fail to work, but that's OK, you just need them to succeed enough, and often some pretty crude stuff will get you down to something that is small enough that a human can get it the rest of the way.