Theoretical computer science can be fun. But it can also be really annoying!
Around the time most schools were starting the 2014-15 school year I saw a post on Reddit in either a math or CS group from someone asking for how to solve a particular theoretical CS problem. The poster didn't say where the problem was from or why they needed a solution, but others quickly figured out it was from someone trying to cheat on the first homework set from COMS 331 (Theory of Computing) at Iowa State University.
No one helped and the poster deleted their post. I thought it was an interesting problem and gave it a try, expecting it to be a short but interesting diversion.
I didn't expect it to take too long because although I was not a CS major (I was a math major) I did take all of the undergraduate theoretical courses at my school and I got decent grades in them. That had been ~35 years earlier so I had forgotten a lot but the problem was from the first COMS 331 homework set so shouldn't require anything past the first week or so of that class, which should all be fairly basic stuff I would still remember.
I spent a couple days on it and got absolutely nowhere. Several times since then I've remembered it, thought about it for a few hours or a day and have continued to completely fail.
If anyone is curious, here is the problem:
Define a 2-coloring of {0, 1}∗ to be a function χ : {0, 1}∗ → {red, blue}. (For example, if χ(1101) = red, we say that 1101 is red in the coloring χ.)
Prove: For every 2-coloring χ of {0, 1}∗ and every (infinite) binary sequence s ∈ {0, 1}∞, there is a sequence
w₀,w₁,w₂,···
of strings wₙ ∈ {0, 1}∗ such that
(i) s = w₀w₁w₂ ···, and
(ii) w₁, w₂, w₃, · · · are all the same color. (The string w₀ may or may not be this color.)
Let us say that an index i is bad, if every finite subsequence of s starting at i is red (i.e. for every j ≥ i we have χ(s_i ... s_j) = red). Two cases:
Case 1: there are infinitely many bad indices. Here we go to the first bad index then the second, and so on. The colour of w₀ does not matter, and since subsequent words start at a bad index, they will all be red.
Case 2: there are finitely many bad indices. Then there is some k which is larger than all bad indices. We start by going to k (again, the colour of w₀ does not matter). Since k is not bad, there must be some blue word starting at k. We take that one and move to a larger index. Again, that index is not bad. We repeat this process to find our sequence.
Nice proof, similar to the one I posted just now but simpler -- you realised that we only need one special category ("bad" rather than "hard-red" + "hard-blue"). Gonna leave mine up though :)
I don't feel like doing a formal proof, but it looks like it should be a direct consequence of the pumping lemma:
"Informally, it says that all sufficiently long strings in a regular language may be pumped—that is, have a middle section of the string repeated an arbitrary number of times—to produce a new string that is also part of the language." [1]
Needless to say, this exercise would be trivial if you just covered the pumping lemma and its applications in class, and next to impossible if you never heard of it.
PS. I took 15-251 back when it was 15-299: a brand new class without a regular number assignment. Honestly, I would have enjoyed it a lot more now than I did back then. But several assignments still stand out for me, in particular "Building from scratch" [2]. Trying to get some of that feeling now, working through Turing Complete[3] with my daughter.
My first idea is to think of this as the following game. Initially we consider all words uncolored and I start by picking an initial set of red words that can color any infinite binary sequence in red. Then you get to take away one of my words and make it blue. After that I get to pick replacement words from the remaining uncolored words to fix my set. Rinse and repeat.
I start with { 0, 1 } which will obviously do. You take - I guess without loss of generality - my 0. I pick 00 and 01 as replacements and end up with { 00, 01, 1 } which will work again. You take away my 01, I replace it with 010 and 011 and { 00, 010, 011, 1 } will still work, I guess, but it is certainly becoming less obvious. In general I will pick x0 and x1 as replacements if you take x away which will allow me to still color x, I just have to choose between x0 and x1 depending on the following bit.
If I am not mistaken, you will have to take away an infinite set of words from me in order to prevent me from being able to color any infinite binary sequence in red, if you take only finitely many, I will be left with a set of red words that still works. My guess is now that if you take away that infinite set in order to stop me, you will accidentally assemble a set of blue words that will be able to color any infinite binary sequence blue. Unfortunately I do not have the time right now to properly think this through.
I am also not too confident because I do not clearly see how being allowed to have the first word in the wrong color comes into play. On the other hand, maybe, maybe you need my 1 for sequences starting with one or something like that.
EDIT: Just had an additional thought, this might actually work. If you take my 0, then you can not also take my 1 later on or you will end up with { 0, 1 }. If you take 01, then you can not also take 10 later. So you always have to take one of my two latest replacements and leave the other one to me forever. We will build complementary sets, you have 0, I have 1, you have 01, I have 10, you maybe 011, I 010, ... Now it seems quite plausible that this will work out and we end up with two complementary set that both can color all infinite binary sequences in [mostly] a single color.
I have thought a bit more about this since. Arrange all words in a rooted binary tree with the empty word Ɛ at the root and each vertex w having color χ(w) and children w0 and w1. Each cut [1] through the tree yields a set of words that can be used to cover all infinite binary sequences. Therefore, if there is a monochromatic cut, then we are done. To obstruct a monochromatic cut of color A, there must be a monochromatic obstructing path of color B from the root all the way down to infinity. To obstruct red and blue cuts, we need a red and a blue obstructing path, and you gave an example. So additional work is required to show that the task is also possible if the coloring contains a red and a blue cut-obstructing path.
[1] With cut I mean a set of vertices where no vertex in the cut is a descendant of any other vertex in the cut. Informally, repeatedly pick a vertex in the tree and remove the subtrees rooted at its children, the leaves of the remaining tree are the cut.
Interesting problem! I've never formally studied computer science, and my math degree is far behind me, so I probably won't be the one to help you solve this, but I'm commenting to at least remind myself to check back here.
Some thoughts that immediately spring to mind (apologies if you've already thought of these - just getting the brain flowing!):
* We only need consider colorings for which 0 and 1 are differently coloured. If 0 and 1 are the same colour, then we're trivially done - let w1, w2, w3, etc. all have length 1
* this smells like it'll be some sort of combination of induction and contradiction? E.g. "assume this property is true for w1w2w3...wn-1, and that wlog they are red. Assume there is no wn that can continue the sequence of red substrings - thus, all strings starting from the end of wn are blue. If that's the case _and_ if we can convert the string [w1...wn-1] into blue substrings, then we're done. So show that there is some contradiction proving that impossible (maybe using the observation I made above that we only care about cases where 1 and 0 are different colours)"
I suspect that if you have a sequence w_0, ... w_n with the property but all possible remaining words in your sequence following w_n are the wrong color, then you can use set w_0' = concat(w_0, ... w_n) and declare victory.
You can't induct for this problem because induction will only show a property for all finite strings. Arbitrary-length finite strings, but still finite.
The following proof is much tighter than the meandering process I followed to get to it. Let me know if the meandering would be interesting!
Call a position i in s "hard-red" (respectively, "hard-blue") if every positive-length substring of s beginning at i is red (respectively, blue); otherwise call i "soft". A position is "hard" if it is hard-red or hard-blue.
There are 3 possible cases:
1. s has no hard positions.
2. s has a positive but finite number of hard positions, with the last being i.
3. s has an infinite number of hard positions.
Case 1
Every position is soft, meaning that if we start a substring of s at that position and continue to grow it by appending digits from s, eventually (after a finite number of steps) the colour of the substring will change. So we can set w₀ arbitrarily to the first digit of s, then grow w₁ from position 2 of s until it has the same colour as w₀. Then we can grow w₂ from the next digit in s until it has the same colour, and so on. This results in all words having the same colour, including the first.
Case 2
Set w₀ to the first i-1 digits of s, and w₁ to the i-th digit of s. All positions > i are soft, meaning that, as for case 1, we can repeatedly grow substrings by appending digits from s until the substring turns the same colour as w₁.
Case 3
Since s has an infinite number of hard positions, it must have an infinite number of hard-red positions, an infinite number of hard-blue positions, or both. Suppose w.l.o.g. that it has an infinite number of hard-red positions (it may or may not also have an infinite number of hard-blue positions). Define p(k) to be the k-th hard-red position in s. Set w₀ to the first p(1)-1 digits of s, and for k >= 1 set word w_k to the substring of s beginning at p(k) and ending at p(k+1)-1. w₁, w₂, w₃, ... all begin at hard-red positions, so are all red. □
The cardinality of the set if possible infinite binaries sequences is uncountable. My gut tells me there must be a way to assume the negation and construct a bijection between the naturals, leading to a contradiction.
This would be an upper level undergraduate class, probably juniors and seniors.
You would have taken courses in discrete mathematics and data structures and algorithms as prerequisites, and have familiarity with proofs and some degree of mathematical maturity.
You build yourself up to a class like this by taking all the prereqs, same as any other high school graduate.
> I know that I have no chance in real computer science department.
Don't say/believe that and limit yourself. You just need to find the right books and slowly educate yourself (never mind what others say). I have spent a lot of time collecting and reading books much of which i still don't "grok fully" but what i do understand is intellectually very stimulating and gives me an edge over the competition (when needed in the industry).
Life can be much broader once you discover one simple fact. And that is, everything around you that you call life was made up by people no smarter than you (Steve Jobs) - https://www.youtube.com/watch?v=kYfNvmF0Bqw
What one man can invent, another can discover. - Sir Arthur Conan Doyle (via Sherlock Holmes).
There is no prison as strong and unbreakable as the mental prison you choose to build and stay in - Me :-)
Can this be solved with a proof by contradiction? Let's assume that this is impossible for our sequence. There must then be at least one (sub)word of the wrong color. For this to be fatal, it must be impossible for any word containing it to be of the correct color while having the rest of the sequence (after and before that word) continue to be the same color, no matter how large we make this word to the right, because if this wasn't true, we could then simply use this bigger word and there would be a contradiction.
If this is true for more than one word, the problem can be solved by simply inverting the chosen color and using two larger words containing both subwords, which will always be of the same, originally wrong color, which leads to a contradiction.
Otherwise, if this is only true for one subword, and therefore the initial sequence and the rest of the sequence around cannot be made to be worded correctly while the subword is contained in a word of the correct color, but that the rest of the sequence can be covered with words of the same color, we can simply include this problematic subword inside w₀
In either of the three cases 0, 1 or 2+ "toxic subwords", it is always possible to find words of the same color covering the sequence. Therefore, there can be no sequence for which it is impossible to find a suitable w₀w₁w₂ ··, and the original proposition is proven.
Please tell me if you find any issue with this approach!
Around the time most schools were starting the 2014-15 school year I saw a post on Reddit in either a math or CS group from someone asking for how to solve a particular theoretical CS problem. The poster didn't say where the problem was from or why they needed a solution, but others quickly figured out it was from someone trying to cheat on the first homework set from COMS 331 (Theory of Computing) at Iowa State University.
No one helped and the poster deleted their post. I thought it was an interesting problem and gave it a try, expecting it to be a short but interesting diversion.
I didn't expect it to take too long because although I was not a CS major (I was a math major) I did take all of the undergraduate theoretical courses at my school and I got decent grades in them. That had been ~35 years earlier so I had forgotten a lot but the problem was from the first COMS 331 homework set so shouldn't require anything past the first week or so of that class, which should all be fairly basic stuff I would still remember.
I spent a couple days on it and got absolutely nowhere. Several times since then I've remembered it, thought about it for a few hours or a day and have continued to completely fail.
If anyone is curious, here is the problem:
Define a 2-coloring of {0, 1}∗ to be a function χ : {0, 1}∗ → {red, blue}. (For example, if χ(1101) = red, we say that 1101 is red in the coloring χ.)
Prove: For every 2-coloring χ of {0, 1}∗ and every (infinite) binary sequence s ∈ {0, 1}∞, there is a sequence
w₀,w₁,w₂,···
of strings wₙ ∈ {0, 1}∗ such that
(i) s = w₀w₁w₂ ···, and
(ii) w₁, w₂, w₃, · · · are all the same color. (The string w₀ may or may not be this color.)