Gvildenu | Mintys / AutomatųLaipsnynas

See also: AutomataTheory, NPComplete

Hi Joseph, Tom, Sean,

I suppose I hope to get your attention. I think I have thought through a novel way to approach automata theory, one that opens up new domains in mathematics, and also metaphysics.

Tom Munnecke wrote to the Blue Oxen collaboratory, presumed copyright: http://collab.blueoxen.net/forums/yak/2003-11/msg00110.html "Re: Chomsky, transformational grammars, etc. I have a fuzzy notion (actually all of this is fuzzy) of a "maturity model" for domains of discourse that they begin with finite state models (pigeonholing thoughts into predefined addresses - poke the address and out comes the thought)... then move to generative grammars (common deep structure with arbitrary number of transformations to a "surface structure.") then moving to self-referential evolutionary structures (crossing an "organic threshold" of being cat-like instead of toaster-like)."

At UCSD, I took a graduate course in automata theory, based on "Introduction to Automata Theory, Languages, and Computation" by John E. Hopcroft and Jeffrey D. Ullman, and was quite intriguted by the Chomsky hierarchy of the classes of languages generated by different kinds of automata.

Philosophically, I rejected the hodge podge approach to defining the classes, and looked for a universal, elegant, and metaphysically profound way to distinguish the classes, and the kinds of automata.

I believed that you could actually look at the production rules for a language and say: here the automata is, qualitatively, behaving like a finite automata, or a push-down automata, or a Turing machine. I believed that you could do this qualitatively, for each production rule. For example, you could point to a Turing machine and say: here it's only behaving as a finite automata. My teacher rejected this approach on the pedantic grounds that the definitions only made sense for the language as a whole, not for particular calculations.

I have found a way to formalize my intuition!

I use matrices. Only three. One matrix X for the variables. It's a diagonal n x n matrix, so the off-diagonal entries are zero. The diagonal entries are either 0 or 1, but for now they are n variables, the n states X(ii). Another n x n matrix A to define the alphabet of the language. I realized that, for my purposes here, I could by introducing as many variable as needed, working with matrices as large as needed, assume that the production rules for a finite automata are all of the form X(ii) -> A(ij)X(jj) where A(ij) is a unique letter in the alphabet, and may also be 0.

In my formalism, we will see that all of the words that result are strings that concatenate the letters A(ij) and always according to the rules of matrix multiplication!

The secret is in the interpretation! The four classes are generated by the following rules for matrix multiplication:

X -> AX finite automata, regular languages X -> AXA push-down automata, context free languages XA -> AX linear bounded automata, context sensitive languages XA -> X Turing machines, recursively enumerable languages

In other words, thinking of A as the generic fixed instructions that take you from one state X(ii) to another X(jj) by unique and deterministic step A(ij), and which you might specialize by setting A(ij) to 1 or 0, the above rules for matrix multiplication define the production rules, both locally and globally the character of the language.

Here in matrix multiplication I'm using + to mean "or", and the multiplication is to be understood noncommutatively, so that it is generating all manner of strings. The actual language generated will be given by a subset of strings taken by setting X(ii) = 0 and A(ij) = 0 for the nonexistent states and paths. We set the remaining states X(ii) = 1 for those that are actually used by the particular automata, and then the strings of A(ij) that result will be understood as those strings that may be accepted by the automata and are part of the language it generates.

Finite automata are exactly like matrix multiplication X -> AX in that they can only look ahead to see where the current state might take it next, and accept a letter in traveling to the next state. Note we only allow multiplication from one side. We can see clearly what kind of elements will be created, anything of the form AAAAAAAA which is to say, things like A(ij)A(jk)A(kl) etc. these are paths.

Pushdown automata make calculations based on whatever is the top tray on the stack, but they can take trays off, not just put them on. They are equivalent to context free languages which have production rules of the form Y -> cZd, which is to say, a rule like Y -> 0Y1 can generate languages like 01, 0011, 000111, 00001111, which finite automata can't because the latter don't have any memory. I noticed that in my context I could express such rules as X(ii) -> A(ij)X(jj)A(ji), or for example even X(ii) -> A(ii)X(ii)A(ii) as here I am using the location of the X as the cursor position to let me know that the lefthand A(ii) is a c, and the righthand A(ii) is a d, and distinguish them. So this should give me the full power of context free languages, and is compatible with what we have for finite automata. Again, this is deterministic, I can work backwards and see how the calculation came about. And we're still dealing with matrix multiplication, pretty much in the normal sense, except that now there is a cursor located somewhere within the middle of the path, not just at the right end.

Linear bounded automata are a special case of Turing machines. What they have in common is the ability for the cursor to carry letters to another location in the string, which is to say, shift them over to a workspace. I couldn't find it in the book, but when I was a graduate student I somewhere found out that the key production rule is of the form Xcd -> dXc which if you notice is a three-cycle and it's purpose and power is to move the c across a bunch of d's, so that Xcddddde yields dddddXce. So now the X has moved the c over next to the e, into a different context. That certainly complicates things with record to our matrix multiplication! But it's actually not a disaster, given the right interpretation, if we know the cursor location! This is because I think we can assume that the cursor only moves right! and never back! This certainly gives us something more powerful than just the pushdown automata, and I believe as powerful as linear bounded automata. You have production rules of the form X -> dXd that let you increase the size of your workspace as much as you like, and as complicated you like, on your right. And you can keep moving further to the right. The trick now is to notice that X is a diagonal matrix. This means that from the point of view of matrix multiplication, AX and XA define the same paths! if thought of as locations visited. I mean to say that a term A(12)X(22) and X(11)A(12) have the same information in terms of path, that information is not lost! Note that the value of the edge from A does not change! So no information is created! Only the value of the cursor changes! All we've done with our strings is now add an "interpretation" that X can move right as far as it likes and then takes on the value of required by its location. Also, certainly, the complication is greater, we are able to move the cursor and therefore generate new kinds of strings, and also we are here now finally able to generate the same string but in more than one way. Still, matrix multiplication continues to generate correctly the set of strings that are accepted.

Finally, the extra power of Turing machines is that they allow us to delete letters, to destroy information! Generally, in the theory, if we have a production rule from one string of letters and variables: w1...wN -> v1...vM then so long as the strings keep getting longer, we can only have the complexity of the linear bounded automata. In order to have greater complexity, we have to have shrinking strings! In other words, we have to have deletion, erasure. Well, the simple way to do that is XA -> X, simply allow a letter next to a cursor to disappear. Note, we can even do this only on the right, on our simple side. The reason is that it destroys our knowledge of where our cursor was. Earlier we were able to use the information that we were generating as AXA as a kind of memory to let us figure out where the cursor was supposed to be centered originally, and also distinguish what was created by the finite automata and by the push down automata. But now everything collapses. We have all these locations where the information we've been needing for global orientation has disappeared. We can no longer keep track of what's up, the evidence can be destroyed. But note that every thing that we're generating is being accepted! There's nothing wrong with the matrix multiplication in that regard. It's simply that we're not able to necessarily tell if a particular word is generated by the production rules, because it may have been shrunken from something very complicated. But certainly we know that the production rules, as interpreted above as matrix multiplication, are generating them!

Writing this up, there's a lot of details to prove that might not be hard for an expert of the field, but if this viewpoint is correct, it's a lot more elegant and unified than the book I have. And it opens up the use of some very powerful mathematical tools, coincidentally, from my thesis.

My thesis was on Symmetric Functions of the Eigenvalues of a Matrix. I noticed that the determinant of a matrix is the product e1 x e2...x eN of its eigenvalues, and the trace is the sum e1 + e2 +... + eN, and these are both symmetric functions. Well, symmetric functions are the foundation of combinatorics, the math of counting, listing, generating objects. The basement of mathematics, at least that's why I studied it. For a general matrix, you can't know the eigenvalues, but... you can always calculate the determinant, trace, and in fact, all of the symmetric functions of the eigenvalues. And I noticed they generate very basic combinatorial objects: walks, cycles, Lyndon words, but all built from edges from i to j, instead of blocks i. In fact, setting the off-diagonal edges to 0 lets you recover the eigenvalues, they become the diagonal elements, and so you get as a "trivial" special case, all of the combinatorics of the usual symmetric functions. Furthermore, traditional theorems for matrix combinatorics all seemed to be interpretations of trivial statements arising from this perspective.

Well, here we're only dealing with two very general matrices, A and X! And everything we're ultimately generating is of the form AAAAXAAAAAA. Well, that is a matrix whose edges are interpreted as paths. And along that path is a cursor! given by the diagonal matrix X. So we get the usual kinds of objects but replacing them with paths with cursors. Note also that the eigenvalues of AB are the same as BA, and we know the eigenvalues of X!

Another thing is that this approach can be taken up with other classes of automata, for example, the P = NP problem. Note that here the length of an input word is very clear, it is the number of letters. Whereas the the complexity is, I imagine, given by the number of production rules. Well we see immediately that the finite automata and pushdown automata can't cause any great problems. Nor can, for that matter, the linear bounded automata, because we can move to the right only to the extent that we've been generating letters on the right with the help of the pushdown automata. So the real source of complexity comes from deletion! In fact, in a sense it is given by the number of deletions, because it's equal to the extra number of additions, and by the number of extra lateral movements between deletions.

Well, the matrices X, AX, AAX, AAAX, etc. all have different sets of eigenvalues. But lateral movement doesn't affect that. So this approach offers a way to keep those two issues separate, which they need to be, if we're to do good counting.

I feel good about this. And if it is the least bit sound, then it has a lot of potential for the metaphysics that I work on, because these production rules have a very similar kind of form to basic operations, structures.

Am I on track? or way off?

Actually, I only used two. (Sorry!)

The reason I wrote three was that I was thinking of inserting an identity matrix to mark the deletions. Then it can be ignored later. But for my purposes here I realized it was simpler just to do without it.

I noticed a mistake... in the Turing machine part, the deletion rule XA -> X may not accord with what's happening on the element level. Or maybe. On the element level we have X(ii)A(ij) -> X(ij) which is nonzero only if i = j. But that may not be a problem. It's only a problem if X(ii)A(ii) -> X(ii) is not strong enough to create full strength Turing machine behavior. An alternative would be to define an operation that would collapse, delete not one element, but the neighboring cycle. That would be very powerful, but of course, messier. It would fit well, though, with the eigenvalue approach, which is all objects built from cycles.

So the overall idea of relying on matrix multiplication still stands, I think.

Also, I forgot to mention that typically the N and NP problems are all about relationships between nodes, and so might have very natural expression in terms of the edges of matrices.

The set-up makes it straightforward to distinguish between Nondeterministic machines (possibly multiple ways out of each state) and Deterministic machines (at most one output of each state). I've set up for the nondeterministic case, and I can get the deterministic case by restricting myself to the machines that are of the shape of permutation matrices. In the eigenvalue theory this is quite a profound restriction because it reduces the homogeneous symmetric functions (which generate products of all strings) to products of disjoint cycles (which is what is generated by the elementary symmetric functions, but the latter are signed). In other words, an arbitrary string of length n gets mapped, collapsed to an n-cycle.

The correct way to define the Turing machine effect, I now think, is now deletion of the cycle to the right. This is a very simple and elegant operation in this combinatorial context, and this kind of operation comes up in my thesis a lot. The cycle may be of arbitrary length. I imagine that each cycle deletion may, from the point of view of complexity, be like increasing the complexity by a power, the next polynomial up.

Shannon, so I've found a unified way to think of four different kinds of production rules. They have different roles in the dynamics of language: - The finite automata are the "natural laws" that keep generating stuff left of the cursor. - The pushdown automata, context free grammar, offer the basic parsing capability for dealing with pairs of markers, like left and right parentheses, working in a tree like fashion. They let us break down a meaning, in the primitive sense. In doing so, they also open up a workspace on the right, balanced with what they are generating on the left. - The linear bounded automata, context sensitive language, serve to move the cursor into the workspace on the right. So this means that the parsing ability can be moved to a new context. In other words, a meaning can depend on a context. - The Turing machine get its power from its ability to delete a cycle on the right. Any such cycle was created by the pushdown automata, which in this set up is the only thing that ever creates something on the right. So on the right we have combinations of creating cycles, moving cursors and deleting cycles. The net result is that we get situations where we may break down one cycle and build up another. We may, with each production rule, be "overspeaking" or "underspeaking", which is to say, building on the last deletion, or building towards the next deletion. In this way, we have the natural kind of ambiguity in language between underspoken and overspoken.

Nobody argues about the power of Turing machines in terms of sheer computational reach. I think we're mostly lacking in a cognitively elegant way to present Turing machines. That's the kind of thing that intrigues me, how to relate the intuitive understanding behind each kind of computation with different kinds of metaphysical first principles.

Joseph, for example, I imagine the above distinctions are relevant to "responsibility" as we apply it in life. There is a kind of "finite automata" responsibility that is one-step at a time, where we are simply forced to choose, or even not choose, at each step, what to do in the current situation. There is a "context free grammar" type of responsibility that pushes out for us a workspace, so that we can "take on responsibility", connect an act of intent that will be our mark in the past, with an intended action that we mark out in the future and will become real at some point. Note that it is a two-step responsibility. There is a "linear bounded automata" type of responsibility by which we ourselves move into that future we've projected, where the three-cycle "take a stand, follow through, reflect" works to move our state (our stand) adapted but pretty much intact across many experiences, until we may end up in an entirely different context where completely other things may be relevant. Finally, there is a "Turing machine" type of responsibility that says that we can have many long chains of thought that we simply delete, they will be truly forgotten, even though they may have counterparts in our past. They are subroutines that have done their job and we don't want to keep track of them.

So this all shows how we can think of time as what is on the right of our cursor. Finite bounded automata (production rules) have time flow at us one-step at a time, and recede to the left, into the past. Context free grammars let us construct our own time as a workspace where we can posit intended actions that match our acts of intent in the past. Linear bounded automata let us move with our principles into that workspace of the future. Turing machines let us delete portions of that future as simply thoughts to be forgotten and erased, although leaving counterpart experiences in the past, the acts of intention with which those thoughts were generated. In each case, we have thoughts in the future to the right of the cursor, and actions that have happened in the past, to the left of the cursor.

All of this with just two matrices, X(ii) and A(ij) and four production rules: X -> AX X -> AXA XA -> AX XA -> delete cycle to the right

These are all just the right simplicity to be expressed metaphysically from first principles.

I think of God as the unity of representations of everything. And one property of everything is that it is the simplest algorithm, the one that accepts all things. So I wonder what that means here.

My purpose is, of course, to understand life, and so my thoughts on automata are simply one more means towards that end. Certainly, "algorithmic" thinking is part of human life, not necessarily all of life. But the qualitative distinctions of the kind I made about responsibility: - dealing with things "now" as they come - depositing "notes" that you will deal with something later - taking up and holding "principles" that you carry across changes in context - being able to put aside chains of thinking that have come full circle. These are ways of thinking about human experience that predate automata, and instead, express some of the ways that we as humans have dealt with life systematically, and in the historical sense, are plausibly what have manifested themselves in the kinds of cultural systems that we find ourselves living in.

The model I'm working out brings all of the above together in a unified framework, and it already has some nice consequences. One of them is that I have a Turing machine equivalent that makes no use of an infinite tape! Instead, the "tape" is whatever workspace is rolled out by the context-free-grammar that is depositing the notes. So this resource is something that is explicitly accounted for.

Also, I have a Turing machine equivalent where the "cursor" only moves to the right, never to the left! And the only elements that can arise on the right are generated by the context-free-grammar. This means that the "future" is extremely small. It is constructed only by "depositing notes". And the complexity of the future is increased by reducing those notes, either by "moving into the future" and dealing with them proactively, or by striking out whole circles of them. The "smallness" of my "personally constructed" future seems a very real factor of human life.

The "interaction" is there. It's not between people. It's between the individual and their environment, much of which is personally constructed. What the model says is that the "real world" only effects us in ways that a "finite automata" experiences, ways that we can "deal with right now". All the rest is personally constructed, personally accepted. Any future that we acknowledge is only because of self-obligations that we have posited. And the way that we may simplify our future is A) to deal with those obligations by moving into them with a principle, or B) to dismiss sets of obligations that we find go nowhere, go back to where we started, cancel themselves out.

That may not be all of life, nor all of experience, but it seems a pretty straightforward account of a lot of down-to-earth reality. I think, compatible, with Buddhism, from what I read.

Another fantastic feature of this framework, if it hangs together, is that for the automata it defines, it deals with ALL automata at the same time. In other words, it is saying what the FULL automata would generate, and then every particular automata could be gotten by setting A(ij) = 0 for any nonexistent production rules. Furthermore, it maintains, on the left, a copy of every rule ever applied! (Just as in real life, regarding the past!) All of the power of the linear bounded automata is gotten just by shifting the cursor, and all of the power of the Turing machine is gotten by deleting cycles, which is all to say, by erasing certain elements, thanks to the ambiguity that results.

Furthermore, that a word has been "accepted" is given as soon as it is generated. This fits nicely with the picture of partial recursive functions, where we have to keep waiting and typically never know that they will "halt" for a particular word, which is to say, we never know if a small word will not be generated later, simply because it may result from deletions from a very large word.

If we think of the amount of resources used, then this framework does a good job of keeping on top of that, they are simply the number of production rules used. I should clarify something, that because each word is keeping a copy of all the stuff generated on the left, then a word can hardly to be said to be "short", they keep getting longer, unless we restrict our thinking to the nub on the right. But we can have another kind of "acceptance" that says a word is accepted if it ever appears to the right of the cursor. And this is perhaps the framework in which to study the use of resources: Given n, and automata A, what is the function f(n) that gives the number of production rules needed to have generated all of the words of length n that will be generated?

Also, we can specialize the matrix in various ways, for example, plug in "1" for various matrix edges at the end, which completely changes the set of words accepted, can collapse them into sets that are more recognizable.

Metaphysically, the matrix is about as simple, pure, powerful, basic object that there can be, if we think of A(ij) as a shift from i to j. Note that we don't even have to have a notion of integer.

Regarding modeling humans, I want to add that I think theoretical automata are much more directly "human", and reflective of human thinking and experience, then any practical contraptions, or networks of contraptions, by which we make computations. The theoretical automata are pure constructs of the human mind, reflections of it. The practical contraptions are much more distant in that regard. The theory clearly shows that the power of computation comes from deleting information, not from generating more. This means that the intelligence of contraptions can arise only from its "failures", or its ability to write something off, allow something else to deal with it.

Joseph, Thank you for pointing me to Husserl and Heidegger. I have Heidegger's "Being and Time" which every so often I pick up and put back down again, so it's good to left bracket that. Saturday at the bookstore I bought Karol Wojtyla's "Acting Person (Analecta Husserliana)" translated in Lithuanian. (He's since Pope John Paul II, he was no slouch as an existentialist philosopher. There's a question, though, of how much of this book was rounded out by an editor.)

I try to write my thoughts more simply. And I also include some speculations on the connection between "removing cycles" and "polynomial use of resources" especially given that the tough problems seem to be the ones without "sub-cycles". Andrius, http://www.ms.lt

It is cool, something nice to have in your mental tool set. Joseph wrote a bit skeptically, and he is in the heart of computer science, so I can't trump him in that space. But I can draw on wider things, and a lot of imagination.

It's exciting when any discipline, including computer science, introduces concepts and observations that touch upon our intuitions about human life. In that sense, we are modeling our experience.

In developing computer science, it has been observed that there are, intuitively, several different ways to think about computation, and they can be considered in classes of different capability.

A simple computer is one such that, given state A, you find fixed instructions that tell you what states B you can go to. For example, a state may be a valid position on a tic-tac-toe board. You can describe a "finite state automata" that would tell you, starting from the empty board, how to make valid moves to valid boards, and ultimately get to all the final boards. You can also describe a "finite state automata" that would play well for one side, or for another side, or in whatever particular way, so long as it is fixed. If there is one or no choice at the state, then it is called deterministic, and if you allow more than one choice, it is called nondeterministic.

It was noted that we can study the "languages" that consist of all possible paths through the machine, we can talk about the paths "accepted" by the machine. These are called regular languages. All tic-tac-toe games is a regular language, but so is, I think, all chess games. And even, all chess games where one or both players play without mistakes. Simply because there are only so many chess positions. You can return to the same chess position, but these would simply be loops, cycles.

It turns out, that we can imagine computers that are smarter than that! For example, consider the set of parentheses expressions, consisting of left and right parentheses that are balanced, such as: (()()) It can be shown that this is not a regular language! Basically, in order to "accept" such a string, you need memory, you need to remember to close each parentheses that you open. And there is a different kind of computer that can do this, it is more powerful. One formulation for it is a "push-down automata". I don't understand it too well, but you have a "stack" of memory, like a stack of cafeteria trays, that you build by laying down on top of each other, and then they are in "memory" so that you can go back to them, as you take them off. So, for example, (()()) might be generated by saying that ( means put down a tray, and ) means pick it back up. So (()()) would be: put down, put down, pick up, put down, pick up, pick up. And in this spirit, each tray can be a different color and come with instructions that tell you what you are allowed to do next, for example, pick up or put down trays if they happen to be a certain color. The languages are given by the sequence of trays you go through, and called context-free languages and Chomsky noted that there is something similar going on in human language, the way we build sentences and have to keep track of what we're up to syntactically. This is also important for building parsers, for parsing commands. But it's also important in everyday life: I can "make a commitment", which opens a left parentheses that waits for a right parentheses.

You have to go down that stack in order. (Just a side note: Jesus of Nazareth had a cryptic saying, "The first will be last, and the last will be first", and that's pretty much what's going on here: the first tray put down will be the last tray taken off, and the last one put down will be the first one taken off, unless you put even more down.)

Well, there is a more powerful computer, a "linear bounded automata", which we may say lets you move your position on that stack. We may think of this as a magic tray that you can move up and down the stack. What this does, is it allows you to reference how big is your stack. You can have that magic tray go into a "gliding" state where it is effectively bringing with it another tray of a particular color, perhaps leaving it at the bottom. In this way, it can effectively "count" the number of trays in the stack, and effectively void certain numbers of trays by entering an "unhappy" state. This means that you can generate languages such as "a to the 2 to the i": {aa, aaaa, aaaaaaaa, aaaaaaaaaaaaaaaa, aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa,...} which is growing exponentially. A normal pushdown automata couldn't do that because it observes the "pumping lemma". Note that all of these computers have only finitely many states and instructions. But they may typically accept strings of arbitrary length. The pumping lemma observes that, when a simple computer, like a finite state automata, or a pushdown automata, is found to accept a sufficiently long string, then we can show there must be some pattern in that string that can be repeated, like a do loop, so that all such are likewise accepted. But in the case of the linear bounded automata, it can manage to break any linear pattern, any rhythm, by going into its own gliding modes, which lets it do its own internal counting. The languages it accepts are called "context sensitive languages".

Finally, the most powerful computer known, which has all manner of equivalent definitions, is called a "Turing machine". It can model absolutely every computation that we can describe. And there are many that it models that we can't even describe for practical purposes. It can go on calculating and we can never be sure if it will actually halt or not, which is to say, we may never be sure if it actually accepts a word, or not. The heart of the power of the Turing Machine is that it can destroy evidence. Imagine being allowed to delete part of the history of the trays that you've laid down. That pretty much makes it effectively impossible to keep track of all of the ways things may have been developing. It means that a very simple situation may come from a very complicated situation. Whereas with the "linear bounded automata" you always know that things are growing in complexity, time flows in one way, so to speak, and you know when you can rule things out, when they're surely not going to happen, especially for short strings. With Turing machines it's possible that you may never know for sure.

Mathematicians approach this by talking about a hierarchy of classes of languages. But, as somebody modeling life, I know that's a locally evident quality, not only globally evident. To the extent that this is helpful to model life, we should be able to say that "here the machine is acting with less sophistication, and here with more sophistication", or "this is a less sophisticated instruction, this is a more sophisticated instruction". Mathematicians don't do this, they see that as a value judgement. So that is a major goal of mine here. (For example, the "perfect chess player" may be coded with a finite state automata, but a lot more intelligence may be revealed as a pushdown automata, or linear bounded automata, or Turing machine. And I think that distinction in intelligence can be observed instruction by instruction.)

In order to do that, we need a shared framework for the different kinds of automata. Now, I haven't been exact or classical in terms of defining the different kinds of machines here. But the upshot is that, I don't have to be, because for each class there are many ways to define the associated machines, and we can choose the definitions that work best for us.

I look for the "most natural" framework. Metaphysically, in my experience, the most basic and general objects are matrices. A matrix A(ij) defines a relationship from each state i to each state j. This is metaphysically very basic, like our mind moving from one thought to another. We don't even need to define numbers to work with matrices, they can simply be understood to relate states.

Let us define the states with a matrix X(ij) where the states are given by X(ii) and when i does not equal j, we have zeroes. Then we can use the following operations to model the different kinds of instructions:

Matrix multiplication X -> AX gives what finite automata do: generating walks along the edges of A. Two-sided multiplication X -> AXA yields what the pushdown automata can do. The replacement XA -> AX turns out to leave the terms from A unaffected, and simply "shift" the cursor recorded by X. And then, we can have an operation that lets us remove a cycle of terms from A.

My instinct is that these building blocks can be used to define equivalents for all of the machines that we have discussed above.

This cycle removal operation is very interesting and I suspect related to an outstanding math problem which asks if there are problems that can be solved using a polynomial amount of resources (such as time) by a nondeterministic machine (many calculation branchings allowed), but not any deterministic one (no branchings allowed in the calculation). Basically, allowing a calculation to branch is like guessing, and is in a sense a multiplication of the calculational power, a sort of exponentiation. But if that is only internal, and the external bounds apply, its not clear if that is any real help or not.

A sample problem is to take a graph (perhaps coded by the edges of a matrix) and ask, does it have any cycle of edges that go through all the nodes? If you allow for guessing, branching, then you can design an efficient algorithm that just quickly branches through all sets of edges, and then in each case it does not take a lot of resources to check if it works. (Maybe n-squared?) Whereas if you don't allow for the branching, then nobody has found any efficient algorithm, it seems that its exponential.

A down-to-earth way to think about this is, if you add another node, say to the graph, so that the size goes from N to N+1, then how does that affect your computation? How does it compare with all that you've already done? If it's the same order of magnitude, or bigger, then it's exponential growth! That seems to be the case for analyzing those cycles. If it's a smaller order of magnitude, that is more like polynomial growth. For example, if you're adding the numbers from 1 to N, then that's roughly N-squared, which is significantly bigger than N+1, especially when N gets large.

The funky thing about the universal framework that I've set up is that it seems to generate all languages. You can specialize it by setting particular values for A and X. But, in a sense, by the basic operations, it is representing (and tracking!) every single possible machine. Furthermore, with the exception of the "Turing machine operations", it only grows in complexity. Strings are "accepted" simply by being generated. So we have a very clear way of seeing the linear boundedness of the resources. It is simply the number of operations compared with the length of the string.

Now my hunch is that each time that we remove a cycle, we are adding to the resources as if by a factor of the length of the cycle. And, in defining the operation, we don't do it for particular cycles, the cycle may be of any length, so we must assume that the cycle is as long as it might be. All of the operations are quite general! which is odd but powerful. In this set up, we can select the particular machine at the very end, but for now we are dealing with all of them at once.

If we look at the kind of problems that are considered "tough", they may be all thought to be looking for structures that lack "sub-cycles". They are looking for machines that "cannot be pumped", that have no regular pattern generator. So, for example, if in a graph with N nodes, we are looking for a cycle through all the nodes, my point is that the cycle, by definition, doesn't have any sub-cycles.

Well, in our set up, these kinds of problems are very easy to set-up. We simply code the matrix A accordingly, putting 0 for nonexisting edges. And we can do this at the end! The question is: Can we use our basic operations to generate a language that would answer our question? The answer is Yes. We can generate a language where the objects of length N are precisely those, if any, which do not have any cycles. But the only way we can restrict ourselves is if we use our cycle eliminator! And we have to do this N-1 times if we want to eliminate any subcycles. And for this particular algorithm it seems like we are using exponential resources, something like N factorial, it seems.

So this says that we have found one bad algorithm. But does that say anything about the good ones?

It might! That is because our set-up, although powerful enough to capture the different kinds of automata, is extremely restrictive. So we might be able to show that, in this set-up, there really isn't any other way to get the language we want. The only tool that we have is the cycle eliminator.

We may be able to show that, in our framework, for the resources that we have defined, there is basically no simpler approach to the problem. This might involve showing that, any restatement of the problem is actually a transformation of this statement. As a transformation, it may have essentially the same eigenvalues. And having the same eigenvalues, there may be no way of getting it to simplify.

We may also be able to show that there is a transformation from this framework into a more classical one with the traditional definitions for time resources.

It's interesting what this might mean metaphysically, if anything. But there seems to be a connection between the exponential power of our ability to make guesses (allow for several options to travel down) and the Turing machine's exponential power to destroy evidence (forcing us to travel down several options to track what is going on).

Of course, very speculative, but I think, imaginative in a useful way, at least for me. I invite such speculation on subjects that matter to us.