Musical Activity as the Basis for the Evolution of Joint Intentionality and Nonlinear Grammar
I will talk about music and logic from the point of view of evolution, the history of life in our universe. I will argue that musical activity, in particular, played an important role in the social, physical and mental development of the characteristics which make us human, including joint intentionality and nonlinear grammar. After a sketchy, whirlwind tour of the evolution of our abstract mind, I will use four levels of music - unison, rhythm, melody, harmony - to illustrate my main point, which is that intentionality - our reason why - must be built above three levels of knowledge - how, what, and whether - and consequently, we must rely on three distinct parsers in order to keep those levels distinct.
Both music and logic are examples of the peculiar evolution of the central nervous system towards ever increasing abstraction. A single-celled paramecium engages the world directly by way of its receptors for light and chemicals. But a butterfly has a nervous system, by which it pays attention to neural images of flowers. It thereby replaces reality with a set of neural answers that model a world of flowers. As neuroscientist Michael Graziano has noted, a mouse furthermore has awareness, in that it utilizes a model of attention which it can identify not only with its own attention, and thus be itself aware, but likewise model a cat's attention, and thus be aware of whether or not the cat is attending to the mouse. Thus the mouse lives an abstract world of indexical and causal relationships, a world of suspicions and potential answers, which is to say, a world of neural questions.
But humans and certain other animals can moreover be conscious. Birute Galdikas has noted how orangutang males go off to live alone, as if they were Zen Buddhists, and how they can choose to ignore people or not. Similarly, we can choose what we wish to be aware of. We can step in to experience a subjective answer, we can step out to formulate an objective question, and thirdly, being conscious, we can be choosing between the two.
Our conscious mind is thus able to balance answers - what we know, with questions - what we don't know. This balance is crucial for both logic and music. It is the central theme for my talk.
But first I want to briefly describe the highly abstract landscape of our conscious minds. We conscious beings are not simply aware of our own attention. We are also aware of our minds, what neuroscientists call our global workspace. We experience cognitive frameworks which divide up that global workspace into various perspectives.
For example, matters of existence require two points of view: We need to be able to raise a question, does a chair exist or not? but also suppose an answer: If it exists, then it exists; if not, then not. Similarly, questions of participation require three points of view: a cycle of taking a stand, following through, and reflecting. Such a cycle is the basis for the scientific method: having a hypothesis, conducting an experiment, and intepreting the results.
Issues of knowledge require four points of view: whether, what, how and why. We experience a cup as a sensory image, What, but also as a mental blueprint, How. We may furthermore imagine Whether the cup is in a cupboard even when nobody sees it. And when we imagine Why there is a cup, then we suppose that we must know its relationships with absolutely everything, as we do with the Yoneda lemma in category theory.
Decision making requires five points of view, morality requires six points of view, and then we come to logic, which requires seven points of view, the familiar square of oppositions. I think of it as a dialogue between two sides of our mind, the all-knowing, intuitive, holistic, answering side, and the all-doubting, rational, step-by-step, questioning side. We may identify these with Kahneman and Tversky's fast thinking System 1 and slow thinking System 2, with the intuitive right brain and the rational left brain, with yin and yang, or even with emotional female and conceptual male gender stereotypes. Logic provides a dialogue which perfectly balances the knowing side and the not-knowing side. This perfect duality of all and nothing is manifested by DeMorgan's laws. The call and response of question and answer is key for music.
Logic yields, among its seven perspectives, one divided perspective, which says that something is known, and something is not known. But we can truly appreciate logic's role if we add an eighth perspective, which would be that everything is known and everything is not known, in which case the system is empty, and all of the tenuous structure collapses! We then have a cognitive framework with zero perspectives, a blank slate, an empty system. Logically, we can say this is the state of contradiction, which we can imagine as our starting point, or even the starting point of the evolution of the universe.
In order to understand this Godly state of contradiction, and what drives evolution, consider what could possibly motivate that which is prior to logic, truth, being, meaning, life or love? The only thing I can imagine is that it would ask of itself, would it be, even if it was not? As in a proof by contradiction, it would remove itself, giving rise to a state of noncontradiction. Like a patient investigator, it waits to see how it may exuberantly arise in our physical, Godless, noncontradictory world.
If our physical world is an investigation, then it is revealing not so much an embodied mind, but rather a disembodying mind. Evidently, there develops an evolutionary pressure upon us all to apply ever more resources to model what we don't know rather than what we know. As conscious beings, we end up in a tiny, abstract cycle of eight cognitive frameworks, where we every so often choose amongst perspectives like "free will" or "fate" and thereby set in motion vast unconscious scripts that have us do what we do.
Logic makes conscious beings aware of their questions. But that is very lonely. Music helps create a culture where we ask questions together. Music is arguably the first game, and as I will argue, games are what make us human.
I did a study of Gamestorming, a collection of 80 innovation games played in Silicon Valley. Each game had a purpose, and I documented a system of 24 different purposes, a framework for innovation. I further realized that every game can be thought of as potentially made up of 24 games, and that all of culture is games upon games.
Games create a shared world, which we enter by asking a question, and we leave by accepting an answer. At the heart of the game is a cycle of exploration, where we propose an idea, try it out, and evaluate the result. Through such play we are, in effect, pursuing the scientific method. We are remaking the world in terms of our cultural conceptions, and we are outdoing biological evolution with a much more rapid cultural evolution.
However, it took hundreds of thousands of years for us to physically and mentally transform into such gameplayers. Music is how and why our vocal chords developed so that we could sing in unison, and ultimately, speak in syntactic languages.
In order to enter a shared world, we need to tune into each other, first in the physical world, but then in the mental world. We can then have a game, an intentionally shared activity, what Tomasello calls joint intentionality. In comparing humans with chimpanzees, he notes that the latter can hunt a monkey as a team, with one chimpanzee chasing it from the trees above, and another running after below. But each chimpanzee is an individual thinking, "That is my monkey". Whereas humans, from infancy, are able to come together as an ad hoc "we". We work together and share the reward, but not with bystanders or cheaters.
This is reflected in our bodily harmony. Look at us around the room, how we acknowledge each other, and our whole group, by unconsciously adjusting our feet and arms and posture. I have drawn some lines to show how these children are harmonized all with each other. It's like they are dancing.
These children in Ethiopia are singing and dancing. If you look, their attention is in very different places, but their bodies are in perfect harmony. There is a beauty in the way they are standing.
Whereas these chimpanzees are physically oblivious to each other. Evidently, humans evolved a "sixth sense" by which we unconsciously manifest solidarity through our body language. Likewise, humans are able to sing in unison, whereas the other apes cannot. Rhythmic unison - singing, drumming and dancing together - may have attracted mates, and engendered a virtuous cycle of rapid evolutionary change. The foundation of music is this perfection of unison. We articulate and accentuate so as to tune in to each other. Human babies cry for attention with the intonational accents which they heard and learned while in the womb. Infant apes don't cry.
Autistic children lack this "sixth sense" of bodily harmony. It is as if they are blind to it, or in the case of Asperger's syndrome, their sense is diminished. They have trouble joining our shared worlds, and when they do succeed, it is because they consciously compensate for the lack of this unconscious faculty.
I have come to the heart of my talk, which I will illustrate with four levels of music - unison, rhythm, melody and harmony. I will show that intensionality requires the use of three distinct parsers. Joint intensionality requires that we all share those three parsers and apply them collaboratively. Jointly intensional sound is music.
Physical unison serves as a starting point for mental unison. Note that men and women sing an octave apart, and so our unison is neurologically abstract. We can sing a note together - we can tune to each other - without any rhythm. In adding a shared, recurring rhythm, we are defining a shared musical activity, what it is, what is taken as given. We are delineating an abstract space. We then adorn that rhythmic space with a melody which sketches out for us a musical system. Playing within the bounds of this system is how we create and conduct musical activity. But for sound to become music, we need a dialogue between two different systems with two different intentions, and that is harmony. That is why we have musical activity.
I will illustrate this dialogue with a Lithuanian children's song, Garnys, garnys:
Garnys, garnys turi ilgas kojas. Garnys, garnys turi ilgas kojas. Bėgčiau, bėgčiau, kad galėčiau, tokias kojas, kad turėčiau, kaip garnys. Bėgčiau, bėgčiau, kad galėčiau, ilgas kojas, kad turėčiau, kaip garnys.
Now I will sing it in English, showing how it is built up of games, where we enter with a call, and exit with a response.
Here we have a tiny game - two notes, E and G. And we repeat this game for a larger game.
"The crane, the crane".
This segment is a call for a response.
"The crane, the crane, has very long legs."
Indeed, in language, every sentence is a game. The subject is the question, What is the crane? The predicate is the answer, The crane has very long legs. Again, this sentence is taken as the call for a response.
"The crane, the crane, has very long legs. The crane, the crane, has very long legs."
But this is not yet music. It's all built up within one system, the notes E, G and A, which established the tonic A. Then comes a response from an entirely different system, another point of view.
Run, I'd run, I would, I would, if I could have such very long legs like the crane.
Run, I'd run, I would, I would, if I could have such very long legs like the crane.
This is now music, a dialogue of intentions. Here the notes D, E and G establish the tonic G. And this is the point of syntax. It distinguishes a point of view's form from its content. Thus we can discuss different points of view. We do this with a shared language, and so we can share a discussion of different intentions, and together formulate our shared intention. The more systematic the language, the more we can jostle our expectations and play with ambiguity as to which system we are in. This is the joy of music.
Let us now consider natural language. Ray Jackendoff has noted that syntax must have arisen after a protolanguage with a linear grammar which was quite robust. Such a linear grammar is used by sailors speaking pidgin; second language users who never develop fluency; people with certain brain injuries; deaf children who develop their own gestures; but also the great apes. It consists of strings of words for which there are no rules. Word order is simply determined pragmatically in context.
Syntax systematizes the language, which allows us to construct different points of view. "Jack loves Jill" can become "Jill is loved by Jack", "Jack does not love Jill", "Doesn't Jack love Jill?", "Jack does not love Jill?" and so on. None of this is possible with linear grammar.
Nonlinear grammar is rule based. Music likewise legislates rules. Joint musical activity demands a perfection of all and at all times. Sounds or words must be annunciated exactly. Rituals develop. Words and concepts become categorized, as Levi-Strauss observed. People develop a sense of right or wrong, in-tune or out-of-tune, grammatical or ungrammatical. Rules must be followed in creating new words. Syntax arises as rules which may not be broken, and is distinct from pragmatic constraints.
Let's us consider how language uses the three parsers to build up intention. We start with the atomic units of meaning, which would be lexemes in Lithuanian or perhaps words in English. A first parser constructs the syntactic atoms, which would be words in Lithuanian and phrases in English. A different, second parser constructs sentences for thoughts. And finally, a third parser constructs a dialogue, a context in which we can separate out intentions, why those words and sentences were spoken.
These levels are relevant for copyright law because intention indicates authorship. Words are shared, and two people may by chance construct the same sentence, but a dialogue of any length is not a matter of chance. Similarly, rhythms are shared, and two people may come up with the same melody, but a harmonic arrangement is attributable to an author.
What is the meaning of language? Leonard Bernstein, in his magnificent video lectures in 1973, showed that music a metaphor of dialogue. In music we hear call and response. We listen in on an emotional dialogue, which captivates us, even though we don't understand the words. Music teaches us conversation, just as a child in the womb learns from her mother's voice.
This suggests that music came before language. Music maps itself onto itself, one syntactic system onto another. When this musical language grew robust and became established, then it could be used to map syntax onto the semantics of the world. In the meanwhile, as our vocal chords evolved, music allowed us to tune our bodies and our minds, so that we could sing before a hunt and after a hunt, before and after work. We could bond with each other in small groups. We could have rituals.
I will conclude by returning to my diagram of the 24 games within a game. I can now explain that the four levels of music allow us to secure that tenuous learning cycle at the heart of a game. We secure it in the real world with unison, and we secure it in the game world with the harmony of our intentionality. Rhythm brings that unison over into our game world, and melody projects our harmony further out, into the real world. Thus these four levels securely relate real world semantics and game world syntax.
It's thus understandable that four kinds of games usher us into the game world, and four more kinds of games lead us back out. Inside the learning cycle, there is a game for creating the deliverable, the very meaning of the game. And there are four more games to relate that meaning to the four levels. There are another six games to keep those four levels distinct. These can be thought of as syntactic games, where we answer a question with another question, and thus we can have highly complicated games. The distinction also makes for a gap and so we can always have a game within a game.
Noam Chomsky searches for the universal grammar of language. But I am convinced that truly there is this universal grammar of games. Language is an important game, but from the point of view of syntactic purity and evolutionary primacy, music is arguably an even more fundamental game for us to explore.
Syntactic rules let us perform a task that we do not completely understand, as when playing our part in a greater musical whole. This fosters our ability to hear what others are saying as well as what we ourselves are thinking. Whereas perhaps other apes can only think one perspective at a time. Thus they can answer questions but they never ask them. Musical activity teaches us to be "I" and "you" and "they" and "we", in parallel.