What drives the complexity of a language?


Looking at English, its complexity seems to have been in constant decrease. For example, in the past, there were conjugations and a separate informal form of “you” (”thou”); all in all, the language was much closer to German. On the other hand, German still has those complexities; however as far as I can tell, it certainly has not gained any new ones (except for the current gender issues).

Also looking at Romance languages, as far as I can tell there has been complexity loss. While Latin has three genders, as far as I can tell all modern Romance languages have only two. And I don't think any of them has the complete six-case declination of Latin.

So all in all, it looks as if the complexity of a language has the general trend to go down over time. But that cannot be the case: Complex languages like Latin or Greek didn't just pop up in their finished form, but developed from earlier languages, and at some time in the past, those languages would need to have been much simpler; it is inconceivable that the first languages humanity developed were extremely complex.

Therefore there must be times when language gets more complex (up to the high complexities of Latin and Greek), and other times when it gets simpler. And those changes surely don't just happen at random.

One guess would be that complexity is added when society changes a lot, but certainly modern technology changed the society quite a bit, but it doesn't seem to drive the language to more complexity. There are more words invented, but not more grammar, as far as I can tell.

So what does actually drive this complexity? That is, what determines whether a language gets more complex or simpler over time?

Speculating: I would imagine that ancient languages that were mostly spoken rarely written got complex for that reason. Also dialects and local differences would cause complexity. But once you establish a national standard for written communication, this ought to set things straight over time. Even more so with modern technology like radio, TV, Internet making it possible to also broadcast spoken language, not just written. Lundin‭ 22 days ago

Though obviously, cultural influences are one of the main reasons for language changes. Britain has a historical tradition of getting invaded, by the Romans, by Saxons, by Vikings, by Normans... each group leaving their mark on the language. Lundin‭ 22 days ago

This is a frame challenge answer.

There is no objective measure of "language complexity" known to me, not even attempts to define one.

Bigger tasks require more complexity, but just very little

Languages used for a drastically wider range of communication functions tend to be a little bit richer, but even that is difficult to quantify. For example, languages with abundantly used spoken and written forms tend to be a little bit more complex than those with just the spoken form. A vernacular language needs to grow additional muscle before it is ready to power newspapers, scientific journals, or international trade, but this process does not transform the language beyond recognition. If it had simple morphology, it will continue to have simple morphology, and vice versa. Any such new muscle is largely optional in its old communication functions. The old muscle gets just a bit more standardized to allow more precise expression, and even that becomes an obligatory feature of the language only if societal (extralinguistic) changes require that.

What remains is in the eye of the beholder.

Language complexity is subjective

People learn languages using different strategies depending on age and environment. The mother tongue is acquired, languages encountered later in life are learned. It is disputed whether one can skip any language acquisition altogether during their first (say) five years of age and still reach fluency in a language later. It is generally accepted that more than one language can be acquired at a very young age, apart from the ability to learn additional languages in adulthood. All such "primary languages" provide a lifelong bias toward what language features we are going to consider "simple" and "natural".

Language learning, unlike acquisition at an early age, is a conscious process which may employ learning strategies such explicit analysis of grammatical categories, paradigmatic exercises, or exercises involving fictional communication needs which focus on the linguistic form more than on the meaning and context. There's awareness and effort, therefore also more memories of undergoing such a process.

There is some evidence that this difference between (natural) language acquisition and (conscious) language learning even translates to stimulation of different parts of the brain, when the resulting fluent bilingual speaker uses each respective language. (For example here.)

One of the artifacts of those respective language acquisition/learning processes is that we are mostly unaware of the complexity of the language(s) we learned in early childhood, while we remember quite a lot of the process and pitfalls of learning our "secondary" languages. However, the influence of the primary languages also determines which features of a particular secondary language will be familiar to us ("wow, I can just substitute some different case endings"), or highly exotic and difficult ("oh my, what do you mean by cases again?"). In the end, we are tempted to interpret languages that are typologically different, genetically unrelated, and perhaps also those encountered in more demanding communication functions, as "more complex" than languages whose complexity we take for granted thanks for our innate ability to acquire them without conscious, structured effort.

(Side note: There is a method which allows a native speaker of a language experience more of its complexity than before. Get hired as a language instructor to foreigners and do your best to excel at the job. Unless you are a professional linguist already, you will soon realize that what you have learned about your own language at school has included gross oversimplifications. Money back guarantee: if this doesn't work for you as described, I'll take from you any tuition fees you thus earned.)

This applies even when looking down at older forms of one's own primary language. Recent texts are familiar and "simple". Older texts require more effort, ancient texts are in a language different enough that theoretical study of the language, or prolonged immersion, is necessary for (passive) fluency. There's clearly more complexity to tackle the further away one moves from their own zone of fluency.

Morphology as a turning wheel

So far I was commenting on language complexity as a whole, rather than just (for example) on the complexity of the system of productive grammatical categories. However, languages can be developing on this or that typological axis, thus shifting complexity away from one layer of linguistic description into another. So a more "local" viewpoint can indeed show measurable changes of (local) complexity.

Let me now switch attention to morphology in particular, as a convenient, perhaps even quantifiable, "observation window" into the structure of a language; let's just remember that morphology is just one layer of the linguistic description and not a universal measure of language complexity.

It's time to address the examples in the OP, too.

The Norman invasion to the British isles in 1066 marked a start of a period of heavy influence of Old French on Anglo Saxon which eventually disrupted a lot of its inherited Germanic morphology, shifting the typological characteristics of English from an inflected language toward an analytical language; i.e., language complexity moved from morphology to syntax and phraseology. This process was not instantaneous, it lasted many centuries as the character of the language was evolving. The complexity is still there in English as a whole, it has just moved to other layers of the linguistic description.

Here I have claimed an external influence, but it is clear that we can see a similar development when we go from Latin to any (recent) successor language including Old French or Modern French; it seems that inflected languages tend to evolve into analytical ones intrinsically. Analytical languages tend to develop into agglutinative ones, and agglutinative languages into inflected ones. The wheel supposedly keeps turning; no language is ever a pure specimen of an inflected, agglutinative or analytical language, but it is somewhere on this wheel, transitioning from one linguistic type to another.

How long does a single revolution of the wheel take? Unfortunately, longer than the timespan covered by current historical linguistics. We have seen lots of examples of the individual (partial or nearly total) transitions in different language families, but, to my knowledge, we have never seen a series of successor languages walk the whole cycle.

All the languages cited in the OP belong to the Indo-European family. Proto-Indo European, as we are imagining it, was a heavily inflected language. If you like to count noun cases, genders and numbers, you'll find 8, 3 and 3 of those, respectively. Therefore, diachronic comparison within the Indo-European family tends to land on evolution from an inflected language toward an analytical language, although the distance travelled on the metaphorical wheel over the last 5000 years varies widely.

Why does the wheel turn in this direction?

Dixon's wheel showing Inflected to Analytical to Agglutinative back to Inflected

It is difficult to say with certainly what keeps this wheel turning. However, if the theory is really as generally applicable as it currently appears it might be, the driving forces must be intrinsic (linguistic) ones. Different language types somehow create room for different processes of further language evolution.

Let me finish by speculation, by a hypothetical example of what I am suggesting in this section.

An analytical language is defined to convey sentence structure through prescribed word order, and "grammatical words". That means that grammatical rules often prescribe what the grammatical words end up adjacent to (still as separate words). Some such combinations, especially the obligatory ones, start merging into longer words; there's a "semantic root" and grammatical affixes. As those affixes originated from grammatical words, each affix carries just a single component of meaning. The new words are highly "logically constructed", but often tediously long and they resist shortening pressure unevenly, due to phonotactical constraints. So there's a pressure to simplify (in syntagmatic sense) and the affixes influence each other and some of them eventually merge into a "tight knot" of meanings all expressed into a fairly economical form which is easy to pronounce. As a trade-off, paradigmatic complexity may have increased greatly through the same process, yielding an inflected language which has kept fewer agglutinative affixes, but which now has one "tight knot" per an inflected word marking the obligatory categories all at once, differently for different lexical items. After that, there's a new pressure to "equalize away irregularities" across the lexical items; it's like a market consolidation phase, whose extreme outcome can be an analytical language. As long as the inflected "tight knots" are still present at all or nearly all lexical items (of inflected parts of speech), they are partly blocking the agglutination process by keeping the boundary of a word very strict; as soon as a stage close to analytical is reached, the agglutinating process can start all over again.

Thank you for your detailed answer. BTW, as a native German speaker, I disagree that you always consider your native language as simple, as I'm quite well aware of the complexity of my language (and we've certainly spent quite some time in school on things like cases). celtschk‭ 20 days ago

@celtschk - Yes I'm pointing out some tendencies and theories rather than hard and fast rules. French or English is further ahead on the Dixon's wheel than German or Latin. If you consider just morphology, especially just that of nouns, the latter are objectively more complex than the former. Still, your experience with cases in German helps you digest Latin declinations. Jirka Hanika‭ 20 days ago

...declinations declensions :-) Jirka Hanika‭ 16 days ago

