The Origins of Language

Before we look at the origins of language, I want to frame this story by looking at the origins of humans.

Humans are classified under the genus Homo, of which there are at least seven different species. They all arose from a family called Hominidae (aka the great apes) which include humans, chimpanzees, gorillas, orangutans and a ton of extinct species.

Here’s a snapshot of key players and their place in the Hominid family tree.

Hominid Timeline (Human Evolution)

But wait! If humans evolved from monkeys, then why are there still monkeys? (Yeah, there are still people out there asking this question. That’s ok.) The very short answer is this. Trees branch. Evolution is not a straight line. It’s the same reason you have cousins. Ok, back to the topic at hand.

The Origins of Language in Hunter-Gatherers

For an awfully long time, various species of humans lived in hunter-gatherer societies. This highly social way of living is thought to have evolved the origins of language, culminating in our species coming to dominate all others. Did Homo sapiens have a special gift for language that exceeded the rest?

Homo sapiens emerged around 300,000 years ago and the fossil record shows it wasn’t long before groups of hunters were able to bring down large prey animals with spears. Such feats required not just advanced tool-making, but complex social skills, which is why we think cavemen were able to verbally forward plan, strategise and coordinate through common language (and something much more advanced than the languages of other modern day apes).

It’s a tentative argument – it’s not as if they transferred their language into symbols and annotated their cave paintings. But there are lines of indirect evidence that lead us here.

For example, when humans first stepped onto Australia 45,000 years ago, the native megafauna were doomed, and not just because of the changing climate. Within a few thousand years, Homo sapiens hunters had used their team-working skills to hunt and help drive almost all large animals to extinction. These were beasts they had never encountered before: massive marsupial lions, dragon-like lizards, five-metre snakes, and two-tonne wombats. Nonetheless, coordinated humans reigned supreme.

Similarly, when humans crossed a land bridge from Siberia to Alaska 16,000 years ago, they thrived and spread through the North American continent by coordinating and hunting mammoths, mastodons, reindeer, horses, camels and sabre-toothed cats. Like no other species before, Homo sapiens had a special intelligence which enabled them to dominate the planet.

Was this special new ability based on complex language? It’s certainly believed so. But where’s the direct evidence of these linguistics origins? This seemingly straightforward question puts us in quite a pickle.

Most of our knowledge about ancient human history comes from bits of pottery and statues and networks of small walls excavated by heavily-bearded archaeologists. (Hey, everyone knows archaeologists have big, full, dust-collecting face-bushes. It’s the rule.) So as far as the concrete proof goes, we have to derive our conclusions from written language alone.

The Origins of Language in Agricultural Societies

At least 12,000 years ago*, after all other species of humans had disappeared, Homo sapiens switched from nomadic hunter-gatherer lifestyles to permanently settled agricultural lifestyles. Plant domestication arose independently in four centres around the world, namely in Asia and South America where the climate was more accommodating after the recent glacial period.

(*Archaeologists are still working on this timeline. The discovery of vast archaeological ruins at Gobekli Tepe in Turkey date back to 11,600 years ago, and this extraordinary creation didn’t pop up overnight. Nor did the architectural skills and astronomical observations encoded into the rock. Civilised society emerged gradually over an unknown period of time prior to this.)

Once humans were tied to their land for plant cultivation and cattle grazing – and could no longer roam in nomadic tribes – society took on entirely new forms. Tribes grew in population, prompting the creation of law, justice and politics. People took on specific job roles, allowing them to specialise in a single expert craft. Trading then became a necessity, so each family could eat more than a single crop. At the same time, we see the first evidence of written language.

In his spellbinding book, Sapiens: A Brief History of Humankind, Yuval Noah Harari explains three factors that necessitated the creation of symbolic language:

1. The capacity of human memory is limited. As agricultural societies developed rules for ownership, trading and taxation, written records maintained accounts and resolved disputes.

2. Out knowledge dies when we die. Details of land ownership were recorded so that valuable property could be retained from generation to generation, surpassing the deaths of individuals.

3. Evolutionary pressures have adapted our brains to recall lots of botanical, zoological and topographical information. Written language was a tool for recording large amounts of difficult-to-remember mathematical data.

So while it’s highly likely that language has existed for hundreds of thousands of years around the camp fire – and possibly among ancient civilisations yet to be identified – the current hard evidence for language extends back only 5,000 years in the forms of scripts to facilitate a rapidly changing way of life.

The Origins of Written Language: Sumerian Scripts

Mesopotamia Map

The Sumerians were an ancient civilisation who lived in southern Mesopotamia – present day Iraq – and by 3000 BC had developed their own written script. They combined two types of signs which they pressed into clay tablets.

The first was a mixture of base-6 and base-10 numeral systems. They had signs for 1, 10, 60, 600, 3,600 and 36,000. A legacy of the Sumerians we enjoy today is that we divide the day into 24 hours and circles into 360 degrees.

The other Sumerian signs represented animals, land, crops, dates, people, and so on, underpinning the fact that their written language was entirely functional, limited to accounting ledgers. They had no desire to record legends or philosophy or art.

And this is why the oldest recorded language in human history reads: “29,086 measures barley 37 months, Kushim” which might be interpreted as: “A total of 29,086 measures of barley were received over the course of 37 months, signed Kushim”.

The Oldest Evidence of Human Language

Partial Script vs Full Script

While technically a written language, early Sumerian writing was only partial script, meaning it could only convey certain types of factual information. Sumerians could record trades and property ownership, but they couldn’t use it to write love stories or legends.

This is opposed to full script, like modern language, which has the capacity to communicate the full spectrum of human experience. Our symbols are so versatile and abundant that we can use them to mimic our spoken language.

Partial Script vs Spoken Language

Adapted from Sapiens: A Brief History of Humankind by Yuval Noah Harari

The partial script was by design, though. Scholars believe the Sumerians invented their partial script not to mimic spoken language, but to compensate where human memory failed, in order to fuel business and society. In other words, the earliest known human language arose entirely for the purposes of accounting.

Over time, the Sumerians expanded their partial script into a full script which scholars call cuneiform. By 2500 BC, cuneiform was used to issue decrees, to record oracles, and to write letters. At the same time, the Egyptians developed their own full script known as hieroglyphics. Full scripts were also developed in China by 1200 BC and in Central America by 1000 BC.

However, partial scripts are not dead. They still play valuable roles in our modern world. Just like the original Sumerian ledgers, musical notation and quantum physics formulas are so functionally specific that they can’t extend beyond their technical means. Nonetheless they perform their narrow-band functions exceedingly well.

Sheet Music Illustration

Partial script is not an intuitive way of thinking. You actually have to spend time learning the rules of the script and then adapt your way of thinking to it. The reverse is true with full script (which reflects spoken language) because our ever-evolving language is shaped by the most common modes of human thought.

This is one reason why Einstein’s Theory of Relativity is so special. Being able to think like a physicist is rather rare.

Somewhere in between partial script and spoken language lies mathematics – arguably the world’s dominant language today. So where did maths come from? We’re going to head over to 9th century India to see how base-10 maths first arose.

The Hindu Numerals

Maths today is based on a partial script of symbols from 0 to 9, which you might know as the Arabic numerals. However, this name is unfair, because the numerals were originally developed in India. See their origins by comparison:

Glyph Comparison - Indian, Arabic, Latin Numerals

The Hindu numerals were popularised by the mathematician al-Khwarizmi (Latinised as Algoritmi) in a book written in 820 BC. However, invading Arabs saw the usefulness of the system, extended it with arithmetic symbols, and spread it throughout the Middle East and Europe on their travels. By the 14th century it had even replaced the incumbent Roman numeral system.

This maths script – which now dominates all 21st century writing systems – has itself birthed another revolutionary script.

Binary Code

Binary code is used by all computers and is a base-2 system which consists of just two signs: 0 and 1. So how in the name of our robot overlords does binary work with just two symbols?

The simple way to describe it is to count upwards as you would usually, but only in numbers exclusively made up of 1 and/or 0. It’s weird at first, but you’ll get it. We usually run out of numbers after 9 and there are fixed rules about how to proceed. Binary is the same except we run out of numbers after 2, and so that reset rule is applied much more often.

One way of looking at it is to count in serial: 0… 1… (ignore 2, 3, 4, 5, 6, 7, 8, 9)… 10… 11… (ignore 12, 13, 14… all the way to 99)… 100… 101… (ignore a bunch more) 110… 111… (ignore a bunch more all the way to 999)… 1000… 1001… 1010… 1011…

How to Count in Binary

Another way to look at binary is to do the conversion:

How to Convert Binary

Binary can be encoded into many different forms, such as hexadecimal (base-16) which uses the regular numerals 0-9 and the letters A-F (eg, 11F4B2). In web colours, which is a common application of hexadecimal code, 11F4B2 is mint green. The colour of this text, dark grey, is 333333. It has a simple set of rules based on adding combined amounts of of red, green and blue. This screams for another article in itself. Save that for another day.

Future Language

The artificial super-intelligence that will come to dominate humankind will think in binary – at least, to start with. We can barely imagine what general AI will look like or how it will behave, let alone what exotic languages it will inevitably develop on its own.

I guess all we can say for now is that any digital intelligence created by humanity will have to start with the languages we give it. For a good week or two, before AI leaves us for dust, at least we’ll have that in common.


