The perfect language

 There have been very many attempts to construct languages. Some of them–mostly ones to construct computer programming languages–have succeeded. While it’s been tempting to come up with a perfect human language (“perfect” defined in many ways: free of ambiguity, efficient, aesthetically beautiful), such attempts have failed.

Why is it so hard? It has to do with how humans utilize language. Language is a reflection of the way of thinking, and we think in a fuzzy way, so the language is fuzzy. If we try to make it precise, we’ll struggle to convert our thoughts into language.

A good example of this conflict is Esperanto. Zamenhof intended to have a language that’s perfectly phonetic. People struggle with the duality between spelling and pronounciation all the time, so why not solve the problem at the outset? It turned out to be an impossible mission. Exceptions to rules would appear (seemingly out of nowhere) as soon as the language was released and started living its life.

This is directly related to the fuzzy thinking that happens in our brains. We don’t just abide by strict rules of grammar; we implicitly trade off the consistency of structure (which is rigid) against the freedom of antistructure (which creates exceptions). It’s as if our brains were picking the most efficient representations of our thoughts.

Why are efficient representations not always structured? For example, why can’t language employ a Huffman-like encoding where frequently used concepts have “short” representations? It’s because we don’t know ahead of time what we will use frequently and a high cost of internalizing a representation means our brains have to pick the representations opportunistically. Greedy processes, such as this one, often create unstructured systems.

If you don’t believe that an ideal human language cannot exist, consider analogous systems, for example the organization of information in a workplace. I’m baffled at how little sense things we see at work make: information seems to be scattered everywhere, there are no standards for where to store documentation, links between sources of information are broken. It’s a particularly big pain when I try to apply a precise framework to the problem of information retrieval: for example, when I want to automate processes that operate on information. Naïvely I say to myself that if I had started all over again, information would be organized well and be easy to find, links would be maintained, standards in place would drastically reduce the complexity of solutions needed to tie all the information together. It’s a naïve view because that would be an incredibly inefficient approach (small systems benefit from unstructured solutions; as the systems get big, there are too many exceptions to come up with a good systematic solution).

This conflict between an efficient short-term (micro) solution and an effective long-term (macro) solution occurs all over the place. For example, game theorists know of many cases of games where locally optimal solutions are at odds with globally optimal ones (say, prisoner’s dilemma). The problem occurs because of lack of information: in the case of language, the brain doesn’t know what representation will be used frequently; in the case of technology it’s unclear what kind of standards are needed (implementing standards for standards’ sake is wasteful); in the case of prisoner’s dilemma, the prisoners don’t know if they can trust one another. There is no good solution.

Which is also why there will be no perfect language.