Evolution of Human Languages

An international project on the linguistic prehistory of humanity
coordinated by the Santa Fe Institute
Languages of the World: Etymological Databases
Testing the "Borean" Hypothesis

Murray Gell-Mann, Santa Fe Institute
Ilia Peiros, Santa Fe Institute
George Starostin, Russian State University for the Humanities

The issue of "Proto-World" - the hypothetical common ancestor to all of the world's languages - has remained fascinating for specialists and the general public alike for most of the 20th century, yet is still far from being resolved. Attempts to arrive at a series of "Proto-World" roots with the aid of so-called "global etymologies" have not been taken seriously by most mainstream linguists, because these "etymologies" frequently rest on surface similarities and depend significantly upon data from insufficiently studied language families. Many particular criticisms of the work done in this field by Merritt Ruhlen, John Bengtson, Vaclav Blazhek, and others have been justified. However, as more and more language families get more detailed comparative-historical treatment, suggested comparanda can be reevaluated and filtered through our up-to-date knowledge of linguistic prehistory.

Working on linguistic macrofamilies, such as Eurasiatic, Dene-Caucasian, Afro-Asiatic, Austric, Niger-Kordofanian, Amerind (or, perhaps, several macrofamilies that should be postulated in the place of a unified Amerind), etc., the EHL team constantly had to deal with phonetically and semantically similar words in these macrofamilies that do not merely constitute isolated chance resemblances in modern languages, but can be shown to be reconstructible at deep chronological levels. In fact, already in the late 1980s the corpus of such similarities between two of the largest stocks in Eurasia - Eurasiatic and Dene-Caucasian - became so vast that Sergei Starostin had dedicated a special paper attempting to trace regular phonetic correspondences between reconstructed protolanguages of these two macrofamilies. To that corpus he later added comparative data from the Afro-Asiatic and Austric macrofamilies. The provisional "macro-macro-family" that united these four large groups was given the name "Borean" (a term originally used by H. Fleming to designate a somewhat different hypothetical taxon).

As EHL expands their focus and delves into comparative material of Africa and America, it becomes clear that the four major macrofamilies of Eurasia share many of their archaic roots with linguistic stocks found on those continents. The kind of evidence that has already been accumulated and grows larger each day is not highly likely to be easily explained away by chance resemblances. Perhaps the best argument for this statement is that the same kinds of resemblances are not found between this expanded version of "Borean" and the few remaining linguistic taxa - such as most of "Indo-Pacific" (a blanket term for several macrofamilies spoken in New Guinea and neighboring regions) and Khoisan (the "click" languages of South Africa).

The entire area of research on this - by all means, still highly speculative - connection basically consists of trying to answer three questions: (a) Is "Borean" a reality?; (b) If yes, what are the limits and the internal classification of Borean?; (c) If yes, how old is Borean? The provisional answers, which best suit the accumulated evidence, at this moment are (a) Yes; (b) The four macrofamilies of Eurasia may be marginally closer to each other than to languages in the Americas and Africa, but this model is very easily subject to change as we learn more about the history of the latter taxa; (c) The age of Borean could be anything in the range of 25,000 - 18,000 BP, but not likely to exceed this range (as suggested by various lexicostatistical calculations and overall intuitive assessments).

At present, the Borean hypothesis has not yet been formulated in strictly scientific terms (not to mention being in any way validated by the comparative method), mainly because there are still multiple unresolved crucial problems in the history of the macrofamilies that may constitute it. Nevertheless, the resolution of each such problem usually increases the evidence, and it may be hoped that in the near future the Borean taxon will become far better substantiated. With the vast amount of scientifically analyzed data and a set of statistical tools to check the results, the recovery of at least several hundred "Proto-Borean" morphemes is, at any rate, a far more promising task than that of reconstructing a "Proto-World" spoken one hundred rather than twenty thousand years ago. Actually, confirmation of the reality of "Borean" would suggest the existence of a curious "linguistic bottleneck" around twenty thousand years ago, with only one of the then current variety of human languages wiping out most of the others over a period of hardly more than ten thousand years - demonstrating that "bottlenecks" of this kind (e. g. the Bantu expansion in Africa, the Indo-European expansion in Eurasia, etc.) are not at all constrained to later periods in human history, but could occur at deep prehistorical stages as well.