The list of most appropriate concepts was established through careful evaluation of concept lists used in similar studies (, section 3), and lexical cognates were identified by experts in Sino-Tibetan historical linguistics using the comparative method supported by state-of-the-art annotation techniques.
Second, we apply Bayesian phylogenetic methods to these data to estimate the most probable tree, outgroup, and timing of Sino-Tibetan under a range of models of cognate evolution; similar methods have been applied to several other families of languages, including Indo-European (18–20), Austronesian (12), Semitic (21), and Bantu (22).
A second group presents Sino-Tibetan basal topology as a rake, with Chinese being one of several primary branches (10).
The more recent part of the tree is thus well resolved.
The Sino-Tibetan family comprises about 500 languages (1) spoken across a wide geographic range, from the west coast of the Pacific Ocean, across China, and extending to countries beyond the Himalayas, such as Nepal, India, Bangladesh, and Pakistan (map, , section 2).
Speakers of these languages have played a major role in human prehistory, giving rise to several of the world’s great cultures in China, Tibet, Burma, and Nepal.
This complexity has led to claims that Sino-Tibetan is one of the greatest challenges that comparative-historical linguistics currently faces (ref. The earliest paleographical inscriptions in Chinese date to before 1400 BCE, and Chinese has an abundant and well-studied literature dating back to the early first millennium BCE.
The Shāng Kingdom, the Chinese polity associated with these inscriptions, was centered on the lower Yellow River valley.