Tuesday, 14 June 2011

Panasian-Part IV: Malaysia and Indonesia

Part IV (and last) of the unsupervised Panasian set ADMIXTURE analysis.

Fst distances provided by ADMIXTURE:
About the populations in this part:
The "Malay" population is comprised by Malaysian Malays; "SGMalay" individuals are from Singapore; while MalayIndonesia are equally Malay dialect-speaking, but from South Sumatra, Indonesia (the official languages of Malaysia and Indonesia, Bahasa Malaysia and Bahasa Indonesia, are different registers of a dialect continuum called "Malay" historically used as a lingua franca spanning both countries).
Temuan speak a language from the "Aboriginal Malay" or "Proto-Malay" group. They are a small group nowadays, and still practice Animism and slash-and-burn agriculture.
Sundanese and Javanese are the main Austronesian-speaking inhabitants of Java and represent together almost half of the Indonesian population (20M and 80M respectively), dominating its politics. Sundanese and Javanese languages are distinct from official Malay-derived Bahasa Indonesia which most of them nowadays speak as well (some exclusively).

Mentawai are Animist/Christian slash-and-burn (swidden) farmers planting sago, yams, taro and raising pigs and chickens much as the original Austronesians and as other modern-day isolated Austronesian peoples; supplemented by much hunting and fishing as well. They live in the Mentawai islands just off West Sumatra.
Batak Karo and Batak Toba are also Austronesian speakers from inland Sumatra; they seem to be dry rice-agriculturalists, also cultivating paddies in some regions.

The "Dayak" sample is derived from an (Austronesian-speaking) Indonesian Dayak tribe from East Borneo, and are slash-and-burn agriculturalists. Biduyah are also Dayaks, but differ from the "Dayak" sample in that they are from Sarawak, Malaysia (Northwestern Borneo). They have presumably received more outside influence than the more Eastern Dayaks, and appear to be more involved in the current plantation economy. (The Iban, represented in a previous run with the main set, are also Dayaks from Western Borneo).

Toraja are Austronesian speakers from Sulawesi's mountainous interior. They're farmers who remained Animist until recently (they are mostly Christian nowadays).

Kambera, Manggarai, Lamaholot, Lembata and Alorese live in the Lesser Sundas in Southeastern Indonesia. They speak Austronesian languages, and are mostly Christian today. These groups present visible Melanesian admixture.

Naasioi are Melanesians, speaking a Papuan language. They share Bougainville island in Papua New Guinea with Austronesian-speaking groups, and are slash-and-burn farmers.
Kensiu and Jehai are Malaysian Negrito peoples.

Some candid observations (and shameless speculations):
1) Austronesian tongues seem to correlate with the darkgreen component modal in Taiwanese Aborigines. Earlier there seemed to be a weaker correlation with Tai-Kadai languages.
These linguistic families have important similarities, and it is controversial among linguists weather they are branches of a more ancient proto-language, or if both branches shared geographically close homelands- their present distribution suggests homeland(s) in Southern China, and populations analysed from China, including all Han groups, also have darkgreen elements.
These are however largely lacking in represented Austro-Asiatic speaking populations (though I would be surprised if more Southern Chinese-influenced Vietnamese weren't the exception); Burmese Sino-Tibetan speakers; Uyghur; Okinawans; and are small in Koreans and Japanese mainlanders (perhaps due to an old Han admixture event/secondary wave?)- all these groups have important "red" elements however.
This could be interpreted as suggesting an origin, or at least diffusion of Sino-Tibetan and Altaic with Neolithic Expansions directly from the Northern China/Yellow River region before the ethnogenesis of the Han (before admixture from populations from the Yangtze River valley?).
(At this moment I would tend to interpret the Altaic element, even in Central Asians, as being derived early on from the East Eurasian Neolithic just as I previously interpreted their Western admixture as being derived from the Fertile Crescent).

2) Austronesian-speaking groups closer to continental SE Asian influence tend to have smaller darkgreen and larger darkblue elements- such as Bidayuh (SW Borneo) and the populations from Java.
The darkblue component is important in continental Southeast Asian populations and those from the Indonesian Archipelago with more historical SE Asian connections. It somewhat correlates with Austro-Asiatic languages, perhaps it indicates an early agricultural wave from China into SE Asia, before the Tai-Kadai-Austronesian one? Darkblue is a bit more distant in fst from the other East Asian Neolithic components, perhaps due to having a larger degree of stabilised ancient local variation included in it.
Darkblue in Indonesians decreases with distance to SE Asia. It may have some association in this context with later expansion of populations- possibly associated with wet rice and other agricultural innovations from SE Asia- from more developed and more densely populated Java, a development continuing to this day.
4) Groups further towards the East present an increasing burgundy element modal in Papuan-speaking Melanesian Naasioi. These groups are swidden agriculturalists. Their relative success in resisting the farmer waves versus distantly related Negrito groups from the Philippines and SE Asia may hint at some primitive agriculture in coastal Papuan-speaking peoples prior to the Austronesian expansion.
Melanesians such as Naasioi probably have some Austronesian admixture, which may be invisible here due to lack of reference highland Papuan populations- the burgundy component may be a fusion of mostly "Papuan" with a little Austronesian- a possible reason why it may pull some of the Philippine Negrito variation in the more admixed tribes.
Polynesians also seem to have important Melanesian admixture.

sorry I didn't include East Asian participants in this run but the ~50000 SNPs in this set do not overlap much with the 23andme ones, so including them would reduce the resolution too much.

Many thanks to Zack from Harappa for drawing my attention to this set -he also wrote and posted the conversion code to bed format.

