I've been away from ADMIXTURE for a couple of weeks, too busy with other stuff.
This time I've decided to tackle East Asia. I got access to a new dataset from the Pan Asian SNP Consortium, with some 50.000 snps. Sadly few of them appear to overlap with my current dataset, so fusing them together means I'd have to work with just a few thousand. I may try anyway in the future, but decided to play with the Panasian set on it's own for the time being. I didn't use the South Asian or Yoruba samples in this series to simplify things, but I did include White Utahns to check for possible West Eurasian (Fertile Crescent) influence.
I intend to apply some old tricks to get components to be more informative and less "isolated group"-tied, but firstly I wanted to see how this set would behave in an unsupervised ADMIXTURE analysis; namely I intend to check which unsupervised components are interesting and coherent with ethnographic/historical data so I can pick them for supervised analysis and hopefully gain some further insights.
I'm presenting in the next few days a series of regionally-split results.
Unsupervised results are good for inter-population comparisons. Most components likely don't represent any particular ancient populations. A certain amount of small component noise is expected also.
The following results are at K=9.
About the populations:
JapaneseML are from the mainland, as presumably are most "Japanese" without the ML qualification. They were separated in the set and I didn't fuse them. JapaneseRyukyu are from Okinawa.
SGChinese are Chinese from Singapore; BJG from Beijing.
Taiwan Hakka and Taiwan Minnan or Hoklo are Han Chinese, comprising the overwhelming majority of the island's current population. They represent the people generally meant when speaking of "Taiwanese" nowdays. They are however recent arrivals, presumably overwhelming the native aboriginals (Taiwan Ami and Atayal) with more efficient agricultural/social technologies only in the last 500 years or so.
TaiwanAmi and TaiwanAtayal are much older Taiwanese populations, but some discontinuity in the Paleolithic-Neolithic transition in Taiwan may imply an exogenous origin (possibly from early Neolithic China). They speak "proto-Austronesian" languages, and the Austronesian wave of language and agricultural lifestyle seems to have spread from there (or perhaps from Southeast mainland China, with a side branch going to Taiwan?).
The Austronesian Expansion seems to have been sort of a "First Wave" of agriculturalists (maybe secondary in some regions). Much later, advanced agriculturalist "second wave" Han Chinese then had again a major demographic effect, going beyond to other Austronesian lands as well, and apparent even in the Philippines today, with some 20% of the population having recent Chinese ancestry. Without Western interference, this possible Han "secondary wave" might have spread further still, given the large amounts of land then still occupied by foragers in insular South East Asia and Oceania. Indeed Negrito forager and semi-forager tribes are still under pressure today from their agricultural neighbours.
Jinuo and Karen are Burmese populations with Sino-Tibetan tongues (same family as Chinese languages such as Mandarin and Cantonese and also Tibetan. Mon speak an AustroAsiatic tongue but also live in Burma.
Mlabri, Mamanwa and Kensiu are all forager/semi-forager tribes. Mlabri live in South East Asia; Mamanwa are Philippine Negritos, while Kensiu are SouthEast Asian Negritos. Naasioi are Papuans.
It's interesting "Red" is very close to "DarkGreen" and LightGreen". The genetic distance is similar to that between closely related components from other Neolithic centres I've run before.
Some ancient stabilized admixture with very different local forager groups, present in these unsupervised components may even explain some of the distance, so the affinity may be higher than seen here.
On the other hand, "Forager-components" are much less similar both to the presumed Neolithic ones and with other forager components. Actually, since I strongly suspect these modern day foragers are hybridized with their agriculturalist neighbours, the distance between different "Negrito"-modal components may be even larger.
The distance between Westerner White Utahns and these groups seems to be roughly similar to the distance between foragers and agriculturalists, and different groups of foragers, but much larger than distance between agriculturalists. A possible explanation is multiple waves of forager-swamping agriculturalists from a single centre or group of related centres in the region.
Some minor forager admixture in farmers and major farmer admixture in foragers, would both be invisible to unsupervised ADMIXTURE if ancient in the absence of "pure" control groups.
I'll reserve more discussion for the supervised run, right now I'd venture to say the red, light green, dark green and blue components are all closely related, tend to exist in a gradient of admixture with one another in similar ethnic groups and may correspond to different but related Neolithic waves probably all from China. There is some interesting correlation with language groups.
Ryukyu Japanese may lack some components more important in mainland Japanese and Koreans due to greater geographic insulation from Chinese secondary Waves. Perhaps like Sardinians and Basques in the West.
Taiwanese Aborigines look promising as the representatives of a vast farmer demographic wave. Darkgreen presence in China may indicate it's origin there, since agriculture is older and was presumably more advanced in the continent.
In the next days I will post results for other Austronesian and Southeast Asian peoples. Then I'll do a restricted pole-supervised run.