Saturday, 14 May 2011

Restricted Pole Run: Part IV- experiments, conjectures

Due to Blogger problems I had to delay this last post.

This is a very experimental part of the run, not altering any other results. Also my interpretations here are quite speculative and I don't have very high confidence in them.

I found out a while back, that when including many individuals of similar background between themselves, but very different from the remaining samples, in one run, ADMIXTURE will pull one of the restricted poles towards the group, irrespective of any relation between actual pole individuals and this different population.
Thus if I had included all the South Asian data-set in my last run, one of the poles would simply become dominated by them, and would peak in the Irula. I didn't want to do that yet, since South Asia is a complex place genetically, and would only make results less clear. Still, I wanted some clues as to which Fertile Crescent elements made their way there.
By including just a few individuals from the area in the run, the pole-pulling problem can be avoided, and ADMIXTURE will instead try to fit them into the non-South Asian-dominated poles.
This means that some results are necessarily artificial for these samples. For instance no adequate pole for Ancestral South Indian (ASI) is present. Since ASI are somewhat "Asian" when compared to Fertile Crescent populations, I expected ASI elements to be mainly allocated to the Siberian poles.
So bear in mind in this run, "Siberian" in South Asians is mostly not actual Siberian or Turkic admixture. It is simply the least inadequate pole for the ASI element. It doesn't matter anyway for this experiment since what I really wanted to check was which Fertile Crescent elements were present -that is, which patterns are present in ANI.

So this part is highly experimental, but the additional individuals analysed here don't alter the remaining results appreciably (if removed, other individuals in the run still retain their admixture patterns).
In addition you may have noticed I didn't include an Amerindian pole in this run. I didn't for two reasons, firstly "Amerindian"-components in Europeans tend to be absorbed by the NMPC since they exist in mostly NMPC populations (including Chuvash used as restricted pole). Siberian tends to detach because many NMPC-rich populations don't have much Siberian, but the same can't be said for the "Amerindian" I found earlier, so they tend to get mixed up (except if using a FC pole without any of it such as the Egyptians).
The other reason was I wanted to check which poles would ADMIXTURE allocate to Amerindians themselves, if denied an exclusive pole for them. Amerindians are quite distinctive in PCA/MDS and in unsupervised runs. They cluster far away from Western populations, further away even than Siberians.
If "Amerindian"-like populations were present in Paleolithic Europe, we would expect them to be more "western" than their very "eastern" plotting position would imply -and namely more "westerly" than East Siberians.
But what if Amerindians are plotting in the "far-east" because for some reason they had a few highly distinctive genetic variants, but were otherwise not so distinctive. When denied their own poles, these distinctive variants wouldn't be allowed to pull them away. ADMIXTURE would be forced to allocate the remaining more conventional variability to conventional poles.
Two things might happen:
1) Amerindians would be allocated 100% to some Far-Eastern Siberian pole- which would support their plotting position being derived from their assumed Far-Eastern departing position into the New World.
2) Amerindians would be split into more conventional poles and their more "western" position, if abstracting from the few exotic elements, would be revealed. This would support western routes into the Americas, or perhaps a fast sprint after the end of the Ice Age, through recently ice-cleared far Northern Eurasia (mostly bypassing then more southeastern Siberian populations).

I thus introduced 5 unadmixed (no significant Spanish or European elements) Totonac individuals. As expected just 5 individuals weren't enough to pull the remaining poles towards them too much- they didn't get any Amerindian pole.

I actually expected Amerindians to come out as some Far-Eastern Siberian+Nganasan pole pattern. But this is what I actually got:

Siberian1 peaks in the Nganasan. Siberian2 (blue) in Yakuts and Mongolians. Siberian3 peaks in Far-East Siberians (Chukchis and Koryaks). Siberian3 is actually based on "Mongolian5", but "ran away" from them.

EBengal1 is Razib Khan from Gene Expression.
UKIND is British (explaining high WMPC) with some Indian.
I picked also 3 random Kalash, who I'm not sure are distinctive mostly because of inbreeding or long term isolation.
Naturally for these populations part of the admixture components is artificial. There is no high "Siberian" in Indians, but it is the "least inadequate" pole to represent ASI in this run.
As for the Totonac neither of the 3 poles is actually adequate since Amerindians are a highly distinctive population. I should point out that the NMPC in Totonac does not correspond to European elements, the Totonac sample is quite homogeneous, with very little such admixture.

EMPC is the predominant Fertile Crescent element in India. There is no other likely reason for ADMIXTURE not to pick the most adequate FC element from all such poles it had to choose. There is some NMPC as well. The lack of WMPC+NC in these populations, which is present in the steppe pastoralists (even in the Kyrgyzstani) points IMO to distinct migrations from similar origins. The colonization of the Steppe with the development of advanced pastoralist lifestyles seems to have occurred after the Second, Out-of-Egypt, wave. The colonization of India, departing from the same region (Iranian plateau, Caucausus, South Mesopotamia?) seems to have happened before the Egyptian wave, but possibly after the EMPC one. The earliest Northwest Indian Neolithic settlements are dated approximately about 6000BC which is in accordance with this possibility.
The representation of ASI-like segments variously by Siberian3, Siberian1 and Siberian1+ WMPC+NWAf may be related to ASI diversity among these populations. If South India was mostly settled, and even then with a high aboriginal persistence, only after the secondary EMPC wave developed (as opposed to a possible Northwestern settlement by a "primary" NMPC wave) this could point to a native incipient Neolithic, at least in South India.

One conjectural model:
1) An earlier less advanced expansion by a high NMPC containing population influencing only "easy" Fertile Crescent toolkit niches in the Northwest
2) Later an advanced secondary Neolithic expansion containing high EMPC from a developed Near Eastern Neolithic Centre, with much improved seeds and techniques finally making some way into Southern and Eastern India, while mostly replacing the earlier wave in the Northwest?
3) Maybe followed by a small reexpansion of the Northern element from the periphery (now mostly EMPC but still with more NMPC than Southerners?)
I'm not sure why ADMIXTURE didn't find WMPC+NC small elements apparently typical of Central Asian populations here. But a possibility is that Central Asian demographic influence in India is overestimated in other models.

About the Totonac results. Indians had an "Eastern element" (ASI) that had to be assigned to an Eastern pole (Siberian poles). South Indians seem to have a more "Southerly" ASI variant which was perhaps artificially allocated to the MPC+NWAf pole. This can be seen in PCA plots.
Amerindians on the other hand are "far eastern" even relative to Siberians. There could be a number of explanations for this, but I think it's interesting ADMIXTURE chose to represent them with Siberian3+NMPC+Siberian2. These don't correspond to actual admixture events (much like "Chinese Mexicans"). They could be partly due to much "Amerindian"-like admixture in North Europeans being allocated to the NMPC pole (since I didn't include an Amerindian pole and high NMPC populations all have residual "Amerindian"-like elements)-making it slightly more "Amerindian"-like than some other poles available.

This is pure speculation but it's as if ADMIXTURE, when forced to ignore some possible Amerindian exotic elements (due to having to pick exclusively from among pole populations without as much of them), is telling us that Amerindians are otherwise "more Western" than they seem to be in the PCA plots.
It was already strange that "Chinese Mexicans" had a smaller "Chinese" component than a Totonac one. Greater proximity between Chinese and Europeans in PCA plots would imply that a Chinese pole would tend to overestimate the Amerindian element not the reverse. East African overestimated the African component in African-Americans as predicted.

So summarily: Totonac obviously don't have any real NMPC. Possibly neither much of the Siberian admixture they seem to have. ADMIXTURE component patterns simply "plot" the Totonac's position relative to the poles available while excluding elements not present in any of the respective aggregated components.
They have some affinity to NMPC only because they're denied their proper poles in this run. This is I think because NMPC has some slight affinity (having "absorbed" them in this run) to possible "Amerindian"-like variants in North Europeans. +Totonac being more "Western" than they seem as long as a few conjectural exotic small elements are forcefully ignored by the run set-up.

Here is the participant's spreadsheet. Full run spreadsheet.
You may also be interested in checking out an interesting 3D PCA model at Harappa.

1 comment:

  1. Briefly examining my results, it seems that the EMPC component largely corresponds with the Ancestral North Indian component, although it seems that a part of the Ancestral South Indian also seems to have been picked up as EMPC as the ASI is only marginally more related to East-Eurasian populations. Hence, given that the West Eurasian element in South Asians is largely derived from West Asia, the EMPC component being the dominant element is but natural. The Siberian2 component seems to be picking up the rest of the ASI, along with a bit of the Siberian3. I reckon that the Siberian2 component will be closer to the ASI and/or Onge component seen in other admixture runs. Diogenes mentions that the Siberian3 component peaks in far-eastern Siberians like the Chuckchis and the Koryaks. If we assume geographic proximity as the sole factor for closeness, the generic Siberian2 component should be closer to ASI than the far-eastern one as ASI is closer to SEA and East Asian populations in general. My Siberian3 score is consistent with the bits of East Asian I get on admixture runs all over the genome blogging projects. Incidentally, my N. Asian + Siberian + SEA score over at HAP was 3%. As for the NMPC, once again rather consistent with the 4 -5% Northern/North East European scores I usually get. In this regard however, TamilBrahmins2's scores seem especially elevated as compared to the other two Tamil Brahmins. The WMPC + NWAf might well be some sort of Neolithic signal from the Indus valley civilization?