ADMIXTURE is an amazing program for ancestry analysis. The problem is, in unsupervised mode it picks stable old admixtures as "unadmixed" components -all populations are after all admixed if we dig far enough into the past.
It finds the Amerindian and European in a Mexican pretty easily, but if struggles to distinguish more ancient components, except if a population still corresponding mostly to that component is in the data. That's why we got some results at odds with recent research in unsupervised runs, such as an "Irula" South Asian component.
So how can we get it to uncover such ancient fossils?
Using the new supervised mode, I think my last analysis of Africans pointed out a method. Faced with a "childless" pole and an "orphan" component that doesn't exist as a modal component in any population in the data, ADMIXTURE tries to fit one to the other. It's algorithm presumably allows for the possibility that some of the variability of the "child" population is no longer present in the "parent" one. If we include poles such as forager populations that didn't contribute significantly to any other population in the data, ADMIXTURE will stretch that pole as much as it needs to include the "orphan". ADMIXTURE necessarily assumes that all variability in the analysed populations is accounted for by variability it's programmed to presume was present, but isn't actually represented in the poles.
This is how in the African analysis, West Africans-Bantus came to dominate "!Kung", even though "San", present in Xhosa and Tswane, was kept local.
Thus it occurred to me that this is a great method to "fish out" components for whom no parent unadmixed population nor anything close to it survives.
I set a run with the following poles, all known relatively "unadmixed" populations or higly distinctive populations, with no known close relatives such as foragers, semi-foragers and recent former foragers:
1. San (African foragers)
2. Papuans+Melanesians (isolates may pick up "Out of Africa" distinctive oceanic migration)
3. Nganassan (Siberian)
4. Koryak (Siberian)
5. Chukchi (Siberian)
6.!kung (African foragers)
7. Maasai (this seemed reasonably unadmixed in the African run, and I suspected some amalgamation there)
8. Yoruba (representative of WAF Neolithic)
9. Pygmyes (all) (African foragers)
10. Hadza (African foragers)
11. Evenki+ (similar) Yakut and Dolgans (Siberian)
I used Dienekes' run to pick Siberian populations. I realise now I amalgamated some as I used them before in some more localized runs, but shouldn't matter.
I did not pick any Fertile Crescent populations purposefully, as I wanted to see if ADMIXTURE could discover it by itself. I also analysed in the same run some African and Siberian populations as a sort of control.
I divided the results in several tables but it's all from the same analysis. Sorry for "San" and "Evenki" being the same colour don't know why google docs is doing this.
I'll offer an interpretation later, and rename the components. I intend to use this method in other regions as well, and if possible with the limited data available to me, design a run with a "master solution" for all populations together.
I'll also present collections of individuals from each population, to show that all significant components are not "chunky" and shouldn't be artefacts.