The Central Siberian Origin for Native American Y Chromosomes

Am. J. Hum. Genet., 64:619-628, 1999

Fabrício R. Santos,1,2 Arpita Pandya,1 Chris Tyler-Smith,1 Sérgio D. J. Pena,2 Moses Schanfield,3 William R. Leonard,4 Ludmila Osipova,5 Michael H. Crawford,6 and R. John Mitchell7

1Department of Biochemistry, Oxford University, Oxford; 2Departamento de Bioquímica, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil; 3Analytical Genetic Testing Center, Inc., Denver; 4Department of Anthropology, University of Florida, Miami; 5Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia; 6Department of Anthropology, University of Kansas, Lawrence; and 7School of Genetics and Human Variation, La Trobe University, Bundoora, Australia


Y chromosomal DNA polymorphisms were used to investigate Pleistocene male migrations to the American continent. In a worldwide sample of 306 men, we obtained 32 haplotypes constructed with the variation found in 30 distinct polymorphic sites. The major Y haplotype present in most Native Americans was traced back to recent ancestors common with Siberians, namely, the Kets and Altaians from the Yenissey River Basin and Altai Mountains, respectively. Going further back, the next common ancestor gave rise also to Caucasoid Y chromosomes, probably from the central Eurasian region. This study, therefore, suggests a predominantly central Siberian origin for Native American paternal lineages for those who could have migrated to the Americas during the Upper Pleistocene.

Address for correspondence and reprints: Dr. Fabrício R. Santos, Departamento de Biologia Geral, ICB/UFMG, Caixa Postal 486, 31.270-910 Belo Horizonte, MG, Brazil. E-mail:


     The pre-Columbian settlers of the New World, who gave rise to the present-day Native Americans, are commonly believed to have come from Siberia, through the Bering land bridge, in the period 30,000-12,000 years before present (ybp). These conclusions are based on cultural, morphological, and genetic similarities between populations of the New World and Siberia (for a review, see Crawford 1998). Although affinities between Asians and Native Americans have been acknowledged for a long time, no particular population in Siberia, except for some Asian Eskimos and their relatives, has been pointed out as directly descended from ancient groups related to the American founder populations (Cavalli-Sforza et al. 1994). Morphological studies have suggested a place of origin for Native Americans along the Amur River region (Crawford 1998), and, more recently, the investigation of mtDNA lineages (Kolman et al. 1996; Merriwether et al. 1996), as well as of retrovirus infections (Neel et al. 1994), has suggested that Mongolia, instead of Siberia, is the source of populations that share the more recent ancestors with the founding population of the Americas.

     Siberia is an inhospitable place for human settlements, but the first hominids may have arrived as early as 260,000 ybp (Waters et al. 1997). The population density has never been high, and it is still vastly uninhabited. These conditions favor a high degree of population isolation and genetic drift, which could have played an important role since the first migrants left for the Americas. Furthermore, the number of indigenous Siberian populations has been decreasing since the beginning of the Russian territorial expansion in the seventeenth century (Forsyth 1996). Many populations have become extinct, and others--such as the Kets, who are inhabitants of the Yenissey River Basin--are now reduced to <1,000 individuals and are speakers of an isolated language unrelated to any other extant languages (Grimes 1996).

     The three-migrations theory (Greenberg et al. 1986) postulates that each of the major Native American groups--namely, the Amerindians, Na-Denes, and Eskimo-Aleuts--came to the Americas in three distinct migratory waves from Siberia, during the period 12,000-6,000 ybp. This theory received support a posteriori from the analysis of mtDNA (Torroni et al. 1993) and several autosomal markers (Cavalli-Sforza et al. 1994), but these data extended the entrance time of these groups to the Americas to 34,000-6,000 ybp. This model has been criticized mainly for the claims of the existence of a genetic homogeneity in present-day Amerindians (Schanfield 1992), who are considered to be the descendants of the first migrants. In addition, further mtDNA analyses have been shown to be consistent with a single migration into the Americas (Merriwether et al. 1995) and with a single migration with further population reexpansion (Forster et al. 1996; Bonnato and Salzano 1997). Some anthropometric studies have also revealed the existence of skeletons not typical of Mongoloids among the oldest hominids in the Americas, suggesting an earlier, non-Mongoloid migration (Neves and Pucciarelli 1991; Lahr 1995).

     Recent studies of the human Y chromosome have shown a major haplotype present in >90% of nonadmixed southern and central Amerindians (Pena et al. 1995; Santos et al. 1995a), indicating a genetic homogeneity and a pronounced founder effect in the formation of these populations. Later, some northern Amerindians and also Na-Dene and Eskimo speakers were studied, and, despite their higher level of admixture (Crawford 1998), these populations also displayed the same major haplotype, although not in such a high frequency (Santos et al. 1996b; Underhill et al. 1996). This major Native American haplotype initially was identified as the combination of alphoid heteroduplex (h) type II and the microsatellite DYS19 A allele (Pena et al. 1995; Santos et al. 1995a), and it subsequently was shown to also be defined by a CT transition in the DYS199 locus (Underhill et al. 1996). The study of several Siberian populations (Karafet et al. 1997; Lell et al. 1997) has identified the presence of the DYS199 T allele only in Asian Eskimos and related tribes from the Beringia region. The presence of the T allele in far northeastern Siberia was explained to be the result of either a back migration of Native American populations bearing the DYS199 T allele or simply a split of populations inhabiting Beringia, after the glaciation period. This suggests that the DYS199 T allele is a useful marker for the identification of Y haplotypes originating after the first migration to the Americas or Beringia but that more-informative (ancient) markers are needed to trace the origin of these Y chromosomes within Asia.

     In this study, the Y chromosomes of five linguistically distinct Siberian populations, as well as of those of Native Americans, Europeans, Indians, Mongolians, central East Asians, and Africans, were analyzed with a set of seven polymorphic systems identifying 30 variable loci in the nonrecombining portion of the Y chromosome. The worldwide distribution of haplotypes and their evolutionary network shows the recent common ancestry of Caucasoid and Native American Y chromosomes, as well as the identification of intermediate Y haplotypes in Siberian populations from the Altai Mountains and the Yenissey River Basin, namely, the Altaians and Kets, respectively.

Material and Methods

DNA Samples

     Most of the 306 male samples were obtained as DNA or were extracted from plugs prepared for pulsed-field gel electrophoresis (Mathias et al. 1994). Samples from Europeans (most were British), Indians (India and Sri Lanka), Africans (Kenyans, Pygmies, and San), central East Asians (Chinese and Japanese), Mongolians (Khalkhs), and Siberians (Buryats, Yakuts, Evenkis, Altaians, and Kets) were subsets of those described elsewhere (Mathias et al. 1994; Zerjal et al. 1997). Ten samples, from south and central Amerindians and a Na-Dene, were purchased from the National Institute of General Medical Science, and an additional 10 Native American samples (not Aleut-Eskimos) came from paternity tests in North America.

DNA Polymorphisms

     The variants of 6-kb and 4.1-kb alphoid units were first identified by hybridization with the pYl probe (Mathias et al. 1994) and subsequently were checked by HindIII digestion of h PCR products (Santos et al. 1995b), to avoid confusion because of the comigration of 4.1-kb bands on the gel from both the 6-kb and the 4.1-kb units. For most cases, 92R7 typing was performed by hybridization (Mathias et al. 1994), but PCR was also used to type some individuals (M. Hurles, F. R. Santos, A. Pandya, and C. Tyler-Smith, unpublished data).

     Additional systems--namely, DYS199, SRY-1532, Tat, the Y Alu polymorphism (YAP), and h--were scored, after PCR, in a MJR PTC200 thermocycler, with a 12.5-l reaction volume, 1 M of each primer, 200 M dNTPs, 1.5 mM MgCl2, 1 U Taq (Bioline) per tube with 1× KCl buffer (Bioline), and other changes as follows. The locus DYS199 (Underhill et al. 1996) was amplified for 30 cycles at 94°C for 20 s, 61°C for 20 s, and 72°C for 30 s, with a modified reverse primer, 5'-AGG TAC CAG CTC TTC CCA ATT-3', containing a GC base change (underlined) that creates an artificial MfeI restriction site when the DYS199 C allele is present. This PCR/RFLP protocol using the MfeI enzyme (New England Biolabs), resolved in native polyacrylamide gels stained by silver (Santos et al. 1996a), allowed us to determine without doubt the allele state at this site that had been detected previously by an allele-specific-primer protocol (Underhill et al. 1996). The locus SRY-1532, with an AG mutation (Whitfield et al. 1995), was amplified for 30 cycles at 94°C for 20 s, 60°C for 20 s, and 72°C for 30 s, with the primers SRY1 (5'-TCC TTA GCA ACC ATT AAT CTG G-3') and SRY2 (5'-AAA TAG CAA AAA ATG ACA CAA GGC-3') and 0.5 U Taq per tube. The G allele was detected by the presence of the DraIII (Boehringer) restriction site on a 1.5% agarose gel in Tris/acetate/EDTA 0.5×. The YAP system was scored, as described elsewhere (Hammer and Horai 1995), in the PCR conditions described above but with PROMEGA enzyme and buffer. The Tat polymorphism (Zerjal et al. 1997) and the h system (Santos et al. 1995b, 1996a) were detected and classified as described previously. Most individuals were also typed for the tetranucleotide microsatellite DYS19 (Santos et al. 1996a).

Population Genetics of Y Chromosomes

     Haplotype frequencies and gene diversities (Nei 1987) were calculated for all populations. A parsimonious network was constructed either manually or by median network analysis (Bandelt et al. 1995), with the knowledge of the molecular mechanisms of the h-system mutations (Santos et al. 1996a) and other loci (Mathias et al. 1994; Hammer and Horai 1995; Jobling and Tyler-Smith 1995; Whitfield et al. 1995; Underhill et al. 1996). On the basis of this network, a haplotype distance matrix was constructed by use of the number of mutation steps between each pair of Y haplotypes. The hierarchic distribution of Y chromosome diversity, measured as the variance components among individuals, populations, and geographic groups, was computed by use of analysis of molecular variance (AMOVA) software. Genetic distances between populations were calculated, and their significance was tested by use of a permutation procedure (Excoffier et al. 1992). Population pairwise FST's and other genetic distances (Excoffier et al. 1992) were used to draw neighbor-joining and UPGMA trees, with the PHYLIP package (Felsenstein 1993), that were visualized by use of TreeView software (Page 1996).


Worldwide Distribution of Y Haplotypes

     This study comprised a sample of 306 men from populations encompassing distinct linguistic affiliations. They are representatives of different geographic areas expected to be informative for the Americas settlement study. The analysis with seven polymorphic systems revealed the variation at 30 distinct loci in the nonrecombining portion of the Y chromosome, which allowed the discrimination of 32 haplotypes among 306 men. The major Amerindian haplotype (Pena et al. 1995; Underhill et al. 1996) is described here as haplotype 31, which is associated with several markers, such as h type II, the alphoid 4.1-kb units, the DYS199 T allele, the 92R7 HindIII- allele, and also the DYS19 microsatellite A allele (data not shown), as well as with the ancestral states of the polymorphisms YAP and Tat. Haplotype 10, differing from haplotype 31 only by the mutation at DYS199, was very frequent (30%) in our Native American sample and was found exclusively among the North American Indians; in addition, it was also observed in a Mongolian and four Indians. Haplotype 20, which is similar to haplotypes 10 and 31, was seen in a single North American Indian and in some populations from the central region of Siberia. It was particularly frequent in a sample of the rapidly disappearing Ket population (70%) and also was found in some Altaians (17.4%) and a single Mongolian. Haplotype 23, which is very different from haplotypes 31, 10, and 20, was seen in a single Na-Dene and could be a more recent migrant haplotype from Asia, since it is most frequent in Mongolia (42%) and also is seen in many Siberians. Haplotype 1, also similar to haplotype 10 and the most frequent in Europe (53%), is also present in India (14.5%) and was found in 20% of the Native Americans, exclusively in the samples collected for paternity tests in North America, but it is absent from Siberia or central East Asia. European ancestry was confirmed for at least one of these Native American samples with haplotype 1 in the paternity-test report. Therefore, the presence of haplotype 1 in North American Indians can be explained as a result of recent admixture with Europeans, whereas haplotypes 10, 20, and 23 cannot be explained in the same way, because they are absent from Europe.

The Y Haplotype Network

     The 32 different Y haplotypes were connected into a parsimonious network assuming no recombination, because all markers were located in the Y-specific region. Information on the mutation mechanisms that characterize the variability of the h system (Santos et al. 1995a, 1995b, 1996a) was used, as well as some additional information about the known ancestral states for DYS199 (Underhill et al. 1996), YAP (Hammer and Horai 1995), Tat (Zerjal et al. 1997), and SRY-1532 (Whitfield et al. 1995) and also the inferred ancestral state for 92R7 (Mathias et al. 1994; Jobling and Tyler-Smith 1995; Jobling et al. 1997). Variability of the 6-kb and 4.1-kb alphoid units was partially associated with the h system (Santos et al. 1995b), and most of the Y chromosomes seemed to have both the 6-kb and 4.1-kb units (Mathias et al. 1994; Santos et al. 1995b; F. R. Santos, A. Pandya, and C. Tyler-Smith, unpublished data). The deletion of the 6-kb units can generate chromosomes bearing 4.1-kb divergent units only, and further deletion of the 4.1-kb units generates chromosomes containing no divergent units. Since deletion events that cause the loss of many units of alphoid DNA are relatively frequent (Mathias et al. 1994; Santos et al. 1995a, 1995b, 1996a), we allowed these events to be recurrent on the network.

     The occurrence of an extra reverse mutation at the SRY-1532 locus is represented as a recurrent step in this parsimony network, leading to haplotype 32. It is supported by the association with 92R7 alleles, by the specific association with h type II, and by analysis with an additional 19 Y markers (F. R. Santos, A. Pandya, and C. Tyler-Smith, unpublished data). In addition, the geographic distribution of haplotype 32 is very similar to that of haplotype 1, its deduced immediate ancestor, and is quite distinct from that of haplotype 19, which shares with haplotype 32 the same allele A at SRY-1532. The same network was obtained with the procedure of median network analysis (Bandelt et al. 1995), which allows the resolution of networks containing such recurrent markers. Our unique network shows the sequential accumulation of mutations used to trace several Y lineages from the ancestral haplotype. The most likely root for this network is haplotype 19, seen only in an African San, because it bears the ancestral states for all the loci for which this information is known or inferred . A recent study with several new biallelic Y markers (Underhill et al. 1997) also supports the conclusion that the San haplotype 19 is the most ancestral Y chromosome. Following from the probable root, haplotypes 3, 13, and 10 are direct ancestors of the Native American haplotype 31, and other related haplotypes, such as haplotypes 20 and 1, share with haplotype 31 the common ancestor haplotype 10. The frequent and widespread haplotype 3 is probably very old, because it gave rise to most, if not all, of the Y chromosomes found outside Africa. Haplotype 13 also is probably old but is very rare, which could indicate that the 92R7 mutation happened quite soon after the origin of haplotype 13, producing haplotype 10. However, a simple deletion of the 6-kb alphoid units in an individual with haplotype 3 can lead to a recurrent haplotype 13. By using other Y markers (F. R. Santos, A. Pandya, and C. Tyler-Smith, unpublished data), we found that the two Chinese individuals with haplotype 13 are recurrent types, because of a de novo deletion of the 6-kb units. Thus, interpretation of the distribution of the few haplotype 13 individuals should be made with care. Fortunately, haplotype 10, the immediate ancestor of haplotype 31, is defined by the point mutation at 92R7 and, together with all chromosomes derived from it, makes up the 92R7 lineage that is important for the tracing of the major migrant Y chromosome to the Americas.

Global Diversity of Y Chromosome Haplotypes

     Our study used markers defining many branches of the 92R7 lineage, with a consequent bias toward a better resolution of populations containing this lineage, despite the fact that most variation was found in the h polymorphism, with no such apparent bias (Santos et al. 1995b, 1996a). For this reason, the gene diversity (Nei 1987) for each population usually was increased when different 92R7-lineage derivatives were present, which also could have increased slightly the within-population variance calculated with AMOVA (discussed below). Therefore, this study was oriented toward a detailed investigation of Y lineages that are interesting with regard to the peopling of the Americas and should not be considered a broad and unbiased description of all worldwide Y lineages.

     The genetic structure of this Y chromosome data was analyzed in detail by use of AMOVA (Excoffier et al. 1992), which allows an estimation of the relative distribution of genetic diversity in three hierarchic levels: among individuals, among populations, and among geographic groups. The AMOVA resulted in the values 59%, 25%, and 16%, respectively, for worldwide Y chromosome diversity. The value of 59% of the total Y chromosome variability found among individuals is relatively low, compared with the value of 85% obtained for other autosomal DNA polymorphisms (Barbujani et al. 1997). Thus, the higher degree of interpopulation and geographic diversity (41%) of Y chromosomes observed in this study emphasizes the usefulness of Y chromosome haplotype analysis, to discriminate between populations and to elucidate past male migrations within and across continents (Jobling and Tyler-Smith 1995; Santos and Tyler-Smith 1996).

Y Chromosome Population Trees

     AMOVA (Excoffier et al. 1992) also generated a matrix of FST analogs between populations of Y chromosomes. This procedure avoids the use of allele frequencies to calculate genetic distances, since the loci in the nonrecombining portion of the Y chromosome are not independent. Otherwise, if we used only the haplotype frequencies (considering the Y chromosome as a single locus, as expected), we would lose the information of shared ancestry. When running AMOVA, we computed the similarity of populations sharing haplotypes related by descent, by taking into account both haplotype frequencies and molecular differences between haplotypes (see Material and Methods). These pairwise FST's computed by AMOVA were used to draw neighbor-joining and UPGMA gene trees (Felsenstein 1993), to display the relationship of Y chromosome populations. Similar trees were obtained with two other distances (data not shown). By use of a nonparametric permutation test (Excoffier et al. 1992), the calculated FST distances were shown to be significant (P < .05), except between the Buryats and Yakuts, the closest groups on the trees. In all the trees, the Native American Y chromosomes clustered with Kets, Altaians, and Caucasoids (Europeans and Indians). European admixture cannot explain this cluster, because if we exclude in the analysis all haplotypes present in Siberians and Amerindians that are also found in Europe (such as haplotype 1, which appears in four Native Americans), the tree remains very similar (data not shown). This tree structure also did not change when we used the inferred frequency 87% for haplotype 31 among 132 Native Americans, published previously (Pena et al. 1995; Santos et al. 1995a; Underhill et al. 1996).

     Although some Siberian and Native American Y chromosomes show remarkably close association with Caucasoid Y chromosomes, other Siberian populations are very distinct, clustering with other Asians. A particular Siberian cluster is formed by Buryat and Yakut Y chromosomes, mainly because of the common origin of most of their Y chromosomes, with the high frequency of the Tat mutation (table 1; Zerjal et al. 1997). Thus, our findings from the haplotype distribution, the haplotype network , and the Y chromosome population trees suggest a high interpopulation differentiation in Siberia, probably because of distinct founder populations and subsequent genetic drift. In addition, these data identify the group of Ket and Altaian Y chromosomes that are related to those among Native Americans and Caucasoids, whereas the Evenki Y chromosomes are related to those of Mongolians. Yakut and Buryat Y chromosomes, which previously were shown to have a common origin with Uralic Y chromosomes (Zerjal et al. 1997), form another distinct cluster.


     A human Y chromosome phylogeny was used to trace the origins of the major founder haplotype of the Americas, haplotype 31. The worldwide distribution of Y haplotypes associated with the information of sequential mutations, displayed in the network , allowed the construction of a map showing the likely pathway of Y chromosomes migrating to the Americas. The present-day distribution of haplotypes related to haplotype 31 can be explained by a radiation from central Eurasia through a northern migration route to the Americas and a southern route to the Indian subcontinent. This dichotomy is supported by the absence of haplotypes 1, 10, 20, and 32 in China and Japan (table 1), including in the analysis of another 56 Chinese and 138 Japanese samples (F. R. Santos and C. Tyler-Smith, unpublished data). A migration route from central Eurasia to northeastern Siberia during the Upper Pleistocene was suggested recently (Lahr and Foley 1994), and the occurrence of several Caucasoid lineages in the Indian subcontinent can be explained by the immigration of Indo-European speakers from central Eurasia after 5,000 ybp (Cavalli-Sforza et al. 1994).

     The major Native American haplotype 31 is present on both sides of Beringia, most likely because of an American or Beringian origin of the mutation in the DYS199 locus (Karafet et al. 1997; Lell et al. 1997). Its immediate ancestor, haplotype 10, is a rare haplotype (11 of 306 individuals) seen only in North America (n = 6), India (n = 4), and Mongolia (n = 1). An old population bearing haplotype 10, a Native American/Siberian/Caucasoid common ancestor, has been placed somewhere in central Eurasia . Haplotypes 1 (Caucasoid), 20 (Siberian and Native American), and 31 (Native American) are derived from this ancestor. The most common European chromosome, haplotype 1, appeared in four Native American samples from paternity tests in North America; thus, they very likely could be due to recent admixture. Haplotype 20, another descendant of haplotype 10 by a simple alphoid locus-deletion step, is very frequent in Kets and was found in some Altaians, all of whom were shown to also have the DYS19 A allele (data not shown), which is also present in most individuals with haplotype 31 (Pena et al. 1995; Santos et al. 1995a; Underhill et al. 1996). Recently, the Ket language was suggested to be closely related to the Na-Dene language (Greenberg 1996), and the resemblance of Kets to Native Americans and Caucasoids, with regard to physical appearance (Forsyth 1996) and Y chromosomes (this study), makes them the most likely central Siberian population to share the same recent ancestors. The Altaians, a common denomination for seven formerly distinct Turkic populations, exhibit very diverse Y haplotypes and could have acquired their Y chromosomes from many neighboring tribes, including the Kets (Forsyth 1996).

     Our study can be compared to current research on the peopling of the Americas. Recent archeological and anthropological studies of the first settlement of the Americas are revealing many alternative migration routes and older dates for settlement (Roosevelt et al. 1996), as well as raising doubts about the Mongolian origins of the first migrants (Neves and Pucciarelli 1991; Lahr 1995). The multiple-dispersals model, suggested recently by paleoanthropologists (Lahr and Foley 1994), claims that the first migrants to the Americas were from a Southeast Asian stock, whereas our Y chromosome data suggest a northern Eurasian route of migration. However, they also proposed two distinct dispersals from central Eurasia to northeastern Siberia and Europe, one in the middle Upper Pleistocene (50,000-15,000 ybp) and another in the late Upper Pleistocene (15,000 ybp to present). The former could be the source of the Y chromosomes in those who first migrated to the Americas through Siberia, as well as the source of the Y chromosomes in those colonizing Europe in the Paleolithic.

     The very recent find of the 9,400-year-old skeleton of the Kennewick man, which displays some Caucasoid characteristics, and his contemporary, the Spirit Cave mummy, suggests that the earliest migrants could be distinct from present-day populations (Morell 1998b). Possible genetic relationships between Eurasians and Native Americans are suggested by the presence of the rare mtDNA haplogroup X in both population groups, which apparently is absent in Siberia (Morell 1998a). Alternatively, in our study, the Y chromosome data reveal a common ancestor (haplotype 10) between Native Americans and Europeans, who left some rare descendants in Siberia, among the Kets and Altaians. However, the presence of the most common European haplotype 1 in the Americas can be explained as a recent European admixture more likely than as a remnant of a pre-Columbian migrant. Our Y chromosome data, when compared with morphological and mtDNA data, could imply another migration of typically Mongoloid people, who would have left phenotypic traces in their Native American descendants without contributing many of their Y chromosomes. This pattern of unequal paternal and maternal contributions in the gene pools of several populations has been characterized and discussed in detail by Poloni et al. (1997).

     The major Native American Y haplotype occurs in high frequencies among Amerindians, Na-Denes, and Aleut-Eskimos (Pena et al. 1995; Santos et al. 1995a, 1996b; Underhill et al. 1996; Karafet et al. 1997; Lell et al. 1997; Rodriguez-Delfin et al. 1997; Underhill et al. 1997). It represented 90% of 90 nonadmixed South American Indians in our previous studies (Pena et al. 1995; Santos et al. 1995a) and 60% of 412 Native American Y chromosomes analyzed by other groups (Underhill et al. 1996; Karafet et al. 1997; Lell et al. 1997; Rodriguez-Delfin et al. 1997; Underhill et al. 1997), including Y chromosomes from tribes with a very high level of admixture, especially in North America (Santos et al. 1996b; Crawford 1998). The presence of this founder Y haplotype in the Americas suggests a single major migration and is compatible with a settlement model incorporating a population differentiation of all Native Americans in Beringia, as suggested by recent mtDNA studies (Forster et al. 1996; Bonatto and Salzano 1997). The first migrants bearing a proto-Caucasoid Y chromosome (haplotype 10) would have come from the region of central Siberia to Beringia 30,000 ybp (Cavalli-Sforza et al. 1994; Underhill et al. 1996). The mutation in the DYS199 locus, which produced haplotype 31, could have happened in this Pleistocene Beringian population, which would have experienced an expansion and migrated south to the Americas, through the Alberta ice-free corridor. Subsequently, the collapse of this corridor 20,000-14,000 ybp (reviewed in Bonatto and Salzano 1997) would have isolated the population that was still in Beringia from the recent migrants in the Americas, who, after a major founder effect, would give rise to the Amerindians. During this time of isolation, new minor Siberian migrants could have come to Beringia, and, at the end of the glaciation (12,000-10,000 ybp), these Beringians finally could have migrated to the Americas, originating the Na-Dene and Eskimo-Aleut speakers, with both still retaining the major haplotype 31 (Underhill et al. 1996; Karafet et al. 1997; Lell et al. 1997) and other Y chromosome lineages in frequencies higher than those in Amerindians, exemplified by haplotypes 10, 20, and 23 in North American Indians (table 1). These chromosomes could represent later migrations from central Siberia or Mongolia, despite the possibility that present-day individuals with haplotype 10 could be descendants of the first migrants prior to the acquisition of the DYS199 mutation. Other scenarios, involving earlier dates (<15,000 ypb), for the first settlement of the Americas are likely, but it is difficult to explain a single major migration with further differentiation for at least three major Native American groups.

     This study traces the major Native American Y chromosome haplotype to the immediate ancestor shared with present-day Siberians and to an older common ancestor shared with Caucasoids (Europeans and Indians). This common ancestry of Native Americans and Caucasoids could explain the existence of non-Mongoloid skeletons, such as the Kennewick man. Despite the fact that the Y chromosome represents only 1 of 46 in the human male genome, in numeric terms, its exclusive father-to-son inheritance allows us to study patrilineages that reflect the past male migrations but that may not reflect the global history of populations. However, the Y lineage is the largest of many genomic lineages that compose the population history of modern Homo sapiens, and it is the counterpart to mtDNA lineage studies. Furthermore, the human Y chromosome seems to display an association with linguistics and geography that is higher than that for mtDNA (Poloni et al. 1997), and our data concur with some current views on the settlement of the Americas. Further analysis of all Y lineages present in the Americas that uses microsatellites (Santos and Tyler-Smith 1996; Zerjal et al. 1997) will be very useful in the detailed study of all trans-Bering Strait migrating lineages, as well as to the more precise determination of their entry time into the Americas.


     We thank D. R. Carvalho-Silva and E. Tarazona-Santos for comments on the manuscript. This work was supported by grants from Conselho Nacional de Desenvolvimento Cientifico e Tecnologico and Fundação de Amparo à Pesquisa do Estado de Minas Gerais, Brazil, and from the Leverhulme Trust, United Kingdom.