The Genetic Structure of Pacific Islanders

¤ Current address: Center for the Study of Human Origins and the New York Consortium in Evolutionary Primatology, Department of Anthropology, New York University, New York, New York, United States of America Affiliation Department of Anthropology, Binghamton University, Binghamton, New York, United States of America ⨯

Affiliation Department of Anthropology, Binghamton University, Binghamton, New York, United States of America ⨯

Affiliation Marshfield Clinic Research Foundation, Marshfield, Wisconsin, United States of America ⨯

The Genetic Structure of Pacific Islanders

Correction

28 Mar 2008: Friedlaender JS, Friedlaender FR, Reed FA, Kidd KK, Kidd JR, et al. (2008) Correction: The Genetic Structure of Pacific Islanders. PLOS Genetics 4(3): 10.1371/annotation/cbdd11a0-4a29-4e7c-9e4e-c00a184c7777. https://doi.org/10.1371/annotation/cbdd11a0-4a29-4e7c-9e4e-c00a184c7777 View correction

Figures

Abstract

Human genetic diversity in the Pacific has not been adequately sampled, particularly in Melanesia. As a result, population relationships there have been open to debate. A genome scan of autosomal markers (687 microsatellites and 203 insertions/deletions) on 952 individuals from 41 Pacific populations now provides the basis for understanding the remarkable nature of Melanesian variation, and for a more accurate comparison of these Pacific populations with previously studied groups from other regions. It also shows how textured human population variation can be in particular circumstances. Genetic diversity within individual Pacific populations is shown to be very low, while differentiation among Melanesian groups is high. Melanesian differentiation varies not only between islands, but also by island size and topographical complexity. The greatest distinctions are among the isolated groups in large island interiors, which are also the most internally homogeneous. The pattern loosely tracks language distinctions. Papuan-speaking groups are the most differentiated, and Austronesian or Oceanic-speaking groups, which tend to live along the coastlines, are more intermixed. A small “Austronesian” genetic signature (always <20%) was detected in less than half the Melanesian groups that speak Austronesian languages, and is entirely lacking in Papuan-speaking groups. Although the Polynesians are also distinctive, they tend to cluster with Micronesians, Taiwan Aborigines, and East Asians, and not Melanesians. These findings contribute to a resolution to the debates over Polynesian origins and their past interactions with Melanesians. With regard to genetics, the earlier studies had heavily relied on the evidence from single locus mitochondrial DNA or Y chromosome variation. Neither of these provided an unequivocal signal of phylogenetic relations or population intermixture proportions in the Pacific. Our analysis indicates the ancestors of Polynesians moved through Melanesia relatively rapidly and only intermixed to a very modest degree with the indigenous populations there.

Author Summary

The origins and current genetic relationships of Pacific Islanders have been the subjects of interest and controversy for many decades. By analyzing the variation of a large number (687) of genetic markers in almost 1,000 individuals from 41 Pacific populations, and comparing these with East Asians and others, we contribute to the clarification and resolution of many of these issues. To judge by the populations in our survey, we find that Polynesians and Micronesians have almost no genetic relation to Melanesians, but instead are strongly related to East Asians, and particularly Taiwan Aborigines. A minority of Island Melanesian populations have indications of a small shared genetic ancestry with Polynesians and Micronesians (the ones that have this tie all speak related Austronesian languages). Inland groups who speak Papuan languages are particularly divergent and internally homogeneous. The genetic divergence among Island Melanesian populations, which is neatly organized by island, island size/topography, as well as their coastal or inland locations, is remarkable for such a small region, and enlarges our understanding of the texture of contemporary human variation.

Citation: Friedlaender JS, Friedlaender FR, Reed FA, Kidd KK, Kidd JR, Chambers GK, et al. (2008) The Genetic Structure of Pacific Islanders. PLoS Genet 4(1): e19. https://doi.org/10.1371/journal.pgen.0040019

Editor: Jonathan K. Pritchard, University of Chicago, United States of America

Received: June 12, 2007; Accepted: December 13, 2007; Published: January 18, 2008

Copyright: © 2008 Friedlaender et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Funding: Different aspects of the project were supported by National Science Foundation grants BNS-0215827, BCS 0413449, and BCS 0243064, the Wenner-Gren Foundation for Anthropological Research, the National Geographic Society Exploration Fund, Taiwan National Science Council grant 95–2627-H-195–001, and Temple University, Binghamton University, and Yale University. FAR is supported by NIH grant F32HG003801.

Competing interests: The authors have declared that no competing interests exist.

Introduction

The populations in New Guinea and the islands immediately to the east (the Bismarck and Solomons archipelagos) are well-known for their great diversity in cultures, languages, and genetics, which by a number of measures is unsurpassed for a region of this size [1]. This area is referred to as Near Oceania, as opposed to the islands farther out in the Pacific, known as Remote Oceania [2] (see Figure 1). For simplicity, we refer only to the peoples of Near Oceania as “Melanesians,” although this term ordinarily encompasses additional groups to the east as far as Fiji, who are not covered in this study. Major parts of Near Oceania were settled from Southeast Asia early in modern human prehistory, between ∼50,000 and ∼30,000 years before present (YBP) [3–5]. Populations were relatively isolated at this edge of the human species range for the following 25,000 years. The early settlers in Near Oceania were very small groups of hunter-gatherers. For example, New Ireland, which is more than 300 km long, is estimated to have had a pre-Neolithic carrying capacity of ∼1,200 people or fewer [6]. There is evidence of sporadic, modest contact between New Guinea and the Bismarcks from 22,000 YBP, and with Bougainville/Buka in the Solomons only from ∼3,300 years ago [3,7].

PowerPoint slide larger image original image Figure 1. Populations Included in This Study

(A) HGDP-CEPH population locations. The two Pacific groups are boxed.

(B) Pacific population locations. Our population samples are blue; the 2 HGDP-CEPH Melanesian “Oceanic” groups are red.

By ∼3,300 YBP [3], at least one powerful new impulse of influence had come from Austronesian speaking migrants from Island Southeast Asia, likely associated with the development of effective sailing [8], that led to the appearance of the Lapita Cultural Complex in the Bismarck Archipelago. After only a few hundred years, “Lapita People” from this area had colonized the islands in Remote Oceania as far east as Tonga and Samoa, where Polynesian culture then developed [9].

The distribution and relations of Pacific language families reflect ancient settlement. Austronesian is a widespread and clearly defined linguistic family with more than 1,000 member languages, which has its greatest diversity, and likely origin, in Taiwan ∼4,000–5,000 years ago [10]. Some basic phylogenetic relations within Austronesian are sketched in Figure S1. All Austronesian languages spoken outside Taiwan belong to the Malayo-Polynesian branch, and almost all the Malayo-Polynesian languages of Oceania belong to the Oceanic branch. It is Proto Oceanic, the immediate ancestor of the Oceanic languages, that is associated with an early phase of the Lapita Cultural Complex. Proto Oceanic split into a number of branches as its descendants spread across Remote Oceania, including Proto Nuclear Micronesian and Proto Polynesian (a branch of Central Oceanic).

Almost all the other indigenous languages of Oceania are referred to as non-Austronesian, or Papuan. Most Papuan languages are found in New Guinea, with the remainder in nearby islands. This is a residual category of ∼800 languages. Most of these can be assigned to more than 20 different language families, but these families cannot be shown to be related on present evidence. There remain a number of “Papuan” isolates that cannot be grouped at all [11]. Trans New Guinea is the largest Papuan language family. It consists of ∼400 languages and dates to 6,000 to 10,000 YBP [12]. Other Papuan families including the ones in the Bismarck and Solomon archipelagos probably also go back at least to this period [13–15]. While it is reasonable to assume these different Papuan families had common origins further back in time, any evidence of such ties that is recoverable with standard methods of historical linguistics has been erased over the millennia. The concentration and number of these apparently unrelated language families and isolates is unsurpassed in any other region of the world [15].

Analyses of genetic variation at some informative loci, particularly the mitochondrial DNA (mtDNA) (reviewed in [16,17–19]), non-recombining Y-chromosome markers (NRY) (reviewed in [19,20]), and a small set of autosomal microsatellites [21] have provided divergent impressions of the population genetic structure of both Near and Remote Oceania. Because they have ¼ the effective sample size of autosomal markers, the mtDNA and NRY haplotypes have been particularly subject to the effects of random genetic drift, and each autosomal marker, no matter how informative, still represents a minute fraction of the total genetic variation among populations. Even so, these data have shown that the genetic variation in Near Oceanic populations is considerably greater than in Remote Oceanic ones, and that there are a cluster of haplogroups that developed in particular islands of Near Oceania between approximately 50,000 and 30,000 years ago.

However, a number of unresolved issues remain concerning the proper interpretation of these and other data that a comprehensive genomic sampling of neutral biparental markers across Pacific populations should clarify. A list of these includes: 1) to whom are these diverse Melanesian populations most closely related outside this region (East or South Asians, or perhaps even Africans, whom they physically resemble)? 2) how does the genetic diversity and differentiation of Near Oceanic populations compare with those in other regions? 3) is there a clear organization of the variation among groups in Near Oceania (i.e., either by language, by island, or distance from major dispersal centers)? 4) is there a genetic signature of Aboriginal Taiwanese/Southeast Asian or Polynesian influence in Melanesian populations, especially in the Bismarcks, where the Lapita Cultural Complex developed? and 5) are Polynesians more closely related to Asian/Aboriginal Taiwanese populations or to Melanesians?

Here we report the analysis of 687 microsatellite and 203 insertion/deletion (indel) polymorphisms in 952 individuals from 41 Pacific populations, primarily in the Bismarck Archipelago and Bougainville Island, and also including select sample sets from New Guinea, Aboriginal Taiwan, Micronesia, and Polynesia. The results show the reduced internal variation of Near Oceanic Melanesian populations and the remarkable divergence among them, and how this divergence is influenced by island size and topography, and is also correlated with language affiliation. We also detected a very small but clear genetic signature of “Asian/Polynesian” intermixture in certain Austronesian (Oceanic)-speaking populations in the region (by “genetic signature,” we mean an ancestral proportion in some groups inferred by the STRUCTURE analysis that predominates in another ancestral grouping). For global context, these data were compared with data from the Centre d'Etude du Polymorphisme Humain human genome diversity panel (HGDP-CEPH), composed of cell lines [22–24], especially its subset from East Asia. Figure 1A shows how undersampled the Pacific populations had been in the HGDP-CEPH dataset (as well as its emphasis on particular regions of Asia), and Figure 1B shows the distribution of our Pacific population samples, with its intensive coverage in Near Oceania.

Results

Our sampling strategy concentrated on Papuan-speaking populations and their immediate Oceanic-speaking neighbors from the islands immediately to the east of New Guinea, in what is called Northern Island Melanesia, consisting of the Bismarck and Solomon Archipelagos (see Figure 1B). The three largest islands of the region were most intensively sampled—New Britain, New Ireland, and Bougainville—along with two nearby smaller islands (New Hanover and Mussau). Additional Pacific samples came from New Guinea (one set from the lowland Sepik region and one set from the Eastern Highlands), Micronesia (primarily from Belau), Polynesia (Samoans and one New Zealand Mãori group), and aboriginal Taiwan (Amis and the Taroko, a mountain Atayal group). The details of the sample locations and language family affiliations are given in Table S1 and in the Methods section.

The Global Context

Figure 2 shows the estimated values of θ (θ̂) calculated from expected heterozygosity (He) arranged from highest to lowest values, combining our Pacific populations and the HGDP-CEPH global set (the values of θ̂, He, and the average number of alleles per locus are given in Table S1). From Ohta and Kimura [25], under a stepwise model, the expected relationship between θ and heterozygosity (H) is