Comparative genetic structure among Colombian and Mexican Drosophila pseudoobscura populations by using 14 microsatellite markers: a first approach.
Alvarez, Diana1, Manuel Ruiz-García1*, Mohamed Noor2, and Victor M. Salceda3. 1Unidad de Genética (Genética de Poblaciones-Biología Evolutiva). Laboratorio de Bioquímica, Biología y Genética Molecular de Poblaciones. Departamento de Biología. Facultad de Ciencias. Pontificia Universidad Javeriana. Cra 7ª No 43-82. Bogotá DC., Colombia; 2Department of Biological Sciences. 138 Life Sciences Building. Lousiana State University. Baton Rouge, LA 70803, USA; 3Instituto Nacional de Investigaciones Nucleares, México DF., México. *For comments, please use the e-mail email@example.com
Since 1963, when Dobzhansky et al. (1963) published the first karyotype study about the peripatrid Colombian Drosophila pseudoobscura population, isolated from the main geographical range at 2400 Km in North America, this relict population has been an object of study by part of different evolutionary biologists from diverse standpoints. The most outstanding works carried out were centered on allozyme genic depauperation in the Colombian population (Prakash et al., 1969), allozyme genetic divergence among the Colombian and the USA populations (Ayala and Dobzhansky, 1974), existence of rare alleles at the XDH, Est-5 and ADH loci in the Colombian populations (Singh et al., 1976; Coyne and Felton, 1977), sterile hybrid males from crossing Colombian and North American flies (Prakash, 1972), nucleotide sequence divergence at the ADH and ADH-Dup genes among the Colombian and North American populations (Schaeffer and Miller, 1991, 1992) as well as at the srRNA mitochondrial gene sequences (Jenkins et al., 1996), and at the level of courtship behavior (Noor et al., 2000a). Furthermore, several new works have revealed new chromosome variability and previously undetected possible natural selection acting upon the third chromosome rearrangements in the Colombian highland populations (Ruiz-Garcia et al., 1999, 2001; Alvarez et al., 2001a). Recently, the first work comparing several molecular genetic parameters, such as gene diversity levels, effective population numbers, divergence times and others, by using microsatellite loci among the Colombian and U.S Drosophila pseudoobscura populations have been published (Alvarez et al., 2001b). Nevertheless in that work, only five microsatellite loci were analyzed and three Colombian populations were compared with four U.S. populations in order to find genetic differences. In the current work, we have increased the number of microsatellite loci, the sample sizes, and the number of Colombian populations. In addition, here the Colombian Drosophila pseudoobscura have been contrasted with several Mexican populations. It seems obvious that the Colombian populations could be originated throughout similar populations that we nowadays find in Mexico with similar semi-tropical climate characteristics. To get new results to complete the comparative picture among the Colombian and the Mexican populations, eleven Drosophila pseudoobscura populations, six in Colombia and five in Mexico, were studied by using 14 microsatellite loci (DpsX001, DpsX002, DpsX003, DpsX009, DpsX010, Dps2001, Dps2002, Dps2005, Dps3001, Dps3002, Dps3003, Dps3004, Dps4001 and Dps4002).
Materials and Methods
Throughout 1996-1998, we sampled six Colombian D. pseudoobscura populations: Torobarroso (04º 55´ N, 74º01´ W), Susa (05º 27´ N, 73º 49´ W), Sutatausa (05º 15´ N, 73º 51´ W), Potosi (04º 48´ N, 73º 56´ W), Las Palmeras (05º 16´ N, 73º 50´ W), and La Linea Dura (05º 22´ N, 73º 47´ W). The five Mexican populations were Tulancingo (20º 04´ N, 98º 20´ W), San Luis Potosí (22º 03´ N, 100º 27´ W), El Seco (19º 07´ N, 97º 33´ W), Amecameca (14º 07´ N, 98º 41´ W), and Zurahuen (19º 27´ N, 101º 45´ W) (see Figure 1). Flies were sampled from these localities monthly using fermented banana traps. Flies directly caught from nature and individuals from isofemale lines were used for all microsatellite assays. From thirty to fifty flies were used for each population. DNA extractions of each individual were done in 60mM NaCl, 5% sucrose and 1.25% SDS, and DNA was resuspended in 50 ml of TE buffer.
The PCR mix was 2 ml DNA, 2.5 mM MgCl2, 1mM dNTPs, 0.5 mM each primer, and 1 unit of Taq polymerase. The reaction was done in a Perkins Elmer 9600 thermal cycler with an initial denaturation of 95°C for 5 minutes, followed by forty cycles of 94°C - 1 minute, 60.5°C- 1 minute, 72°C- 1 minute, and a final extension of 72°C for 5 minutes. The temperature for the annealing step changed in accordance to the marker (60.5°C for DpsX001, Dps2001, Dps3002, Dps4001, and Dps4002; 58.2°C for DpsX002, DpsX003, DpsX004, Dps3001, and DpsX003; 56°C for DpsX009, DpsX010, Dps2002, and Dps3004). The products were sized on 6% denaturanting polyacrylamide gels, running in the same conditions used for sequencing gels in a chamber Hoeffer SQ3 Sequencer. Bands were silver stained and the gels were dried. Genotypes were scored manually. All microsatellite markers employed had dinucleotide repeat motifs. Details of the isolation and genomic locations of these microsatellites were presented elsewhere (Noor et al., 2000b). Population genetic parameters were analyzed as follows:
(1) Hardy-Weinberg equilibrium was analyzed for each molecular marker by using exact probability test with Markov chains with the Metropolis´s algorithm. It was specified that there were 50 blocks per analysis, 1000 replications per block, and 1000 demorization steps. Likely, an overall multilocus probability test was employed to analyze simultaneously all loci in the Colombian and Mexican set separately and all populations and all loci simultaneously. In addition, the exact probability tests were employed to analyze the amounts of gametic disequilibrium for the Colombian and Mexican groups and for all populations taken altogether.
(2) The genetic heterogeneity among the populations studied. In addition, the gene flow estimates were obtained by employing the Wright’s (1965) FST and the Slatkin’s (1995) RST diversity statistics, with exact tests. The gene flow was alternatively calculated with the private allele procedure of Slatkin (1985) and Barton and Slatkin (1986). In addition, a hierarchical F analysis with the Michalakis and Excoffier (1996) procedure was carried out. In identical fashion an AMOVA (Analysis of Molecular Variance) was applied.
(3) The matrix of dm2 (Goldstein et al., 1995) genetic distance was produced to compare pairs of these eleven populations. Throughout these genetic distance pairs, divergence times were calculated assuming that the estimated mutation rates per generation (m) for microsatellite loci, with dinucleotide repeat motifs in Drosophila melanogaster, is approximately 9.3 x 10-6 mutations per generation (Schug et al., 1998a, b), and D. pseudoobscura is suggested to have a similar microsatellite mutation rate (Noor et al., 2000b), and assuming that the D. pseudoobscura generation time is approximately 20 days in nature.
(4) The last population genetic analysis has been focusing on the detection of recent bottleneck events in the Drosophila populations studied. The term recent means that these populations have gone throughout a bottleneck event 2Ne-4Ne generations ago, being Ne the effective number of these species. To carry out this analysis, the most recently derived theory, generated by Cornuet and Luikart (1996), Luikart and Cornuet (1998), and Luikart et al. (1998), was employed. The populations, which have experienced a recent bottleneck simultaneously lost the allele number and the expected levels of heterozygosity. Nevertheless, the allele number (ko) is reduced faster than the expected heterozygosity. Therefore, the value of the expected heterozygosity calculated throughout the allele number (Heq) is lower than the obtained expected heterozygosity (He). This excess of the expected heterozygosity, regarding to that obtained throughout the number of alleles, has been demonstrated under the infinite allele model (Kimura and Crow, 1964), although it is not so clear under a step-wise mutation model (Ohta and Kimura, 1973). The microsatellite markers employed here, although probably nearest to the second mutational model, do not strictly follow. As soon as a marker departs slightly from the step-wise model toward the allele infinite model, the excess of the expected heterozyosity will be fast put forward as a consequence of a bottleneck event. For neutral markers, in a population in mutation-gene drift equilibrium, there is an equal probability that a given locus has a slight excess or deficit of heterozygosity regard to the heterozygosity calculated from the number of alleles. On the contrary, in a bottlenecked population, a big fraction of the loci analyzed will exhibit a significant excess of the expected heterozygosity. To measure this probability, four diverse procedures were used as follows: sign test, a standardized difference test, a Wilcoxon´s signed rank test (Luikart and Conuet, 1998; Luikart et al., 1998), and a graphical descriptor of the shape of the allele frequency distribution. A population, which did not suffer a recent bottleneck event, will yield an L-shape distribution (such as expected in a stable population in mutation-gene drift equilibrium), whereas a recently bottlenecked population will show a mode-shift distribution. The Wilcoxon´s signed rank test is feasibly the most powerful and well-supported when the number of loci analyzed is low, such as it is in the current case.
The major fraction of the microsatellite loci studied, both in Colombia and in Mexico, were not in Hardy-Weinberg equilibrium. Probably, the Wahlund effect and/or endogamy could explain the strong deficiency of heterozygotes found. In the Colombian case, the DPSX001, DPSX002, DPSX003, DPSX010, DPS2001, DPS3001, DPS3002, DPS3003, DPS4001 were not in Hardy-Weinberg equilibrium when we employed an exact test with Markov chains. For the Mexican populations, the microsatellite loci skewed from the Hardy-Weinberg equilibrium were the same with the outstanding difference of the DPS2002 locus, which was in Hardy-Weinberg equilibrium in all the Colombian populations, with exception of Potosí, whereas all the Mexican populations were not in Hardy-Weinberg equilibrium. In an identical sense, meanwhile only one Colombian population (Sutatausa) was in Hardy-Weinberg equilibrium; the major fraction of the Mexican ones were in Hardy-Weinberg equilibrium at the DPS3001 marker. These differential results for these two molecular markers could be explained if they are under the control of different selective pressures associated to chromosomal rearrangements themselves linked to different climatic characteristics in Colombia and México. Only one locus showed consistent agreement with Hardy-Weinberg equilibrium in Colombia (DPS2002), whereas three loci were found in equilibrium in Mexico (DPSX009, DPS3001 and DPS3004). When populations were considered all together, Hardy-Weinberg equilibrium was significantly discarded by using a multi-locus test (P = 0.0000 + 0.0000). Likely, when all loci were analyzed together, Hardy-Weinberg equilibrium was refused by employing a multi-locus test (P = 0.0000 + 0.0000). These results put forward the existence of significantly different genic pools for these 14 microsatellites analyzed within and between the D. pseudoobscura populations analyzed in Colombia and in México (Table 1).
The levels of gametic disequilibrium were slightly more elevated in the Colombian populations than in the Mexican ones, ranging from 4.5% (Torobarroso) to 12.73% (Susa) for the first country, and from 0% (Zirahuen) to 8.64% (Tulancingo) in the Mexican case. This could indicate that the Mexican populations could have effective numbers slightly greater than those from the Colombian populations. When populations were considered altogether a highly significant 16.48% of loci pair combinations yielded gametic disequilibrium, which could be generated by Wahlund effect to consider as a unique population, groups genetically differentiated.
The genetic heterogeneity was extremely important when all populations were considered together. All loci showed a P = 0.00000 and all them jointly showed a c2 = infinite with 28 degrees of freedom and P = 0.00000. When the genetic heterogeneity was considered for each one of the countries, only four microsatellite loci yielded no significant heterogeneity in Colombia (DPSX009, P = 0.25814; DPS2002, P = 0.16832; DPS2005, P = 0.3526; DPS4002, P = 0.0743) and also four microsatellites did not show significant heterogeneity in Mexico (DPSX009, P = 0.1372; DPSX010, P = 0.10396; DPS2005, P = 0.22364; DPS3004, P = 0.17438). Such as it was observed, the loci DPSX009 and DPS2005 did not present significant heterogeneity in both countries, which could express that these two molecular markers are associated to genome areas submitted to unifying natural selection. When all the loci were jointly considered in Colombia, by one hand, and in Mexico, on the other hand, the total genetic heterogeneity was extremely high in both countries (c2 = infinite, 28 degrees of freedom, P = 0.00000). The results obtained by using the Wright´s FST statistic for the Colombian population set showed an average of FST = 0.03704, which was significant (c2 = infinite, 28 degrees of freedom, P = 0.00000). This statistic for the Mexican population set was smaller, FST = 0.01465, although significant as well. However, the situation is inverse when the unbiased RST statistic (Slatkin, 1995) was employed. This statistic ranged from 0.03524 (average over loci) to 0.05360 (average over variance components) for Colombia (similar to that detected by the Wrigth´s FST statistic), meanwhile it ranged from 0.04495 (average over loci) to 0.07503 (average over variance components) for Mexico, which represented among 3-5 times fold that the genetic heterogeneity values found with the Wrigth´s FST statistic. When we employed the infinite island model, the n-dimensional island model, and the private allele model (Slatkin, 1985), the theoretical gene flow estimates were higher for the Mexican set (Nm = 16.8148, Nm = 10.7615 and Nm = 4.6152, respectively) than for the Colombian one (Nm = 6.4994, Nm = 4.5135 and Nm = 2.3474, respectively). On the contrary, when the RST statistic was employed to determine levels of gene flow among the studied populations the situation was opposite. The gene flow for the Colombian group ranged from 5.1325 to 3.3109, meanwhile the Mexican gene flow estimates oscillated from 4.2495 to 2.4654. In whatever case, these gene flow estimates are moderately elevated, although they seem not enough to cancel genetic heterogeneity among the populations within each of the countries considered. When all the Colombian and Mexican populations were considered together, the RST values noticeably increased (RST = 0.1875 (average over loci) - 0.2768 (averaging variance components)). The corresponding Nm values were smaller than 1 (Nm = 0.5805-0.9629), which indicates a clear genetic isolation among the Colombian and Mexico populations.
The application of the hierarchical Wrigth´s F statistics by using the procedure of Michalakis and Excoffier (1996) with jackknifing over loci, taken together all the Colombian and the Mexican populations analyzed, revealed the following values: FIT = 0.449 + 0.034, FIS = 0.365 + 0.045, FST = 0.132 + 0.023, which suggested a strong excess of homozygous, especially within the Total population and within the individual subpopulations. This result agrees quite well with an AMOVA (Analysis of Molecular Variance) obtained with the procedure of the distance of the difference of squared size allele. In this analysis 45.48% of the genetic variation was found among groups (among the Colombian and Mexican sets), only 1.51% of the genetic variation was observed within the populations within the groups (among theColombian populations and among the Mexican populations, respectively) and a 53.01% of the genetic variation was discovered within each one of the Colombian and Mexican individual populations considered. Therefore, within each population there are a considerable genetic variability amount.
Table 1. FIS statistic values for each marker (14 microsatellites) and for each population (11) studied in Colombia and Mexico. In each row, the upper value belongs to the Weir & Cockerham´s statistic, and the lower value belongs to the Robertson & Hill´s statistic. TORB = Torobarroso, SUT = Sutatausa, POT = Potosí, LPAL = Las Palmeras, LDUR = La Línea Dura, TUL = Tulancingo, ELSC = El Seco, SLU = San Luis, AMEC = Amecameca, ZUR = Zurahuen. * P< 0.05, ** P< 0.005.
Table 2. Bottleneck analysis by using the Cornuet & Luikart (1996) and Luikart et al. (1998) theory applied to each one of the Drosophila pseudoobscura populations analyzed. The symbols of the populations as in Table 1. Three different tests were applied to detect bottleneck in these populations: the sign test, the standardized differences test and the Wilcoxon test. I. A. M. = Allele infinite mutation model. S. M. M. = Step-wise mutation model. Numbers showed correspond to the the probability of each one of these tests.
The divergence times
among the Colombian and
Mexican populations was estimated
throughout the mathematical properties
of the dm2
genetic distance (Goldstein et al.,
1995). Assuming the mutation
rates per generation obtained by Schug et al. (1995) for microsatellites analyzed in Drosophila pseudoobscura
(9.3 x 10-6 to 6.5 x 10-6) and a generation time of
20 days in nature for D. pseudoobscura,
we obtained a time divergence separation estimate from 61,507 to 88,002 years
ago, which is highly similar to the estimates obtained by Alvarez et al.
(2001b) among several Colombian and USA D. pseudoobscura
populations (75,000-88,700 years ago) and those obtained by Jenkins et
al. (1996) (109,375 years ago with one-standard-error
limits of 87,500 and 131,250 years ago). Furthermore, Schaeffer (in Jenkins et
al., 1996) by using a substitution rate
of 3.2% per million years for ADH
sequences calculated an estimate of 77,000 years ago. Among the Colombian populations, the separation could have
accounted among 11,087 and 15,864 years ago, and among the Mexican populations,
this separation time divergence estimate could range from 6,790 to 9,716 years
ago, although the estimates for the Colombian and Mexican populations could
represent “effective divergence times” rather than “real
divergence times”, because the hard possibility of internal gene flow
among the Colombian populations, on one hand, and among the Mexican populations,
on the other hand, cannot be excluded, such as it is reflected in the gene
flow estimates previously commented.
throughout the mathematical properties of the dm2 genetic distance (Goldstein et al., 1995). Assuming the mutation rates per generation obtained by Schug et al. (1995) for microsatellites analyzed in Drosophila pseudoobscura (9.3 x 10-6 to 6.5 x 10-6) and a generation time of 20 days in nature for D. pseudoobscura, we obtained a time divergence separation estimate from 61,507 to 88,002 years ago, which is highly similar to the estimates obtained by Alvarez et al. (2001b) among several Colombian and USA D. pseudoobscura populations (75,000-88,700 years ago) and those obtained by Jenkins et al. (1996) (109,375 years ago with one-standard-error limits of 87,500 and 131,250 years ago). Furthermore, Schaeffer (in Jenkins et al., 1996) by using a substitution rate of 3.2% per million years for ADH sequences calculated an estimate of 77,000 years ago. Among the Colombian populations, the separation could have accounted among 11,087 and 15,864 years ago, and among the Mexican populations, this separation time divergence estimate could range from 6,790 to 9,716 years ago, although the estimates for the Colombian and Mexican populations could represent “effective divergence times” rather than “real divergence times”, because the hard possibility of internal gene flow among the Colombian populations, on one hand, and among the Mexican populations, on the other hand, cannot be excluded, such as it is reflected in the gene flow estimates previously commented.
Finally, recent bottleneck events were investigated in each one of the Colombian and Mexican populations analyzed, by using the population genetic theory proposed by Cornuet and Luikart (1996) and Luikart et al. (1998). Two Colombian populations did not show any evidence of recent bottleneck events (Susa and Torobarroso). Another two Colombian populations, Sutatausa and Potosí showed certain possibilities to go throughout a recent bottleneck when the Wilcoxon and the standardized differences tests were applied for a infinite allele mutation model, but not for a step-wise mutation model. On the contrary, and surprisingly, the Mexican populations of Tulancingo, El Seco, San Luis, and Amecameca showed to go throughout recent bottlenecks. For Tulancingo, the recent bottleneck was evidenced for both mutation models and with the sign, standardized differences and Wilcoxon tests. For El Seco, the Wilcoxon test with the infinite allele mutation model revealed a recent bottleneck as well. The same was found for San Luís and for Amecameca when the infinite allele model was considered for the sign, standardized differences and Wilcoxon tests. Although in the scientist literature, the Colombian Drosophila pseudoobscura is repetitively considered a clear case of strong bottleneck in the past, or of an considerable important founder effect in its original constitution, our molecular population genetics analyses show that recently this population has not gone throughout to repetitive bottlenecks, whereas the Mexican populations, obviously with more genetic diversity alleles and with higher genetic diversity levels, seem positively to go across recent bottlenecks (Table 2). It is probable that changes in the environmental characters in the last decades in the Mexican studied localities could explain several recent bottlenecks in the Mexican populations.
References: Alvarez, D., M. Ruiz-García, J. Guerrero, J.P. Jaramillo 2001a, Genetics and Molecular Biology (in press); Alvarez, D., M. Noor, M. Ruiz-García 2001b, Biotropica (in press); Ayala F. J., and Th. Dobzhansky 1974, Pan-Pacific Entomologist 50: 211-219; Barton N.H., and M. Slatkin 1986, Heredity 56: 409-416; Cornuet, J.M., and G. Luikart 1996, Genetics 144: 2001-2014; Coyne, J.A., and A.A. Felton 1977, Genetics 134: 1289-1303; Dobzhansky, Th., A. S. Hunter, O. Pavlovsky, B. Spassky, and B. Wallace 1963, Genetics 48: 91-103; Goldstein, D.B., A. Ruiz-Linares, L.L. Cavalli-Sforza, and M.W. Feldman 1995, Proceedings National Academy of Sciences USA 92: 6723-6727; Jenkins, T.M., C.J. Basten, and W.W. Anderson 1996, Molecular Biology and Evolution 13: 1266-1275; Kimura, M., and J.F. Crow 1964, Genetics 49: 725-738; Luikart, G., and J.M. Cornuet 1998, Conservation Biology 12: 228-237; Luikart, G., F.W. Allendorf, B. Sherwin, and J.M. Cornuet 1998, Journal of Heredity 89: 238-247; Michalakis, Y., and L. Excoffier 1996, Genetics 142: 1061-1064; Noor, M.A.F., M.A. Williams, D. Alvarez, and M. Ruiz-García 2000a, Journal of Insect Behavior 13: 255-262; Noor, M.A.F., M.D. Schug, and C.F. Aquadro 2000b, Genetical Research 75: 25-35; Ohta, T., and M. Kimura 1973, Genetical Research 22: 201-204; Prakash, S., 1972, Genetics 72: 143-155; Prakash, S, R.C. Lewontin, and J.L. Hubby 1969, Proceedings National Academy of Sciences USA 59: 398-405; Ruiz-García, M., D. Alvarez, C. Guerrero 1999, Drosophila Information Service 82: 20-26; Ruiz-García, M., D. Alvarez, C. J. Guerrero, and V.M. Salceda 2001, Annals de la Societé Entomologique de France 36 (4) (in press); Schaeffer, S.W., and E.L. Miller 1991, Proceedings National Academy of Sciences USA 88: 6097-6101; Schaeffer, S.W., and E.L. Miller 1992, Genetics 132: 471-480; Schug, M.D., C.M. Hutter, M.A.F. Noor, and C.F. Aquadro 1998a, Genetica 102/103: 359-367; Schug, M.D., C.M. Hutter, K.A. Wetterstarnd, M.S. Gaudette, T.F.C. Mackay, and C.F. Aquadro 1998b, Molecular Biology and Evolution 15: 1751-1760; Singh, R.S., R.C. Lewontin, and A.A. Felton 1976, Genetics 84: 609-629; Slatkin, M., 1985, Evolution 39: 53-65; Slatkin, M., 1995, Genetics 139: 457-462; Wright, S., 1965, Evolution 19: 395-420.
Figure 1. Map of the Mexican and Colombia