Genetic Associations with Gestational Length and Spontaneous Preterm Birth

Discovery stage


Despite evidence that genetic factors contribute to the duration of gestation and the risk of preterm birth, robust associations with genetic variants have not been identified. We used large data sets that included the gestational duration to determine possible genetic associations.


We performed a genomewide association study in a discovery set of samples obtained from 43,568 women of European ancestry using gestational duration as a continuous trait and term or preterm (<37 weeks) birth as a dichotomous outcome. We used samples from three Nordic data sets (involving a total of 8643 women) to test for replication of genomic loci that had significant genomewide association (P<5.0×10−8) or an association with suggestive significance (P<1.0×10−6) in the discovery set.


In the discovery and replication data sets, four loci (EBF1, EEFSEC, AGTR2, and WNT4) were significantly associated with gestational duration. Functional analysis showed that an implicated variant in WNT4 alters the binding of the estrogen receptor. The association between variants in ADCY5 and RAP2C and gestational duration had suggestive significance in the discovery set and significant evidence of association in the replication sets; these variants also showed genomewide significance in a joint analysis. Common variants in EBF1, EEFSEC, and AGTR2 showed association with preterm birth with genomewide significance. An analysis of mother–infant dyads suggested that these variants act at the level of the maternal genome.


In this genomewide association study, we found that variants at the EBF1, EEFSEC, AGTR2, WNT4, ADCY5, and RAP2C loci were associated with gestational duration and variants at the EBF1, EEFSEC, and AGTR2 loci with preterm birth. Previously estab- lished roles of these genes in uterine development, maternal nutrition, and vascular control support their mechanistic involvement. (Funded by the March of Dimes and others.)

Raw data

These two files contain the summary results of the top 10,000 SNPs associated with gestational length and preterm birth identified from the 23andMe (discovery) data set. Each file includes nine columns:
  1. snp: snp rs id
  2. chr: chromosome
  3. pos: physical position (GRCh37/hg19)
  4. alleles: alleles based on positive strand of reference genome. Allele B is used as the reference allele for frequency and effect.
  5. freq: frequency of allele B
  6. eff: effect size. For gestational length, effect is the estimated changes in gestational days per allele (B); for preterm birth, effect is the estimated odds ratio (OR) of the reference allele (B)
  7. se: standard error (not adjusted by inflation).
  8. p: p-value (before adjustment for inflation)
  9. p.adj: p-value (inflation adjusted). Gestational length: lambda=1.038; preterm birth lambda=1.025
The complete summary statistics are available upon request from 23andMe or Dr. David Hinds.



2017-05-09Initial posting of results files