Genetic studies of accelerometer-based sleep measures yield new insights into humansleep behaviour
Article | OPEN | Published: 05 April 2019
Genetic studies of accelerometer-based sleep measures yield new insights into humansleep behaviour
Samuel E. Jones, Vincent T. van Hees, […]Andrew R. Wood
Nature Communicationsvolume 10, Article number: 1585 (2019) | Download Citation
Abstract Sleep is an essential human function but its regulation is poorly understood. Using accelerometer data from 85,670 UK Biobank participants, we perform a genome-wide association study of 8 derived sleep traits representing sleep quality, quantity and timing, and validate our findings in 5,819 individuals. We identify 47 genetic associations at P < 5 × 10−8, of which 20 reach a stricter threshold of P < 8 × 10−10. These include 26 novel associations with measures of sleep quality and 10 with nocturnal sleep duration. The majority of identified variants associate with a single sleep trait, except for variants previously associated with restless legs syndrome. For sleep duration we identify a missense variant (p.Tyr727Cys) in PDE11A as the likely causal variant. As a group, sleep quality loci are enriched for serotonin processing genes. Although accelerometer-derived measures of sleep are imperfect and may be affected by restless legs syndrome, these findings provide new biological insights into sleep compared to previous efforts based on self-report sleep measures.
Sleep quality and quantity are uncorrelated with timing
Descriptive statistics and correlations between the eight accelerometer-derived phenotypes are shown in Table 1 and Supplementary Table 1. We observed little phenotypic correlation (R) between measures of sleep timing and measures of nocturnal sleep duration and quality (−0.10 ≤ R ≤ 0.12). These negligible or limited correlations between timing and duration are consistent with data from self-reported chronotype and sleep duration (R = −0.01). We also observed limited correlation between sleep duration and sleep quality as represented by the number of nocturnal sleep episodes (R = 0.14) but observed a stronger correlation between sleep duration and sleep efficiency (R = 0.57). The correlations between self-reported sleep duration and accelerometer-derived sleep duration was 0.19 and between self-reported chronotype (morningness) and L5 timing was −0.29.
Table 1 Descriptive statistics of sleep and activity measures derived from accelerometer data
Full size table
Accelerometer-derived sleep pattern estimates are heritable
To estimate the proportion of variance attributable to genetic factors for a given trait, we used BOLT-REML to estimate SNP-based heritability (h2SNP) (Table 2). The h2SNP estimates ranged from 2.8% (95% CI 2.0%, 3.6%) for variation in sleep duration (defined as the standard deviation of accelerometer-derived sleep duration across all nights), to 22.3% (95% CI 21.5%, 23.1%) for number of nocturnal sleep episodes. For sleep duration, we observed higher heritability using the accelerometer-derived measure (h2SNP = 19.0%, 95% CI 18.2%, 19.8%) in comparison to self-report sleep duration (h2SNP = 8.8%, 95% CI 8.6%, 9.0%). The heritability estimates for sleep and activity timings (maximum h2SNP = 11.7%, 95% CI 10.9%, 12.5%) were lower than for self-report chronotype (h2SNP = 13.7%, 95% CI 13.3%, 14.0%)31.
Table 2 Heritability estimates of derived sleep variables from BOLT-REML
Full size table
Low genetic correlation between sleep duration estimates
To quantify the genetic overlap between accelerometer-derived and self-reported sleep traits, we performed genetic correlation analyses using LD-score regression as implemented in LD-Hub32. We observed strong genetic correlations of L5, M10 and sleep midpoint timing with self-report chronotype (rG > 0.79), and weaker genetic correlation between accelerometer-derived versus self-reported sleep duration (rG = 0.43). This observation may be due to differences in the genetic contribution to variation in self-reported versus accelerometer-derived sleep duration or differences in the accuracy of self-reported phenotypes.
Forty-seven genetic associations identified for sleep traits
To identify genetic loci associated with accelerometer-derived sleep traits, we performed a genome-wide association analysis of 11,977,111 variants in up to 85,670 individuals for the eight accelerometer-derived sleep traits. We identified 47 genetic associations across seven of the phenotypes at the standard GWAS threshold (P < 5 × 10−8). Among these associations, 20 reached a more stringent threshold of P < 8 × 10−10. We estimate that this threshold reflects a better type 1 error rate to account for the approximate number of independent genetic variants analysed31 and the 8 accelerometer-based traits (Table 3 and Supplementary Figs. 1 and 2). Twenty-six associations were observed for sleep quality measures, including 21 variants associated with number of nocturnal sleep episodes and five associated with sleep efficiency (8 and 2 at P < 8 × 10−10, respectively). An additional eight genetic associations were identified for sleep and activity timing. These included six associated with L5 timing, one associated with M10 timing, and one associated with midpoint sleep. Only three associations with L5 timing were detected at P < 8 × 10−10. Finally, for sleep duration we observed 13 associations—11 for sleep duration and 2 associated with diurnal inactivity (6 and 1 at P < 8 × 10−10, respectively). Of these 47 associations reaching P < 5 × 10−8 and the 20 associations reaching P < 8 × 10−10, 31 and 9 were not previously reported in studies based on self-report measures, respectively (Table 3). The variance explained by all the discovered loci ranged from 0.04% for sleep midpoint timing to 0.8% for number of nocturnal sleep episodes. The lambda GC observed across these analyses ranged from 1.03 (sleep duration variability) to 1.14 (number of nocturnal sleep episodes), while LD-score intercepts ranged from 1.03 (diurnal inactivity) to 1.07 (sleep midpoint timing). Given the median χ2 test-statistic can be inflated for polygenic traits as sample size increases, the LD-score intercepts suggest limited inflation of test statistics observed is more likely to due to the polygenicity of the phenotype tested over and above population stratification33,34.
Table 3 Summary statistics for 47 genetic associations identified in the UK Biobank reaching P < 5 × 10−8
Full size table
Replication of 47 genetic associations in 5819 individuals
We attempted to replicate our findings in up to 5819 adults from the Whitehall II (N = 2,144), CoLaus (N = 2,257), and Rotterdam Study (subsample from RS-I, RS-II and RS-III, N = 1,418) who had worn similar wrist-worn accelerometer devices for a comparable duration as the UK Biobank participants. Individual study and meta-analysisresults for the three replication studies are presented in Supplementary Data 1. Of the 47 associations, the signal near GPR139 (rs8045740) reached Bonferroni significance (P = 0.001) and 11 were associated at P < 0.05 after meta-analysis of the replication studies. Given the limited power to detect single SNP associations in the replication meta-analysis, we next examined the directional consistency of allele effect estimates. Of the 20 associations reaching P < 8 × 10−10, 18 were directionally consistent in the replication cohort meta-analyses (Pbinomial = 3 × 10−4). Of the additional 27 signals, 18 were directionally consistent in the replication meta-analysis (Pbinomial = 0.03). Finally, for traits with more than one independent lead SNP associated at P < 5 × 10−8 in the UK Biobank (Table 3), we combined the effects of the lead SNPs on the respective sleep trait (aligned to the trait increasing allele) and tested them in the replication data. In the combined-effects analysis, we observed overall associations with sleep duration (P = 0.008), sleep efficiency (P = 3 × 10−4), number of nocturnal sleep episodes (P = 2 × 10−6), and sleep timing (P = 0.034) (Supplementary Data 2).
The genetics of sleep quality overlaps with sleep disorders
Of the five variants associated with sleep efficiency, a measure of sleep quality, one was the strongly associated PAX8 sleep duration signal11 (rs62158169, P = 2 × 10−8) and one was a restless legs syndrome/insomnia-associated signal (MEIS1)18,35 (rs113851554, P = 5 × 10−22). Of the 20 loci associated with number of nocturnal sleep episodes, one is represented by the APOE variant (rs429358). This variant is a proxy for the APOE ε4 risk allele that is strongly associated with late-onset Alzheimer’s disease and cognitive decline36. The ε4 allele is associated with a reduced number of nocturnal sleep episodes (−0.13 sleep episodes; 95% CI: −0.16, −0.11; P = 4 × 10−8). This finding is strengthened by additional analyses of the ε2, ε3 and ε4 APOE Alzheimer’s diseaserisk alleles, with an overall reduction in the number of nocturnal sleep episodes observed with higher risk haplotypes (F(5, 72,578) = 5.36, P = 0.001) (Supplementary Table 2). This finding is inconsistent with the observational association between cognitive decline in older age and poorer sleep quality37,38,39,40. One possible explanation for this finding is ascertainment bias in the UK Biobank whereby carriers of ε4 risk allele areprotected from cognitive decline through other factors. We also noted that the APOE ε4 risk allele was nominally associated (P < 0.05) with sleep timing (L5, −1.8 min per allele, P = 4 × 10−6), sleep midpoint (−0.6 min per allele; P = 0.002), sleep duration (−1.1 min per allele, P = 7 × 10−4), and diurnal inactivity (−1.0 min per allele, P = 2 × 10−5). Apart from the APOE variant (rs429358), which had double the effect size in the older half of the cohort (Supplementary Table 2), there were minimal differences in effect sizes in a range of sensitivity analyses, including removing individuals on sleep or depression medication, adjustments for BMI and lifestyle factors, and splitting the cohort by median age (Supplementary Data 3 and Supplementary Methods).
Six associations identified for estimates of sleep timing
We identified six loci associated with L5 timing, of which three have not previously been associated with self-report chronotype but have been associated with restless legs syndrome35. The lead variants at these three loci are in strong to modest LD with the previously reported variants associated with restless legs syndrome (rs113851554, MEIS1, P = 2 × 10−35, LD r2 = 1.00; rs12991815, C1D, P = 2 × 10−9, LD r2 = 0.96; rs9369062, BTBD9, P = 9 × 10−14, LD r2 = 0.49). The three variants that reside in loci previously associated with self-report chronotype are in strong to modest linkage disequilibrium with those previously reported12,15,16 (rs1144566, RSG16, P = 8 × 10−12, LD r2 > 0.91; rs12927162, TOX3, P = 3 × 10−8, LD r2 = 1.00; rs4882315, ALG10B, P = 2 × 10−8, LD r2 = 0.58). The variant rs1144566 is a missense coding change (p.His137Arg) in exon 5 of RSG16, a known circadian rhythm gene, which contains variants strongly associated with self-report chronotype12. In a parallel self-report chronotype study in the UK Biobank, rs1144566 represented the strongest association, with the T allele having a morningness odds ratio of 1.26 (P = 2 × 10−95)31. In addition, variants in the region of TOX3 have previously been associated with restless legs syndrome35. However, our lead SNP (rs12927162) was not in LD with the previously reported index variant at this locus (rs45544231, LD r2 = 0.004). There were minimal differences in effect sizes when we performed a range of sensitivity analyses, including removing individuals on depression medication, adjustments for BMI and lifestyle factors and splitting the cohort by median age (Supplementary Data 3 and Supplementary Methods).
Ten novel loci associated with estimates of sleep duration
We identified 11 loci associated with accelerometer-derived sleep duration, including ten not previously reported to be associated with self-report sleep duration, despite the fivefold increase in sample size available for a parallel self-report sleep duration GWAS study14 (Fig. 1 and Supplementary Data 4). This lower overlap in signals is consistent with the lower genetic correlation between self-reported and accelerometer-derived sleep duration than between chronotype and accelerometer-derived measures of sleep and activity timing. The lead variants representing the ten new sleep duration loci all had the same direction and larger effects in the accelerometer data compared to self-report data, with effect sizes ranging from 1.3 to 5.9 min compared to 0.1 to 0.8 min (self-report P < 0.05), with the MEIS1 locus having the strongest effect. Two of the ten new sleep duration signals, rs113851554 in MEIS1 (P = 2 × 10−25) and rs9369062 in BTBD9 (P = 2 × 10−10), have previously been associated with restless legs syndrome. The one variant previously detected based on self-report sleep duration, near PAX8, was the first variant to be associated with sleep duration through GWAS11. The minor PAX8 allele effect size was consistent across accelerometer-derived measures of sleep duration (2.7 min per allele, 95% CI: 2.1 to 3.3, P = 3 × 10−21) and self-report sleep duration (2.4 min per allele, 95% CI: 2.1 to 2.8, P = 7 × 10−49). We observed similar effect sizes in a subset of 72,510 unrelated Europeans from the UK Biobank, when removing individuals on depression medication and after adjusting for BMI and lifestyle factors. To confirm that associations were not influenced by age-related differences in sleep, we confirmed that there was also no difference in effect sizes between younger and older individuals (above and below the median age of 63.7 years) (Supplementary Data 3).
Comparison of SNP effect estimates on accelerometer and self-report sleep duration. The effects for 11 genetic variants associated with accelerometer-derived sleep duration against effect estimates from a parallel GWAS of self-report sleep duration14 are presented. Error bars represent the 95% confidence intervals for each effect estimate
Full size image
Fine-mapping analysis identifies likely causal variants
To identify credible SNP sets likely to contain causal variants within 500 Kb of lead SNPs for each trait with a genetic association (P < 5 × 10−8) we used FINEMAP41 to identify credible sets of likely causal SNPs (log10 Bayes Factor > 2) (Supplementary Data 5). This approach places a probability on the likelihood that a variant, among those tested, represents the causal allele. Two loci contained a coding variant with a probability >80% for being the causal variant. The first variant (rs17400325, MAF = 4.2%) was a missense variant (p.Tyr727Cys) in PDE11A, a phosphodiesterase highly expressed in the hippocampus that was associated with sleep duration (P = 2 × 10−8) and sleep efficiency (P = 2 × 10−10). The other was the missense APOE variant, a proxy for the ε4 allele known to predispose to Alzheimer’s disease and responsible for the association signal with the number of nocturnal sleep episodes. Of the remaining loci, five fine-mapped variants are eQTLs in the Genotype-Tissue Expression (GTEx) project42. Of these only the fine-mapped variant at the CLUAP1 locus associated with the number of nocturnal sleep episodes (P = 4 × 10−9) was the lead variant for the corresponding eQTL (log10 Bayes Factor = 2.48, Pcausal = 0.72) (Supplementary Data 5). CLUAP1 has been gene previously associated with photoreceptor maintenance43.
Serotoninpathway-related genes enriched at associated loci
We used MAGMA44 to assess tissue enrichment of genes at associated loci across the sleep traits. All traits showed an enrichment of genes in the cerebellum (Supplementary Figs. 3 and 4). Loci associated with number of nocturnal sleep episodes were enriched for genes involved in serotoninpathways (PBonferroni = 3 × 10−4) (Supplementary Table 3).
Associated variants are implicated in restless legs syndrome
We observed most variants to be associated with either sleep quality, duration, or timing, but not combinations of these sleep characteristics. However, the variant rs113851554 at the MEIS1 locus was associated with sleep quality (sleep efficiency), duration, and timing (L5). In addition, the variant rs9369062 at the BTBD9 locus was associated with both sleep duration and L5 timing. Both variants have previously been reported as associated with restless legs syndrome (Fig. 2). To follow up this observation, we performed Mendelian Randomisation using 20 variants associated with restless legs syndrome in the discovery stage of the most recent and largest genome-wide association study35. We tested these 20 variants against all eight activity-monitor-derived sleep traits and showed a clear causative association of restless legs syndrome with all sleep traits. We also observed a causative association of restless legs syndrome with self-report sleep duration and chronotype, suggesting that variants associated with restless legs syndrome were not artefacts of the accelerometer-derived measures of sleep (Supplementary Data 6).
Fig. 2 Effects of restless legs syndrome-associated SNPs on derived sleep traits. Presented are the effect estimates for genetic variants associated with a either L5 timing or sleep duration, b either sleep duration or the number of nocturnal sleep episodes, and c either L5 timing or sleep quality (number of nocturnal sleep episodes or sleep efficiency). Variants previously associated with restless legs syndrome are highlighted in red. Effect estimates represent standard deviations of the inverse-normal distribution of each trait. Error bars represent the 95% confidence intervals for each effect estimate
Full size image Waist-hip-ratio causally influences sleep outcomes
To assess causality of phenotypes, we used genetic correlations to prioritise traits with evidence of genetic overlap for subsequent Mendelian Randomisation analyses using LD-Hub32. We tested for genetic correlations between the eight activity-monitor-derived measures and 234 published GWAS studies across a range of diseases and traits. Given previous reports that genetic correlations are similar to phenotypic correlations45, this approach also enabled us to analyse phenotypes under-represented, not recorded, or not well defined within the UK Biobank. After adjustment for the number of genetic correlations tested (8 × 234), we observed genetic correlations between sleep traits and obesity and educational attainment related traits (Supplementary Data 7). After adjusting for the number of tests in the bi-directional MR analysis (100), we observed evidence that higher waist-hip-ratio (adjusted for BMI) is causally associated with lowersleep duration (PIVW = 5 × 10−6) and lowersleep efficiency (PIVW = 3 × 10−4). In addition, we observed higher educational attainment to be causally associated with lowersleep duration (PIVW = 5 × 10−5). However, given the genetic correlation and MR analyses are not independent, only the causal association of waist-hip-ratio (adjusted for BMI) on sleep duration remained significant after applying a more stringent threshold (PIVW ≤ 3 × 10−5) to account for a maximum of 234 bi-directional MR analyses (Supplementary Data 8).