Genome-Wide Association Study for OCD Complications

The OCD (Obsessive — compulsive disorder) is referred as repetitive behaviors and thoughts experienced by individuals. (Visscher, Brown, McCarthy, et al. (2012). Typically, the genes’ characteristics of twins and families have revealed that the OCD has the feature of multifactorial familial condition involving both environmental and polygenic factors. (Moran, 2013). Genetic studies have revealed that the interaction of the glutamatergic, serotonergic, and dopaminergic systems and genes affecting them play a crucial role in functioning of the circuit. (Yang, Lee, Goddard, Mand et al. 2011). Meanwhile, the environmental factors that include psychological trauma, adverse perinatal effects and neurological trauma may modify the risk genes, which can consequently manifest the compulsive-obsessive behaviors. (Visscher, Brown, McCarthy et al. 2012). The OCD is a frequent and, relative common debilitating neuropsychiatric disorder affecting 2% of the U.S. population. (Arnold, Sicard, Burroughs, et al. (2006). Typically, the OCD is obsessions of repetitive behaviors and thoughts experienced as unwanted. (Baxter, Scott, Vos, et al. 2012). In other words, the OCD is a health and clinically heterogeneous disorder consisting of different types of symptomatic expression. (Murray, and Lopez, 1996). In the United States, the OCD can affect people in early childhood and adulthood and, between 30% and 50% of the U.S. population suffer from OCD in their early childhood often with children below the 10 years of age. (Pauls, Abramovitch, Rauch et al. 2014).

Don't use plagiarized sources. Get Your Custom Essay on
Association Study for OCD Complications
Just from $13/Page
Order Essay

The risk factors associated to the OSD include the stress as well as the history of child abuse. In the healthcare environment, the OCD has continued to demand the psychiatric attention because of its high OSD diagnosis rate. (Murray and Lopez 1996).

Effectively, the GWAS (Genome-Wide Association Studies) is a clinical approach to address the OSD complication. (Ziegler, 2009, Zohar, 1999).

“The premise of the GWAS design is that extensive common variation in the human genome, as exhibited by SNPs with frequencies greater than 1%, is responsible for the risk of most genetically complex disorders.” (Cantor, et al., 2009 p 6).

In essence, the GWAS are the standard method for a discovery of genes. The genotyping technologies and computational statistical analysis have become the standard method to analyze the GWA to address the OSD problem. (Barrett, Healy-Farrell, L. & March, 2004).

The goal of this research is to carry out the analysis of GSA using the computation and statistical tool to understand OSD complications.


Objective of the lab report is to use statistical and computation tools to analyze the GWAS to deliver effective method to understand the strategy to address the OSD complications. The sampling information, selection procedures and quality control are used to understand the effectiveness of GWAS.

Study Selection Procedure

The study collects data from the QIMR (Queensland Institute of Medical Research) for the analysis of GWAS in order to compare the case control and data from QIMR comprising the ~2,372,500 SNPs (single nucleotide polymorphisms). The selection process is very essential to the sample to address the problems of design bias.

Study exclusion/inclusion

The study selects all subjects from European ancestry to address the problem of overrepresentation of variance that occurs across ethnic groups. The study removes individuals who are more closely paired. More important, subjects who are non-Europeans are removed. The subjects with genetic genders discrepancies are also removed to avoid sample mix-up.

Raw data

The extracted data consist of SNP genotypes, phenotypes, and MAF (minor allele frequencies).

Quality control

The standard and statistical QC (quality control) is performed on the QIMR genotype data to enhance the quality of SNP sample data. The final dataset of the OCD is able to match the sample case size where the control sample sizes (n=500). The control sample sizes are able to match the male and female subjects (n=750). However, excessive missing values of (>20%) were dropped to remove the potential errors, which consequently set both cases and control ratio to 50:50 to reduce the signal noise ratio. Moreover, the paper removes the SNPs having low MAFs (<10%). The SNPs are dropped if the excessive missing values is greater than 10% (>10%). The SNPs are also dropped if both cases and controls are less than 10% of sample MAFs groups (<10%).

The statistical analysis is carried out to test the allele frequency. The study also analyzes the differences between control and cases with the aid of the PLINK software. The study also stimulates the OCD phenotypes with the aid of the GCTA software to deliver the genome plausible association. The statistical tool such as descriptive statistics is used to summarize the raw mass data to provide the median, P-value and Chi-Square. The outcome of the analysis assists in presentation of the results in graphical form.


The data analysis has delivered the results to enhance a greater understanding of the contribution of the GWAS in reducing the OCD complications.

Figure 1: Mahattan Plot

The figure 1 presents the Manhattan Plot revealing that the red dotted line does not surpass the genome-wide threshold.

Fig 2: QQ Plot

The Figure 2 provides the QQ Plot where QQ is – .99 Plots of the GWAS that consequently delivers no minimal inflation. The QQ Plots also reveals that the lambda = inflation and deviation of GWAS is based on the polygenetic architecture leading to some bias that inflates the test statistics. More importantly, the line is able to deviate the null (line), where the median expected is equal to median observed that results to 1.

The Q-Q plot assists in verifying and authenticating the quality of association carrying out after the case control analysis that are not confounded with the unaccounted variables such as population stratification. The illustration also presents the small number of SNPs that deviates from the remaining SNPs association. The graphical illustration shows that the moderate p
Figure 3. Regional LD Plot

The illustration in figure 3 determines the extent the LCD block has been tagged by the SNP. The LCD plot also determines the ability of the LD to tag the top of SNP. Moreover, LCD vertical line plot is equal to 25592137-25718520. The Table 1 delivers the results of the analysis.

Table 1: Outcome of the Analysis








Risk allele viewed on the odd ratio

F_A (all affected people)


F_U (unaffected people)





1.318 (1.3 x greater than the T allele where cases v. controls) Top SNP =Rs1487971

Figure 4. Regional Associations Plot

The figure 4 provides the regional associations’ plot that provides the number of SNPs extracted from NCBI’s gene datasets. The datasets illustrates the graph based on the strength of correlation. The effect of the size reveals that the odd ratio (1.318) is not significant since it is less than the p-value of 1.874e-07. However, the p-value ought to be 1.874e-08 to have a significant value. The value of Chi Square (27.16) is also not significant since the chi-square ought to be greater than 27.16 to be significant. (The Chi Square value is in Table 1). However, the gene function aligns with GWAS statistical association since the role of BLM hydrolase within the normal physiological is generally unknown.


“Genome-wide association studies (GWAS) have rapidly become a standard method for disease gene discovery.” (Cantor, et al., 2009 p 6). The results reveal that GWAS have become a critical and standard method to illuminate the disease gene to address the OCD complications. . (Cullen, et al. 2007). Theoretically, the OCD encompasses both the compulsions and obsessions leading to the entire range of behaviors and unique human thoughts that affect individuals. (Geller, et al. 2003). Typically, individuals suffering from the OCD fear of causing harms to others and themselves that aggravate the avoidance behavior. (Moran, 2013). Heredity has been identified as one of the major causes of the OSD transmitted within families. Risks associated to OCD include the compulsive-obsessive symptoms within family members. (Pauls, Abramovitch, Rauch, et al. 2014). Twin studies have revealed the evidence that OCD can be genetic, which can be transmitted within families due to a shared environment factors. (Ruscio, Stein, Chiu. et al. 2010). Specifically, genetic variance account for 40% of compulsive and obsessive behaviors for the sampled twins. (Stein, Andersen, & Overo, 2007, Zohar, Greenberg, B. & Denys, 2012). Moreover, non-shared environmental factors contribute to 51% of the OSD. (Stein, Andersen, & Overo, 2007, Mattheisen, Samuels, Wang, et al. 2014).).

A substantial number of GWAS is linked to disorder; however, few GWAS variants are implicated to SNPs explaining a fraction of genetic risk. (Stewart et al. 2007, Stewartn et al. 1999). When the disease is conceived, the GWAS is believed “to provide an effective and unbiased approach to revealing the risk alleles for genetically complex non-Mendelian disorders.” (Cantor, et al., 2009 p 6). The statistical analysis shows that GWAS association is connected to the genes. Moreover, the statistical method is used to accumulate the SNPs association. (Stewart, Mayerfeld, Arnold, et al. (2013). The method is to use the statistical permutation to address the issues such as source of bias. (Gibson, 2010).

The results provide the following elements. The data collected consists of a large sample population that has assisted to achieve the research outcomes. Moreover, the statistical analytical tool assists in identifying the genetic associations. (Cantor, Lange, and Sinsheimer, 2010). In essence, each of the three elements has been developed to accomplish the research objectives. (Ruscio, Stein, Chiu et al. 2010). Moreover, the research used the large study sample to deliver a sufficient statistical power to deliver the research objectives.


Arnold, P., Sicard, T., Burroughs, E. et al. (2006). Glutamate Transporter Gene SLC1A1 Associated With Obsessive-compulsive Disorder. Arch Gen Psychiatry, 63(7), p.769.

Baxter, A., Scott, K., Vos, T. and Whiteford, H. (2012). Global prevalence of anxiety disorders: a systematic review and meta-regression. Psychological Medicine, 43(05), pp.897-910.

Barrett, P., Healy-Farrell, L. & March, J. S. (2004). Cognitivebehavioral family treatment of childhood obsessive-compulsive disorder: a controlled trial. J. Am. Acad. Child Adolesc. Psychiatry . 43, 46-62.

Cantor, R., Lange, K. and Sinsheimer, J. (2010). Prioritizing GWAS Results: A Review of Statistical Methods and Recommendations for Their Application. The American Journal of Human Genetics, 86(1):.6-22.

Cullen, B. et al. (2007). Factor analysis of the Yale-Brown Obsessive Compulsive Scale in a family study of obsessive-compulsive disorder. Depress. Anxiety. 130-138.

Geller, D. A. et al. (2003). Which SSRI? A meta-analysis of pharmacotherapy trials in pediatric obsessive-compulsive disorder. Am. J. Psychiatry. 160, 1919-1928.

Gibson, G. (2010). Hints of hidden heritability in GWAS. Nature Genetics, 42(7), pp.558-560.

Mattheisen, M., Samuels, J., Wang, Y., et al. (2014). Genome-wide association study in obsessive-compulsive disorder: results from the OCGAS. Molecular Psychiatry, 20(3): 337-344.

Moran, M. (2013). DSM-5 Updates Depressive, Anxiety, and OCD Criteria. PN, 48(4): 22-43.

Murray, C. and Lopez, A. (1996). The global burden of disease. [Cambridge, Mass.]: Published by the Harvard School of Public Health on behalf of the World Health Organization and the World Bank.

Pauls, D., Abramovitch, A., Rauch, S. and Geller, D. (2014). Obsessive — compulsive disorder: an integrative genetic and neurobiological perspective. Nature Reviews Neuroscience, 15(6): 410-424.

Ruscio, A. M., Stein, D. J., Chiu, W. T. et al. (2010). The epidemiology of obsessive-compulsive disorder in the National Comorbidity Survey Replication. Mol. Psychiatry. 15, 53-63.

Stein, D. J., Andersen, E. W. & Overo, K. F. (2007). Response of symptom dimensions in obsessive-compulsive disorder to treatment with citalopram or placebo. Rev. Bras. Psiquiatr. 29, 303-307.

Stewart, S. E. et al. (2007). Principal components analysis of obsessive-compulsive disorder symptoms in children and adolescents. Biol. Psychiatry. 61, 285-291.

Stewart, S., Mayerfeld, C., Arnold, P., et al. (2013). Meta-analysis of association between obsessive-compulsive disorder and the 3? region of neuronal glutamate transporter gene SLC1A1. Am. J. Med. Genet., 162(4): 367-379.

Visscher, P., Brown, M., McCarthy, M. et al. (2012). Five Years of GWAS Discovery. The American Journal of Human Genetics, 90(1): 7-24.

Yang, J., Lee, S., Goddard, M. and Visscher, P. (2011). GCTA: A Tool for Genome-wide Complex Trait Analysis. The American Journal of Human Genetics, 88(1): 76-82.

Ziegler, A. (2009). Genome-wide association studies: quality control and population-based measures. Genetic Epidemiology, 33(S1):S45-S50.

Zohar, A. H. (1999).The epidemiology of obsessive-compulsive disorder in children and adolescents. Child Adolesc. Psychiatr. Clin. N. Am. 8, 445-460.

Zohar, J., Greenberg, B. & Denys, D. (2012). Obsessive-compulsive disorder. Handb. Clin. Neurol. 106, 375-390.

rs1487971 ( CEU )

25592137 25718520

Chromosome 17 position (hg18) (kb)

25097 25347 25597 25847 26097









ua re d







co m bi na tio n ra te




















(p v alu








bination rate (cM



















Position on chr17 (Mb)

Plotted SNPs


display range:

hilite range:

reference SNP:

number of SNPs plotted:

max PValue:

min PValue:

Mon Aug 17-06:21:54 2015


chr17:25196879?25996879 [25196879?25996879]

0 ? 0 [ 0 ? 0 ]


1.87E?7 [chr17:25596486]

9.99E?1 [chr17:25826990]

Make more plots at