Advances in sequencing technology and publicly available sequencing data of large populations has enabled informative genome wide association studies (GWAS) linking genes with phenotypic traits of interest. In response to the increased demand, many publicly available tools able to conduct genome-wide association studies (GWAS) have been developed. However, these tools lack a comprehensive pipeline that includes both pre-GWAS analysis such as outlier removal, data transformation, and best linear unbiased predictions or estimations (BLUP/BLUE), in addition to post-GWAS analysis such as haploblock analysis and candidate gene identification. Here, we present HAPPI GWAS, a R-based tool able to performs pre-GWAS, GWAS, and post-GWAS analysis in an automated pipeline on the Linux command line while maintaining user flexibility in model selection and other parameter thresholds. The tool leverages well-known R packages such as GAPIT and Haploview by stitching their outputs together in an easy to navigate and operate, holistic pipeline. Unlike other GWAS tools that result in a list of SNPs, HAPPI GWAS outputs a list of putative candidate genes in high linkage disequilibrium with the significant SNP and their descriptions. Automatic outputs include comprehensive, publication-ready summary tables and figures that facilitate easy comparison across traits. HAPPI GWAS gives user an easy to use and flexible workflow to run repeatable and comprehensive GWAS analyses of multiple traits with minimal user intervention.
Coauthors: Yen On Chan – University of Missouri-Columbia; Vivek Shrestha – University of Missouri-Columbia; Alex Lipka – University of Illinois; Ruthie Angelovici – University of Missouri-Columbia