This file describes how to run the AMELIA script written by Jennifer Asimit in the Applied Statistical Genetics group led by Eleftheia Zeggini at the Wellcome Trust Sanger Institute. The analysis requires three source files: 1) amelia_script_general.R 2) amelia_analysis3_general.R 3) ameliaN2.so The first script (amelia_script_general.R) is the only script that the user needs to call, but the last 2 scripts are necessary for the main script to work properly. The scripts are written such that all the scripts need to be in the same directory as the data so that they can be found. The main script (amelia_script_general.R) takes 6 input parameters: 1) Chromosome number to analyze (e.g. an interger between 1-22) 2) Minor Allele Frequency Cutoff for analysis (e.g. 0.05) 3) Genotype File Name 4) Phenotype File Name 5) Gene Definition File 6) Output File Name The Genotype File is a tab-delimited file with one line per SNP and 6+3N columns, where N is the number of individuals being analyzed. Missing values are coded by a dot ("."). The columns must be in this order: 1) Chromosome 2) Position 3) ID 4) Reference Allele, one of A, C, G, T 5) Alternative Allele, one of A, C, G, T 6) Consensus SNP Quality Score 7) Allele 1 for Individual 1, 0 for the reference allele and 1 for the alternative allele 8) Allele 2 for Individual 1, 0 for the reference allele and 1 for the alternative allele 9) Genotype Quality Score for Genotype of Individual 1 OR Non-consensus SNP Quality Score of Individual i ....and so on for each individual to analyze If consensus SNP qualitiy scores are to be used, then set the values in (9) to 100. If non-consensus SNP quality scores or genotype quality scores are to be used, then set the values in (6) to 100. If both consensus SNP quality scores and genotype quality scores are available, then AMELIA may be run using both levels of quality scores. In this case do not modify the columns in (6) and (9). The Phenotype File is a single column of case control status for the individuals in the genotype file and in the same order as the genotype file. The script is called in the following manner from the command line: >R CMD BATCH '--args arg_1 arg_2 arg_3 arg_4 arg_5 arg_6' amelia_script_general.R chr__amelia_script.Rout The output file is a tab-delimited file with 5 columns: 1) Gene Name 2) Number of Variants Analyzed 3) KBAT p-value 4) AMELIA p-value 5) Number of Permutations Performed