058_ Replication study: Identification and characterization of schizophrenia patient strata using genotype imputed transcriptional profiles
Research Question and Aims
Schizophrenia is a highly heterogeneous disease with potentially different genetic and biological mechanisms contributing to disease etiology in distinct patients. We aim to identify distinct SCZ patient groups based on differences in their genetic risk profiles. To that end, we have developed a novel computational approach using large reference cohorts and imputed gene expression profiles to stratify patients into distinct subgroups with different genetic liability profiles towards impairment of distinct biological pathways. We have applied this approach to the PGC SCZ wave 2 dataset and leveraged the UK Biobank to annotate the patient clusters and predict endophenotypic differences between the patient groups. We now aim here to validate these predictions by using the deep phenotyping information available in the PsyCourse cohort and confirm differences in clinical and biographical (endo-) phenotypes between PsyCourse patient groups defined by the CASTOM-iGEx algorithm. These analyses will be included in the revision of the current CASTOM-iGEx manuscript.
We hypothesize that distinct patient groups suffering from mental illness defined by a patient grouping scheme derived from the PGC SCZ dataset will exhibit significant differences
in their clinical and endophenotypic characteristics
Data from all samples in PsyCourse who have genotype data available will be included in this study.
Gene and pathway level imputation
We will leverage the CASTOM-iGEx pipeline to impute gene expression levels and pathway activity profiles from genotype information only using pre-trained machine learning models for 10 distinct cell types and tissues.
We will leverage a patient clustering structure based on imputed gene expression profiles in dorso lateral prefrontal cortex obtained from the PGC SCZ dataset and project the PsyCourse patients onto this pre-existing clustering structure using the respective PsyCourse imputed gene expression profiles.
Subsequently, we will compare a battery of clinical and other (endo)-phenotypic characteristics between the PsyCourse patients assigned to different clusters (see below). For the latter, appropriate generalized linear models correcting for relevant covariates will be used.