2022-01-28
058_ Replication study: Identification and characterization of schizophrenia patient strata using genotype imputed transcriptional profiles
Research Question and Aims
Schizophrenia is a highly heterogeneous disease with potentially different genetic and biological mechanisms contributing to disease etiology in distinct patients. We aim to identify distinct SCZ patient groups based on differences in their genetic risk profiles. To that end, we have developed a novel computational approach using large reference cohorts and imputed gene expression profiles to stratify patients into distinct subgroups with different genetic liability profiles towards impairment of distinct biological pathways. We have applied this approach to the PGC SCZ wave 2 dataset and leveraged the UK Biobank to annotate the patient clusters and predict endophenotypic differences between the patient groups. We now aim here to validate these predictions by using the deep phenotyping information available in the PsyCourse cohort and confirm differences in clinical and biographical (endo-) phenotypes between PsyCourse patient groups defined by the CASTOM-iGEx algorithm. These analyses will be included in the revision of the current CASTOM-iGEx manuscript.
Analytic Plan
We hypothesize that distinct patient groups suffering from mental illness defined by a patient grouping scheme derived from the PGC SCZ dataset will exhibit significant differences
in their clinical and endophenotypic characteristics
Participants
Data from all samples in PsyCourse who have genotype data available will be included in this study.
Gene and pathway level imputation
We will leverage the CASTOM-iGEx pipeline to impute gene expression levels and pathway activity profiles from genotype information only using pre-trained machine learning models
for 10 distinct cell types and tissues.
Patient stratification
We will leverage a patient clustering structure based on imputed gene expression profiles in dorso lateral prefrontal cortex obtained from the PGC SCZ dataset and project
the PsyCourse patients onto this pre-existing clustering structure using the respective PsyCourse imputed gene expression profiles.
Statistical analysis
Subsequently, we will compare a battery of clinical and other (endo)-phenotypic characteristics between the PsyCourse patients assigned to different clusters (see below).
For the latter, appropriate generalized linear models correcting for relevant covariates will be used.
Resources needed
Sociodemographic variables
v1_sex
v1_ageBL
v1_yob
v1_seas_birth
v1_center
Diagnosis
v1_scid_dsm_dx_cat
v1_stat
v1_scid_dsm_dx
Somatic information/comorbidities
v1_height
v1_weight
v1_waist
v1_bmi
v1_chol_trig
v1_hyperten
v1_ang_pec
v1_heart_att
v1_stroke
v1_diabetes
v1_hyperthy
v1_hypothy
v1_osteopor
v1_asthma
v1_copd
v1_allerg
v1_neuroder
v1_psoriasis
v1_autoimm
v1_cancer
v1_stom_ulc
v1_kid_fail
v1_stone
v1_epilepsy
v1_migraine
v1_parkinson
v1_liv_cir_inf
v1_tbi
v1_beh
v1_eyear
v1_inf
Psychiatric history
1_cur_psy_trm
v1_outpat_psy_trm
v1_age_1st_out_trm
v1_daypat_inpat_trm
v1_age_1st_inpat_trm
v1_dur_illness
v1_1st_ep
v1_tms_daypat_outpat_trm
v1_cat_daypat_outpat_trm
Family history
v1_fam_hist
Severity
1_panss_sum_pos
v1_panss_sum_neg
v1_panss_sum_gen
v1_cgi_s
v1_gaf
v1_1st_ep
v1_dur_illness
v1_age_1st_inpat_trm
v1_drugs
v1_con_medication_variables_1
v1_clin_medication_variables_1
v1_med_clin_orig
Neurocognitive function
v1_nrpsy_com
v1_nrpsy_lng
v1_nrpsy_mtv
v1_nrpsy_tmt_A_rt
v1_nrpsy_tmt_A_err
v1_nrpsy_tmt_B_rt
v1_nrpsy_tmt_B_err
v1_nrpsy_dgt_sp_frw
v1_nrpsy_dgt_sp_bck
v1_nrpsy_dg_sym
v1_nrpsy_mwtb
Substance abuse
v1_ever_smkd
v1_no_cig
v1_lftm_alc_dep
v1_pst6_ill_drg
v1_evr_ill_drg
v1_evr_hvy_usr