User Tools

Site Tools


open_projects

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
open_projects [2024/01/22 14:54] – [Open projects in Bioinformatics] projectopen_projects [2024/02/14 17:09] (current) project
Line 4: Line 4:
  
 If you are a potential supervisor, [[supervisor_instructions:click here]] If you are a potential supervisor, [[supervisor_instructions:click here]]
 +
 +=== Investigation of the effect of the circadian rhythm on the genetic control of gene expression ===
 +
 +Contact: Sonia shah <sonia.shah@imb.uq.edu.au>, Solal Chauquet <uqschauq@uq.edu.au >
 +
 +The circadian rhythm reflects the daily cycle of behaviours and metabolic processes organisms exhibit. A 24-hour gene expression pattern occurs at the molecular level, with genes activated either during the day or night. Different tissues all display circadian control, with some more affected than others. Within the liver, for example, 3000 genes are subjected to circadian control. This regulation is orchestrated by a small group of CLOCK genes, establishing feedback loops that result in rhythmic gene expression in every tissue.
 +
 +We know that gene expression can be influences by genetics variants, called expression quantitative trait loci (eQTL), and this may be one mechanism linking genetic variants to disease. As a result, large eQTL datasets have been generated to assist in understanding disease mechanisms. However, it remains unknown whether sample collection time can affect eQTL identification. This project therefore aims to identify the possible effects of the circadian rhythm on the genetic control of gene expression using the Genotype-Tissue expression (GTEx) dataset.
 +
 +During this project, you will run Python tools such as PEER and tensorQTL to identify eQTL within 49 tissues. You will subsequently investigate the associations identified and follow up on the role of the genes under circadian controls within different phenotypes.
 +
 +=== Understanding the influence of taste and olfactory perception on eating behaviour and health conditions using big genetic data ===
 +
 +Contact info: Daniel Hwang <d.hwang@uq.edu.au>
 +
 +Project description: Human perception of taste and smell plays a key role in food preferences and choices. There is a large and growing body of work suggesting that taste and smell (together known as "chemosensory perception") determine eating behaviour and dietary intake, a primary risk factor of chronic conditions such as obesity, cardiometabolic disorders, and cancer. Evidence to date is largely based on observational studies that are susceptible to confounding and reverse causation, leaving the "causal effects" of chemosensory perception on food consumption unclear. If their relationship is truly causal, flavour modification may represent a tangible way of modifying food consumption in a way that benefits public health outcomes. This project aims to: (i) elucidate the genetic architecture underlying individual differences in taste and smell perception, (ii) use this information to assess their causal effects on eating behaviour, and (iii) create a sensory-food causal network mapping individual sensory qualities (i.e. sweet taste, bitter taste, and more) to individual food items.
 +
 +=== Increasing drug success rate in human clinical trials using genomics ===
 +
 +Around 90% of drug candidates fail in human clinical trials largely due to lack of efficacy or safety concerns. This partly reflects the limitations of using in vitro and animal studies to predict the effect of compounds in humans. Recent studies highlight that drug targets backed by evidence from human genetic studies are 2 times more likely to make it to market. Human genetic data can also identify potential adverse side effects. Such information prior to embarking on human clinical trials could improve the success rate of a compound in human clinical trials and help avoid adverse outcomes for participants. This project will use statistical genomics analyses using publicly available human genomic data to predict efficacy as well as any safety concerns of compounds that are currently in the drug development pipeline.
 +
 +Project significance: Findings from this project could potentially identify new therapeutic applications for these compounds or unknown side effects, and ultimately informing future human clinical trials.
 +
 +Contact: Sonia Shah <sonia.shah@imb.uq.edu.au>
 +
 +Supervisors: You will be working with a multidisciplinary team of supervisors Prof Dave Evans, Dr Sonia Shah, Prof Glenn King, Assoc/Prof Nathan Palpant
 +
 +Familiarity with computational analyses (e.g using R or python etc) is needed for this project. Some knowledge around genome-wide association studies and statistical genomics methods such as Mendelian randomisation analysis would be beneficial
 +
 +=== Developing quiescent stem cell classifier using single cell transcriptomics ===
 +Contact info: Dr Lachlan Harris (Lachlan.Harris@qimrberghofer.edu.au), Dr Olga Kondrashova (Olga.Kondrashova@qimrberghofer.edu.au)
 +
 +Quiescence is a reversible state of cell-cycle arrest, sometimes referred to as the “G0” phase of the cell-cycle. It is an adaptive feature of most adult stem cell populations, where it ensures that stem cells divide only when needed, preserving regenerative capacity. However, quiescence is also adopted by cancer stem cells to evade chemo- and radiotherapies that preferentially kill fast-dividing cells. Single-cell data promises to uncover the molecular regulation of quiescent stem cells in health and disease but the identification of these cells within these datasets is either reliant on expert knowledge and manual curation or is currently impossible, due to a lack of marker genes. 
 +
 +The most common classifiers that define cell-cycle stages (G1/S/G2/M) in single-cell RNA-sequencing data (scRNA- seq) were trained on populations of actively cycling cells. Therefore, these tools cannot identify quiescent stem cells in “G0” phase of the cell-cycle. It is an outstanding question as to whether there are sufficient transcriptomic similarities across quiescent stem cells from different tissue types to build a generalisable model to discriminate these cellular populations. Furthermore, it is unknown whether such a model would generalise to cancerous tissue, where increased variability in transcriptomic states often degrades the distinction between cell types. 
 +
 +This project aims to develop a broadly applicable quiescent classifier. As a first step towards this, this project will seek to 1) contribute to the curation of datasets and isolation of tissue-agnostic and tissue-specific feature sets that define quiescent stem cells and 2) compare methods for training quiescent classifiers and for determining the most salient features. 
 +
 +
 +=== Understanding sex-specific cardiovascular disease risk ===
 +
 +Contact info: Dr Sonia Shah (sonia.shah@imb.uq.edu.au), Dr Clara Jiang (j.jiang@uq.edu.au)
 +
 +Description: Cardiovascular diseases (CVD) account for 35% of female deaths globally (29% in Australia). However, CVDs remain under-studied, under-diagnosed and under-treated in women. This sex disparity is partly due to the lack of knowledge of female-specific risk factors. This project involves statistical analysis of large-scale health and genetic data to identify sex-specific CVD risk factors and underlying mechanisms.
 +
 +Requirements: A background in genetics and computational data analysis is preferable.
 +
 +=== De-risking the drug development pipeline by finding biomarkers of drug action ===
 +
 +Supervisor: Dr Nathan Palpant (n.palpant@uq.edu.au)
 +
 +Greater than 90% of drugs fail to advance into clinical approval. Genetic evidence supporting a drug-target-indication can improve the success by greater than 50%. This project aims to make use of consortium-level data resources (UKBiobank, Human Cell Atlas, ENCODE etc) to identify genetic links between genetic targets and phenotypes to help facilitate the translation of drugs from healthy individuals (Phase 1 clinical trial assessing safety) into sick patients (Phase 2 clinical trial assessing efficacy). Finding orthogonal biomarkers of drug action in healthy individuals is critical to de-risk drug dosing when transitioning from Phase 1 to Phase 2 trials. Using ASIC1a as a candidate drug being developed to treat heart attacks, we aim to develop a functionally validated computational pipeline to predict orthogonal biomarkers of ASIC1a inhibitor drug action in healthy individuals to help inform dosing in human clinical trials. Computationally predicted biomarkers will be validated using genetic knockout animals and pharmacological inhibitors of ASIC1a. Collectively, this project will help develop proof-of-principle computational pipeline for orthogonal biomarker prediction of drug targets in the human genome.  
 +
 +=== Parsing the genome into functional units to understand the genetic basis of cell identity and function ===
 +
 +Supervisor: Dr Nathan Palpant (n.palpant@uq.edu.au)
 +
 +The billions of bases in the genome are shared among all cell types and tissues in the body. Understanding how regions of the genome control the diverse functions of cells is fundamental to understanding evolution, development, and disease. We recently identified approaches to define diverse biologically constrained regions of the genome that appear to control very specific cellular functions. This project will evaluate how these biologically constrained regions of the genome have influenced evolutionary processes, evaluate their regulatory basis in controlling the identity and function of cells, and analyse the promiscuity of cross-talk between different biologically constrained regions. The project will also study how these genomic regions impact disease mechanisms by evaluating how disease-associated variants in different regions influence survival of patients with cancer and assessing whether these regions are associated with identifying causal disease variants in human complex trait data. The project will involve significant collaborative work with industry partners and researchers across Australia with the goal of providing critical insights into fundamental mechanisms of genome regulation.    
  
 === Machine learning integration of sequencing and imaging data in cancer research === === Machine learning integration of sequencing and imaging data in cancer research ===
Line 113: Line 171:
  
  
-=== Trans-ancestry conditional analyses of genome-wide association studies === +=== Decoding Transcription Factor Dosage Effects on Cell State Transitions with DoseH-Seq ===
  
-Contact: Dr Loic Yengo (l.yengo@imb.uq.edu.au)+Contact info: Dr Christian Nefzger (c.nefzger@imb.uq.edu.au), Ralph Patrick (ralph.patrick@imb.uq.edu.au) and Marina Naval-Sanchez (m.navalsanchez@imb.uq.edu.au)
  
-The experimental design of genome-wide association studies (GWASconsists in testing the association between a large number of DNA polymorphisms and a trait of interestClassicallythese associations are tested using a simple linear regression (i.eone at a timeframeworkwhich cannot distinguish associations from correlated variantsTo solve that issueconditional and joint (COJOanalyses leverage the correlation structure between polymorphisms to identify subsets of variants that are jointly associated with the trait of interestCurrent implementations of COJO algorithms can be applied to GWAS performed in individuals of a single ancestrywhere the correlation structure between variants is constant; but they cannot yet handle meta-analyses of GWAS from diverse ancestries (e.g. East-AsianEuropean).+Cell identity is controlled by different combinations of transcription factors (TFsthat bind to genomic regulatory elements to regulate gene expression. TF activity is not binary in most instances but graded and in response to TF dosage levels (e.g., Naqvi et al., Nat Genet., 2023, PMID: 37024583). For this reasonTFs are strongly enriched for haploinsufficient disease associations (Seidman et al, 2002, J. Clin. Invest. PMID: 11854316; Van de Lee et al., 2020, Trends Genet., PMID: 32451166) and TF dosage and stoichiometry strongly affects reprogramming outcomes (e.g., Polo et al, 2012, Cell, PMID: 32939092; An et al., 2019, Cell Reports, PMID: 31722212). Furthermore, TF dosage effects may also underlie seemingly contradictory effects linked to overactivation of certain TFs in cancer contextsincluding of the Nfi family (Becker-Santos, 2017, The Lancet Discovery SciencePMID: 28596133).
  
-This project aims at developing COJO algorithm to simultaneously perform variants selection and meta-analyses of multiple GWAS from participants of diverse ancestriesThe research will include: (ideveloping and comparing algorithms, (iitesting the impact of violations of model assumptions through simulations and (iiiwriting C++ based software implementing this algorithmApplication of this research can improve our ability to discover genes involved in the susceptibility of common diseases.+Single-cell RNA+ATAC-seq is uniquely powerful assay to measure the impact of TF levels on cell regulatory architecture; however, no tools currently exist to directly study TF dosage effects on temporal cell state transitionsTo address these gaps, we developed Dosage and Hashtag sequencing (DoseH-seq), an expansion of the 10x Genomics single-nucleus (sn)RNA+ATAC-seq assay that enables sensitive detection of lentiviral perturbations (e.g., TFslinked to heterogeneously expressed promoterIn combination with sample hash tagging, multiple temporal, and dosage states, for theoretically any number of genes of interest, can be profiled. This allows detection of TF dosage-dependent effects on temporal cell state transitions, chromatin architecture, co-factor expression, and the rewiring of TF networks at high-resolution. Compatibility with BGI sequencing technology enables the generation of low-cost datasets.We demonstrate the utility of DoseH-seq by tracking the dosage effects of somatic transcription factor, Nfix, during reprogramming towards pluripotency. Contrary to the current dogma, we find that Nfi overexpression can act either as a reprogramming roadblock or as a reprogramming booster, depending on TF dosage and context. These insights may help resolve the TF’s paradoxical role in cancer. DoseH-seq represents a powerful tool for elucidating, and ultimately controlling, both desired and pathological cell state transitions.
  
-The ideal candidate will have a good understanding of the multiple linear regression model and will be able to efficiently program in R/Python and C++.+The applicant would help drive method establishment around our novel DoseH-seq technique and support analysis to understand TFs dosage effects with established data sets. Ideal candidate will be able to efficiently program in R or Python. This project is looking for bioinformatics Masters students (ideally 16 units, but we consider 8 unit applicants as well. We also consider PhD students.
  
  
  
 +=== Trans-ancestry conditional analyses of genome-wide association studies === 
  
-=== DNA sequence analysis to investigate why prevalence of adverse effects to ACE inhibitor medication differs across ancestries === +Contact: Dr Loic Yengo (l.yengo@imb.uq.edu.au)
-  +
-Contact: Dr Sonia Shah (s.shah1@uq.edu.au) +
-  +
-The angiotensin converting enzyme (ACE) is a component of the renin-angiotensin pathway which regulates blood pressure. It is a target for blood pressure lowering medication (ACE inhibitors). The efficacy and occurrence of adverse side-effects from ACE inhibitor treatment is different amongst difference ancestries. +
-  +
-This project will analyse exome sequence data of the ACE gene in different ancestries to determine if there are differences in structure across different ancestries, which may explain the ancestry differences in ACE inhibitor adverse effets. +
-  +
-The ideal candidate will have knowledge and experience in bioinformatics, particularly DNA and protein sequence analysis and analysis of next generation sequence data. Though not necessary, experience with tools such as NCBI BLAST, samtools, vcftools, other sequence analysis packages in R will be advantageous.+
  
 +The experimental design of genome-wide association studies (GWAS) consists in testing the association between a large number of DNA polymorphisms and a trait of interest. Classically, these associations are tested using a simple linear regression (i.e. one at a time) framework, which cannot distinguish associations from correlated variants. To solve that issue, conditional and joint (COJO) analyses leverage the correlation structure between polymorphisms to identify subsets of variants that are jointly associated with the trait of interest. Current implementations of COJO algorithms can be applied to GWAS performed in individuals of a single ancestry, where the correlation structure between variants is constant; but they cannot yet handle meta-analyses of GWAS from diverse ancestries (e.g. East-Asian, European).
  
 +This project aims at developing a COJO algorithm to simultaneously perform variants selection and meta-analyses of multiple GWAS from participants of diverse ancestries. The research will include: (i) developing and comparing algorithms, (ii) testing the impact of violations of model assumptions through simulations and (iii) writing a C++ based software implementing this algorithm. Application of this research can improve our ability to discover genes involved in the susceptibility of common diseases.
 +
 +The ideal candidate will have a good understanding of the multiple linear regression model and will be able to efficiently program in R/Python and C++.
  
  
Line 203: Line 258:
 In this project, students will use cutting edge data sources including reduce representation bisulphite sequencing data, whole genome bisulphite sequencing, long read sequencing and human methylation data to develop a tool to impute methylation sites from low coverage ONT sequence data.  In this project, students will use cutting edge data sources including reduce representation bisulphite sequencing data, whole genome bisulphite sequencing, long read sequencing and human methylation data to develop a tool to impute methylation sites from low coverage ONT sequence data. 
  
-  
 This project is designed for students who are studying for Masters of Molecular Biology, Masters of Biotechnology, & Masters of Bioinformatics.  This project is designed for students who are studying for Masters of Molecular Biology, Masters of Biotechnology, & Masters of Bioinformatics. 
  
 Available for semester 1, 2 and summer  Available for semester 1, 2 and summer 
  
- +=== Differential methylated regions related to puberty in Brahman cattle ===
- +
-==== Differential methylated regions related to puberty in Brahman cattle ===+
  
 Puberty is a complex whole-body phenomenon that affects bone growth. In this study, we investigated how puberty in Bos indicus females affects methylation profiles in the epiphyseal growth plate, the cartilage that is essential to bone growth in long bones. Student will analyse nanopore sequencing data of 12 samples (6 pre-puberty and 6 post-puberty) to call methylation and identify the differentially methylated regions between these two groups. Puberty is a complex whole-body phenomenon that affects bone growth. In this study, we investigated how puberty in Bos indicus females affects methylation profiles in the epiphyseal growth plate, the cartilage that is essential to bone growth in long bones. Student will analyse nanopore sequencing data of 12 samples (6 pre-puberty and 6 post-puberty) to call methylation and identify the differentially methylated regions between these two groups.
Line 217: Line 269:
  
 Available for semester 1, 2 and summer  Available for semester 1, 2 and summer 
- 
  
 === CRISPR === === CRISPR ===
Line 234: Line 285:
  
 Available all year, for Master of Bioinformatics students; suitable for one semester, full-time. Available all year, for Master of Bioinformatics students; suitable for one semester, full-time.
- 
-=== The transcriptional landscape of cardiovascular differentiation === 
- 
-Contact info: Nathan Palpant (n.palpant@uq.edu.au) 
- 
-Project description: Analysing the transcriptional landscape of cardiovascular differentiation from stem cells at single cell resolution. Stem cells provide a mechanism for generating all cell types of the body. Understanding the mechanisms by which stem cells differentiation into these diverse cell types is central to utilizing them for understanding developmental biology and maximizing their translational potential for cell therapeutics and drug discovery. In this project, you will make use of and develop computational and statistical tools to study the transcriptional landscape of cardiovascular differentiation at single cell resolution. The project will include implementing protocols for quality control analysis, normalization, and clustering, analysing gene networks underlying cell subpopulations, identifying key genetic regulators of cell states, and helping develop novel strategies for studying and analysing single cell RNA-sequencing data to study biological questions.   
- 
-Availability, requirements, etc: The project as available on an ongoing basis for honours or masters of Bioinformatics students, full time. 
  
 === Machine learning and data integration in bioinformatics === === Machine learning and data integration in bioinformatics ===
Line 252: Line 295:
  
 Availability all year, for bioinformatics students with problem-solving skills, Honours or Masters. Availability all year, for bioinformatics students with problem-solving skills, Honours or Masters.
- 
- 
  
 === Reconstruction of ancestral proteins === === Reconstruction of ancestral proteins ===
open_projects.1705895687.txt.gz · Last modified: 2024/01/22 14:54 by project