Data Science Hub Computational biology
We’re a dedicated bioinformatics team that uses computational approaches to drive biological discoveries. Our core facility offers comprehensive data analysis, software development, and consulting tailored to the needs of diverse research projects.
Meet the team
Hu Chen
Hari Yalamanchili
View profile
Yi Zhong
View profile
Hyun-Hwan Jeong
Learn more
Johnathan Jia
View profile
Mehadi Hasan
View profile
A recent case study
We recently collaborated with Dr. Huda Zoghbi’s lab on a study of MeCP2, which causes Rett syndrome (RTT) and MECP2 duplication syndrome (MDS). Researchers identified Gdf11 as a gene tightly regulated by MeCP2: normalizing Gdf11 in MDS mice improved behavioral deficits, while losing one Gdf11 copy led to neurobehavioral issues, suggesting that Gdf11 plays a critical role in brain function. Our RNA-seq and CUT&RUN bioinformatics analyses were pivotal in establishing the relationship between MeCP2 and Gdf11, demonstrating the power of these techniques in uncovering the molecular mechanisms underlying neurological diseases. This research was published in eLife in 2023.
Our tools:
Bioconductor —open-source software that offers a comprehensive collection of tools and packages for the analysis and interpretation of high-throughput genomic data in the R programming language. It’s built on top of the R statistical computing environment and has powerful data manipulation and visualization capabilities.
STAR (Spliced Transcripts Alignment to a Reference) — a highly efficient RNA-seq aligner that maps sequencing reads to a reference genome while considering spliced alignments. It uses a two-step process, first identifying the maximal mappable prefixes and then aligning the reads based on the identified splice junctions.
Salmon —a lightweight, quasi-mapping-based tool for quantifying transcript abundance from RNA-seq data. It uses a two-phase inference procedure, combining online inference with a stochastic collapsed variational Bayesian inference algorithm.
HISAT2 (Hierarchical Indexing for Spliced Alignment of Transcripts 2) — a fast and sensitive alignment program for mapping next-generation sequencing reads to a reference genome. It uses a novel indexing scheme called Hierarchical Graph FM index (HGFM) and a new local alignment algorithm to achieve high accuracy and efficiency. It’s particularly well-suited for aligning reads spanning multiple exons and handling spliced alignments.
BWA (Burrows-Wheeler Aligner) — a widely used alignment algorithm that maps low-divergent sequences against a large reference genome. It employs the Burrows-Wheeler transform and backward search with a suffix array to enable fast and accurate alignments. BWA supports both short and long read alignment and is often used in variant calling pipelines.
Bowtie — an ultrafast, memory-efficient short read aligner that uses the Burrows-Wheeler transform and FM-index to align sequencing reads to large genomes. It supports both unspliced and spliced alignment modes and allows for mismatches and gaps. Bowtie is commonly used in various sequencing applications, including ChIP-seq, RNA-seq, and BS-seq.
Trinity — a de novo transcriptome assembly tool designed to reconstruct full-length transcripts from RNA-seq data without relying on a reference genome. It employs a three-step process: Inchworm, Chrysalis, and Butterfly. Trinity handles alternative splicing events, resolves paralogous transcripts, and produces isoform-level assemblies. It’s particularly useful for non-model organisms or when a reference genome is unavailable.
Seurat — an R package designed for analyzing and exploring single-cell RNA-sequencing data. It offers a comprehensive set of tools for quality control, normalization, dimensionality reduction, clustering, and differential expression analysis of scRNA-seq datasets.
Scanpy — a Python-based toolkit for analyzing single-cell gene expression data, built on top of the scientific Python stack, including NumPy, SciPy, Matplotlib, and Pandas.
Get in touch with us
Texas Children's Hospital researchers can submit a ticket here.
Baylor College of Medicine researchers can submit a ticket here.
Researchers from other institutions can contact us at researchdata@texaschildrens.org