Differential Gene Expression Analysis in Alzheimer’s Disease

Differential Gene Expression Analysis in Alzheimer’s Disease

Client or Institution: Reproducible Research Project
Project Goal: Identify and visualize differentially expressed genes (DEGs) in Alzheimer’s disease using RNA-Seq data and statistical methods
Tools Used: R, DESeq2, Ggplot2, ComplexHeatmap, Python (preprocessing)
Duration: 3 weeks
Outcome: Discovered top DEGs in Alzheimer’s disease; created a public Shiny app to interactively explore MA plots, dispersion, heatmaps, PCA, and gene counts
Dataset: GSE53697
Web App: Explore Webapp

From Preprocessing to Shiny App Exploration Using DESeq2

Abstract

This project presents a full workflow for differential gene expression analysis (DGE) using RNA-Seq data from the GSE53697 dataset. After performing quantile normalization and exploratory analysis using PCA, outliers were detected and excluded. Differential analysis was then carried out using the DESeq2 package in R.

The results were visualized through a variety of plots—MA, dispersion, volcano, PCA, and heatmaps—highlighting key transcriptional changes between Alzheimer’s patients and healthy controls. A Shiny dashboard was also created to allow collaborators and researchers to explore the data interactively.

🧠 Background & Problem Statement

Understanding how gene expression is altered in Alzheimer’s is crucial for identifying potential biomarkers and therapeutic targets. However, such datasets are high-dimensional and noisy, and require careful preprocessing, normalization, and statistical testing.

This project addresses the full spectrum of this challenge:

Preprocess the GSE53697 RNA-Seq dataset
Remove noise/outliers
Perform DGE analysis
Deliver interactive plots for better interpretation

🗂 Dataset

GSE53697 is a public gene expression dataset from the GEO repository. It contains RNA-Seq data from both Alzheimer’s and Control human samples.

📌 Preprocessing

Quantile normalization (via preprocessCore in R)
PCA used to identify and exclude strong outliers
Raw counts retained for DESeq2 analysis

⚙️ Workflow Overview

1. Exploratory Data Analysis (EDA)

PCA Scatter Plot

PCA was used to detect outlier samples and observe clustering by disease state.

Clear separation between Alzheimer and Control samples.
An earlier PCA plot showing potential outliers.

2. Differential Expression Analysis with DESeq2

DESeq2 was used to fit models to count data
Statistical tests computed log2FoldChange, p-value, and adjusted p-value (padj)
Significant DEGs were defined by |log2FC| > 1 and padj < 0.05

📊 Visualizations

📈 Dispersion Plot

Assesses model fit and biological variability.

🧪 MA Plot

Mean of normalized counts vs log2 fold change.

🌋 Volcano Plot

Combines significance and magnitude for gene prioritization.

🧬 Heatmap of Top DEGs

Shows clustered gene expression patterns across conditions and genders.

🧰 Outcome

Identified multiple upregulated genes (e.g., CP, CD44, SERPINA3)
Built interactive Shiny app to explore:
- PCA
- Volcano plots
- MA & dispersion plots
- Count plots for any gene
- Heatmaps with annotation tracks
Shiny dashboard uses .rds object and renders plots with ggplot2, plotly, ComplexHeatmap, and shinydashboard

💡 Lessons Learned

PCA was essential for quality control and sample exclusion
Outlier removal substantially improved the clarity of DE patterns
Automating Shiny dashboard deployment provides an efficient way to share complex analyses

💬 Discussion

This project offers a reproducible, interpretable, and shareable pipeline for DGE analysis. It highlights best practices in both statistical rigor (DESeq2, multiple testing correction) and communication (interactive visualization). The analysis aligns well with reproducibility standards and can serve as a blueprint for similar transcriptomics projects.

📣 Call to Action

Got a GEO dataset you’d like to explore?
🔬 I can help you turn raw expression data into clear, interactive insights.
💬 Start a conversation to build your own DESeq2 pipeline or Shiny dashboard.

Start a Conversation

Differential Gene Expression Analysis in Alzheimer’s Disease

Differential Gene Expression Analysis in Alzheimer’s Disease

Abstract

🧠 Background & Problem Statement

🗂 Dataset

📌 Preprocessing

⚙️ Workflow Overview

1. Exploratory Data Analysis (EDA)

PCA Scatter Plot

2. Differential Expression Analysis with DESeq2

📊 Visualizations

📈 Dispersion Plot

🧪 MA Plot

🌋 Volcano Plot

🧬 Heatmap of Top DEGs

🧰 Outcome

💡 Lessons Learned

💬 Discussion

📣 Call to Action

Discover more from Your Bioinformatics Developer

Comments

Leave a comment Cancel reply

Differential Gene Expression Analysis in Alzheimer’s Disease

Differential Gene Expression Analysis in Alzheimer’s Disease

Abstract

🧠 Background & Problem Statement

🗂 Dataset

📌 Preprocessing

⚙️ Workflow Overview

1. Exploratory Data Analysis (EDA)

PCA Scatter Plot

2. Differential Expression Analysis with DESeq2

📊 Visualizations

📈 Dispersion Plot

🧪 MA Plot

🌋 Volcano Plot

🧬 Heatmap of Top DEGs

🧰 Outcome

💡 Lessons Learned

💬 Discussion

📣 Call to Action

Share this:

Discover more from Your Bioinformatics Developer

Comments

Leave a comment Cancel reply

Discover more from Your Bioinformatics Developer