biology – Your Bioinformatics Developer

Ensemble Feature Selection for Robust Multi-Omics Biomarker Discovery

Feb 26, 2026

—

by

in Bioinformatics Data Analytics, Bioinformatics Workflows, Pipelines and Tools, Omics & Multi-Omics Analysis, Projects

Ensemble Feature Selection for Robust Multi-Omics Biomarker Discovery A Systems-Level Approach to Biomarker Discovery Abstract Biomarker discovery often suffers from instability — different feature selection algorithms produce different gene lists, reducing reproducibility and clinical trust. This project addressed that challenge by designing and implementing a comprehensive ensemble feature selection framework that integrates multiple ranking strategies…

Benchmarking Germline and Somatic Variant Calling Pipelines Using CoSAP and Galaxy

Jun 5, 2025

—

by

Cyrille Njume

in Bioinformatics Data Analytics, Bioinformatics Workflows, Pipelines and Tools, Omics & Multi-Omics Analysis, Projects

Benchmarking Germline and Somatic Variant Calling Pipelines Using CoSAP and Galaxy A Comparative Study with 9 Pipelines on Real Exome Data Abstract This project evaluated nine variant calling pipelines using whole exome sequencing (WES) data. It involved combinations of two mappers (BWA, Bowtie2) and four variant callers (DeepVariant, HaplotypeCaller, Strelka, SomaticSniper) across both germline and…

Getting Started with Trimmomatic for Illumina Sequencing

Apr 30, 2024

—

by

Cyrille Njume

in Uncategorized

Trimmomatic is a vital tool for preprocessing Illumina NGS data, addressing issues like adapter contamination and low-quality bases. It offers various quality filtering methods and supports paired-end reads. As an open-source tool, it enhances mapping accuracy and is essential for accurate downstream analyses in genomics research.

Quality Control with MultiQC: A Complete Guide

Apr 30, 2024

—

by

Cyrille Njume

in Uncategorized

MultiQC is a bioinformatics tool that consolidates results from multiple samples into a single HTML report, streamlining data analysis from high-throughput sequencing. It supports over 100 tools, providing summary statistics and visualizations. This automation enhances quality control, saves time, and minimizes errors, making it ideal for various genomic workflows.

Quality Control of High-Volume Sequencing Data with FastQC: A Complete Guide

Apr 26, 2024

—

by

Cyrille Njume

in Uncategorized

High-throughput sequencing technologies have transformed genomics by enabling massive data generation in record time. But with large volumes of data comes the responsibility of ensuring its quality. This is where FastQC steps in—offering a robust quality control solution tailored for sequencing data. 🔍 What is FastQC? FastQC is a Java-based application designed to analyze sequencing…

Exploring Essential Data Types and Formats in Bioinformatics: Origins and Applications

Apr 23, 2024

—

by

Cyrille Njume

in Uncategorized

Bioinformatics merges biology and computational science to analyze complex biological data, vital for genomics, transcriptomics, and proteomics. Understanding data types like sequencing and file formats is essential for effective analysis. Each format addresses specific needs, enhancing data management and interpretation, crucial for impactful discoveries in life sciences and personalized medicine.

Gene Set Enrichment and KEGG Pathway Analysis Using ClusterProfiler

Jan 20, 2024

—

by

Cyrille Njume

in Bioinformatics Data Analytics, Custom Solutions with Python and R Programming, Data Visualization & Dashboards, Projects

The project utilized KEGG-based gene set enrichment analysis from DESeq2 results to visualize biological pathway alterations in Alzheimer’s disease. Using R and ClusterProfiler, enriched pathways were identified and visualized, revealing significant immune and neurodegenerative responses. The findings could inform future research and biomarker discovery in Alzheimer’s.

BIOPRED: A Machine Learning-Based Web Application for Accurate Bioactivity Prediction, Drug Repurposing, and Molecular Docking

Jun 5, 2023

—

by

Cyrille Njume

in Bioinformatics Data Analytics, Custom Solutions with Python and R Programming, Machine Learning & Bioinformatics Consulting, Projects

BIOPRED is a machine learning-driven web application developed for predicting drug-target interactions and supporting molecular docking. Utilizing ChEMBL data, it employs various algorithms for both regression and classification tasks with high accuracy. The user-friendly platform enables researchers to input SMILES strings and get bioactivity predictions, facilitating drug repurposing efforts.

Tag: biology