What should be the normalization protocol for RNA seq data for WGCNA?
1
2
Entering edit mode
sukeshinik5 ▴ 20
@8cebf978
Last seen 4 hours ago
India

Hello, I am performing WGCNA on RNA sequencing data for a cohort. We are struggling with the normalization method that will not interfere with the correlation with the traits.

1. Which would be the most appropriate method to normalize RNA seq for WGCNA? I have come across where vst and rlog are being used. However, there was little explanation for

2. Why DESeq2 median normalization or quantile normalization was not used. We also have 4 separate batches of sequencing for which I am using RemoveBatchEffect after normalization.

3. In case if vst/rlog, shall vst or rlog be performed on matrix or deseq2 object?

Following are the codes for multiple ways that are used. Could you please correct me if it's wrong and comment on the right one to choose?

METHOD 1: USING CPM (library:edgeR)

data_filt <- cpm(data_filt, log=TRUE)

data_adj1 <- removeBatchEffect(data_filt, Batch = coldata$Batch, covariates=NULL[,-1])

----------------------------------------------------------------------

METHOD 2: USING QUANTILE NORMALIZATION (library:preprocessCore)

QST<- preprocessCore::normalize.quantiles(data_filt, copy = TRUE, keep.names = TRUE)

rownames( QST) <- rownames(data_filt)

colnames( QST) <- colnames(data_filt)

data_adj <- removeBatchEffect(QST, Batch = coldata$Batch, covariates=NULL[,-1])

------------------------------------------------------------------------

METHOD 3: USING DESEQ2 (library:DESeq2)

dds <- DESeqDataSetFromMatrix(countData = data_filt, colData = coldata, design = ~1) #~1 because i have a cohort and not case control

dds <- DESeq(dds)

norm_count <- counts(dds, normalized = TRUE)

data_adj <- removeBatchEffect(norm_count, batch = coldata$Batch, covariates=NULL[,-1])

------------------------------------------------------------------------

METHOD 4: USING VST (library:DESeq2)

dds <- DESeqDataSetFromMatrix(countData = data_filt, colData = coldata, design = ~1)

dds <- DESeq(dds)

vsd <-varianceStabilizingTransformation(dds) #requires matrix or DESeq object

vsd<-getVarianceStabilizedData(vsd)

data_adj1 <- removeBatchEffect(vsd, batch = coldata$Batch, covariates=NULL[,-1])

-------------------------------------------------------------------------

METHOD 5: USING RLOG (library: DESeq2)

rlog <- rlogTransformation(data_filt, blind = FALSE)

data_adj<- removeBatchEffect(rlog, batch = coldata$Batch, covariates = NULL[,-1])

--------------------------------------------------------------------------

Your help will be really appreciated in this matter, Thank you, Sukeshini K

RNA_seqdata Normalization_method WGCNA • 89 views
ADD COMMENT
1
Entering edit mode
ATpoint ★ 4.5k
@atpoint-13662
Last seen 9 hours ago
Germany

WGCNA is still not part of Bioconductor (Batch adjustment for cohort based RNA seq data) so please check its documentation what the authors recommend. Generally, note that batch regression should be done on log2-scale data, so method 3 is wrong per se. WGCNA documentation is currently at https://bioinformatics.stackexchange.com/questions/21885/where-to-access-the-wgcna-tutorial-documents-horvath-lab-site-down/21886#21886

ADD COMMENT
0
Entering edit mode

Thank you so much for your response. Can you please tell me what type of preprocessing should be expected for any gene correlation analysis in RNA sequencing data?

ADD REPLY
0
Entering edit mode

Read the WGCNA docs, please. Just repeating the question is basically ignoring what I just wrote.

ADD REPLY
0
Entering edit mode

I apologize if you felt like I had ignored the response. I am new to the technical terms and analysis. From above, what I understood is that rlog does log2, and vst also mostly performs log2 transformation. So these methods are more appropriate. Regarding the tutorial on WGCNA, it has worked with microarray datasets and has performed rma; if I quantile normalize and perform log transformation and then correct for batch effect, will that be appropriate? (I understand WGCNA is not a part of Bioconductor; only need to know if what I am performing isn't incorrect technically).

https://alexslemonade.github.io/refinebio-examples/04-advanced-topics/network-analysis_rnaseq_01_wgcna.html (suggests vst)

Thank you and apologies for bearing my silly questions, I truly appreciate your support, Sukeshini K

ADD REPLY

Login before adding your answer.

Traffic: 535 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6