Article
Integrative analysis of RNA-Seq and DNA-Seq data via boosting
Search Medline for
Authors
Published: | September 6, 2019 |
---|
Outline
Text
Introduction and question: The identification of SNPs and InDels in the comparison of tumor and normal samples comparison often results in small power and a high false positive rate. A possibility to solve this problem is to increase the sample size, but genome sequencing is expensive and the sample size is often limited, especially for rare diseases. Another possibility to overcome this problem is to integrate data from other molecular levels. Assuming that a mutation at the DNA level leads to a modified transcription level, RNA-Seq and DNA-Seq data can be combined.
Material and methods: We present a boosting approach for performing such a combined analysis. RNA-Seq data are analyzed in a first step, to identify differential expressed genes. The reciprocal value of the resulting p-values are then used as weights in a likelihood-based boosting algorithm to identify SNPs and InDels in DNA-Seq data. The same weight is used for each SNP within a gene and within 200 kb upstream and downstream of the gene body to include regulatory regions. The boosting algorithm utilized resampling techniques and SNPs/InDels are selected based on inclusion frequencies. This approach was develop on simulated RNA-Seq and DNA-Seq data for tumor and control sampes.
Results: The integrated analysis of simulated DNA-Seq and RNA-Seq data in order to identify SNPs and InDels is seen to raise the power and reduce the false positive rate compared to an analysis of only DNA-Seq.
Discussion: A mutation on DNA level does not necessary lead to a changed transcription level having only an effect on the protein level. This type of mutation is down-weighted by the propose boosting approach, but still can potentially be identified, in contrast to filtering mutations based on differentially expressed genes.
The authors declare that they have no competing interests.
The authors declare that an ethics committee vote is not required.