Full text loading...
Abstract
Biclustering algorithms are unsupervised machine learning algorithms that find paired subsets of samples and variables exhibiting co-dependence in a transcriptomics dataset. While matrix-factorization-based biclustering is especially suited to revealing enrichment patterns in metabolomic datasets, a full matrix-factorization-based biclustering pipeline does not exist. Here we present mfBiclust, an R package with a Shiny-based GUI that enables users to apply recently developed biclustering pipelines to the transcriptomics datasets. In our general matrix-factorization pipeline, a data matrix is approximated as the product of two factors. The optimal number of biclusters for a dataset can be estimated by bi-cross-validating truncated singular value decompositions. Biclustering results can be visualized and exported, facilitating functional characterization of the observed biclusters. mfBiclust is thus potentially useful for analyzing any genomics and transcriptomics assay.
- Published Online: