Fork me on GitHub!


Build Status DOI

clonealign assigns single-cell RNA-seq expression to cancer clones by probabilistically mapping RNA-seq to clone-specific copy number profiles using reparametrization gradient variational inference. This is particularly useful when clones have been inferred using ultra-shallow single-cell DNA-seq meaning SNV analysis is not possible.

Getting started


  1. Introduction to clonealign Overview of clonealign including data preparation, model fitting, plotting results, and advanced inference control
  2. Preparing copy number data for input to clonealign Instructions for taking region/range specific copy number profiles and converting them to gene and clone specific copy numbers for input to clonealign


clonealign is built using Google’s Tensorflow so requires installation of the R package tensorflow:

tensorflow::install_tensorflow(extra_packages ="tensorflow-probability", version="1.12.0")

Note that clonealign uses the Tensorflow probability library, requiring Tensorflow version >= 1.12.0, which can be installed using the above.

clonealign can then be installed from github:

install.packages("devtools") # If not already installed


clonealign accepts either a cell-by-gene matrix of raw counts or a SingleCellExperiment with a counts assay as gene expression input. It also requires a gene-by-clone matrix or data.frame corresponding to the copy number of each gene in each clone. The cells are then assigned to their clones by calling

cal <- clonealign(gene_expression_data, # matrix or SingleCellExperiment
                  copy_number_data)     # matrix or data.frame
A clonealign_fit for 200 cells, 100 genes, and 3 clones
To access clone assignments, call x$clone
To access ML parameter estimates, call x$ml_params
[1] "B" "C" "C" "B" "C" "B"


clonealign: statistical integration of independent single-cell RNA and DNA sequencing data from human cancers, Genome Biology 2019


Kieran R Campbell, University of British Columbia