Trajectory Inference with VIA#

VIA (Velocity and Topology Inference Algorithm) is a single-cell trajectory inference method for topology construction, pseudotime estimation, automatic terminal-state prediction, and visualization of temporal gene dynamics along lineages. This tutorial follows the example data and analysis workflow provided by the VIA authors, while using the OmicVerse wrapper interface for a more direct AnnData workflow.

If you use VIA in your research, please cite:

Generalized and scalable trajectory inference in single-cell omics data with VIA

Code repository: ShobiStassen/VIA

Reproducible Colab tutorial: https://colab.research.google.com/drive/1A2X23z_RLJaYLbXaiCbZa-fjNbuGACrD?usp=sharing

%matplotlib inline

from pathlib import Path
import warnings

import scanpy as sc
import matplotlib.pyplot as plt
import omicverse as ov

warnings.filterwarnings("ignore", category=FutureWarning)

ov.plot_set(font_path='Arial')
Path("figures").mkdir(exist_ok=True)

%load_ext autoreload
%autoreload 2
🔬 Starting plot initialization...
Using already downloaded Arial font from: /var/folders/rv/3jnfbs0d6r7d0c5bfj7ft5k00000gn/T/omicverse_arial.ttf
Registered as: Arial
🧬 Detecting GPU devices…
✅ Apple Silicon MPS detected
    • [MPS] Apple Silicon GPU - Metal Performance Shaders available

   ____            _     _    __                  
  / __ \____ ___  (_)___| |  / /__  _____________ 
 / / / / __ `__ \/ / ___/ | / / _ \/ ___/ ___/ _ \ 
/ /_/ / / / / / / / /__ | |/ /  __/ /  (__  )  __/ 
\____/_/ /_/ /_/_/\___/ |___/\___/_/  /____/\___/                                              

🔖 Version: 2.2.1rc1   📚 Tutorials: https://omicverse.readthedocs.io/
✅ plot_set complete.

Data Loading and Preprocessing#

This tutorial uses the scRNA_hematopoiesis hematopoietic development dataset provided by the VIA authors. The data have already been normalized and log-transformed, but not scaled. After loading the dataset, we compute PCA; VIA will use adata.obsm["X_pca"] as the cell feature space.

adata = ov.single.scRNA_hematopoiesis()
sc.tl.pca(adata, svd_solver='arpack', n_comps=200)
adata
Cell type PRE_B2 has  515 cells
Cell type MONO1 has  186 cells
Cell type GMP has  28 cells
Cell type EOS2 has  3 cells
Cell type ERY4 has  53 cells
Cell type mDC (cDC) has  96 cells
Cell type GRAN1 has  3 cells
Cell type BASO1 has  2 cells
Cell type B_a1 has  27 cells
Cell type B_a4 has  8 cells
Cell type PRE_B3 has  2 cells
Cell type ERY3 has  104 cells
Cell type MONO2 has  35 cells
Cell type HSC1 has  2365 cells
Cell type MEP has  339 cells
Cell type ERY1 has  773 cells
Cell type HSC2 has  10 cells
Cell type MEGA1 has  12 cells
Cell type CMP has  968 cells
Cell type pDC has  204 cells
Cell type B_a2 has  1 cells
Cell type Nka3 has  5 cells
Cell type ERY2 has  40 cells
Cell type TCEL7 has  1 cells
AnnData object with n_obs × n_vars = 5780 × 14651
    obs: 'clusters', 'palantir_pseudotime', 'palantir_diff_potential', 'label'
    uns: 'cluster_colors', 'ct_colors', 'palantir_branch_probs_cell_types', 'pca'
    obsm: 'tsne', 'MAGIC_imputed_data', 'palantir_branch_probs', 'X_pca'
    varm: 'PCs'

Model Construction and Run#

VIA needs a cell feature matrix for trajectory inference, such as X_pca, X_scVI, or X_glue. Here we use X_pca and set adata_ncomps=80 to use the first 80 principal components.

We also specify the cell annotation column used for coloring and graph summaries through clusters; this tutorial uses adata.obs["label"]. If root_user is not provided, VIA attempts to choose the root cell automatically. Here the root cell index is set explicitly to reproduce the author example. The two-dimensional visualization coordinates are selected with basis; this dataset provides adata.obsm["tsne"].

For more parameter details, see the VIA documentation: https://pyvia.readthedocs.io/en/latest/Parameters and Attributes.html

v0 = ov.single.pyVIA(
    adata=adata,
    adata_key='X_pca',
    adata_ncomps=80,
    basis='tsne',
    clusters='label',
    knn=30,
    random_seed=4,
    root_user=[4823],
)

v0.run()
2026-05-23 00:15:53.139373	Running VIA over input data of 5780 (samples) x 80 (features)
2026-05-23 00:15:53.139707	Knngraph has 30 neighbors
2026-05-23 00:15:55.565303	Finished global pruning of 30-knn graph used for clustering at level of 0.15. Kept 46.3 % of edges. 
2026-05-23 00:15:55.577221	Number of connected components used for clustergraph  is 1
2026-05-23 00:15:55.705682	Commencing community detection
2026-05-23 00:15:56.048954	Finished community detection. Found 207 clusters.
2026-05-23 00:15:56.049939	Merging 189 very small clusters (<10)
2026-05-23 00:15:56.051191	Finished detecting communities. Found 18 communities
2026-05-23 00:15:56.051347	Making cluster graph. Global cluster graph pruning level: 0.15
2026-05-23 00:15:56.058392	Graph has 1 connected components before pruning
2026-05-23 00:15:56.058911	Graph has 2 connected components after pruning
2026-05-23 00:15:56.059177	Graph has 1 connected components after reconnecting
2026-05-23 00:15:56.059300	0.0% links trimmed from local pruning relative to start
2026-05-23 00:15:56.059308	62.6% links trimmed from global pruning relative to start
initial links 198 and final_links_n 198
2026-05-23 00:15:56.060286	component number 0 out of  [0]
2026-05-23 00:15:56.085227	The root index, 4823 provided by the user belongs to cluster number 2 and corresponds to cell type HSC1
2026-05-23 00:15:56.086122	Computing lazy-teleporting expected hitting times
2026-05-23 00:16:00.048271	Ended all multiprocesses, will retrieve and reshape
2026-05-23 00:16:00.063818	start computing walks with rw2 method
memory for rw2 hittings times  2. Using rw2 based pt
2026-05-23 00:16:01.292715	Identifying terminal clusters corresponding to unique lineages...
2026-05-23 00:16:01.292735	Closeness:[3, 5, 7, 9, 12]
2026-05-23 00:16:01.292742	Betweenness:[1, 2, 3, 4, 7, 9, 10, 12, 14, 15, 16]
2026-05-23 00:16:01.292746	Out Degree:[3, 4, 5, 7, 9, 10, 12, 14, 15, 16]
2026-05-23 00:16:01.292808	Cluster 5 had 3 or more neighboring terminal states [7, 9, 16] and so we removed cluster 7
2026-05-23 00:16:01.292824	We removed cluster 10 from the shortlist of terminal states
2026-05-23 00:16:01.292873	Terminal clusters corresponding to unique lineages in this component are [3, 4, 5, 9, 12, 14, 15, 16] 
2026-05-23 00:16:01.292883	Calculating lineage probability at memory 5
2026-05-23 00:16:02.555445	Cluster or terminal cell fate 3 is reached 15.0 times
2026-05-23 00:16:02.571441	Cluster or terminal cell fate 4 is reached 448.0 times
2026-05-23 00:16:02.587845	Cluster or terminal cell fate 5 is reached 85.0 times
2026-05-23 00:16:02.601460	Cluster or terminal cell fate 9 is reached 85.0 times
2026-05-23 00:16:02.613330	Cluster or terminal cell fate 12 is reached 443.0 times
2026-05-23 00:16:02.624441	Cluster or terminal cell fate 14 is reached 531.0 times
2026-05-23 00:16:02.635918	Cluster or terminal cell fate 15 is reached 507.0 times
2026-05-23 00:16:02.649561	Cluster or terminal cell fate 16 is reached 51.0 times
2026-05-23 00:16:02.652447	There are (8) terminal clusters corresponding to unique lineages {3: 'PRE_B2', 4: 'CMP', 5: 'ERY1', 9: 'ERY1', 12: 'MONO1', 14: 'pDC', 15: 'pDC', 16: 'ERY1'}
2026-05-23 00:16:02.652466	Begin projection of pseudotime and lineage likelihood
2026-05-23 00:16:03.561362	Cluster graph layout based on forward biasing
2026-05-23 00:16:03.561940	Starting make edgebundle viagraph...
2026-05-23 00:16:03.561954	Make via clustergraph edgebundle
2026-05-23 00:16:03.610889	Hammer dims: Nodes shape: (18, 2) Edges shape: (74, 3)
2026-05-23 00:16:03.611198	Graph has 1 connected components before pruning
2026-05-23 00:16:03.611704	Graph has 6 connected components after pruning
2026-05-23 00:16:03.612676	Graph has 1 connected components after reconnecting
2026-05-23 00:16:03.612811	52.7% links trimmed from local pruning relative to start
2026-05-23 00:16:03.612821	63.5% links trimmed from global pruning relative to start
initial links 74 and final_links_n 35
2026-05-23 00:16:03.613568	Start making edgebundle milestone with 150 milestones...This can be recomputed with make_edgebundle_milestone()
2026-05-23 00:16:03.613580	Start finding milestones
2026-05-23 00:16:04.777856	End milestones with 150
2026-05-23 00:16:04.777945	Will use via-pseudotime for edges, otherwise consider providing a list of numeric labels (single cell level) or via_object
2026-05-23 00:16:04.787501	Recompute weights
2026-05-23 00:16:04.806073	pruning milestone graph based on recomputed weights
2026-05-23 00:16:04.806701	Graph has 1 connected components before pruning
2026-05-23 00:16:04.806968	Graph has 1 connected components after pruning
2026-05-23 00:16:04.807017	Graph has 1 connected components after reconnecting
2026-05-23 00:16:04.809682	68.6% links trimmed from global pruning relative to start
2026-05-23 00:16:04.809739	regenerate igraph on pruned edges
2026-05-23 00:16:04.817431	Setting numeric label as single cell pseudotime for coloring edges
2026-05-23 00:16:04.828442	Making smooth edges
REMEMBER TO RE-INCLUDE the PLT.SHOW HERE - COMMENTING IT OUT FOR NOW
2026-05-23 00:16:05.086211	Time elapsed 10.7 seconds

Basic Visualization#

Before drawing trajectory plots, we first inspect the cell annotations in tSNE space.

fig, ax = plt.subplots(1,1,figsize=(4,4))
ov.pl.embedding(
    adata,
    basis="tsne",
    color=['label'],
    frameon=False,
    ncols=1,
    wspace=0.5,
    show=False,
    ax=ax
)
fig.savefig('figures/via_fig1.png',dpi=300,bbox_inches = 'tight')

VIA Graph#

VIA provides several trajectory visualization modes. We first show the cluster-graph-level trajectory abstraction: the left panel shows the true-label composition of each VIA cluster, and the right panel shows VIA pseudotime. This view is useful for checking the global topology, root position, and terminal branches.

fig, ax, ax1 = v0.plot_piechart_graph(
    clusters='label',
    cmap='Reds',
    dpi=80,
    show_legend=False,
    ax_text=False,
    fontsize=4
)
fig.savefig('figures/via_fig2.png',dpi=300,bbox_inches = 'tight')
#you can use `v0.model.single_cell_pt_markov` to extract the pseudotime
v0.get_pseudotime(v0.adata)
v0.adata
...the pseudotime of VIA added to AnnData obs named `pt_via`
AnnData object with n_obs × n_vars = 5780 × 14651
    obs: 'clusters', 'palantir_pseudotime', 'palantir_diff_potential', 'label', 'pt_via'
    uns: 'cluster_colors', 'ct_colors', 'palantir_branch_probs_cell_types', 'pca', 'REFERENCE_MANU', 'label_colors'
    obsm: 'tsne', 'MAGIC_imputed_data', 'palantir_branch_probs', 'X_pca'
    varm: 'PCs'

Gene / Feature Graph Visualization#

VIA can display gene expression changes along the inferred graph. Here the HNSW small-world graph built inside VIA is used to accelerate gene-expression smoothing, conceptually similar to MAGIC imputation. Selected marker genes are then projected onto the VIA cluster graph to inspect how hematopoietic programs are distributed along the trajectory.

gene_list_magic = ['IL3RA', 'IRF8', 'GATA1', 'GATA2', 'ITGA2B', 'MPO', 'CD79B', 'SPI1', 'CD34', 'CSF1R', 'ITGAX']
fig,axs=v0.plot_clustergraph(gene_list=gene_list_magic[:4],figsize=(12,4),)
fig.savefig('figures/via_fig2_1.png',dpi=300,bbox_inches = 'tight')
shape of transition matrix raised to power 3 (5780, 5780)
../../../_images/ee9a563f21607bc5381c41a482bf82453a13ddc87b9b5631eee1548d9691d687.png

Trajectory Projection#

Next, we project the VIA-inferred trajectory structure onto a two-dimensional embedding such as UMAP, PHATE, or tSNE. This section includes three views:

  • overlaying the high-level cluster graph abstraction on the embedding;

  • drawing a finer-grained vector-field view of cell directionality on the embedding;

  • drawing high-edge-resolution directed graphs or stream plots.

fig,ax1,ax2=v0.plot_trajectory_gams(basis='tsne',clusters='label',draw_all_curves=False)
fig.savefig('figures/via_fig3.png',dpi=300,bbox_inches = 'tight')
2026-05-23 00:16:48.944818	Super cluster 3 is a super terminal with sub_terminal cluster 3
2026-05-23 00:16:48.948985	Super cluster 4 is a super terminal with sub_terminal cluster 4
2026-05-23 00:16:48.949017	Super cluster 5 is a super terminal with sub_terminal cluster 5
2026-05-23 00:16:48.949041	Super cluster 9 is a super terminal with sub_terminal cluster 9
2026-05-23 00:16:48.949058	Super cluster 12 is a super terminal with sub_terminal cluster 12
2026-05-23 00:16:48.949075	Super cluster 14 is a super terminal with sub_terminal cluster 14
2026-05-23 00:16:48.949090	Super cluster 15 is a super terminal with sub_terminal cluster 15
2026-05-23 00:16:48.949106	Super cluster 16 is a super terminal with sub_terminal cluster 16
../../../_images/277c421bce31050f172d707ebb8fdee40ab4171e04fe463d0c141e128abf8331.png
fig,ax=v0.plot_stream(
    basis='tsne',
    clusters='label',
    density_grid=0.8,
    scatter_size=30,
    scatter_alpha=0.3,
    linewidth=0.5
)
fig.savefig('figures/via_fig4.png',dpi=300,bbox_inches = 'tight')
fig,ax=v0.plot_stream(
    basis='tsne',
    density_grid=0.8,
    scatter_size=30,
    color_scheme='time',
    linewidth=0.5,
    min_mass = 1,
    cutoff_perc = 5,
    scatter_alpha=0.3,
    marker_edgewidth=0.1,
    density_stream = 2,
    smooth_transition=1,
    smooth_grid=0.5
)
fig.savefig('figures/via_fig5.png',dpi=300,bbox_inches = 'tight')

Probabilistic Pathways#

VIA estimates lineage probabilities from the root state to terminal states. A higher lineage probability indicates greater potential for a cell to differentiate toward the corresponding terminal state. We first show probability distributions for all detected terminal lineages.

fig, axs = v0.plot_lineage_probability(figsize=(10, 5),ncol=4)
fig.savefig("figures/via_fig6.png", dpi=300, bbox_inches="tight")
2026-05-23 00:32:09.730409	Marker_lineages: [3, 4, 5, 9, 12, 14, 15, 16]
2026-05-23 00:32:09.732605	The number of components in the original full graph is 1
2026-05-23 00:32:09.732637	For downstream visualization purposes we are also constructing a low knn-graph 
2026-05-23 00:32:12.524489	Check sc pb 1.0 
f getting majority comp
2026-05-23 00:32:12.594621	Cluster path on clustergraph starting from Root Cluster 2 to Terminal Cluster 3: [2, 1, 0, 8, 6, 13, 10, 3]
2026-05-23 00:32:12.594638	Cluster path on clustergraph starting from Root Cluster 2 to Terminal Cluster 4: [2, 1, 0, 11, 4]
2026-05-23 00:32:12.594645	Cluster path on clustergraph starting from Root Cluster 2 to Terminal Cluster 5: [2, 1, 7, 5]
2026-05-23 00:32:12.594650	Cluster path on clustergraph starting from Root Cluster 2 to Terminal Cluster 9: [2, 1, 7, 5, 9]
2026-05-23 00:32:12.594655	Cluster path on clustergraph starting from Root Cluster 2 to Terminal Cluster 12: [2, 1, 0, 8, 6, 12]
2026-05-23 00:32:12.594660	Cluster path on clustergraph starting from Root Cluster 2 to Terminal Cluster 14: [2, 1, 0, 8, 14]
2026-05-23 00:32:12.594665	Cluster path on clustergraph starting from Root Cluster 2 to Terminal Cluster 15: [2, 1, 0, 8, 6, 13, 15]
2026-05-23 00:32:12.594670	Cluster path on clustergraph starting from Root Cluster 2 to Terminal Cluster 16: [2, 1, 7, 5, 16]
setting vmin to 0.0
2026-05-23 00:32:12.617573	Revised Cluster level path on sc-knnGraph from Root Cluster 2 to Terminal Cluster 3 along path: [2, 2, 2, 1, 10, 3, 3, 3, 3, 3]
setting vmin to 0.0
2026-05-23 00:32:12.624577	Revised Cluster level path on sc-knnGraph from Root Cluster 2 to Terminal Cluster 4 along path: [2, 2, 1, 4]
setting vmin to 0.0
2026-05-23 00:32:12.632017	Revised Cluster level path on sc-knnGraph from Root Cluster 2 to Terminal Cluster 5 along path: [2, 2, 1, 16, 5, 5, 5, 5, 5]
setting vmin to 0.0
2026-05-23 00:32:12.638892	Revised Cluster level path on sc-knnGraph from Root Cluster 2 to Terminal Cluster 9 along path: [2, 2, 2, 2, 2, 1, 7, 9, 9]
setting vmin to 0.0
2026-05-23 00:32:12.645880	Revised Cluster level path on sc-knnGraph from Root Cluster 2 to Terminal Cluster 12 along path: [2, 2, 2, 1, 10, 12, 12, 12, 12, 12]
setting vmin to 0.0
2026-05-23 00:32:12.958839	Revised Cluster level path on sc-knnGraph from Root Cluster 2 to Terminal Cluster 14 along path: [2, 2, 1, 10, 13, 14, 14, 14]
setting vmin to 0.0
2026-05-23 00:32:12.966461	Revised Cluster level path on sc-knnGraph from Root Cluster 2 to Terminal Cluster 15 along path: [2, 2, 2, 10, 13, 15, 15, 15, 15]
setting vmin to 0.0
2026-05-23 00:32:12.973863	Revised Cluster level path on sc-knnGraph from Root Cluster 2 to Terminal Cluster 16 along path: [2, 2, 2, 1, 16, 16, 16, 16]
../../../_images/db83f56b4784131c7acf5bb9fe1bb863575ddccfc82e1e25ce4a073cb7cc7823.png

A subset of terminal lineages can also be visualized. Here we take the first two terminal clusters detected by the fitted model, which avoids hard-coding cluster IDs that may vary across versions or random seeds.

fig,axs=v0.plot_lineage_probability(figsize=(8,4),marker_lineages=via_marker_lineages)
fig.savefig('figures/via_fig7.png',dpi=300,bbox_inches = 'tight')
2026-05-23 00:27:08.829986	Marker_lineages: [3, 4]
2026-05-23 00:27:08.832346	The number of components in the original full graph is 1
2026-05-23 00:27:08.832391	For downstream visualization purposes we are also constructing a low knn-graph 
2026-05-23 00:27:11.752665	Check sc pb 1.0 
f getting majority comp
2026-05-23 00:27:11.821942	Cluster path on clustergraph starting from Root Cluster 2 to Terminal Cluster 3: [2, 1, 0, 8, 6, 13, 10, 3]
2026-05-23 00:27:11.821959	Cluster path on clustergraph starting from Root Cluster 2 to Terminal Cluster 4: [2, 1, 0, 11, 4]
setting vmin to 0.0
2026-05-23 00:27:11.833837	Revised Cluster level path on sc-knnGraph from Root Cluster 2 to Terminal Cluster 3 along path: [2, 2, 2, 1, 10, 3, 3, 3, 3, 3]
setting vmin to 0.0
2026-05-23 00:27:11.840987	Revised Cluster level path on sc-knnGraph from Root Cluster 2 to Terminal Cluster 4 along path: [2, 2, 1, 4]
../../../_images/edc85540167c985e8b81fdad94f559762e4c35ac8c9158b64708759971e96b04.png

Gene Dynamics#

VIA automatically infers gene-expression trends along detected lineages. These curves can be interpreted as dynamic marker-gene programs along a specific terminal lineage. Finally, we summarize selected gene trends on specified lineages with a heatmap.

fig,axs=v0.plot_gene_trend(gene_list=gene_list_magic,figsize=(8,6),)
fig.savefig('figures/via_fig8.png',dpi=300,bbox_inches = 'tight')
shape of transition matrix raised to power 3 (5780, 5780)
../../../_images/55f93d90f4b1e0803e4eaef7ff79601a25c57cf3f519b711826a6b1e33716878.png
fig,ax=v0.plot_gene_trend_heatmap(
    gene_list=gene_list_magic,figsize=(4,4),
    marker_lineages=via_heatmap_lineages
)
fig.savefig('figures/via_fig9.png',dpi=300,bbox_inches = 'tight')
shape of transition matrix raised to power 3 (5780, 5780)
../../../_images/4202c68744a8858b3751094e12b029644692392e74b384e6bccc0d1fd79af9de.png