Tutorial 3: Virtual Knockout¶
scTenifoldKnk builds a wild-type network from one sample, creates a perturbed knockout tensor, and ranks genes by differential regulation between WT and KO.
In [1]:
Copied!
from scTenifold import scTenifoldKnk, virtual_knockout
from scTenifold.data import get_test_df
data = get_test_df(n_cells=80, n_genes=40, random_state=2)
ko_genes = ["NG-1"]
from scTenifold import scTenifoldKnk, virtual_knockout
from scTenifold.data import get_test_df
data = get_test_df(n_cells=80, n_genes=40, random_state=2)
ko_genes = ["NG-1"]
One-call workflow¶
virtual_knockout() is the simplest entry point.
In [2]:
Copied!
result = virtual_knockout(
data,
ko_genes=ko_genes,
qc_kws={"min_lib_size": 1, "plot": False, "max_mito_ratio": 1.0, "min_percent": 0, "min_exp_avg": 0, "min_exp_sum": 0},
network_kws={"n_nets": 2, "n_samp_cells": 25, "q": 0},
ko_method="default",
td_kws={"K": 2, "max_iter": 20, "init": "random"},
ma_kws={"d": 2},
)
result.sort_values("p-value").head()
result = virtual_knockout(
data,
ko_genes=ko_genes,
qc_kws={"min_lib_size": 1, "plot": False, "max_mito_ratio": 1.0, "min_percent": 0, "min_exp_avg": 0, "min_exp_sum": 0},
network_kws={"n_nets": 2, "n_samp_cells": 25, "q": 0},
ko_method="default",
td_kws={"K": 2, "max_iter": 20, "init": "random"},
ma_kws={"d": 2},
)
result.sort_values("p-value").head()
Removed 0 cells with lib size < 1 Removed 0 outlier cells from original data Found mitochondrial genes: ['MT-1', 'MT-10', 'MT-2', 'MT-3', 'MT-4', 'MT-5', 'MT-6', 'MT-7', 'MT-8', 'MT-9'] Removed 0 samples from original data (mt genes ratio > 1.0) Removed 0 genes expressed in less than 0 of data Removed 0 genes with expression values: average < 0 or sum < 0 finish QC: WT process qc finished in 0.005553757000598125 secs. make_networks processing time: 0.11604772100054106 process nc finished in 0.11614250099955825 secs. Using tensorly (40, 40, 2) tensor_decomp processing time: 0.004349935999925947 process td finished in 0.004552725999928953 secs. process ko finished in 0.00037601000076392666 secs. manifold_alignment processing time: 0.0032367639996664366 process ma finished in 0.0032797840003695455 secs. d_regulation processing time: 0.014066936999370228 process dr finished in 0.01411015700068674 secs.
Out[2]:
| Gene | Distance | boxcox-transformed distance | Z | FC | p-value | adjusted p-value | |
|---|---|---|---|---|---|---|---|
| 10 | NG-1 | 0.005371 | -3.511645 | 2.796169 | 35.303449 | 2.821321e-09 | 1.128528e-07 |
| 17 | NG-16 | 0.001262 | -4.059385 | 1.615881 | 1.949104 | 1.626834e-01 | 1.000000e+00 |
| 39 | NG-9 | 0.000879 | -4.177127 | 1.362167 | 0.945190 | 3.309468e-01 | 1.000000e+00 |
| 25 | NG-23 | 0.000711 | -4.242841 | 1.220564 | 0.618924 | 4.314474e-01 | 1.000000e+00 |
| 11 | NG-10 | 0.000349 | -4.447445 | 0.779677 | 0.149458 | 6.990540e-01 | 1.000000e+00 |
Step-wise workflow¶
The class API exposes the WT tensor, KO tensor, manifold, and final table.
In [3]:
Copied!
knk = scTenifoldKnk(
data=data,
ko_genes=ko_genes,
qc_kws={"min_lib_size": 1, "plot": False, "max_mito_ratio": 1.0, "min_percent": 0, "min_exp_avg": 0, "min_exp_sum": 0},
nc_kws={"n_nets": 2, "n_samp_cells": 25, "q": 0, "backend": "serial"},
td_kws={"K": 2, "max_iter": 20, "init": "random"},
ma_kws={"d": 2},
ko_method="default",
)
for step in ["qc", "nc", "td", "ko", "ma", "dr"]:
knk.run_step(step)
knk.d_regulation.head()
knk = scTenifoldKnk(
data=data,
ko_genes=ko_genes,
qc_kws={"min_lib_size": 1, "plot": False, "max_mito_ratio": 1.0, "min_percent": 0, "min_exp_avg": 0, "min_exp_sum": 0},
nc_kws={"n_nets": 2, "n_samp_cells": 25, "q": 0, "backend": "serial"},
td_kws={"K": 2, "max_iter": 20, "init": "random"},
ma_kws={"d": 2},
ko_method="default",
)
for step in ["qc", "nc", "td", "ko", "ma", "dr"]:
knk.run_step(step)
knk.d_regulation.head()
Removed 0 cells with lib size < 1 Removed 0 outlier cells from original data Found mitochondrial genes: ['MT-1', 'MT-10', 'MT-2', 'MT-3', 'MT-4', 'MT-5', 'MT-6', 'MT-7', 'MT-8', 'MT-9'] Removed 0 samples from original data (mt genes ratio > 1.0) Removed 0 genes expressed in less than 0 of data Removed 0 genes with expression values: average < 0 or sum < 0 finish QC: WT process qc finished in 0.004650214999855962 secs.
make_networks processing time: 0.11670983199928742 process nc finished in 0.11684261200025503 secs. Using tensorly (40, 40, 2) tensor_decomp processing time: 0.004174925000370422 process td finished in 0.005494577000717982 secs. process ko finished in 0.0003595909993237001 secs. manifold_alignment processing time: 0.0030782730000282754 process ma finished in 0.0031186429996523657 secs. d_regulation processing time: 0.013182466000216664 process dr finished in 0.013228227000581683 secs.
Out[3]:
| Gene | Distance | boxcox-transformed distance | Z | FC | p-value | adjusted p-value | |
|---|---|---|---|---|---|---|---|
| 10 | NG-1 | 0.005371 | -3.512135 | 2.795610 | 35.303449 | 2.821321e-09 | 1.128528e-07 |
| 17 | NG-16 | 0.001262 | -4.060077 | 1.615683 | 1.949104 | 1.626834e-01 | 1.000000e+00 |
| 39 | NG-9 | 0.000879 | -4.177868 | 1.362032 | 0.945190 | 3.309468e-01 | 1.000000e+00 |
| 25 | NG-23 | 0.000711 | -4.243612 | 1.220461 | 0.618924 | 4.314474e-01 | 1.000000e+00 |
| 11 | NG-10 | 0.000349 | -4.448312 | 0.779663 | 0.149458 | 6.990540e-01 | 1.000000e+00 |
In [4]:
Copied!
{
"wt_tensor": knk.tensor_dict["WT"].shape,
"ko_tensor": knk.tensor_dict["KO"].shape,
"manifold": knk.manifold.shape,
}
{
"wt_tensor": knk.tensor_dict["WT"].shape,
"ko_tensor": knk.tensor_dict["KO"].shape,
"manifold": knk.manifold.shape,
}
Out[4]:
{'wt_tensor': (40, 40), 'ko_tensor': (40, 40), 'manifold': (80, 2)}
For propagation-style knockouts, set ko_method="propagation" and pass options such as ko_kws={"degree": 1}. This rebuilds PC networks after propagating the perturbation through each WT network.