Tutorial 2: scTenifoldNet¶
This notebook compares two expression matrices with the one-call compare_networks() API and then shows the equivalent step-wise class workflow.
In [1]:
Copied!
from scTenifold import compare_networks, scTenifoldNet
from scTenifold.data import get_test_df
x = get_test_df(n_cells=80, n_genes=40, random_state=0)
y = get_test_df(n_cells=80, n_genes=40, random_state=1)
from scTenifold import compare_networks, scTenifoldNet
from scTenifold.data import get_test_df
x = get_test_df(n_cells=80, n_genes=40, random_state=0)
y = get_test_df(n_cells=80, n_genes=40, random_state=1)
One-call workflow¶
For quick analyses, call compare_networks(). The small network settings below keep the tutorial fast; increase `n_nets, n_samp_cells, tensor rank, and tensor iterations for real analyses.
In [2]:
Copied!
result = compare_networks(
x,
y,
x_label="control",
y_label="condition",
qc_kws={"min_lib_size": 1, "plot": False, "max_mito_ratio": 1.0, "min_percent": 0},
network_kws={"n_nets": 2, "n_samp_cells": 25, "q": 0},
backend="serial",
td_kws={"K": 2, "max_iter": 20, "init": "random"},
ma_kws={"d": 2},
)
result.head()
result = compare_networks(
x,
y,
x_label="control",
y_label="condition",
qc_kws={"min_lib_size": 1, "plot": False, "max_mito_ratio": 1.0, "min_percent": 0},
network_kws={"n_nets": 2, "n_samp_cells": 25, "q": 0},
backend="serial",
td_kws={"K": 2, "max_iter": 20, "init": "random"},
ma_kws={"d": 2},
)
result.head()
Removed 0 cells with lib size < 1 Removed 0 outlier cells from original data Found mitochondrial genes: ['MT-1', 'MT-10', 'MT-2', 'MT-3', 'MT-4', 'MT-5', 'MT-6', 'MT-7', 'MT-8', 'MT-9'] Removed 0 samples from original data (mt genes ratio > 1.0) Removed 0 genes expressed in less than 0 of data Removed 0 genes with expression values: average < 0 or sum < 0 finish QC: control Removed 0 cells with lib size < 1 Removed 0 outlier cells from original data Found mitochondrial genes: ['MT-1', 'MT-10', 'MT-2', 'MT-3', 'MT-4', 'MT-5', 'MT-6', 'MT-7', 'MT-8', 'MT-9'] Removed 0 samples from original data (mt genes ratio > 1.0) Removed 0 genes expressed in less than 0 of data Removed 0 genes with expression values: average < 0 or sum < 0 finish QC: condition process qc finished in 0.009546581999529735 secs. make_networks processing time: 0.11581167100030143
make_networks processing time: 0.11569302100087953 process nc finished in 0.23233089199948154 secs. Using tensorly (40, 40, 2) tensor_decomp processing time: 0.0044343650006339885 Using tensorly (40, 40, 2) tensor_decomp processing time: 0.0041170049998981995 process td finished in 0.009537382000416983 secs. manifold_alignment processing time: 0.0026729829996838816 process ma finished in 0.002711033000196039 secs. d_regulation processing time: 0.011582894000639499 process dr finished in 0.011625193999861949 secs.
Out[2]:
| Gene | Distance | boxcox-transformed distance | Z | FC | p-value | adjusted p-value | |
|---|---|---|---|---|---|---|---|
| 34 | NG-4 | 0.029244 | -1.382743 | 2.033638 | 3.698975 | 0.054446 | 0.66952 |
| 31 | NG-29 | 0.027043 | -1.390411 | 1.787239 | 3.163037 | 0.075323 | 0.66952 |
| 10 | NG-1 | 0.024698 | -1.398821 | 1.516972 | 2.638354 | 0.104312 | 0.66952 |
| 37 | NG-7 | 0.023560 | -1.403003 | 1.382574 | 2.400828 | 0.121271 | 0.66952 |
| 21 | NG-2 | 0.021716 | -1.409936 | 1.159797 | 2.039616 | 0.153248 | 0.66952 |
Step-wise workflow¶
Use scTenifoldNet directly when you want intermediate outputs or when you want to resume from a saved run.
In [3]:
Copied!
model = scTenifoldNet(
x,
y,
x_label="control",
y_label="condition",
qc_kws={"min_lib_size": 1, "plot": False, "max_mito_ratio": 1.0, "min_percent": 0},
nc_kws={"n_nets": 2, "n_samp_cells": 25, "q": 0, "backend": "serial"},
td_kws={"K": 2, "max_iter": 20, "init": "random"},
ma_kws={"d": 2},
)
for step in ["qc", "nc", "td", "ma", "dr"]:
model.run_step(step)
model.d_regulation.head()
model = scTenifoldNet(
x,
y,
x_label="control",
y_label="condition",
qc_kws={"min_lib_size": 1, "plot": False, "max_mito_ratio": 1.0, "min_percent": 0},
nc_kws={"n_nets": 2, "n_samp_cells": 25, "q": 0, "backend": "serial"},
td_kws={"K": 2, "max_iter": 20, "init": "random"},
ma_kws={"d": 2},
)
for step in ["qc", "nc", "td", "ma", "dr"]:
model.run_step(step)
model.d_regulation.head()
Removed 0 cells with lib size < 1 Removed 0 outlier cells from original data Found mitochondrial genes: ['MT-1', 'MT-10', 'MT-2', 'MT-3', 'MT-4', 'MT-5', 'MT-6', 'MT-7', 'MT-8', 'MT-9'] Removed 0 samples from original data (mt genes ratio > 1.0) Removed 0 genes expressed in less than 0 of data Removed 0 genes with expression values: average < 0 or sum < 0 finish QC: control Removed 0 cells with lib size < 1 Removed 0 outlier cells from original data Found mitochondrial genes: ['MT-1', 'MT-10', 'MT-2', 'MT-3', 'MT-4', 'MT-5', 'MT-6', 'MT-7', 'MT-8', 'MT-9'] Removed 0 samples from original data (mt genes ratio > 1.0) Removed 0 genes expressed in less than 0 of data Removed 0 genes with expression values: average < 0 or sum < 0 finish QC: condition process qc finished in 0.008201310000004014 secs. make_networks processing time: 0.11497028999929171
make_networks processing time: 0.1151920299998892 process nc finished in 0.2309635209994667 secs. Using tensorly (40, 40, 2) tensor_decomp processing time: 0.0044216360001883 Using tensorly (40, 40, 2) tensor_decomp processing time: 0.004125004999878001 process td finished in 0.009382821999679436 secs. manifold_alignment processing time: 0.00213844199970481 process ma finished in 0.0021745820004070993 secs. d_regulation processing time: 0.010781072999634489 process dr finished in 0.010820974000125716 secs.
Out[3]:
| Gene | Distance | boxcox-transformed distance | Z | FC | p-value | adjusted p-value | |
|---|---|---|---|---|---|---|---|
| 34 | NG-4 | 0.029244 | -1.382743 | 2.033638 | 3.698975 | 0.054446 | 0.66952 |
| 31 | NG-29 | 0.027043 | -1.390411 | 1.787239 | 3.163037 | 0.075323 | 0.66952 |
| 10 | NG-1 | 0.024698 | -1.398821 | 1.516972 | 2.638354 | 0.104312 | 0.66952 |
| 37 | NG-7 | 0.023560 | -1.403003 | 1.382574 | 2.400828 | 0.121271 | 0.66952 |
| 21 | NG-2 | 0.021716 | -1.409936 | 1.159797 | 2.039616 | 0.153248 | 0.66952 |
In [4]:
Copied!
{
"qc_shape": model.QC_dict["control"].shape,
"n_networks": len(model.network_dict["control"]),
"tensor_shape": model.tensor_dict["control"].shape,
"manifold_shape": model.manifold.shape,
}
{
"qc_shape": model.QC_dict["control"].shape,
"n_networks": len(model.network_dict["control"]),
"tensor_shape": model.tensor_dict["control"].shape,
"manifold_shape": model.manifold.shape,
}
Out[4]:
{'qc_shape': (40, 80),
'n_networks': 2,
'tensor_shape': (40, 40),
'manifold_shape': (80, 2)}