pypath.omnipath.app.DatabaseManager§
- class pypath.omnipath.app.DatabaseManager(rebuild=False, **kwargs)[source]§
Bases:
Logger
Builds and serves the databases in OmniPath such as various networks, enzyme-substrate interactions, protein complexes, annotations and inter-cellular communication roles. Saves the databases to and loads them from pickle dumps on demand.
- __init__(rebuild=False, **kwargs)[source]§
Make this instance a logger.
- Parameters:
name – The label of this instance that will be prepended to all messages it sends to the logger.
module – Send the messages by the logger of this module.
Methods
__init__
([rebuild])Make this instance a logger.
build
()Builds all built-in datasets.
build_dataset
(dataset[, ncbi_tax_id])Builds a dataset.
compile_table
(dataset)Compiles the summaries table for a dataset.
Compiles the summaries table for all datasets.
dataset_dependencies
(dataset)Returns the dependencies of a dataset.
define_dataset
(name, module[, args, pickle])Add a new dataset definition.
ensure_dataset
(dataset[, force_reload, ...])Makes sure a dataset is loaded. It loads only if it's not loaded yet or :py:arg:`force_reload` is
True
. It only builds if it's not availabe as a pickle dump or :py:arg:`force_rebuild` isTrue
.Checks if the directories for tables, figures and pickles exist and creates them if necessary.
ensure_module
(dataset[, reset])Makes sure the module providing a particular dataset is available and has no default database loaded yet (
db
attribute of the module).foreach_dataset
(method)Applies a method for each dataset.
Returns the arguments for building the curated PPI network dataset.
Returns the arguments for building the lncRNA-mRNA network dataset.
Returns the arguments for building the miRNA-mRNA network dataset.
Returns the arguments for building the small molecule-protein network dataset.
Returns the arguments for building the TF-miRNA network dataset.
Returns the arguments for building the TF-target network dataset.
get_build_args
(dataset)Retrieves the default database build parameters for a dataset.
get_db
(dataset[, ncbi_tax_id])Returns a dataset object.
get_param
(key)Retrieves a parameter from the
param
dict of the current object or from the module settings.load_dataset
(dataset[, ncbi_tax_id])Loads a dataset, builds it if no pickle dump is available.
network_df
(dataset[, by_source])Creates a data frame of a network dataset where rows aggregate information from all resources describing an interaction.
network_df_by_source
([dataset])Creates a data frame of a network dataset where each row contains information from one resource.
pickle_exists
(dataset[, ncbi_tax_id])Tells if a pickle dump of a particular dataset exists.
pickle_path
(dataset[, ncbi_tax_id])Returns the path of the pickle dump for a dataset according to the current settings.
reload
()Reloads the object from the module level.
reload_module
(dataset)Reloads the module of the database object of a particular dataset.
Removes all loaded datasets.
remove_db
(dataset[, ncbi_tax_id])Removes a dataset.
set_network
(dataset[, by_source])Sets dataset as the default
table_path
(dataset)Returns the full path for a table (to be exported or imported).
- compile_table(dataset)[source]§
Compiles the summaries table for a dataset. These tables contain various quantitative descriptions of the data contents.
- compile_tables()[source]§
Compiles the summaries table for all datasets. These tables contain various quantitative descriptions of the data contents.
- dataset_dependencies(dataset)[source]§
Returns the dependencies of a dataset. E.g. to build annotations complexes must be loaded hence the former is dependent on the latter.
- define_dataset(name: str, module: Literal['annot', 'complex', 'enz_sub', 'intercell', 'network'], args: dict | None = None, pickle: str | None = None, **param)[source]§
Add a new dataset definition.
- Args
- name:
Arbitrary name for the dataset.
- module:
A database builder module: this determines the type of the dataset.
- args:
Arguments for the database provider method (typically called
get_db
) of the above module.- pickle:
A name for the pickle file, if not provided it will be named as “<name>_<module>.pickle”.
- param:
Further parameters, saved directly into the :attr:
param
dict of this object, however the three arguments above override values provided this way.
- ensure_dataset(dataset, force_reload=False, force_rebuild=False, ncbi_tax_id=9606)[source]§
Makes sure a dataset is loaded. It loads only if it’s not loaded yet or :py:arg:`force_reload` is
True
. It only builds if it’s not availabe as a pickle dump or :py:arg:`force_rebuild` isTrue
.- Parameters:
dataset (str) – The name of the dataset.
ncbi_tax_id (int) – NCBI Taxonomy ID of the organism. Considered only if the dataset builds for one organism and saved to organism specific pickle files.
- ensure_dirs()[source]§
Checks if the directories for tables, figures and pickles exist and creates them if necessary.
- ensure_module(dataset, reset=True)[source]§
Makes sure the module providing a particular dataset is available and has no default database loaded yet (
db
attribute of the module).
- get_args_small_molecule()[source]§
Returns the arguments for building the small molecule-protein network dataset.
- get_db(dataset, ncbi_tax_id=9606)[source]§
Returns a dataset object. Loads and builds the dataset if necessary.
- Parameters:
ncbi_tax_id (int) – NCBI Taxonomy ID of the organism. Considered only if the dataset builds for one organism and saved to organism specific pickle files.
- get_param(key)[source]§
Retrieves a parameter from the
param
dict of the current object or from the module settings.
- load_dataset(dataset, ncbi_tax_id=9606)[source]§
Loads a dataset, builds it if no pickle dump is available.
- network_df(dataset, by_source=False)[source]§
Creates a data frame of a network dataset where rows aggregate information from all resources describing an interaction.
- network_df_by_source(dataset='omnipath')[source]§
Creates a data frame of a network dataset where each row contains information from one resource.
- pickle_exists(dataset, ncbi_tax_id=9606)[source]§
Tells if a pickle dump of a particular dataset exists.
- pickle_path(dataset, ncbi_tax_id=9606)[source]§
Returns the path of the pickle dump for a dataset according to the current settings.
- reload_module(dataset)[source]§
Reloads the module of the database object of a particular dataset. E.g. in case of network datasets the
pypath.network
module will be reloaded.
- remove_all()[source]§
Removes all loaded datasets. Deletes the references to the objects in the module, however if you have references elsewhere in your code they remain in the memory.