pypath.omnipath.app.DatabaseManager§

class pypath.omnipath.app.DatabaseManager(rebuild=False, **kwargs)[source]§

Bases: Logger

Builds and serves the databases in OmniPath such as various networks, enzyme-substrate interactions, protein complexes, annotations and inter-cellular communication roles. Saves the databases to and loads them from pickle dumps on demand.

__init__(rebuild=False, **kwargs)[source]§

Make this instance a logger.

Parameters:

name – The label of this instance that will be prepended to all messages it sends to the logger.
module – Send the messages by the logger of this module.

Methods

`__init__`([rebuild])	Make this instance a logger.
`build`()	Builds all built-in datasets.
`build_dataset`(dataset[, ncbi_tax_id])	Builds a dataset.
`compile_table`(dataset)	Compiles the summaries table for a dataset.
`compile_tables`()	Compiles the summaries table for all datasets.
`dataset_dependencies`(dataset)	Returns the dependencies of a dataset.
`define_dataset`(name, module[, args, pickle])	Add a new dataset definition.
`ensure_dataset`(dataset[, force_reload, ...])	Makes sure a dataset is loaded. It loads only if it's not loaded yet or :py:arg:`force_reload` is `True`. It only builds if it's not availabe as a pickle dump or :py:arg:`force_rebuild` is `True`.
`ensure_dirs`()	Checks if the directories for tables, figures and pickles exist and creates them if necessary.
`ensure_module`(dataset[, reset])	Makes sure the module providing a particular dataset is available and has no default database loaded yet (`db` attribute of the module).
`foreach_dataset`(method)	Applies a method for each dataset.
`get_args_curated`()	Returns the arguments for building the curated PPI network dataset.
`get_args_lncrna_mrna`()	Returns the arguments for building the lncRNA-mRNA network dataset.
`get_args_mirna_mrna`()	Returns the arguments for building the miRNA-mRNA network dataset.
`get_args_small_molecule`()	Returns the arguments for building the small molecule-protein network dataset.
`get_args_tf_mirna`()	Returns the arguments for building the TF-miRNA network dataset.
`get_args_tf_target`()	Returns the arguments for building the TF-target network dataset.
`get_build_args`(dataset)	Retrieves the default database build parameters for a dataset.
`get_db`(dataset[, ncbi_tax_id])	Returns a dataset object.
`get_param`(key)	Retrieves a parameter from the `param` dict of the current object or from the module settings.
`load_dataset`(dataset[, ncbi_tax_id])	Loads a dataset, builds it if no pickle dump is available.
`network_df`(dataset[, by_source])	Creates a data frame of a network dataset where rows aggregate information from all resources describing an interaction.
`network_df_by_source`([dataset])	Creates a data frame of a network dataset where each row contains information from one resource.
`pickle_exists`(dataset[, ncbi_tax_id])	Tells if a pickle dump of a particular dataset exists.
`pickle_path`(dataset[, ncbi_tax_id])	Returns the path of the pickle dump for a dataset according to the current settings.
`reload`()	Reloads the object from the module level.
`reload_module`(dataset)	Reloads the module of the database object of a particular dataset.
`remove_all`()	Removes all loaded datasets.
`remove_db`(dataset[, ncbi_tax_id])	Removes a dataset.
`set_network`(dataset[, by_source])	Sets dataset as the default
`table_path`(dataset)	Returns the full path for a table (to be exported or imported).

build()[source]§: Builds all built-in datasets.

build_dataset(dataset, ncbi_tax_id=9606)[source]§: Builds a dataset.

compile_table(dataset)[source]§: Compiles the summaries table for a dataset. These tables contain various quantitative descriptions of the data contents.

compile_tables()[source]§: Compiles the summaries table for all datasets. These tables contain various quantitative descriptions of the data contents.

dataset_dependencies(dataset)[source]§: Returns the dependencies of a dataset. E.g. to build annotations complexes must be loaded hence the former is dependent on the latter.

define_dataset(name: str, module: Literal['annot', 'complex', 'enz_sub', 'intercell', 'network'], args: dict | None = None, pickle: str | None = None, **param)[source]§

Add a new dataset definition.

Args

name:: Arbitrary name for the dataset.
module:: A database builder module: this determines the type of the dataset.
args:: Arguments for the database provider method (typically called get_db) of the above module.
pickle:: A name for the pickle file, if not provided it will be named as “<name>_<module>.pickle”.
param:: Further parameters, saved directly into the :attr:param dict of this object, however the three arguments above override values provided this way.

ensure_dataset(dataset, force_reload=False, force_rebuild=False, ncbi_tax_id=9606)[source]§

Makes sure a dataset is loaded. It loads only if it’s not loaded yet or :py:arg:`force_reload` is True. It only builds if it’s not availabe as a pickle dump or :py:arg:`force_rebuild` is True.

Parameters:

dataset (str) – The name of the dataset.
ncbi_tax_id (int) – NCBI Taxonomy ID of the organism. Considered only if the dataset builds for one organism and saved to organism specific pickle files.

ensure_dirs()[source]§: Checks if the directories for tables, figures and pickles exist and creates them if necessary.

ensure_module(dataset, reset=True)[source]§: Makes sure the module providing a particular dataset is available and has no default database loaded yet (db attribute of the module).

foreach_dataset(method)[source]§: Applies a method for each dataset.

get_args_curated()[source]§: Returns the arguments for building the curated PPI network dataset.

get_args_lncrna_mrna()[source]§: Returns the arguments for building the lncRNA-mRNA network dataset.

get_args_mirna_mrna()[source]§: Returns the arguments for building the miRNA-mRNA network dataset.

get_args_small_molecule()[source]§: Returns the arguments for building the small molecule-protein network dataset.

get_args_tf_mirna()[source]§: Returns the arguments for building the TF-miRNA network dataset.

get_args_tf_target()[source]§: Returns the arguments for building the TF-target network dataset.

get_build_args(dataset)[source]§: Retrieves the default database build parameters for a dataset.

get_db(dataset, ncbi_tax_id=9606)[source]§

Returns a dataset object. Loads and builds the dataset if necessary.

Parameters:: ncbi_tax_id (int) – NCBI Taxonomy ID of the organism. Considered only if the dataset builds for one organism and saved to organism specific pickle files.

get_param(key)[source]§: Retrieves a parameter from the param dict of the current object or from the module settings.

load_dataset(dataset, ncbi_tax_id=9606)[source]§: Loads a dataset, builds it if no pickle dump is available.

network_df(dataset, by_source=False)[source]§: Creates a data frame of a network dataset where rows aggregate information from all resources describing an interaction.

network_df_by_source(dataset='omnipath')[source]§: Creates a data frame of a network dataset where each row contains information from one resource.

pickle_exists(dataset, ncbi_tax_id=9606)[source]§: Tells if a pickle dump of a particular dataset exists.

pickle_path(dataset, ncbi_tax_id=9606)[source]§: Returns the path of the pickle dump for a dataset according to the current settings.

reload()[source]§: Reloads the object from the module level.

reload_module(dataset)[source]§: Reloads the module of the database object of a particular dataset. E.g. in case of network datasets the pypath.network module will be reloaded.

remove_all()[source]§: Removes all loaded datasets. Deletes the references to the objects in the module, however if you have references elsewhere in your code they remain in the memory.

remove_db(dataset, ncbi_tax_id=9606)[source]§: Removes a dataset. Deletes the references to the object in the module, however if you have references elsewhere in your code it remains in the memory.

set_network(dataset, by_source=False, **kwargs)[source]§: Sets dataset as the default

table_path(dataset)[source]§: Returns the full path for a table (to be exported or imported).