pypath.omnipath.app.DatabaseManager§

class pypath.omnipath.app.DatabaseManager(rebuild=False, **kwargs)[source]§

Bases: Logger

Builds and serves the databases in OmniPath such as various networks, enzyme-substrate interactions, protein complexes, annotations and inter-cellular communication roles. Saves the databases to and loads them from pickle dumps on demand.

__init__(rebuild=False, **kwargs)[source]§

Make this instance a logger.

Parameters:
  • name – The label of this instance that will be prepended to all messages it sends to the logger.

  • module – Send the messages by the logger of this module.

Methods

__init__([rebuild])

Make this instance a logger.

build()

Builds all built-in datasets.

build_dataset(dataset[, ncbi_tax_id])

Builds a dataset.

compile_table(dataset)

Compiles the summaries table for a dataset.

compile_tables()

Compiles the summaries table for all datasets.

dataset_dependencies(dataset)

Returns the dependencies of a dataset.

define_dataset(name, module[, args, pickle])

Add a new dataset definition.

ensure_dataset(dataset[, force_reload, ...])

Makes sure a dataset is loaded. It loads only if it's not loaded yet or :py:arg:`force_reload` is True. It only builds if it's not availabe as a pickle dump or :py:arg:`force_rebuild` is True.

ensure_dirs()

Checks if the directories for tables, figures and pickles exist and creates them if necessary.

ensure_module(dataset[, reset])

Makes sure the module providing a particular dataset is available and has no default database loaded yet (db attribute of the module).

foreach_dataset(method)

Applies a method for each dataset.

get_args_curated()

Returns the arguments for building the curated PPI network dataset.

get_args_lncrna_mrna()

Returns the arguments for building the lncRNA-mRNA network dataset.

get_args_mirna_mrna()

Returns the arguments for building the miRNA-mRNA network dataset.

get_args_small_molecule()

Returns the arguments for building the small molecule-protein network dataset.

get_args_tf_mirna()

Returns the arguments for building the TF-miRNA network dataset.

get_args_tf_target()

Returns the arguments for building the TF-target network dataset.

get_build_args(dataset)

Retrieves the default database build parameters for a dataset.

get_db(dataset[, ncbi_tax_id])

Returns a dataset object.

get_param(key)

Retrieves a parameter from the param dict of the current object or from the module settings.

load_dataset(dataset[, ncbi_tax_id])

Loads a dataset, builds it if no pickle dump is available.

network_df(dataset[, by_source])

Creates a data frame of a network dataset where rows aggregate information from all resources describing an interaction.

network_df_by_source([dataset])

Creates a data frame of a network dataset where each row contains information from one resource.

pickle_exists(dataset[, ncbi_tax_id])

Tells if a pickle dump of a particular dataset exists.

pickle_path(dataset[, ncbi_tax_id])

Returns the path of the pickle dump for a dataset according to the current settings.

reload()

Reloads the object from the module level.

reload_module(dataset)

Reloads the module of the database object of a particular dataset.

remove_all()

Removes all loaded datasets.

remove_db(dataset[, ncbi_tax_id])

Removes a dataset.

set_network(dataset[, by_source])

Sets dataset as the default

table_path(dataset)

Returns the full path for a table (to be exported or imported).

build()[source]§

Builds all built-in datasets.

build_dataset(dataset, ncbi_tax_id=9606)[source]§

Builds a dataset.

compile_table(dataset)[source]§

Compiles the summaries table for a dataset. These tables contain various quantitative descriptions of the data contents.

compile_tables()[source]§

Compiles the summaries table for all datasets. These tables contain various quantitative descriptions of the data contents.

dataset_dependencies(dataset)[source]§

Returns the dependencies of a dataset. E.g. to build annotations complexes must be loaded hence the former is dependent on the latter.

define_dataset(name: str, module: Literal['annot', 'complex', 'enz_sub', 'intercell', 'network'], args: dict | None = None, pickle: str | None = None, **param)[source]§

Add a new dataset definition.

Args
name:

Arbitrary name for the dataset.

module:

A database builder module: this determines the type of the dataset.

args:

Arguments for the database provider method (typically called get_db) of the above module.

pickle:

A name for the pickle file, if not provided it will be named as “<name>_<module>.pickle”.

param:

Further parameters, saved directly into the :attr:param dict of this object, however the three arguments above override values provided this way.

ensure_dataset(dataset, force_reload=False, force_rebuild=False, ncbi_tax_id=9606)[source]§

Makes sure a dataset is loaded. It loads only if it’s not loaded yet or :py:arg:`force_reload` is True. It only builds if it’s not availabe as a pickle dump or :py:arg:`force_rebuild` is True.

Parameters:
  • dataset (str) – The name of the dataset.

  • ncbi_tax_id (int) – NCBI Taxonomy ID of the organism. Considered only if the dataset builds for one organism and saved to organism specific pickle files.

ensure_dirs()[source]§

Checks if the directories for tables, figures and pickles exist and creates them if necessary.

ensure_module(dataset, reset=True)[source]§

Makes sure the module providing a particular dataset is available and has no default database loaded yet (db attribute of the module).

foreach_dataset(method)[source]§

Applies a method for each dataset.

get_args_curated()[source]§

Returns the arguments for building the curated PPI network dataset.

get_args_lncrna_mrna()[source]§

Returns the arguments for building the lncRNA-mRNA network dataset.

get_args_mirna_mrna()[source]§

Returns the arguments for building the miRNA-mRNA network dataset.

get_args_small_molecule()[source]§

Returns the arguments for building the small molecule-protein network dataset.

get_args_tf_mirna()[source]§

Returns the arguments for building the TF-miRNA network dataset.

get_args_tf_target()[source]§

Returns the arguments for building the TF-target network dataset.

get_build_args(dataset)[source]§

Retrieves the default database build parameters for a dataset.

get_db(dataset, ncbi_tax_id=9606)[source]§

Returns a dataset object. Loads and builds the dataset if necessary.

Parameters:

ncbi_tax_id (int) – NCBI Taxonomy ID of the organism. Considered only if the dataset builds for one organism and saved to organism specific pickle files.

get_param(key)[source]§

Retrieves a parameter from the param dict of the current object or from the module settings.

load_dataset(dataset, ncbi_tax_id=9606)[source]§

Loads a dataset, builds it if no pickle dump is available.

network_df(dataset, by_source=False)[source]§

Creates a data frame of a network dataset where rows aggregate information from all resources describing an interaction.

network_df_by_source(dataset='omnipath')[source]§

Creates a data frame of a network dataset where each row contains information from one resource.

pickle_exists(dataset, ncbi_tax_id=9606)[source]§

Tells if a pickle dump of a particular dataset exists.

pickle_path(dataset, ncbi_tax_id=9606)[source]§

Returns the path of the pickle dump for a dataset according to the current settings.

reload()[source]§

Reloads the object from the module level.

reload_module(dataset)[source]§

Reloads the module of the database object of a particular dataset. E.g. in case of network datasets the pypath.network module will be reloaded.

remove_all()[source]§

Removes all loaded datasets. Deletes the references to the objects in the module, however if you have references elsewhere in your code they remain in the memory.

remove_db(dataset, ncbi_tax_id=9606)[source]§

Removes a dataset. Deletes the references to the object in the module, however if you have references elsewhere in your code it remains in the memory.

set_network(dataset, by_source=False, **kwargs)[source]§

Sets dataset as the default

table_path(dataset)[source]§

Returns the full path for a table (to be exported or imported).