pypath.utils.orthology.OrthologyManager§

class pypath.utils.orthology.OrthologyManager(cleanup_period: int = 10, lifetime: int = 300, **kwargs)[source]§

Bases: Logger

__init__(cleanup_period: int = 10, lifetime: int = 300, **kwargs)[source]§

Make this instance a logger.

Parameters:

name – The label of this instance that will be prepended to all messages it sends to the logger.
module – Send the messages by the logger of this module.

Methods

`__init__`([cleanup_period, lifetime])	Make this instance a logger.
`get_df`(target[, source, id_type, ...])	Create a data frame for one source organism and ID type.
`get_dict`(target[, source, id_type, ...])	Create a dictionary for one source organism and ID type.
`load`(key)
`reload`()
`translate`(identifiers, target[, source, ...])	Translate one or more identifiers by orthologous gene pairs.
`translate_df`(df, target[, source, cols, ...])	Translate columns in a data frame.
`which_table`(target[, source, ...])

Attributes

`RESOURCE_PARAM`
`TRANSLATION_PARAM`

get_df(target: str | int, source: str | int = 9606, id_type: str = 'uniprot', only_swissprot: bool = True, oma: bool | None = None, homologene: bool | None = None, ensembl: bool | None = None, oma_rel_type: set[Literal['1:1', '1:n', 'm:1', 'm:n']] | None = None, oma_score: float | None = None, ensembl_hc: bool = True, ensembl_types: list[Literal['one2one', 'one2many', 'many2many']] | None = None, full_records: bool = False, **kwargs) → DataFrame[source]§

Create a data frame for one source organism and ID type.

Parameters:

target – Name or NCBI Taxonomy ID of the target organism.
source – Name or NCBI Taxonomy ID of the source organism.
id_type – The identifier type to use.
only_swissprot – Use only SwissProt IDs.
oma – Use orthology information from the Orthologous Matrix (OMA). Currently this is the recommended source for orthology data.
homologene – Use orthology information from NCBI HomoloGene.
ensembl – Use orthology information from Ensembl.
oma_rel_type – Restrict relations to certain types.
oma_score – Lower threshold for similarity metric.
ensembl_hc – Use only the high confidence orthology relations from Ensembl.
ensembl_types – Ensembl orthology relation types to use. Possible values are one2one, one2many and many2many. By default only one2one is used.
full_records – Include not only the identifiers, but also some properties of the orthology relationships.
kwargs – Ignored.

Returns:

A data frame with pairs of orthologous identifiers, in two columns: “source” and “target”.

get_dict(target: str | int, source: str | int = 9606, id_type: str = 'uniprot', only_swissprot: bool = True, oma: bool | None = None, homologene: bool | None = None, ensembl: bool | None = None, oma_rel_type: set[Literal['1:1', '1:n', 'm:1', 'm:n']] | None = None, oma_score: float | None = None, ensembl_hc: bool = True, ensembl_types: list[Literal['one2one', 'one2many', 'many2many']] | None = None, full_records: bool = False) → dict[str, set[OrthologBase]][source]§

Create a dictionary for one source organism and ID type.

Parameters:

target – Name or NCBI Taxonomy ID of the target organism.
source – Name or NCBI Taxonomy ID of the source organism.
id_type – The identifier type to use.
only_swissprot – Use only SwissProt IDs.
oma – Use orthology information from the Orthologous Matrix (OMA). Currently this is the recommended source for orthology data.
homologene – Use orthology information from NCBI HomoloGene.
ensembl – Use orthology information from Ensembl.
oma_rel_type – Restrict relations to certain types.
oma_score – Lower threshold for similarity metric.
ensembl_hc – Use only the high confidence orthology relations from Ensembl.
ensembl_types – Ensembl orthology relation types to use. Possible values are one2one, one2many and many2many. By default only one2one is used.
full_records – Include not only the identifiers, but also some properties of the orthology relationships.

Returns:

A dict with identifiers of the source organism as keys, and sets of their orthologs as values.

translate(identifiers: str | Iterable[str], target: str | int, source: str | int = 9606, id_type: str = 'uniprot', only_swissprot: bool = True, oma: bool = None, homologene: bool = None, ensembl: bool = None, oma_rel_type: set[Literal['1:1', '1:n', 'm:1', 'm:n']] | None = None, oma_score: float | None = None, ensembl_hc: bool = True, ensembl_types: list[Literal['one2one', 'one2many', 'many2many']] | None = None, full_records: bool = False)[source]§

Translate one or more identifiers by orthologous gene pairs.

Parameters:

identifiers – One or more identifers of the source organism, of ID type id_type.
target – Name or NCBI Taxonomy ID of the target organism.
source – Name or NCBI Taxonomy ID of the source organism.
id_type – The identifier type to use.
only_swissprot – Use only SwissProt IDs.
oma – Use orthology information from the Orthologous Matrix (OMA). Currently this is the recommended source for orthology data.
homologene – Use orthology information from NCBI HomoloGene.
ensembl – Use orthology information from Ensembl.
oma_rel_type – Restrict relations to certain types.
oma_score – Lower threshold for similarity metric.
ensembl_hc – Use only the high confidence orthology relations from Ensembl.
ensembl_types – Ensembl orthology relation types to use. Possible values are one2one, one2many and many2many. By default only one2one is used.
full_records – Include not only the identifiers, but also some properties of the orthology relationships.

Returns:

Set of identifiers of orthologous genes or proteins in the target taxon.

Translate columns in a data frame.

Parameters:

df – A data frame.
cols – One or more columns to be translated. It can be a single column name, an iterable of column names or a dict where keys are column names and values are ID types. Except this last case, identifiers are assumed to be id_type.
target – Name or NCBI Taxonomy ID of the target organism.
source – Name or NCBI Taxonomy ID of the source organism.
id_type – The default identifier type to use, will be used for all columns where ID type is not specified.
only_swissprot – Use only SwissProt IDs.
oma – Use orthology information from the Orthologous Matrix (OMA). Currently this is the recommended source for orthology data.
homologene – Use orthology information from NCBI HomoloGene.
ensembl – Use orthology information from Ensembl.
oma_rel_type – Restrict relations to certain types.
oma_score – Lower threshold for similarity metric.
ensembl_hc – Use only the high confidence orthology relations from Ensembl.
ensembl_types – Ensembl orthology relation types to use. Possible values are one2one, one2many and many2many. By default only one2one is used.
kwargs – Same as providing a dict to cols, but beware, keys (column names) can not match existing argument names of this function.

Returns:

A data frame with the same column layout as the input, and the identifiers translated as demanded. Rows that could not be translated are omitted.