pypath.utils.orthology.EnsemblOrthology§

class pypath.utils.orthology.EnsemblOrthology(target: int | str, source: int | str = 9606, id_type: str = 'uniprot', only_swissprot: bool | None = None, hc: bool | None = None, types: list[Literal['one2one', 'one2many', 'many2many']] | None = None)[source]§

Bases: ProteinOrthology

Orthology translation with Ensembl data.

Args

target:: Name or NCBI Taxonomy ID of the target organism.
source:: Name or NCBI Taxonomy ID of the source organism.
id_type:: The identifier type to use.
only_swissprot:: Use only SwissProt IDs.
hc:: Use only high confidence orthology relations from Ensembl. By default it is True. You can also set it by the ensembl_hc attribute.
types:: The Ensembl orthology relationship types to use. Possible values are one2one, one2many and many2many. By default only one2one is used. You can also set this parameter by the ensembl_types attribute.

Methods

`__init__`(target[, source, id_type, ...])	Orthology translation with Ensembl data.
`asdict`([full_records])	Create a dictionary from the translation table.
`df`([full_records])	Orthologous pairs as data frame.
`get_taxon`(protein[, only_swissprot])
`get_taxon_trembl`(protein)
`has_protein`(protein)
`is_swissprot`(protein)
`load`()
`load_proteome`(taxon[, only_swissprot])
`load_taxonomy`()
`match`(ortholog, **kwargs)	Check an ortholog against filtering criteria.
`reload`()
`translate`(identifier[, full_records])	For one UniProt ID of the source organism returns all orthologues from the target organism.
`translate_df`(df[, cols, ortho_df])	Translate columns in a data frame.

Attributes

`key`
`pickle_path`
`resource`

asdict(full_records: bool = False, **kwargs) → dict[str, set[OrthologBase]]§

Create a dictionary from the translation table.

Parameters:

full_records – Include not only the identifiers, but also some properties of the orthology relationships.
kwargs – Resource specific filtering criteria.

Returns:

A dict with identifiers of the source organism as keys, and sets of their orthologs as values.

df(full_records: bool = False, **kwargs) → DataFrame§

Orthologous pairs as data frame.

Parameters:

full_records – Include not only the identifiers, but also some properties of the orthology relationships.
kwargs – Resource specific filtering criteria.

Returns:

A data frame with pairs of orthologous identifiers, in two columns: “source” and “target”.

match(ortholog: OrthologBase, **kwargs) → bool[source]§

Check an ortholog against filtering criteria.

Parameters:

ortholog – An ortholog record.
kwargs – Override default filtering parameters.

Returns:

True if the ortholog meets the criteria.

translate(identifier: str | Iterable[str], full_records: bool = False, **kwargs) → set[str]§

For one UniProt ID of the source organism returns all orthologues from the target organism.

Parameters:

identifier – An identifier corresponding to the ID type and source organism of the instance.
full_records – Include not only the identifiers, but also some properties of the orthology relationships.
kwargs – Resource specific translation parameters.

Returns:

A set of identifiers of orthologues in the target taxon.

translate_df(df: DataFrame, cols: str | list[str] | None = None, ortho_df: DataFrame | None = None, **kwargs)§

Translate columns in a data frame.

Parameters:

df – A data frame.
cols – One or more columns to be translated. It can be a single column name, an iterable of column names or a dict where keys are column names and values are ID types. Except this last case, identifiers are assumed to be UniProt.
ortho_df – Override the translation data frame. If provided, the parameters in kwargs won’t have an effect. Must have columns “source” and “target”.
kwargs – Resource specific translation parameters.

Returns:

A data frame with the same column layout as the input, and the identifiers translated as demanded. Rows that could not be translated are omitted.