pypath.utils.mapping.MapReader§

class pypath.utils.mapping.MapReader(param, ncbi_tax_id=None, entity_type=None, load_a_to_b=True, load_b_to_a=False, uniprots=None, lifetime=300, resource_id_types=None)[source]§

Bases: Logger

Reads ID translation data and creates MappingTable instances. When initializing ID conversion tables for the first time data is downloaded from UniProt and read into dictionaries. It takes a couple of seconds. Data is saved to pickle dumps, this way later the tables load much faster.

__init__(param, ncbi_tax_id=None, entity_type=None, load_a_to_b=True, load_b_to_a=False, uniprots=None, lifetime=300, resource_id_types=None)[source]§
Args
param (MappingInput): A mapping table definition, any child of

the internals.input_formats.MappingInput class.

ncbi_tax_id (int): NCBI Taxonomy identifier of the organism. entity_type (str): An optional, custom string showing the type of

the entities, e.g. protein. This is not mandatory for the identification of mapping tables, hence the same name types can’t be used for different entities. E.g. if both proteins and miRNAs have Entrez gene IDs then these should be different ID types (e.g. entrez_protein and entrez_mirna) or both protein and miRNA IDs can be loaded into one mapping table and simply called entrez.

load_a_to_b (bool): Load the mapping table for translation from

id_type to target_id_type.

load_b_to_a (bool): Load the mapping table for translation from

target_id_type to id_type.

uniprots (set): UniProt IDs to query in case the source of the

mapping table is the UniProt web service.

lifetime (int): If this table has not been used for longer than

this preiod it is to be removed at next cleanup. Time in seconds. Passed to MappingTable.

resource_id_types: Additional mappings between pypath and resource

specific identifier type labels.

Methods

__init__(param[, ncbi_tax_id, entity_type, ...])

Args

id_type_side(id_type)

Tells if an ID type is on the "a" or "b" (source or target) side in the current mapping table definition.

load()

The complete process of loading mapping tables.

read()

Reads the ID translation data from the original source.

read_cache()

Reads the ID translation data from a previously saved pickle file.

read_mapping_array()

Loads mapping table between microarray probe IDs and genes.

read_mapping_biomart()

Loads a mapping table using BioMart data.

read_mapping_file()

Reads a mapping table from a local file or a function.

read_mapping_hmdb()

Loads an ID translation table from th Human Metabolome Database.

read_mapping_pro()

read_mapping_ramp()

Loads an ID translation table from RaMP.

read_mapping_unichem()

Loads an ID translation table from UniChem.

read_mapping_uniprot()

Downloads ID mappings directly from UniProt.

read_mapping_uniprot_list()

Builds a mapping table by downloading data from UniProt's upload lists service.

reload()

resource_id_type([side])

Resource specific identifier type.

set_uniprot_space([swissprot])

Sets up a search space of UniProt IDs.

setup_cache()

Constructs the cache file path as md5 hash of the parameters.

tables_loaded()

Tells if the requested tables have been created.

write_cache()

Exports the ID translation data into pickle files.

Attributes

mapping_table_a_to_b

Returns a MappingTable instance created from the already loaded data.

mapping_table_b_to_a

Returns a MappingTable instance created from the already loaded data.

resource_id_type_a

resource_id_type_b

id_type_side(id_type)[source]§

Tells if an ID type is on the “a” or “b” (source or target) side in the current mapping table definition.

Args

id_type (str): An ID type label.

Returns

Returns the string “a” if id_type is on the source side in the mapping table definition, “b” if it is on the target side, None if the id_type is not in the definition.

load()[source]§

The complete process of loading mapping tables. First sets up the paths of the cache files, then loads the tables from the cache files or the original sources if necessary. Upon successful loading from an original source writes the results to cache files.

property mapping_table_a_to_b§

Returns a MappingTable instance created from the already loaded data.

property mapping_table_b_to_a§

Returns a MappingTable instance created from the already loaded data.

read()[source]§

Reads the ID translation data from the original source.

read_cache()[source]§

Reads the ID translation data from a previously saved pickle file.

read_mapping_array()[source]§

Loads mapping table between microarray probe IDs and genes.

read_mapping_biomart()[source]§

Loads a mapping table using BioMart data.

read_mapping_file()[source]§

Reads a mapping table from a local file or a function.

read_mapping_hmdb()[source]§

Loads an ID translation table from th Human Metabolome Database.

read_mapping_ramp()[source]§

Loads an ID translation table from RaMP.

read_mapping_unichem()[source]§

Loads an ID translation table from UniChem.

read_mapping_uniprot()[source]§

Downloads ID mappings directly from UniProt. See the names of possible identifiers here: http://www.uniprot.org/help/programmatic_access

read_mapping_uniprot_list()[source]§

Builds a mapping table by downloading data from UniProt’s upload lists service.

resource_id_type(side=typing.Literal['a', 'b']) str | None[source]§

Resource specific identifier type.

set_uniprot_space(swissprot=None)[source]§

Sets up a search space of UniProt IDs.

Args
swissprot (bool): Use only SwissProt IDs, not TrEMBL. True

loads only SwissProt IDs, False only TrEMBL IDs, None loads both.

setup_cache()[source]§

Constructs the cache file path as md5 hash of the parameters.

tables_loaded()[source]§

Tells if the requested tables have been created.

write_cache()[source]§

Exports the ID translation data into pickle files.