pypath.utils.mapping.map_name§

pypath.utils.mapping.map_name(name, id_type, target_id_type, ncbi_tax_id=None, strict=False, expand_complexes=True, uniprot_cleanup=True)[source]§

Translates one instance of one ID type to a different one. Returns set of the target ID type.

This function should be used to convert individual IDs. It takes care about everything and ideally you don’t need to think on the details.

How does it work: looks up dictionaries between the original and target ID type, if doesn’t find, attempts to load from the predefined inputs. If the original name is genesymbol, first it looks up among the preferred gene names from UniProt, if not found, it takes an attempt with the alternative gene names. If the gene symbol still couldn’t be found, and strict = False, the last attempt only the first 5 characters of the gene symbol matched. If the target name type is uniprot, then it converts all the ACs to primary. Then, for the Trembl IDs it looks up the preferred gene names, and find Swissprot IDs with the same preferred gene name.

Args

name (str): The original name to be converted. id_type (str): The type of the name. Available by default:

  • genesymbol (gene name)

  • entrez (Entrez Gene ID [#])

  • refseqp (NCBI RefSeq Protein ID [NP_*|XP_*])

  • ensp (Ensembl protein ID [ENSP*])

  • enst (Ensembl transcript ID [ENST*])

  • ensg (Ensembl genomic DNA ID [ENSG*])

  • hgnc (HGNC ID [HGNC:#])

  • gi (GI number [#])

  • embl (DDBJ/EMBL/GeneBank CDS accession)

  • embl_id (DDBJ/EMBL/GeneBank accession)

And many more, see the code of pypath.internals.input_formats

target_id_type (str): The name type to translate to, more or

less the same values are available as for id_type.

ncbi_tax_id (int): NCBI Taxonomy ID of the organism. strict (bool): In case a Gene Symbol can not be translated,

try to add number “1” to the end, or try to match only its first five characters. This option is rarely used, but it makes possible to translate some non-standard gene names typically found in old, unmaintained resources.

expand_complexes (bool): When encountering complexes,

translated the IDs of its components and return a set of IDs. The alternative behaviour is to return the Complex objects.

uniprot_cleanup (bool): When the target_id_type is UniProt

ID, call the uniprot_cleanup function at the end.