pypath.utils.proteomicsdb.ProteomicsDB§
- class pypath.utils.proteomicsdb.ProteomicsDB(username, password, output_format='json')[source]§
Bases:
object
- __init__(username, password, output_format='json')[source]§
This is an extensible class for downloading and processing data from ProteomicsDB. Now 2 of the 10 available APIs implemented here, but feel free to write functions for the other APIs. To find out more about ProteomicsDB, take a look at Wilhelm et al. 2014, Nature: http://www.nature.com/nature/journal/v509/n7502/full/nature13319.html To read a comprehensive descritpion of the APIs, visit here: https://www.proteomicsdb.org/proteomicsdb/#api
- @usernamestr
Registered and API enabled user for ProteomicsDB. To have such a user, you need first to register, AND then write an e-mail to the address given on the webpage. In a couple of days the admins will enable the API for your user.
- @passwordstr
Password of the user.
- @output_formatstr
Either ‘json’ or ‘xml’. Some functions in this module process JSON further and give certain objects.
Methods
__init__
(username, password[, output_format])This is an extensible class for downloading and processing data from ProteomicsDB.
get_expression
([normalized, tissue_average])Extracts normalized or unnormalized expression data from previously downloaded data, stored on disk, and opened for reading in file object ProteomicsDB.result.
get_json
(content)get_pieces
([size, delimiters])A generator for reading huge files (hundreds of MBs).
get_proteins
(tissue_id[, ...])Gets an annotated list of all tissues for which ProteomicsDB has expression data.
load
([pfile])Returns expression data in a pandas matrix.
query
(api, param[, silent, large])Retrieves data from the API.
reload
()save
([outf])tissues_x_proteins
([normalized, tissues])For all tissues downloads the expression of all the proteins.
which_tissues
(name, value)- get_expression(normalized=True, tissue_average=False)[source]§
Extracts normalized or unnormalized expression data from previously downloaded data, stored on disk, and opened for reading in file object ProteomicsDB.result. Optionally averages data per tissue.
- @normalizedbool
Read normalized or unnormalized expression values.
- @tissue_averagebool
Read and store data for each samples, or keep only the mean value per tissue.
- get_pieces(size=20480, delimiters=('{', '}'))[source]§
A generator for reading huge files (hundreds of MBs). Reads segments of @size, searches for self-contained JSON objects, and returns a list of them.
- @sizeint
Size to read at once (in Bytes).
- @delimiterstuple
Starting and closing delimiters. By default, these are curly braces, to return individual JSON objects of the largest possible size.
- get_tissues()[source]§
Gets an annotated list of all tissues for which ProteomicsDB has expression data. Result stored in ProteomicsDB.tissues.
- query(api, param, silent=False, large=False)[source]§
Retrieves data from the API.
- @apistr
Shold be one of the 10 API sections available.
- @paramtuple
Tuple of the parameters according to the API.
- @largebool
Passed to the curl wrapper function. If True, the file will be written to disk, and a file object open for reading is returned; if False, the raw data will be returned, in case of JSON, converted to python object, in case of XML, as a string.