pypath.utils.proteomicsdb.ProteomicsDB§

class pypath.utils.proteomicsdb.ProteomicsDB(username, password, output_format='json')[source]§

Bases: object

__init__(username, password, output_format='json')[source]§

This is an extensible class for downloading and processing data from ProteomicsDB. Now 2 of the 10 available APIs implemented here, but feel free to write functions for the other APIs. To find out more about ProteomicsDB, take a look at Wilhelm et al. 2014, Nature: http://www.nature.com/nature/journal/v509/n7502/full/nature13319.html To read a comprehensive descritpion of the APIs, visit here: https://www.proteomicsdb.org/proteomicsdb/#api

@usernamestr: Registered and API enabled user for ProteomicsDB. To have such a user, you need first to register, AND then write an e-mail to the address given on the webpage. In a couple of days the admins will enable the API for your user.
@passwordstr: Password of the user.
@output_formatstr: Either ‘json’ or ‘xml’. Some functions in this module process JSON further and give certain objects.

Methods

`__init__`(username, password[, output_format])	This is an extensible class for downloading and processing data from ProteomicsDB.
`get_expression`([normalized, tissue_average])	Extracts normalized or unnormalized expression data from previously downloaded data, stored on disk, and opened for reading in file object ProteomicsDB.result.
`get_json`(content)
`get_pieces`([size, delimiters])	A generator for reading huge files (hundreds of MBs).
`get_proteins`(tissue_id[, ...])
`get_tissues`()	Gets an annotated list of all tissues for which ProteomicsDB has expression data.
`load`([pfile])
`pandas_matrix`()	Returns expression data in a pandas matrix.
`query`(api, param[, silent, large])	Retrieves data from the API.
`reload`()
`save`([outf])
`tissues_x_proteins`([normalized, tissues])	For all tissues downloads the expression of all the proteins.
`which_tissues`(name, value)

get_expression(normalized=True, tissue_average=False)[source]§

Extracts normalized or unnormalized expression data from previously downloaded data, stored on disk, and opened for reading in file object ProteomicsDB.result. Optionally averages data per tissue.

@normalizedbool: Read normalized or unnormalized expression values.
@tissue_averagebool: Read and store data for each samples, or keep only the mean value per tissue.

get_pieces(size=20480, delimiters=('{', '}'))[source]§

A generator for reading huge files (hundreds of MBs). Reads segments of @size, searches for self-contained JSON objects, and returns a list of them.

@sizeint: Size to read at once (in Bytes).
@delimiterstuple: Starting and closing delimiters. By default, these are curly braces, to return individual JSON objects of the largest possible size.

get_proteins(tissue_id, calculation_method=0, swissprot_only=1, no_isoform=1)[source]§

get_tissues()[source]§: Gets an annotated list of all tissues for which ProteomicsDB has expression data. Result stored in ProteomicsDB.tissues.

pandas_matrix()[source]§: Returns expression data in a pandas matrix. Not implemented.

query(api, param, silent=False, large=False)[source]§

Retrieves data from the API.

@apistr: Shold be one of the 10 API sections available.
@paramtuple: Tuple of the parameters according to the API.
@largebool: Passed to the curl wrapper function. If True, the file will be written to disk, and a file object open for reading is returned; if False, the raw data will be returned, in case of JSON, converted to python object, in case of XML, as a string.

tissues_x_proteins(normalized=True, tissues=None)[source]§: For all tissues downloads the expression of all the proteins. In the result, a dict of dicts will hold the expression values of each proteins, grouped by samples.