pypath.utils.proteomicsdb.ProteomicsDB§

class pypath.utils.proteomicsdb.ProteomicsDB(username, password, output_format='json')[source]§

Bases: object

__init__(username, password, output_format='json')[source]§

This is an extensible class for downloading and processing data from ProteomicsDB. Now 2 of the 10 available APIs implemented here, but feel free to write functions for the other APIs. To find out more about ProteomicsDB, take a look at Wilhelm et al. 2014, Nature: http://www.nature.com/nature/journal/v509/n7502/full/nature13319.html To read a comprehensive descritpion of the APIs, visit here: https://www.proteomicsdb.org/proteomicsdb/#api

@usernamestr

Registered and API enabled user for ProteomicsDB. To have such a user, you need first to register, AND then write an e-mail to the address given on the webpage. In a couple of days the admins will enable the API for your user.

@passwordstr

Password of the user.

@output_formatstr

Either ‘json’ or ‘xml’. Some functions in this module process JSON further and give certain objects.

Methods

__init__(username, password[, output_format])

This is an extensible class for downloading and processing data from ProteomicsDB.

get_expression([normalized, tissue_average])

Extracts normalized or unnormalized expression data from previously downloaded data, stored on disk, and opened for reading in file object ProteomicsDB.result.

get_json(content)

get_pieces([size, delimiters])

A generator for reading huge files (hundreds of MBs).

get_proteins(tissue_id[, ...])

get_tissues()

Gets an annotated list of all tissues for which ProteomicsDB has expression data.

load([pfile])

pandas_matrix()

Returns expression data in a pandas matrix.

query(api, param[, silent, large])

Retrieves data from the API.

reload()

save([outf])

tissues_x_proteins([normalized, tissues])

For all tissues downloads the expression of all the proteins.

which_tissues(name, value)

get_expression(normalized=True, tissue_average=False)[source]§

Extracts normalized or unnormalized expression data from previously downloaded data, stored on disk, and opened for reading in file object ProteomicsDB.result. Optionally averages data per tissue.

@normalizedbool

Read normalized or unnormalized expression values.

@tissue_averagebool

Read and store data for each samples, or keep only the mean value per tissue.

get_pieces(size=20480, delimiters=('{', '}'))[source]§

A generator for reading huge files (hundreds of MBs). Reads segments of @size, searches for self-contained JSON objects, and returns a list of them.

@sizeint

Size to read at once (in Bytes).

@delimiterstuple

Starting and closing delimiters. By default, these are curly braces, to return individual JSON objects of the largest possible size.

get_proteins(tissue_id, calculation_method=0, swissprot_only=1, no_isoform=1)[source]§
get_tissues()[source]§

Gets an annotated list of all tissues for which ProteomicsDB has expression data. Result stored in ProteomicsDB.tissues.

pandas_matrix()[source]§

Returns expression data in a pandas matrix. Not implemented.

query(api, param, silent=False, large=False)[source]§

Retrieves data from the API.

@apistr

Shold be one of the 10 API sections available.

@paramtuple

Tuple of the parameters according to the API.

@largebool

Passed to the curl wrapper function. If True, the file will be written to disk, and a file object open for reading is returned; if False, the raw data will be returned, in case of JSON, converted to python object, in case of XML, as a string.

tissues_x_proteins(normalized=True, tissues=None)[source]§

For all tissues downloads the expression of all the proteins. In the result, a dict of dicts will hold the expression values of each proteins, grouped by samples.