pypath.core.intercell.IntercellAnnotation§

class pypath.core.intercell.IntercellAnnotation(class_definitions=None, excludes=None, excludes_extra=None, cellphonedb_categories=None, baccin_categories=None, hpmr_categories=None, surfaceome_categories=None, gpcrdb_categories=None, icellnet_categories=None, build=True, composite_resource_name=None, **kwargs)[source]§

Bases: CustomAnnotation

__init__(class_definitions=None, excludes=None, excludes_extra=None, cellphonedb_categories=None, baccin_categories=None, hpmr_categories=None, surfaceome_categories=None, gpcrdb_categories=None, icellnet_categories=None, build=True, composite_resource_name=None, **kwargs)[source]§

Builds a database about roles of proteins and complexes in intercellular communication. The built-in category definitions defining the default contents of this database can be found in the pypath.core.intercell_annot module.

Parameters:
  • class_definitions (tuple) – A series of annotation class definitions, each represented by an instance of pypath.internals.annot_formats.AnnotDef. These definitions carry the attributes and instructions to populate the classes.

  • excludes (dict) – A dict with parent category names (strings) or category keys (tuples) as keys and sets if identifiers as values. The identifiers in this dict will be excluded from all the respective categories while building the database. E.g. if the UniProt ID P00533 (EGFR) is in the set under the key of adhesion it will be excluded from the category adhesion and all it’s direct children.

  • excludes_extra (dict) – Same kind of dict as excludes but it will be added to the built-in default. The built in and the provided extra sets will be merged. If you want to overwrite or modify the built-in sets provide your custom dict as excludes.

  • build (bool) – Execute the build upon instantiation or set up an empty object the build can be executed on later.

Methods

__init__([class_definitions, excludes, ...])

Builds a database about roles of proteins and complexes in intercellular communication.

add_baccin_categories()

add_cellphonedb_categories()

add_class_definitions(class_definitions)

add_classes_to_df()

add_extra_categories()

add_gpcrdb_categories()

add_hpmr_categories()

add_icellnet_categories()

add_surfaceome_categories()

all_resources()

browse([start])

Print gene information as a table.

build()

class_to_class_connections(**kwargs)

kwargs passed to filter_interclass_network.

class_to_class_connections_directed(**kwargs)

class_to_class_connections_inhibitory(**kwargs)

class_to_class_connections_signed(**kwargs)

class_to_class_connections_stimulatory(**kwargs)

class_to_class_connections_undirected(**kwargs)

classes_by_entity(element[, labels])

Returns a set of class keys with the classes containing at least one of the elements.

collect_classes()

complexes_by_resource()

consensus_score(name, entity)

consensus_score_normalized(name, entity)

count_inter_class_connections([...])

count_inter_class_connections_all([...])

count_inter_class_connections_directed([...])

count_inter_class_connections_inhibitory([...])

count_inter_class_connections_signed([...])

count_inter_class_connections_stimulatory([...])

count_inter_class_connections_undirected([...])

counts([entity_type, labels])

Returns a dict with number of elements in each class.

counts_by_class([entity_type, labels])

Returns a dict with number of elements in each class.

counts_by_resource([entity_types])

counts_df([groupby])

create_class(classdef[, override])

Creates a category of entities by processing a custom definition.

degree_inter_class_network([...])

degrees_of : str

degree_inter_class_network_2([degrees_of, ...])

degree_inter_class_network_directed([...])

degree_inter_class_network_directed_2(**kwargs)

degree_inter_class_network_inhibitory([...])

degree_inter_class_network_inhibitory_2(**kwargs)

degree_inter_class_network_stimulatory([...])

degree_inter_class_network_stimulatory_2(...)

degree_inter_class_network_undirected([...])

degree_inter_class_network_undirected_2(**kwargs)

df_add_causality()

df_add_locations([locations])

difference(*args)

ensure_annotdb()

entities_by_resource([entity_types])

export(fname, **kwargs)

filter([entity_type])

Filters the annotated entities by annotation class attributes and entity_type.

filter_classes(classes, **kwargs)

Returns a list of annotation classes filtered by their attributes.

filter_df(annot_df[, category, name, ...])

filter_entity_type(cls[, entity_type])

filter_interclass_network([annot_df, ...])

Combines the annotation data frame and a network data frame.

filtered([annot_df, entities])

get_aspect(name[, parent, resource])

get_class(definition[, parent, resource, ...])

Retrieves a class by its name or definition.

get_class_label(name[, parent, resource])

get_class_scope(name[, parent, resource])

get_complexes()

get_df()

Returns the data frame of custom annotations.

get_entities([entity_types])

get_interclass_network_df(**kwargs)

If the an interclass network is already present the network and other kwargs provided not considered.

get_mirnas()

get_parent(name[, parent, resource])

get_parents(name[, parent, resource])

As names should be unique for resources, a combination of a name and resource determines the parent category.

get_proteins()

get_resource(name[, parent])

For a category name and its parent returns a single resource name.

get_resources(name[, parent])

Returns a set with the names of all resources defining a category with the given name and parent.

get_source(name[, parent, resource])

inter_class_network([annot_args_source, ...])

inter_class_network_directed([...])

inter_class_network_inhibitory([...])

inter_class_network_signed([...])

inter_class_network_stimulatory([...])

inter_class_network_undirected([...])

intersection(*args)

isdisjoint(*args)

iter_classes(**kwargs)

labels(name[, parent, resource, entity_type])

Same as select but returns a list of labels (more human readable).

load()

load_from_pickle(pickle_file)

make_df()

Creates a pandas.DataFrame where each record assigns a molecular entity to an annotation category.

mirnas_by_resource()

network_df([annot_df, network, combined_df, ...])

Combines the annotation data frame and a network data frame.

numof_classes()

numof_complex_records()

numof_complexes()

numof_entities([entity_types])

numof_mirna_records()

numof_mirnas()

numof_protein_records()

numof_proteins()

numof_records([entity_types])

populate_classes([update])

Creates a classification of proteins according to the custom annotation definitions.

populate_scores()

Creates the consensus score dictionaries based on the number of resources annotating an entity for each composite category.

post_load()

pre_build()

process_annot(classdef)

Processes an annotation definition and returns a set of identifiers.

proteins_by_resource()

quality_check_table([path, fmt, ...])

Exports a table in tsv format for quality check and browsing purposes.

register_network(network)

Sets network as the default network dataset for the instance.

reload()

Reloads the object from the module level.

resources_in_category(key)

Returns a list of resources contributing to the definition of a category.

save_to_pickle(pickle_file)

select(definition[, parent, resource, ...])

Retrieves a class by its name or definition.

set_classes()

set_interclass_network_df(**kwargs)

Creates a data frame of the whole inter-class network and keeps it assigned to the instance in order to make subsequent queries faster.

sets(*args)

show(name[, parent, resource])

Same as select but prints a table to the console with basic information from the UniProt datasheets.

summaries_tab([outfile, return_table])

symmetric_difference(*args)

union(*args)

unset_interclass_network_df()

update_excludes()

update_parents()

Creates a dict :py:attr:children with parent class names as keys and sets of children class keys as values.

update_summaries()

browse(start: int = 0, **kwargs)§

Print gene information as a table.

Presents information about annotation classes as ascii tables printed in the terminal. If one class provided, prints one table. If multiple classes provided, prints a table for each of them one by one proceeding to the next one once you hit return. If no classes provided goes through all classes.

kwargs passed to pypath.utils.uniprot.info.

class_to_class_connections(**kwargs)§

kwargs passed to filter_interclass_network.

classes_by_entity(element, labels=False)§

Returns a set of class keys with the classes containing at least one of the elements.

Parameters:
  • element (str,set) – One or more element (entity) to search for in the classes.

  • labels (bool) – Return labels instead of keys.

counts(entity_type='protein', labels=True, **kwargs)§

Returns a dict with number of elements in each class.

Parameters:

labels (bool) – Use keys or labels as keys in the returned dict.

All other arguments passed to iter_classes.

counts_by_class(entity_type='protein', labels=True, **kwargs)§

Returns a dict with number of elements in each class.

Parameters:

labels (bool) – Use keys or labels as keys in the returned dict.

All other arguments passed to iter_classes.

create_class(classdef, override=False)§

Creates a category of entities by processing a custom definition.

degree_inter_class_network(annot_args_source=None, annot_args_target=None, degrees_of='target', **kwargs)§
degrees_ofstr

Either source or target. Count the degrees for the source or the target class.

filter(entity_type=None, **kwargs)§

Filters the annotated entities by annotation class attributes and entity_type. kwargs passed to filter_classes.

static filter_classes(classes, **kwargs)§

Returns a list of annotation classes filtered by their attributes. kwargs contains attributes and values.

filter_interclass_network(annot_df=None, network=None, combined_df=None, network_args=None, annot_args=None, annot_args_source=None, annot_args_target=None, entities=None, only_directed=False, only_undirected=False, undirected_orientation=None, only_signed=None, only_effect=None, only_proteins=False, swap_undirected=True, entities_or=False, transmitter_receiver=False, only_generic=True, only_composite=True, only_functional=True, exclude_intracellular=True)§

Combines the annotation data frame and a network data frame. Creates a pandas.DataFrame where each record is an interaction between a pair of molecular enitities labeled by their annotations.

networkpypath.network.Network,pandas.DataFrame

A pypath.network.Network object or a data frame with network data.

combined_dfpandas.DataFrame

Optional, a network data frame already combined with annotations for filtering only.

resourcesset,None

Use only these network resources.

entitiesset,None

Limit the network only to these molecular entities.

entities_sourceset,None

Limit the source side of network connections only to these molecular entities.

entities_targetset,None

Limit the target side of network connections only to these molecular entities.

annot_argsdict,None

Parameters for filtering annotation classes; note, the defaults might include some filtering, provide an empty dict if you want no filtering at all; however this might result in huge data frame and consequently memory issues. Passed to the filtered method.

annot_args_sourcedict,None

Same as annot_args but only for the source side of the network connections.

annot_args_targetdict,None

Same as annot_args but only for the target side of the network connections.

only_directedbool

Use only the directed interactions.

only_undirectedbool

Use only the undirected interactions. Specifically for retrieving and counting the interactions without direction information.

undirected_orientationstr,None

Ignore the direction at all interactions and make sure all of them have a uniform orientation. If id, all interactions will be oriented by the identifiers of the partenrs; if category, the interactions will be oriented by the categories of the partners.

only_effectint,None

Use only the interactions with this effect. Either -1 or 1.

only_signedbool

Use only the interactions with effect sign.

only_proteinsbool

Use only the interactions where each of the partners is a protein (i.e. not complex, miRNA, small molecule or other kind of entity).

transmitter_receiverbool

On the source side only transmitters, on the target side only receivers.

only_genericbool

Use only the generic classes. If specific classes allowed the size of the combined data frame might be huge.

only_compositebool

Use only the composite classes. If resource_specific classes allowed the size of the combined data frame might be huge.

only_functionalbool

Use only the functional classes. Locational classes are often not relevant and they largely increase the size of the combined data frame.

exclude_intracellularbool

Remove the intracellular parent class and it’s children. These classes are not relevant in intercellular signaling and having them largely increases the size of the combined data frame.

get_class(definition, parent=None, resource=None, entity_type=None, **kwargs)§

Retrieves a class by its name or definition. The definition can be a class name (string) or a set of entities, or an AnnotDef object defining the contents based on original resources or an AnnotOp which defines the contents as an operation over other definitions.

get_df()§

Returns the data frame of custom annotations. If it does not exist yet builds the data frame.

get_interclass_network_df(**kwargs)§

If the an interclass network is already present the network and other kwargs provided not considered. Otherwise these are passed to network_df.

get_parents(name, parent=None, resource=None)§

As names should be unique for resources, a combination of a name and resource determines the parent category. This method looks up the parent for a pair of name and resource.

get_resource(name, parent=None)§

For a category name and its parent returns a single resource name. If a category belonging to the composite database matches the name and the parent the name of the composite database will be returned, otherwise the resource name first in alphabetic order.

get_resources(name, parent=None)§

Returns a set with the names of all resources defining a category with the given name and parent.

labels(name, parent=None, resource=None, entity_type=None)§

Same as select but returns a list of labels (more human readable).

make_df()[source]§

Creates a pandas.DataFrame where each record assigns a molecular entity to an annotation category. The data frame will be assigned to the df attribute.

network_df(annot_df=None, network=None, combined_df=None, network_args=None, annot_args=None, annot_args_source=None, annot_args_target=None, entities=None, only_directed=False, only_undirected=False, undirected_orientation=None, only_signed=None, only_effect=None, only_proteins=False, swap_undirected=True, entities_or=False, transmitter_receiver=False, only_generic=True, only_composite=True, only_functional=True, exclude_intracellular=True)[source]§

Combines the annotation data frame and a network data frame. Creates a pandas.DataFrame where each record is an interaction between a pair of molecular enitities labeled by their annotations.

networkpypath.network.Network,pandas.DataFrame

A pypath.network.Network object or a data frame with network data.

combined_dfpandas.DataFrame

Optional, a network data frame already combined with annotations for filtering only.

resourcesset,None

Use only these network resources.

entitiesset,None

Limit the network only to these molecular entities.

entities_sourceset,None

Limit the source side of network connections only to these molecular entities.

entities_targetset,None

Limit the target side of network connections only to these molecular entities.

annot_argsdict,None

Parameters for filtering annotation classes; note, the defaults might include some filtering, provide an empty dict if you want no filtering at all; however this might result in huge data frame and consequently memory issues. Passed to the filtered method.

annot_args_sourcedict,None

Same as annot_args but only for the source side of the network connections.

annot_args_targetdict,None

Same as annot_args but only for the target side of the network connections.

only_directedbool

Use only the directed interactions.

only_undirectedbool

Use only the undirected interactions. Specifically for retrieving and counting the interactions without direction information.

undirected_orientationstr,None

Ignore the direction at all interactions and make sure all of them have a uniform orientation. If id, all interactions will be oriented by the identifiers of the partenrs; if category, the interactions will be oriented by the categories of the partners.

only_effectint,None

Use only the interactions with this effect. Either -1 or 1.

only_signedbool

Use only the interactions with effect sign.

only_proteinsbool

Use only the interactions where each of the partners is a protein (i.e. not complex, miRNA, small molecule or other kind of entity).

transmitter_receiverbool

On the source side only transmitters, on the target side only receivers.

only_genericbool

Use only the generic classes. If specific classes allowed the size of the combined data frame might be huge.

only_compositebool

Use only the composite classes. If resource_specific classes allowed the size of the combined data frame might be huge.

only_functionalbool

Use only the functional classes. Locational classes are often not relevant and they largely increase the size of the combined data frame.

exclude_intracellularbool

Remove the intracellular parent class and it’s children. These classes are not relevant in intercellular signaling and having them largely increases the size of the combined data frame.

populate_classes(update=False)§

Creates a classification of proteins according to the custom annotation definitions.

populate_scores()§

Creates the consensus score dictionaries based on the number of resources annotating an entity for each composite category.

process_annot(classdef)§

Processes an annotation definition and returns a set of identifiers.

quality_check_table(path=None, fmt='tsv', only_swissprot=True, top=None, **kwargs)§

Exports a table in tsv format for quality check and browsing purposes. Each protein represented in one row of this table with basic data from UniProt and the list of annotation categories from this database.

Parameters:
  • path (str) – Path for the exported file.

  • fmt (str) – Format: either tsv or latex.

register_network(network)§

Sets network as the default network dataset for the instance. All methods afterwards will use this network. Also it discards the interclass network data frame if it present to make sure future queries will address the network registered here.

reload()[source]§

Reloads the object from the module level.

resources_in_category(key)§

Returns a list of resources contributing to the definition of a category.

select(definition, parent=None, resource=None, entity_type=None, **kwargs)§

Retrieves a class by its name or definition. The definition can be a class name (string) or a set of entities, or an AnnotDef object defining the contents based on original resources or an AnnotOp which defines the contents as an operation over other definitions.

set_interclass_network_df(**kwargs)§

Creates a data frame of the whole inter-class network and keeps it assigned to the instance in order to make subsequent queries faster.

show(name, parent=None, resource=None, **kwargs)§

Same as select but prints a table to the console with basic information from the UniProt datasheets.

update_parents()§

Creates a dict :py:attr:children with parent class names as keys and sets of children class keys as values. Also a dict :py:attr:parents with children class keys as keys and parent class keys as values.