xxxxxxxxxx
<h1>How to build networks with OmniPath & pypath</h1>

How to build networks with OmniPath & pypath

This tutorial presents a few basic ideas how to build a comprehensive prior knowledge network (PKN) for Boolean or other kind of modeling.

This tutorial presents a few basic ideas how to build a comprehensive prior knowledge network (PKN) for Boolean or other kind of modeling.

xxxxxxxxxx
 
Open a terminal and a text editor. First create a working directory for this session and enter into it. Then link the shared cache directory from ``penelope`` storage under this directory. Finally start a Python shell. Note this should be run in bash:

Open a terminal and a text editor. First create a working directory for this session and enter into it. Then link the shared cache directory from penelope storage under this directory. Finally start a Python shell. Note this should be run in bash:

 
``mkdir pypath-tests``<br>
``cd pypath``<br>
``ln -s /media/penelopeprime/pypath-cache ./cache``<br>
``python # or python3``

mkdir pypath-tests
cd pypath
ln -s /media/penelopeprime/pypath-cache ./cache
python # or python3

xxxxxxxxxx
 
After importing a couple of generic modules import the ``pypath`` module:

After importing a couple of generic modules import the pypath module:

In [25]:
from __future__ import print_function
import rlcompleter, readline
readline.parse_and_bind('tab:complete')
import pypath
from pypath import data_formats as df
xxxxxxxxxx
Then initialize a ``PyPath`` object:

Then initialize a PyPath object:

In [2]:
 
pa = pypath.PyPath()
	=== d i s c l a i m e r ===

	All data coming with this module
	either as redistributed copy or downloaded using the
	programmatic interfaces included in the present module
	are available under public domain, are free to use at
	least for academic research or education purposes.
	Please be aware of the licences of all the datasets
	you use in your analysis, and please give appropriate
	credits for the original sources when you publish your
	results. To find out more about data sources please
	look at `pypath.descriptions` and
	`pypath.data_formats.urls`.

	> New session started,
	session ID: 'hr00h'
	logfile: './log/hr00h.log'
	pypath version: 0.7.9
x
We have several options to build a network. For example ``load_omnipath`` reproduces the workflow as it has been described in the paper. Even here we can set certain options, check out ``help(pa.load_omnipath)``. In ``pypath.data_formats`` (imported here as ``df``) various sets of format definitions can be found. E.g. ``data_formats.pathway`` contains the activity flow type of resources while ``data_formats.mirna_target`` contains the miRNA-mRNA interaction resources. To list the resources in one set one can just look at the dictionary keys:

We have several options to build a network. For example load_omnipath reproduces the workflow as it has been described in the paper. Even here we can set certain options, check out help(pa.load_omnipath). In pypath.data_formats (imported here as df) various sets of format definitions can be found. E.g. data_formats.pathway contains the activity flow type of resources while data_formats.mirna_target contains the miRNA-mRNA interaction resources. To list the resources in one set one can just look at the dictionary keys:

xxxxxxxxxx
 
<h2>1: Building a network combined from multiple resources</h2>

1: Building a network combined from multiple resources

In [5]:
df.pathway.keys()
Out[5]:
dict_keys(['trip', 'spike', 'signalink3', 'guide2pharma', 'ca1', 'arn', 'nrf2', 'macrophage', 'death', 'pdz', 'signor'])
 
Each element is a ``ReadSettings`` object:

Each element is a ReadSettings object:

In [6]:
 
df.pathway['signor']
Out[6]:
<pypath.input_formats.ReadSettings at 0x7ff1f99d70b8>
You can define new ``ReadSettings`` objects this way you can use any file or method as an input for building your network. For now let's use the predefined inputs and load a set of resources i.e. build a merged network of all these resources:

You can define new ReadSettings objects this way you can use any file or method as an input for building your network. For now let's use the predefined inputs and load a set of resources i.e. build a merged network of all these resources:

In [4]:
 
# pa.load_omnipath()
pa.init_network(df.pathway)
	:: Loading data from cache previously downloaded from www.uniprot.org
	:: Ready. Resulted `plain text` of type unicode string.                                                                                              
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/ec920965677ac83b8805d72853c79d45-`.
	:: Loading data from cache previously downloaded from www.uniprot.org
	:: Ready. Resulted `plain text` of type unicode string.                                                                                              
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/ec920965677ac83b8805d72853c79d45-`.
 > TRIP
	:: Reading from cache: cache/trip.edges.pickle
        Processing nodes -- finished: 100%|██████████| 423/423 [00:00<00:00, 170Kit/s]
        Processing edges -- finished: 100%|██████████| 423/423 [00:00<00:00, 55.0Kit/s]
        Processing attributes -- finished: 100%|██████████| 423/423 [00:00<00:00, 1.04Kit/s]
 > SPIKE
	:: Reading from cache: cache/spike.edges.pickle
        Processing nodes -- finished: 100%|██████████| 3.72K/3.72K [00:00<00:00, 227Kit/s]
        Processing edges -- finished: 100%|██████████| 3.72K/3.72K [00:00<00:00, 88.9Kit/s]
        Processing attributes -- finished: 100%|██████████| 3.72K/3.72K [00:03<00:00, 996it/s]s]
 > SignaLink3
	:: Reading from cache: cache/signalink3.edges.pickle
        Processing nodes -- finished: 100%|██████████| 6.94K/6.94K [00:00<00:00, 228Kit/s]
        Processing edges -- finished: 100%|██████████| 6.94K/6.94K [00:00<00:00, 91.6Kit/s]
        Processing attributes -- finished: 100%|██████████| 6.94K/6.94K [00:05<00:00, 1.22Kit/s]
 > Guide2Pharma
	:: Reading from cache: cache/guide2pharma.edges.pickle
        Processing nodes -- finished: 100%|██████████| 266/266 [00:00<00:00, 126Kit/s]
        Processing edges -- finished: 100%|██████████| 266/266 [00:00<00:00, 72.1Kit/s]
        Processing attributes -- finished: 100%|██████████| 266/266 [00:00<00:00, 611it/s]
 > CA1
	:: Reading from cache: cache/ca1.edges.pickle
        Processing nodes -- finished: 100%|██████████| 1.88K/1.88K [00:00<00:00, 138Kit/s]
        Processing edges -- finished: 100%|██████████| 1.88K/1.88K [00:00<00:00, 75.4Kit/s]
        Processing attributes -- finished: 100%|██████████| 1.88K/1.88K [00:01<00:00, 1.58Kit/s]
 > ARN
	:: Reading from cache: cache/arn.edges.pickle
        Processing nodes -- finished: 100%|██████████| 95.0/95.0 [00:00<00:00, 59.6Kit/s]
        Processing edges -- finished: 100%|██████████| 95.0/95.0 [00:00<00:00, 33.9Kit/s]
        Processing attributes -- finished: 100%|██████████| 95.0/95.0 [00:00<00:00, 483it/s]
 > NRF2ome
	:: Reading from cache: cache/nrf2ome.edges.pickle
        Processing nodes -- finished: 100%|██████████| 109/109 [00:00<00:00, 63.3Kit/s]
        Processing edges -- finished: 100%|██████████| 109/109 [00:00<00:00, 35.2Kit/s]
        Processing attributes -- finished: 100%|██████████| 109/109 [00:00<00:00, 540it/s]]
 > Macrophage
	:: Reading from cache: cache/macrophage.edges.pickle
        Processing nodes -- finished: 100%|██████████| 4.85K/4.85K [00:00<00:00, 251Kit/s]
        Processing edges -- finished: 100%|██████████| 4.85K/4.85K [00:00<00:00, 79.5Kit/s]
        Processing attributes -- finished: 100%|██████████| 4.85K/4.85K [00:01<00:00, 3.98Kit/s]
 > DeathDomain
	:: Reading from cache: cache/deathdomain.edges.pickle
        Processing nodes -- finished: 100%|██████████| 236/236 [00:00<00:00, 79.8Kit/s]
        Processing edges -- finished: 100%|██████████| 236/236 [00:00<00:00, 45.5Kit/s]
        Processing attributes -- finished: 100%|██████████| 236/236 [00:00<00:00, 1.46Kit/s]
 > PDZBase
	:: Reading from cache: cache/pdzbase.edges.pickle
        Processing nodes -- finished: 100%|██████████| 133/133 [00:00<00:00, 83.8Kit/s]
        Processing edges -- finished: 100%|██████████| 133/133 [00:00<00:00, 12.6Kit/s]
        Processing attributes -- finished: 100%|██████████| 133/133 [00:00<00:00, 510it/s]
 > Signor
	:: Loading data from cache previously downloaded from signor.uniroma2.it
	:: Ready. Resulted `plain text` of type file object.                                                                                                 
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/a6746e1fff57be04ea20f53a3376ef42-download_entity.php`.
        Processing nodes -- finished: 100%|██████████| 9.88K/9.88K [00:00<00:00, 234Kit/s]
        Processing edges -- finished: 100%|██████████| 9.88K/9.88K [00:00<00:00, 104Kit/s]
        Processing attributes -- finished: 100%|██████████| 9.88K/9.88K [00:07<00:00, 1.28Kit/s]
 :: Comparing with reference lists... done.

 > 14293 interactions between 4609 nodes
 from 11 resources have been loaded,
 for details see the log: ./log/hr00h.log
xxxxxxxxxx
Looking at the last lines of messages we see the size of the network. The network itself is represented as an [``igraph`` object](http://igraph.org/):

Looking at the last lines of messages we see the size of the network. The network itself is represented as an igraph object:

In [7]:
 
pa.graph
Out[7]:
<igraph.Graph at 0x7ff1f30aa228>
 
<h2>2: Directions and effect signs</h2>

2: Directions and effect signs

This ``igraph.Graph`` object is undirected by default because direction information is too complex to be stored in ``igraph`` data structures hence we keep it in separate objects.

This igraph.Graph object is undirected by default because direction information is too complex to be stored in igraph data structures hence we keep it in separate objects.

In [8]:
pa.graph.is_directed()
Out[8]:
False
 
Let's see one of these ``Direction`` objects.

Let's see one of these Direction objects.

In [23]:
x
print(pa.graph.es[28]['dirs'])
Directions and signs of interaction between P16949 and P17612

	P16949 <=== P17612 :: Signor
	P16949 <=-= P17612 :: Signor

In [26]:
xxxxxxxxxx
pa.graph.es[28]['dirs'].is_inhibition()
Out[26]:
True
 
The undirected graph object can be converted to a directed ``igraph.Graph`` object using the ``get_directed()`` method. This automatically happens in the background once we start to query the ``PyPath`` object about causality relationships. We can ask which nodes are upstream or downstream of or stimulators of or inhibitors of or stimulated by or inhibited by certain nodes. For example to get the nodes inhibited by ULK1: 

The undirected graph object can be converted to a directed igraph.Graph object using the get_directed() method. This automatically happens in the background once we start to query the PyPath object about causality relationships. We can ask which nodes are upstream or downstream of or stimulators of or inhibitors of or stimulated by or inhibited by certain nodes. For example to get the nodes inhibited by ULK1:

In [28]:
pa.gs_inhibits('ULK1')
        Setting directions -- finished: 100%|██████████| 14.3K/14.3K [00:22<00:00, 636it/s]
Out[28]:
<pypath.main._NamedVertexSeq at 0x7ff1ce67b550>
 
The ``gs_`` prefix is necessary to express ULK1 is a Gene Symbol. The returned ``_NamedVertexSeq`` object is a generator which can be iterated as a series of integer vertex IDs or UniProt IDs or Gene Symbols. Let's make it a list of Gene Symbols:

The gs_ prefix is necessary to express ULK1 is a Gene Symbol. The returned _NamedVertexSeq object is a generator which can be iterated as a series of integer vertex IDs or UniProt IDs or Gene Symbols. Let's make it a list of Gene Symbols:

In [29]:
xxxxxxxxxx
list(pa.gs_inhibits('ULK1').gs())
Out[29]:
['PRKAA2', 'DYNLL1', 'PRKAA1', 'PRKAG3']
 
I know this reads like "inhibits ULK1" and it is counter intuitive that these are actually inhibited by ULK1: this is a design mistake but changing it would be even more confusing. Similarly the UniProt IDs of nodes stimulating EGFR:

I know this reads like "inhibits ULK1" and it is counter intuitive that these are actually inhibited by ULK1: this is a design mistake but changing it would be even more confusing. Similarly the UniProt IDs of nodes stimulating EGFR:

In [30]:
xxxxxxxxxx
 
list(pa.up_stimulated_by('P00533').up())
Out[30]:
['P17252',
 'Q03135',
 'P12931',
 'P22681',
 'P17275',
 'Q13671',
 'P18146',
 'P01135',
 'P08253',
 'P10276',
 'P00519',
 'P01133',
 'P04637',
 'P04626',
 'P52735',
 'Q14289',
 'Q9Y5X1',
 'Q99075',
 'P35070',
 'P15514',
 'Q12913',
 'Q99527',
 'O14944',
 'Q6UW88']
 
<h2>3: Annotations</h2>

3: Annotations

 
Pypath is able to combine many different kind of annotations onto the network i.e. to assign data to edges or nodes. Among the most often used are post-translational modifications. These are enzyme-substrate interactions assigned to edges. The ``pypath.ptm`` is a standalone submodule which is able to collect enzyme-substrate interactions from multiple resources and also to translate them from one organism to another, usually between rodents and human. The collected enzyme-substrate relationships optionally can be assigned to the edges of a network. This can be done easily just by one call:

Type Markdown and LaTeX: $\alpha^2$

In [31]:
 
pa.load_ptms()
        Downloading `DEPOD_201408_human_phosphatase-substrate.txt` from depod.bioss.uni-freiburg.de -- 95.93kB downloaded: 178Kit [00:00, 422Kit/s]               00, 37.8Kit/s]
	:: Ready. Resulted `plain text` of type unicode string.                                                                                              
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/d04d1d3e13e165085a804c8b09b5ab03-DEPOD_201408_human_phosphatase-substrate.txt`.
        Downloading `DEPOD_201405_human_phosphatase-substrate.mitab` from depod.bioss.uni-freiburg.de -- 564.15kB downloaded: 911Kit [00:00, 2.91Mit/s]          0:00, 159Kit/s]
	:: Ready. Resulted `plain text` of type unicode string.                                                                                              
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/6a711369ecf9dcff8c5ed88996685b54-DEPOD_201405_human_phosphatase-substrate.mitab`.
	:: Loading data from cache previously downloaded from www.uniprot.org
	:: Ready. Resulted `plain text` of type unicode string.                                                                                              
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/05c7ce2e64bbe145c1f1b6bc34b8a7cd-`.
	:: Loading data from cache previously downloaded from ftp.uniprot.org
	:: Ready. Resulted `gz extracted data` of type unicode string.                                                                                       
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/fdd147089cd934ee56155a95053b0aec-uniprot_sprot_varsplic.fasta.gz`.
        Loading dephosphorylation data from DEPOD -- finished: 100%|██████████| 299/299 [00:00<00:00, 79.9Kit/s]
	:: Loading data from cache previously downloaded from signor.uniroma2.it
	:: Ready. Resulted `plain text` of type file object.                                                                                                 
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/a6746e1fff57be04ea20f53a3376ef42-download_entity.php`.
        Processing PTMs from Signor -- finished: 100%|██████████| 8.73K/8.73K [00:02<00:00, 3.03Kit/s]
	:: Loading data from cache previously downloaded from genome.cshlp.org
	:: Ready. Resulted `plain text` of type file object.                                                                                                 
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/8bc6971b54289acf8ab5b960ba9a0fa7-Supplementary_files_S1-S5.xls`.
Processing PTMs from Li2012: initializing:   0%|          | 0.00/349 [00:00<?, ?it/s]
	:: Loading 'genesymbol-syn' to 'uniprot' mapping table
        Processing PTMs from Li2012 -- finished: 100%|██████████| 349/349 [00:00<00:00, 2.57Kit/s]
	:: Loading data from cache previously downloaded from www.hprd.org
	:: Extracting tgz data                                                                                                                               
	:: Ready. Resulted `tgz extracted data` of type dict of unicode strings.                                                                             
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/72f76ffea18e507b9d0fd9297226191d-HPRD_FLAT_FILES_041310.tar.gz`.
        Processing PTMs from HPRD -- busy:   0%|          | 23.0/4.67K [00:00<00:20, 226it/s]
	:: Loading 'refseqp' to 'uniprot' mapping table
        Processing PTMs from HPRD -- finished: 100%|██████████| 4.67K/4.67K [00:01<00:00, 3.29Kit/s]
	:: Loading data from cache previously downloaded from mimp.baderlab.org
	:: Ready. Resulted `plain text` of type unicode string.                                                                                              
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/cace5d811a51dd29de00abf9267c08e0-phosphorylation_data.tab`.
	:: Loading data from cache previously downloaded from kinase.com
	:: Ready. Resulted `plain text` of type unicode string.                                                                                              
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/eeb42195630e3f1c3abc7bb4487c102d-Table%20S2.txt`.
        Processing PTMs from MIMP -- finished: 100%|██████████| 17.0K/17.0K [00:13<00:00, 1.29Kit/s]
	:: Loading data from cache previously downloaded from phosphonetworks.org
	:: Ready. Resulted `plain text` of type unicode string.                                                                                              
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/7856f40462d035fc66ec4f3fd74edf05-highResolutionNetwork.csv`.
        Processing PTMs from PhosphoNetworks -- finished: 100%|██████████| 4.42K/4.42K [00:00<00:00, 4.51Kit/s]
	:: Loading data from cache previously downloaded from phospho.elm.eu.org
	:: Extracting tgz data                                                                                                                               
	:: Ready. Resulted `tgz extracted data` of type dict of unicode strings.                                                                             
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/90581004e4c132cc50e3c08e4294d3af-phosphoELM_vertebrate_latest.dump.tgz`.
	:: Loading data from cache previously downloaded from phospho.elm.eu.org
	:: Ready. Resulted `plain text` of type unicode string.                                                                                              
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/06a2ce5395900c217a2108aa5dd6e44f-kinases.html`.
        Processing PTMs from phosphoELM -- finished: 100%|██████████| 2.43K/2.43K [00:00<00:00, 3.91Kit/s]
	:: Loading data from cache previously downloaded from dbptm.mbc.nctu.edu.tw
	:: Ready. Resulted `tgz extracted data` of type dict of unicode strings.                                                                             
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/6bf9e1416d4635031d824080f24525a2-N-linked.tgz`.
	:: Loading data from cache previously downloaded from dbptm.mbc.nctu.edu.tw
	:: Ready. Resulted `tgz extracted data` of type dict of unicode strings.                                                                             
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/076038aa317d12b10231ad543983f046-O-linked.tgz`.
	:: Loading data from cache previously downloaded from dbptm.mbc.nctu.edu.tw
	:: Ready. Resulted `tgz extracted data` of type dict of unicode strings.                                                                             
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/e6fe5c6b9f3a55330b93627427e03580-C-linked.tgz`.
	:: Loading data from cache previously downloaded from dbptm.mbc.nctu.edu.tw
	:: Ready. Resulted `tgz extracted data` of type dict of unicode strings.                                                                             
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/24fe81aadcd4d0d8e0bae7b9d128e989-Phosphorylation.tgz`.
	:: Loading data from cache previously downloaded from dbptm.mbc.nctu.edu.tw
	:: Ready. Resulted `tgz extracted data` of type dict of unicode strings.                                                                             
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/4366939c454b57358be3569abf90e364-Acetylation.tgz`.
	:: Loading data from cache previously downloaded from dbptm.mbc.nctu.edu.tw
	:: Ready. Resulted `tgz extracted data` of type dict of unicode strings.                                                                             
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/5194c620102533019d7352758cb59e96-Methylation.tgz`.
	:: Loading data from cache previously downloaded from dbptm.mbc.nctu.edu.tw
	:: Ready. Resulted `tgz extracted data` of type dict of unicode strings.                                                                             
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/2fbe62a1b4339358124d1c543df70df0-Myristoylation.tgz`.
	:: Loading data from cache previously downloaded from dbptm.mbc.nctu.edu.tw
	:: Ready. Resulted `tgz extracted data` of type dict of unicode strings.                                                                             
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/84a2cdcbc4db85ed031e8029bba07185-Palmitoylation.tgz`.
	:: Loading data from cache previously downloaded from dbptm.mbc.nctu.edu.tw
	:: Ready. Resulted `tgz extracted data` of type dict of unicode strings.                                                                             
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/ba086659818d09cdd19827a635388e85-Prenylation.tgz`.
	:: Loading data from cache previously downloaded from dbptm.mbc.nctu.edu.tw
	:: Ready. Resulted `tgz extracted data` of type dict of unicode strings.                                                                             
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/205a79b3c49ea7dc6c4024adccd98589-Carboxylation.tgz`.
	:: Loading data from cache previously downloaded from dbptm.mbc.nctu.edu.tw
	:: Ready. Resulted `tgz extracted data` of type dict of unicode strings.                                                                             
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/b591f8dd0464a31ae3e9645a2877d4aa-Sulfation.tgz`.
	:: Loading data from cache previously downloaded from dbptm.mbc.nctu.edu.tw
	:: Ready. Resulted `tgz extracted data` of type dict of unicode strings.                                                                             
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/993079c14c9f731648f797c141967027-Ubiquitylation.tgz`.
	:: Loading data from cache previously downloaded from dbptm.mbc.nctu.edu.tw
	:: Ready. Resulted `tgz extracted data` of type dict of unicode strings.                                                                             
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/33f68ffb9d728824a13c5eadc656fc1c-Sumoylation.tgz`.
	:: Loading data from cache previously downloaded from dbptm.mbc.nctu.edu.tw
	:: Ready. Resulted `tgz extracted data` of type dict of unicode strings.                                                                             
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/a63e0d01c4732c90c07e73f660f3954e-Nitrosylation.tgz`.
        Processing PTMs from dbPTM -- finished: 100%|██████████| 223K/223K [00:00<00:00, 220Kit/s]
	:: Loading data from cache previously downloaded from www.phosphosite.org
	:: Ready. Resulted `gz extracted data` of type file object.                                                                                          
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/8182cf0471d401885aa59b9900a99e60-Kinase_Substrate_Dataset.gz`.
	:: Loading data from cache previously downloaded from ftp.ncbi.nih.gov
	:: Ready. Resulted `plain text` of type file object.                                                                                                 
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/6e6a9dce3cf617fdfa58ce73d6a9b51c-homologene.data`.
	:: Loading 'refseqp' to 'uniprot' mapping table
	:: Loading data from cache previously downloaded from ftp.ncbi.nih.gov
	:: Ready. Resulted `plain text` of type file object.                                                                                                 
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/6e6a9dce3cf617fdfa58ce73d6a9b51c-homologene.data`.
	:: Loading 'refseqp' to 'uniprot' mapping table
	:: Loading data from cache previously downloaded from ftp.ncbi.nih.gov
	:: Ready. Resulted `plain text` of type file object.                                                                                                 
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/6e6a9dce3cf617fdfa58ce73d6a9b51c-homologene.data`.
	:: Loading 'refseqp' to 'uniprot' mapping table
	:: Loading data from cache previously downloaded from ftp.ncbi.nih.gov
	:: Ready. Resulted `plain text` of type file object.                                                                                                 
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/6e6a9dce3cf617fdfa58ce73d6a9b51c-homologene.data`.
	:: Loading 'refseqp' to 'uniprot' mapping table
	:: Loading data from cache previously downloaded from ftp.ncbi.nih.gov
	:: Ready. Resulted `plain text` of type file object.                                                                                                 
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/6e6a9dce3cf617fdfa58ce73d6a9b51c-homologene.data`.
	:: Loading 'refseqp' to 'uniprot' mapping table
	:: Loading data from cache previously downloaded from ftp.ncbi.nih.gov
	:: Ready. Resulted `plain text` of type file object.                                                                                                 
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/6e6a9dce3cf617fdfa58ce73d6a9b51c-homologene.data`.
	:: Loading 'refseqp' to 'uniprot' mapping table
	:: Loading data from cache previously downloaded from ftp.ncbi.nih.gov
	:: Ready. Resulted `plain text` of type file object.                                                                                                 
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/6e6a9dce3cf617fdfa58ce73d6a9b51c-homologene.data`.
Processing PTMs from PhosphoSite: initializing:   0%|          | 0.00/10.6K [00:00<?, ?it/s]
	:: Loading 'refseqp' to 'uniprot' mapping table
        Processing PTMs from PhosphoSite -- finished: 100%|██████████| 10.6K/10.6K [00:03<00:00, 3.32Kit/s]
	:: Directionality set for 175 interactions
	   based on known (de)phosphorylation events.
 
Let's see how enzyme-substrate interactions are represented:

Let's see how enzyme-substrate interactions are represented:

In [34]:
x
print(pa.graph.es['ptm'][111][0])
Domain-motif interaction:
  Domain in protein P17612-1:
	Name: unknown
	Range: 0-0
	3D structures: 
  PTM: phosphorylation in protein O95644-1
    Motif: unknown

    Residue: Residue S-245 in protein O95644-1
  Data sources: Signor, PhosphoSite
  References: 
  3D structures: 

 
All details can be accessed as attributes of this object:

All details can be accessed as attributes of this object:

In [37]:
x
pa.graph.es['ptm'][111][0].ptm.residue
Out[37]:
Residue S-245 in protein O95644-1
In [38]:
x
pa.graph.es['ptm'][111][0].ptm.residue.number
Out[38]:
245
 
Between 2 protiens multiple enzyme-substrate relationships might occure:

Between 2 protiens multiple enzyme-substrate relationships might occure:

In [40]:
 
print(pa.graph.es['ptm'][111][1])
Domain-motif interaction:
  Domain in protein P17612-1:
	Name: unknown
	Range: 0-0
	3D structures: 
  PTM: phosphorylation in protein O95644-1
    Motif: unknown

    Residue: Residue S-269 in protein O95644-1
  Data sources: Signor, PhosphoSite
  References: 
  3D structures: 

 
We can load also many other annotations. Just for example here we load complex memberships from Corum database and cell type expressions from the Human Protein Atlas (HPA):

We can load also many other annotations. Just for example here we load complex memberships from Corum database and cell type expressions from the Human Protein Atlas (HPA):

In [41]:
 
pa.load_corum()
pa.load_hpa()
	:: Loading data from cache previously downloaded from mips.helmholtz-muenchen.de
	:: Ready. Resulted `plain text` of type unicode string.                                                                                              
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/3571354f160dc5d03397a0e9813c458f-allComplexes.txt`.
        Processing data -- finished: 100%|██████████| 3.59K/3.59K [00:00<00:00, 32.0Kit/s]
        Downloading `normal_tissue.csv.zip` from www.proteinatlas.org -- 4.52MB downloaded: 6.09Mit [00:01, 5.67Mit/s]             00, 462Kit/s]
	:: Ready. Resulted `zip extracted data` of type dict of file objects.                                                                                
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/162c9fee49adeea10cc2f7b99592a9c9-normal_tissue.csv.zip`.
	:: Loading 'uniprot-sec' to 'uniprot-pri' mapping table
	:: Loading data from cache previously downloaded from www.uniprot.org
	:: Ready. Resulted `plain text` of type unicode string.                                                                                              
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/842e6f2bc63115660aec8aff917330ce-`.
	:: Loading data from cache previously downloaded from ftp.uniprot.org
	:: Ready. Resulted `plain text` of type file object.                                                                                                 
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/49314fe217bf0f2a5544a2c4314b4adf-sec_ac.txt`.
        Reading from file -- finished: 0.00it [00:00, ?it/s]
	:: Loading 'genesymbol' to 'trembl' mapping table
	:: Loading 'genesymbol' to 'swissprot' mapping table
	:: Loading 'genesymbol-syn' to 'swissprot' mapping table
        Downloading `cancer.csv.zip` from www.proteinatlas.org -- 5.05MB downloaded: 5.57Mit [00:02, 2.24Mit/s]5M [00:02<00:00, 345Kit/s]
	:: Ready. Resulted `zip extracted data` of type dict of file objects.                                                                                
	:: Local file at `/home/denes/Dokumentumok/pw/dev/src/cache/8092155dfad3ed61a7ba405960b99a82-cancer.csv.zip`.
 
We can also load the canonical pathways (e.g. RTK, WNT, Hedgehog, etc):

We can also load the canonical pathways (e.g. RTK, WNT, Hedgehog, etc):

In [ ]:
 
pa.load_all_pathways()
 
Or set functional annotations i.e. whether a protein is a receptor or transcription factor:

Or set functional annotations i.e. whether a protein is a receptor or transcription factor:

In [ ]:
x
pa.set_receptors()
#pa.set_transcription_factors()
#pa.set_kinases()
#pa.set_druggability()
#pa.set_disease_genes()
xxxxxxxxxx
 
<h2>4: Selecting subnetworks</h2>

4: Selecting subnetworks

 
In order to extract the data from ``pypath`` or OmniPath for your model you can apply many different methods and you need to be creative. Just a few examples. First let's take the  first neighbours network around EGFR:

In order to extract the data from pypath or OmniPath for your model you can apply many different methods and you need to be creative. Just a few examples. First let's take the first neighbours network around EGFR:

In [44]:
nb_egfr = pa.neighbourhood_network('P00533')
 
This returns an igraph object:

This returns an igraph object:

In [48]:
 
print('Graph with %u nodes and %u edges.' % (nb_egfr.vcount(), nb_egfr.ecount()))
Graph with 114 nodes and 381 edges.
xxxxxxxxxx
This was easy. As a next example let's say we have a set of proteins and we want to create a subnetwork containing all of them. Let's see what vertex attributes we have:

This was easy. As a next example let's say we have a set of proteins and we want to create a subnetwork containing all of them. Let's see what vertex attributes we have:

In [52]:
pa.graph.vertex_attributes()
Out[52]:
['type',
 'name',
 'nameType',
 'originalNames',
 'ncbi_tax_id',
 'exp',
 'label',
 'sources',
 'references',
 'slk_pathways',
 'g2p_ligand',
 'g2p_receptor',
 'ca1_function',
 'ca1_location',
 'atg',
 'complexes',
 'adrenal gland:glandular cells',
 'appendix:glandular cells',
 'appendix:lymphoid tissue',
 'bone marrow:hematopoietic cells',
 'breast:adipocytes',
 'breast:glandular cells',
 'breast:myoepithelial cells',
 'bronchus:respiratory epithelial cells',
 'caudate:glial cells',
 'caudate:neuronal cells',
 'cerebellum:cells in granular layer',
 'cerebellum:cells in molecular layer',
 'cerebellum:Purkinje cells',
 'cerebral cortex:endothelial cells',
 'cerebral cortex:glial cells',
 'cerebral cortex:neuronal cells',
 'cerebral cortex:neuropil',
 'cervix: uterine',
 'colon:endothelial cells',
 'colon:glandular cells',
 'colon:peripheral nerve/ganglion',
 'duodenum:glandular cells',
 'endometrium 1:cells in endometrial stroma',
 'endometrium 1:glandular cells',
 'endometrium 2:cells in endometrial stroma',
 'endometrium 2:glandular cells',
 'epididymis:glandular cells',
 'esophagus:squamous epithelial cells',
 'fallopian tube:glandular cells',
 'gallbladder:glandular cells',
 'heart muscle:myocytes',
 'hippocampus:glial cells',
 'hippocampus:neuronal cells',
 'kidney:cells in glomeruli',
 'kidney:cells in tubules',
 'liver:bile duct cells',
 'liver:hepatocytes',
 'lung:macrophages',
 'lung:pneumocytes',
 'lymph node:germinal center cells',
 'lymph node:non-germinal center cells',
 'nasopharynx:respiratory epithelial cells',
 'oral mucosa:squamous epithelial cells',
 'ovary:ovarian stroma cells',
 'pancreas:exocrine glandular cells',
 'pancreas:islets of Langerhans',
 'parathyroid gland:glandular cells',
 'placenta:decidual cells',
 'placenta:trophoblastic cells',
 'prostate:glandular cells',
 'rectum:glandular cells',
 'salivary gland:glandular cells',
 'seminal vesicle:glandular cells',
 'skeletal muscle:myocytes',
 'skin 1:fibroblasts',
 'skin 1:keratinocytes',
 'skin 1:Langerhans',
 'skin 1:melanocytes',
 'skin 2:epidermal cells',
 'small intestine:glandular cells',
 'smooth muscle:smooth muscle cells',
 'soft tissue 1:adipocytes',
 'soft tissue 1:fibroblasts',
 'soft tissue 1:peripheral nerve',
 'soft tissue 2:adipocytes',
 'soft tissue 2:fibroblasts',
 'soft tissue 2:peripheral nerve',
 'spleen:cells in red pulp',
 'spleen:cells in white pulp',
 'stomach 1:glandular cells',
 'stomach 2:glandular cells',
 'testis:cells in seminiferous ducts',
 'testis:Leydig cells',
 'thyroid gland:glandular cells',
 'tonsil:germinal center cells',
 'tonsil:non-germinal center cells',
 'tonsil:squamous epithelial cells',
 'urinary bladder:urothelial cells',
 'vagina:squamous epithelial cells',
 'ovary:follicle cells',
 'soft tissue 1:chondrocytes',
 'soft tissue 2:chondrocytes',
 'hair:cells in cortex/medulla',
 'hair:cells in cuticle',
 'hair:cells in external root sheath',
 'hair:cells in internal root sheath',
 'cerebral cortex:processes/nerve bundles',
 'lactating breast:lactating glandular cells',
 'adrenal gland:cells in zona fasciculata',
 'adrenal gland:cells in zona glomerulosa',
 'adrenal gland:cells in zona reticularis',
 'adrenal gland:medullary cells',
 'skin:sebaceous cells',
 'skin:secretory cells',
 'skin:sweat ducts',
 'retina:cells in inner nuclear layer',
 'retina:cells in photoreceptor layer',
 'retina:ganglion cells',
 'retina:nerve fibers in inner plexiform layer',
 'retina:nerve fibers in nerve fiber layer',
 'retina:nerve fibers in outer plexiform layer',
 'pituitary gland:cells in anterior',
 'pituitary gland:cells in posterior',
 'hypothalamus:glial cells',
 'hypothalamus:neuronal cells',
 'hypothalamus:processes/nerve bundles',
 'lactating breast:ductal cells',
 'eye:lens',
 'hippocampus:processes/nerve bundles',
 'rec',
 'tf',
 'kegg_pathways',
 'signor_pathways',
 'signalink_pathways']
 
For example get a network of genes expressed in kidney glomeruli at high levels. We already loaded these annotations from Human Protein Atlas.

For example get a network of genes expressed in kidney glomeruli at high levels. We already loaded these annotations from Human Protein Atlas.

In [54]:
x
glom_h = set([v.index for v in pa.graph.vs if v['kidney:cells in glomeruli'] == 3])
len(glom_h)
Out[54]:
305
In [55]:
xxxxxxxxxx
 
glom_net = pa.graph.induced_subgraph(glom_h)
In [56]:
xxxxxxxxxx
glom_net.vcount()
Out[56]:
305
In [ ]:
import igraph
igraph.plot(glom_net)
 
Or select only those which are not expressed in the tubules but in the glomeruli:

Or select only those which are not expressed in the tubules but in the glomeruli:

In [59]:
 
glom_spec = (set([v.index for v in pa.graph.vs if v['kidney:cells in glomeruli'] == 3]) -
             set([v.index for v in pa.graph.vs if v['kidney:cells in tubules'] > 1]))
In [60]:
xxxxxxxxxx
len(glom_spec)
Out[60]:
31
In [61]:
xxxxxxxxxx
glom_spec_net = pa.graph.induced_subgraph(glom_spec)
In [ ]:
xxxxxxxxxx
 
igraph.plot(glom_spec_net)
As we see these 31 genes are not really connected. Maybe we should add other nodes to connect them. Another example: select the autophagy related genes based on Autophagy Regulatory Network and look up all paths between these genes and the glomerule specific genes:

As we see these 31 genes are not really connected. Maybe we should add other nodes to connect them. Another example: select the autophagy related genes based on Autophagy Regulatory Network and look up all paths between these genes and the glomerule specific genes:

In [62]:
atg = set([v.index for v in pa.graph.vs if v['atg']])
In [63]:
xxxxxxxxxx
 
len(atg)
Out[63]:
76
In [68]:
atg_glom_paths = pa.find_all_paths(list(atg), list(glom_spec))
Looking up all paths up to length 2: initializing:   0%|          | 0.00/2.36K [00:00<?, ?it/s]
        Looking up all paths up to length 2 -- busy:   0%|          | 0.00/2.36K [00:00<?, ?it/s]
        Looking up all paths up to length 2 -- busy:   3%|▎         | 64.0/2.36K [00:00<00:03, 612it/s]
        Looking up all paths up to length 2 -- busy:   5%|▌         | 126/2.36K [00:00<00:03, 601it/s] 
        Looking up all paths up to length 2 -- busy:  10%|▉         | 225/2.36K [00:00<00:04, 479it/s]
        Looking up all paths up to length 2 -- busy:  13%|█▎        | 300/2.36K [00:00<00:04, 435it/s]
        Looking up all paths up to length 2 -- busy:  17%|█▋        | 408/2.36K [00:00<00:04, 459it/s]
        Looking up all paths up to length 2 -- busy:  24%|██▍       | 575/2.36K [00:01<00:03, 528it/s]
        Looking up all paths up to length 2 -- busy:  29%|██▊       | 676/2.36K [00:01<00:02, 611it/s]
        Looking up all paths up to length 2 -- busy:  33%|███▎      | 784/2.36K [00:01<00:02, 580it/s]
        Looking up all paths up to length 2 -- busy:  37%|███▋      | 879/2.36K [00:01<00:02, 651it/s]
        Looking up all paths up to length 2 -- busy:  43%|████▎     | 1.00K/2.36K [00:01<00:01, 747it/s]
        Looking up all paths up to length 2 -- busy:  49%|████▉     | 1.16K/2.36K [00:01<00:01, 744it/s]
        Looking up all paths up to length 2 -- busy:  59%|█████▊    | 1.38K/2.36K [00:02<00:01, 671it/s]
        Looking up all paths up to length 2 -- busy:  63%|██████▎   | 1.49K/2.36K [00:02<00:01, 617it/s]
        Looking up all paths up to length 2 -- busy:  67%|██████▋   | 1.58K/2.36K [00:02<00:01, 678it/s]
        Looking up all paths up to length 2 -- busy:  80%|███████▉  | 1.88K/2.36K [00:02<00:00, 877it/s]
        Looking up all paths up to length 2 -- busy:  89%|████████▉ | 2.10K/2.36K [00:02<00:00, 922it/s]
        Looking up all paths up to length 2 -- busy:  95%|█████████▌| 2.24K/2.36K [00:03<00:00, 715it/s]
        Looking up all paths up to length 2 -- finished:  98%|█████████▊| 2.30K/2.36K [00:03<00:00, 715it/s]
        Looking up all paths up to length 2 -- finished: 100%|██████████| 2.36K/2.36K [00:03<00:00, 749it/s]
In [69]:
atg_glom_paths
Out[69]:
[[897, 437, 1260],
 [644, 391, 676],
 [644, 1859, 1407],
 [644, 1859, 1096],
 [644, 1096],
 [644, 634, 1096],
 [644, 1712, 1096],
 [644, 1236, 1096],
 [644, 156, 1358],
 [644, 1712, 3023],
 [644, 1859, 1109],
 [644, 1712, 1109],
 [644, 1252, 1109],
 [644, 1265, 1260],
 [644, 1859, 3955],
 [644, 1859, 767],
 [1669, 3161, 676],
 [1669, 1859, 1407],
 [1669, 1358, 1407],
 [1669, 634, 1096],
 [1669, 1682, 1096],
 [1669, 1712, 1096],
 [1669, 240, 1096],
 [1669, 1859, 1096],
 [1669, 1358],
 [1669, 1933, 1358],
 [1669, 1712, 3023],
 [1669, 1712, 1109],
 [1669, 1859, 1109],
 [1669, 1859, 3955],
 [1669, 1859, 767],
 [1927, 1207, 1666],
 [1927, 1332, 1666],
 [1927, 1790, 3718],
 [1927, 1207, 150],
 [1927, 578, 790],
 [1927, 695, 2330],
 [1927, 578, 676],
 [1927, 522, 1407],
 [1927, 1043, 1407],
 [1927, 1358, 1407],
 [1927, 1559, 1407],
 [1927, 79, 2746],
 [1927, 445, 1096],
 [1927, 578, 1096],
 [1927, 634, 1096],
 [1927, 770, 1096],
 [1927, 932, 1096],
 [1927, 1236, 1096],
 [1927, 1712, 1096],
 [1927, 1775, 1096],
 [1927, 1809, 1096],
 [1927, 1358],
 [1927, 1712, 3023],
 [1927, 1065, 1109],
 [1927, 1439, 1109],
 [1927, 1712, 1109],
 [1927, 1790, 219],
 [1927, 1629, 868],
 [1927, 1852, 868],
 [1927, 1104, 1260],
 [1927, 1265, 1260],
 [1927, 1531, 1260],
 [1927, 1790, 1260],
 [1927, 433, 767],
 [1927, 1776, 767],
 [2440, 695, 2330],
 [2440, 24, 676],
 [2440, 240, 1096],
 [2440, 437, 1260],
 [2440, 24, 118],
 [525, 1207, 1666],
 [525, 3, 150],
 [525, 1110, 150],
 [525, 1207, 150],
 [525, 3, 676],
 [525, 24, 676],
 [525, 3891, 676],
 [525, 522, 1407],
 [525, 1859, 1407],
 [525, 634, 1096],
 [525, 644, 1096],
 [525, 1712, 1096],
 [525, 1236, 1096],
 [525, 1809, 1096],
 [525, 277, 1096],
 [525, 1859, 1096],
 [525, 1874, 1096],
 [525, 1402, 1096],
 [525, 509, 1096],
 [525, 1927, 1358],
 [525, 1712, 3023],
 [525, 1712, 1109],
 [525, 1859, 1109],
 [525, 1369, 1260],
 [525, 1531, 1260],
 [525, 1859, 3955],
 [525, 24, 118],
 [525, 3, 3967],
 [525, 4555, 3967],
 [525, 1859, 767],
 [1682, 1344, 790],
 [1682, 578, 790],
 [1682, 1973, 2208],
 [1682, 578, 676],
 [1682, 522, 1407],
 [1682, 1859, 1407],
 [1682, 1878, 180],
 [1682, 1809, 1096],
 [1682, 578, 1096],
 [1682, 1859, 1096],
 [1682, 1096],
 [1682, 1874, 1096],
 [1682, 1878, 1096],
 [1682, 874, 1096],
 [1682, 634, 1096],
 [1682, 673, 1096],
 [1682, 1669, 1358],
 [1682, 1859, 1109],
 [1682, 1629, 868],
 [1682, 1368, 1260],
 [1682, 1265, 1260],
 [1682, 1859, 3955],
 [1682, 1344, 767],
 [1682, 1859, 767],
 [1043, 3083, 150],
 [1043, 1110, 150],
 [1043, 1773, 150],
 [1043, 1054, 790],
 [1043, 578, 790],
 [1043, 695, 2330],
 [1043, 578, 676],
 [1043, 1407],
 [1043, 578, 1096],
 [1043, 1874, 1096],
 [1043, 1407, 1358],
 [1043, 1927, 1358],
 [1043, 1209, 1109],
 [1043, 36, 1260],
 [1043, 1350, 1260],
 [661, 1682, 1096],
 [789, 1248, 150],
 [789, 1248, 1109],
 [1531, 1790, 3718],
 [1531, 3, 150],
 [1531, 578, 790],
 [1531, 3, 676],
 [1531, 578, 676],
 [1531, 923, 676],
 [1531, 1559, 1407],
 [1531, 635, 1407],
 [1531, 2441, 1988],
 [1531, 578, 1096],
 [1531, 2686, 1096],
 [1531, 277, 1096],
 [1531, 874, 1096],
 [1531, 1927, 1358],
 [1531, 1536, 1109],
 [1531, 1209, 1109],
 [1531, 1536, 219],
 [1531, 1790, 219],
 [1531, 1104, 1260],
 [1531, 1255, 1260],
 [1531, 1260],
 [1531, 1265, 1260],
 [1531, 1790, 1260],
 [1531, 2387, 1260],
 [1531, 1368, 1260],
 [1531, 1369, 1260],
 [1531, 437, 1260],
 [1531, 2525, 1260],
 [1531, 3, 3967],
 [1945, 1790, 3718],
 [1945, 1790, 219],
 [1945, 1265, 1260],
 [1945, 1531, 1260],
 [1945, 1790, 1260],
 [539, 1332, 1666],
 [539, 1043, 1407],
 [539, 1531, 1260],
 [667, 1927, 1358],
 [926, 330, 1666],
 [926, 1928, 1988],
 [926, 932, 1096],
 [926, 1927, 1358],
 [926, 1629, 868],
 [673, 1208, 3718],
 [673, 1682, 1096],
 [673, 1096],
 [673, 1236, 1096],
 [802, 3, 150],
 [802, 3, 676],
 [802, 1043, 1407],
 [802, 3, 3967],
 [1315, 1790, 3718],
 [1315, 2142, 150],
 [1315, 1184, 150],
 [1315, 1722, 790],
 [1315, 695, 2330],
 [1315, 24, 676],
 [1315, 923, 676],
 [1315, 522, 1407],
 [1315, 1043, 1407],
 [1315, 1859, 1407],
 [1315, 634, 1096],
 [1315, 240, 1096],
 [1315, 1809, 1096],
 [1315, 1859, 1096],
 [1315, 1209, 1109],
 [1315, 1859, 1109],
 [1315, 1790, 219],
 [1315, 1699, 4188],
 [1315, 1265, 1260],
 [1315, 1790, 1260],
 [1315, 1859, 3955],
 [1315, 24, 118],
 [1315, 1859, 767],
 [549, 1054, 790],
 [549, 1043, 1407],
 [549, 1927, 1358],
 [1320, 1927, 1358],
 [809, 578, 790],
 [809, 695, 2330],
 [809, 578, 676],
 [809, 277, 1096],
 [809, 932, 1096],
 [809, 1712, 1096],
 [809, 578, 1096],
 [809, 1775, 1096],
 [809, 1927, 1358],
 [809, 1712, 3023],
 [809, 1439, 1109],
 [809, 1712, 1109],
 [809, 1209, 1109],
 [809, 1252, 1109],
 [809, 1157, 2276],
 [433, 1248, 150],
 [433, 1344, 790],
 [433, 79, 2746],
 [433, 18, 1096],
 [433, 1133, 1096],
 [433, 240, 1096],
 [433, 1927, 1358],
 [433, 1209, 1109],
 [433, 1248, 1109],
 [433, 1344, 767],
 [433, 1741, 767],
 [433, 2558, 767],
 [433, 767],
 [1969, 644, 1096],
 [1969, 1236, 1096],
 [1969, 1368, 1260],
 [1969, 1265, 1260],
 [1332, 1666],
 [1332, 1207, 1666],
 [1332, 1207, 150],
 [1332, 1809, 1096],
 [1332, 277, 1096],
 [1332, 634, 1096],
 [1332, 1236, 1096],
 [1332, 240, 1096],
 [1332, 1927, 1358],
 [1332, 437, 1260],
 [1332, 596, 767],
 [1085, 1790, 3718],
 [1085, 522, 1407],
 [1085, 1790, 219],
 [1085, 1790, 1260],
 [1341, 3223, 1666],
 [1341, 1207, 1666],
 [1341, 1207, 150],
 [1341, 1809, 1096],
 [1341, 673, 1096],
 [1341, 1712, 1096],
 [1341, 634, 1096],
 [1341, 1712, 3023],
 [1341, 1740, 3023],
 [1341, 1712, 1109],
 [1604, 644, 1096],
 [1604, 1927, 1358],
 [1604, 433, 767],
 [1604, 2558, 767],
 [1859, 330, 1666],
 [1859, 2857, 1666],
 [1859, 1208, 3718],
 [1859, 1790, 3718],
 [1859, 3, 150],
 [1859, 1184, 150],
 [1859, 1248, 150],
 [1859, 1773, 150],
 [1859, 695, 2330],
 [1859, 3, 676],
 [1859, 1407],
 [1859, 3096, 180],
 [1859, 1096],
 [1859, 1236, 1096],
 [1859, 240, 1096],
 [1859, 644, 1096],
 [1859, 1682, 1096],
 [1859, 932, 1096],
 [1859, 1407, 1358],
 [1859, 1669, 1358],
 [1859, 1109],
 [1859, 1209, 1109],
 [1859, 1248, 1109],
 [1859, 1790, 219],
 [1859, 1629, 868],
 [1859, 1852, 868],
 [1859, 1368, 1260],
 [1859, 437, 1260],
 [1859, 1790, 1260],
 [1859, 3955],
 [1859, 3, 3967],
 [1859, 2710, 3967],
 [1859, 767],
 [1859, 2926, 767],
 [330, 1666],
 [330, 3223, 1666],
 [330, 1859, 1407],
 [330, 1809, 1096],
 [330, 1859, 1096],
 [330, 634, 1096],
 [330, 1712, 1096],
 [330, 1712, 3023],
 [330, 1859, 1109],
 [330, 1712, 1109],
 [330, 1350, 1260],
 [330, 1859, 3955],
 [330, 1859, 767],
 [1739, 1207, 1666],
 [1739, 330, 1666],
 [1739, 1790, 3718],
 [1739, 3, 150],
 [1739, 1207, 150],
 [1739, 1248, 150],
 [1739, 3, 676],
 [1739, 522, 1407],
 [1739, 1859, 1407],
 [1739, 79, 2746],
 [1739, 634, 1096],
 [1739, 1236, 1096],
 [1739, 1775, 1096],
 [1739, 240, 1096],
 [1739, 1809, 1096],
 [1739, 1859, 1096],
 [1739, 1874, 1096],
 [1739, 932, 1096],
 [1739, 509, 1096],
 [1739, 1669, 1358],
 [1739, 1248, 1109],
 [1739, 1859, 1109],
 [1739, 1790, 219],
 [1739, 1790, 1260],
 [1739, 1368, 1260],
 [1739, 1369, 1260],
 [1739, 1859, 3955],
 [1739, 3, 3967],
 [1739, 1859, 767],
 [589, 1859, 1407],
 [589, 1859, 1096],
 [589, 1712, 1096],
 [589, 1712, 3023],
 [589, 1859, 1109],
 [589, 1712, 1109],
 [589, 1859, 3955],
 [589, 1859, 767],
 [1107, 3, 150],
 [1107, 3, 676],
 [1107, 1043, 1407],
 [1107, 1809, 1096],
 [1107, 634, 1096],
 [1107, 1252, 1109],
 [1107, 3, 3967],
 [1236, 1332, 1666],
 [1236, 1207, 1666],
 [1236, 1207, 150],
 [1236, 391, 676],
 [1236, 1859, 1407],
 [1236, 1809, 1096],
 [1236, 2084, 1096],
 [1236, 1859, 1096],
 [1236, 1096],
 [1236, 1874, 1096],
 [1236, 634, 1096],
 [1236, 644, 1096],
 [1236, 673, 1096],
 [1236, 1712, 1096],
 [1236, 1927, 1358],
 [1236, 1712, 3023],
 [1236, 1859, 1109],
 [1236, 1712, 1109],
 [1236, 1252, 1109],
 [1236, 1368, 1260],
 [1236, 1859, 3955],
 [1236, 1859, 767],
 [982, 1712, 1096],
 [982, 1669, 1358],
 [982, 1927, 1358],
 [982, 1726, 1358],
 [982, 1712, 3023],
 [982, 1712, 1109],
 [982, 1629, 868],
 [982, 1531, 1260],
 [474, 1332, 1666],
 [474, 1054, 790],
 [474, 24, 676],
 [474, 522, 1407],
 [474, 1043, 1407],
 [474, 1809, 1096],
 [474, 277, 1096],
 [474, 634, 1096],
 [474, 1712, 1096],
 [474, 1927, 1358],
 [474, 1712, 3023],
 [474, 1712, 1109],
 [474, 1209, 1109],
 [474, 24, 118],
 [3547, 634, 1096],
 [3548, 1669, 1358],
 [2529, 236, 150],
 [2529, 1236, 1096],
 [2529, 1252, 1109],
 [996, 1054, 790],
 [996, 923, 676],
 [996, 79, 2746],
 [996, 1927, 1358],
 [996, 1851, 2276],
 [996, 1724, 2276],
 [996, 1531, 1260],
 [1638, 330, 1666],
 [1638, 3223, 1666],
 [1638, 1054, 790],
 [1638, 578, 790],
 [1638, 695, 2330],
 [1638, 578, 676],
 [1638, 1859, 1407],
 [1638, 1878, 180],
 [1638, 578, 1096],
 [1638, 1859, 1096],
 [1638, 1874, 1096],
 [1638, 1878, 1096],
 [1638, 644, 1096],
 [1638, 673, 1096],
 [1638, 1236, 1096],
 [1638, 1775, 1096],
 [1638, 1859, 1109],
 [1638, 1368, 1260],
 [1638, 1265, 1260],
 [1638, 1859, 3955],
 [1638, 1859, 767],
 [1897, 644, 1096],
 [1897, 1236, 1096],
 [1897, 1629, 868],
 [622, 1820, 676],
 [622, 1859, 1407],
 [622, 1878, 180],
 [622, 1859, 1096],
 [622, 1874, 1096],
 [622, 1878, 1096],
 [622, 2686, 1096],
 [622, 915, 1096],
 [622, 932, 1096],
 [622, 1712, 1096],
 [622, 1236, 1096],
 [622, 1669, 1358],
 [622, 1712, 3023],
 [622, 1859, 1109],
 [622, 1712, 1109],
 [622, 1859, 3955],
 [622, 1859, 767],
 [623, 1184, 150],
 [1134, 1790, 3718],
 [1134, 522, 1407],
 [1134, 1712, 1096],
 [1134, 634, 1096],
 [1134, 1712, 3023],
 [1134, 1712, 1109],
 [1134, 1209, 1109],
 [1134, 1790, 219],
 [1134, 1629, 868],
 [1134, 1790, 1260],
 [244, 1790, 3718],
 [244, 3, 150],
 [244, 1248, 150],
 [244, 1973, 2208],
 [244, 3, 676],
 [244, 24, 676],
 [244, 633, 676],
 [244, 923, 676],
 [244, 1043, 1407],
 [244, 1859, 1407],
 [244, 93, 180],
 [244, 74, 1096],
 [244, 1682, 1096],
 [244, 1775, 1096],
 [244, 240, 1096],
 [244, 277, 1096],
 [244, 1859, 1096],
 [244, 1874, 1096],
 [244, 509, 1096],
 [244, 1669, 1358],
 [244, 1726, 1358],
 [244, 1209, 1109],
 [244, 1248, 1109],
 [244, 1252, 1109],
 [244, 1859, 1109],
 [244, 1790, 219],
 [244, 1699, 4188],
 [244, 1265, 1260],
 [244, 1790, 1260],
 [244, 1368, 1260],
 [244, 1369, 1260],
 [244, 1531, 1260],
 [244, 1859, 3955],
 [244, 24, 118],
 [244, 3, 3967],
 [244, 4555, 3967],
 [244, 1859, 767],
 [1141, 1927, 1358],
 [1780, 330, 1666],
 [1780, 1859, 1407],
 [1780, 1878, 180],
 [1780, 644, 1096],
 [1780, 1859, 1096],
 [1780, 1236, 1096],
 [1780, 1878, 1096],
 [1780, 1859, 1109],
 [1780, 1859, 3955],
 [1780, 1859, 767],
 [1780, 2926, 767],
 [759, 1928, 1988],
 [759, 1809, 1096],
 [759, 1712, 1096],
 [759, 634, 1096],
 [759, 1712, 3023],
 [759, 1712, 1109],
 [759, 437, 1260],
 [759, 1265, 1260],
 [1148, 1208, 3718],
 [1148, 1184, 150],
 [1148, 79, 2746],
 [1148, 1809, 1096],
 [1148, 673, 1096],
 [1148, 1712, 1096],
 [1148, 1927, 1358],
 [1148, 1712, 3023],
 [1148, 1593, 1109],
 [1148, 1712, 1109],
 [1148, 1209, 1109],
 [1148, 1629, 868],
 [1148, 1531, 1260],
 [1148, 104, 2290],
 [2558, 1344, 790],
 [2558, 1414, 790],
 [2558, 79, 2746],
 [2558, 18, 1096],
 [2558, 1133, 1096],
 [2558, 240, 1096],
 [2558, 1344, 767],
 [2558, 433, 767],
 [2558, 767]]
In [70]:
x
import itertools
atg_glom_nodes = set(itertools.chain(*atg_glom_paths))
In [71]:
xxxxxxxxxx
atg_glom_net = pa.graph.induced_subgraph(atg_glom_nodes)
In [73]:
x
 
atg_glom_net.vcount()
Out[73]:
759
In [ ]:
 
igraph.plot(atg_glom_net)
 
Similarly we can add connecting nodes within certain group of proteins:

Similarly we can add connecting nodes within certain group of proteins:

In [77]:
glom_conn_paths = pa.find_all_paths(list(glom_spec), list(glom_spec), mode = 'ALL')
Looking up all paths up to length 2: initializing:   0%|          | 0.00/961 [00:00<?, ?it/s]
        Looking up all paths up to length 2 -- busy:   0%|          | 0.00/961 [00:00<?, ?it/s]
        Looking up all paths up to length 2 -- busy:  26%|██▋       | 253/961 [00:00<00:00, 2.50Kit/s]
        Looking up all paths up to length 2 -- busy:  51%|█████     | 490/961 [00:00<00:00, 2.42Kit/s]
        Looking up all paths up to length 2 -- busy:  77%|███████▋  | 742/961 [00:00<00:00, 2.44Kit/s]
        Looking up all paths up to length 2 -- finished:  77%|███████▋  | 742/961 [00:00<00:00, 2.44Kit/s]
        Looking up all paths up to length 2 -- finished: 100%|██████████| 961/961 [00:00<00:00, 2.37Kit/s]
In [78]:
x
glom_conn_nodes = set(itertools.chain(*glom_conn_paths))
In [79]:
xxxxxxxxxx
glom_conn_net = pa.graph.induced_subgraph(glom_conn_nodes)
In [80]:
 
glom_conn_net.vcount()
Out[80]:
47