Home » Tutorial

Tutorial

Contents
Exercise 1 : Setting up the connection
Exercise 2 : Testing the connection
Exercise 3 : Creating a pathway
Exercise 3.1 : Adding Datanodes
Exercise 3.2 : Adding Lines
Exercise 3.3 : Annotating elements
Exercise 3.4 : Removing elements
Exercise 4 : Visualizing data
Exercise 4.1 : Importing data
Exercise 4.2 : Creating visualization
Exercise 4.3 : Applying visualization
Exercise 5 : Exporting Pathways
Exercise 6 : Pathway statistics
Exercise 7 : Query Pathway

Exercise 1 : Setting up the connection

Setting up the connection to PathVisioRPC depends on the programming language you are using and module or package that is providing the XML-RPC functions. Below you’ll find a few examples.

R code:
library("XMLRPC")
server = "http://localhost:7777"

The xmlrpc library is not part of the standard R distribution it needs to be downloaded and installed.

Python code:

>>> import xmlrpclib
>>> server="http://localhost:7777"

The xmlrpclib is part of the standard python distribution so no additional packages have to be downloaded.

Perl code:
Not tried out yet!

Back to the top


Exercise 2 : Testing the connection

To check if the connection was set up successfully the test function should be tried. It takes no arguments and returns ‘it works!’ after calling.

R code:
xml.rpc(server,"PathVisio.test")
>it works!

Python code:

>>> server.PathVisio.test()
'it works!'

Perl code:
Not tried out yet!

Back to the top


Exercise 3 : Creating a pathway

To create a pathway use the createPathway function. It takes string arguments indicating the name of the pathway, author of the pathway, name of the organism in which the pathway is found and the folder where the pathway should be saved.
The input arguments are all optional, in case you provide all empty strings to the method a pathway with the name “.gpml” will be created in your home directory. It is advised to provide at least a pathway name, in order to refer to the pathway later or to edit it.

R code:

Python code:

>>> server.PathVisio.createPathway("TestPathway","Anwesha Dutta", "Homo Sapiens", "/home/anwesha/PathVisioRPC/final-version")
'TestPathway pathway GPML file created in /home/anwesha/PathVisioRPC/final-version'

Perl code:

Exercise 3.1 : Adding Datanodes

Once a pathway is created you can then add DataNodes to it. A datanode is a term used in PathVisio to denote a gene, metabolite, protein etc. Essentially any biological entity that forms a node in a pathway and has some biological meaning associated with it.
Datanodes can be added using the addDataNode function. It takes two strings as an input, one specifying the name of the gene/protein … and the other specifying the biological type of the DataNode, i.e whether it is a Gene, Protein or a Metabolite and returns a Datanode ID, an unique identifier using which you can refer to the added DataNode later. Optionally the pathway name or ID can be supplied otherwise the DataNodes are added to the pathway open.
Datanodes can also be directly annotated at this point, by providing an identifier and datasource for each gene/protein …. Annotation can also be done later using the annotateElement function.

R code:

Python code:

>>> server.PathVisio.addDataNode("/home/anwesha/PathVisioRPC/final-version/TestPathway.gpml","A","GeneProduct","6262","Entrez Gene","/home/anwesha/PathVisioRPC/final-version")
'A; ID : id93926132 added to TestPathway.gpml'
>>> server.PathVisio.addDataNode("/home/anwesha/PathVisioRPC/final-version/TestPathway.gpml","B","GeneProduct","6265","Entrez Gene","/home/anwesha/PathVisioRPC/final-version")
'B; ID : id93926133 added to TestPathway.gpml'

Perl code:

Exercise 3.2 : Adding Lines

A line is a term used in PathVisio to denote a reaction or an interaction. For eg. a metabolic reaction or a protein protein interaction, etc. Once some DataNodes have been added to a pathway, you can add some lines describing the relationships between the DataNodes. To add lines to a pathway the addLine function can be used, it takes strings as input specifying the DataNode IDs of the two nodes which are to be connected, the startlinetype and the endlinetype and returns a unique Line ID.
The start/endlinetypes can be any one of the following : Line, TBar, Arrow and the many mim glyphs a comprehensive list of ehich can be obtained from the API using the function getLineType. By default the starttlineype and endlinetype of a line is set to Line

Optionally the pathway name or ID can be supplied otherwise the Lines are added to the pathway open.
Lines can also be directly annotated at this point, by providing an identifier and datasource for each reaction/interaction …. Annotation can also be done later using the annotateElement function.

R code:

Python code:

>>> server.PathVisio.addLine("/home/anwesha/PathVisioRPC/final-version/TestPathway.gpml","A->B","A","B","","mim-conversion","REACT_1876","Reactome","/home/anwesha/PathVisioRPC/final-version")
'A->B; ID : idf066edaa added to TestPathway'

Perl code:

Exercise 3.3 : Annotating elements

The elements (Datanodes and Lines) added to the pathway can be annotated with an identifier from a database using annotateElement function. It takes a number of string inputs, namely the name of the element that is to be annotated, the identifier and the datasource to be used. Optionally the pathway name or ID can be supplied otherwise the applications looks for elements in the pathway open.
R code:

Python code:

>>> server.PathVisio.annotateElement("/home/anwesha/PathVisioRPC/final-version/TestPathway.gpml", "A", "6345","EntrezGene","/home/anwesha/PathVisioRPC/final-version")
'A with ID : id93926132 in TestPathway has been annotated'

Perl code:

Exercise 3.4 : Removing elements

The datanodes and Lines in a pathway can be removed from the pathway if found unnecessary using the function removeElement. It takes a string specifying the datanode or line identifier, alternatively the name of the element to be removed can be used, but that risks unwanted removal of all instances of the element. Optionally the pathway name or ID can be supplied otherwise the elements are removed from the pathway open.
R code:

Python code:

>>> server.PathVisio.removeElement("/home/anwesha/PathVisioRPC/final-version/TestPathway.gpml","A","/home/anwesha/PathVisioRPC/final-version")
'A in TestPathway removed'

Perl code:

Back to the top


Exercise 4 : Visualizing data

PathVisio provides functionality to visualize various types of data on the pathways of interest. Visual representation of numerical data can often ease understanding.

Exercise 4.1 : Importing data

The data to be visualized on the pathway first needs to be imported, which can be done using the importData function. This takes a number of arguments, first the datafile that is to be imported, the name of the file can be supplied or altenatively an object containing the data is also acceptable ; the second argument is the databasefile that is to be used for mapping the data and the species needs to be mentioned. The function creates a number of files in the working directory, the .pgex file and a log file reporting the details for the import and listing any errors if encountered.

R code:

Python code:

>>> server.PathVisio.importData("/home/anwesha/PathVisioRPC/INPUT/testdoc.txt", "/home/anwesha/PathVisioRPC/PathVisio-Data/gene databases/Hs_Derby_20110601.bridge","/home/anwesha/PathVisioRPC/final-version")
'testdoc.txt imported & testdoc.txt.pgex created!'

Perl code:

Exercise 4.2 : Creating a visualization

Data can be visualized on pathways using colours. A gradient colouring scheme can be used to visualize a range of data on a gene (eg. Fold Change) while a rule can be applied for certain criteria allowing only the genes which qualify to be coloured (e.g P Value < 0.05)
R code:

Python code:

>>> server.PathVisio.createVisualization( "/home/anwesha/PathVisioRPC/final-version/testdoc.txt.pgex","Fold Change", "blue,white,red", "-7,0,4", "P.Value", "green", "[P.Value]<0.05")
'/home/anwesha/PathVisioRPC/final-version/testdoc.txt.pgex.xml-visualization file created!'
>>> server.PathVisio.visualizeData("/home/anwesha/PathVisioRPC/pathways/WP15_48242.gpml", "/home/anwesha/PathVisioRPC/final-version/testdoc.txt.pgex","/home/anwesha/PathVisioRPC/testnewfunc/Hs_Derby_20110601.bridge","/home/anwesha/PathVisioRPC/final-version")
'Data Visualized on Selenium Pathway, results in /home/anwesha/PathVisioRPC/final-version/Selenium Pathway'

Perl code:

Exercise 4.3 : Applying a visualization

Applying the visualization rules and gradients

Python code:
>>> server.PathVisio.visualizeData("/home/anwesha/PathVisioRPC/pathways/WP15_48242.gpml", "/home/anwesha/PathVisioRPC/final-version/testdoc.txt.pgex","/home/anwesha/PathVisioRPC/testnewfunc/Hs_Derby_20110601.bridge","/home/anwesha/PathVisioRPC/final-version")
'Data Visualized on Selenium Pathway, results in /home/anwesha/PathVisioRPC/final-version/Selenium Pathway'

Back to the top


Exercise 5 : Exporting Pathways

The pathways created can be exported in a number of formats, namely png, svg, html, gpml, cytoscape network using the exportPathway function. Optionally the pathway name or ID can be supplied otherwise the the pathway open is exported. The pathways can also be exported with data visualized in the different image formats, in that case the pgex file to be used needs to be provided as an argument.
R code:

Python code:

Perl code:

Back to the top


Exercise 6 : Pathway statistics

Over representation analysis can be performed on a list of pathways to determine pathways contain the most changed expression, taking into consideration the number of genes/proteins on the pathway that were measured in the experiment and the number of those that are differentially expressed. For the statistical analysis a criterion to determine which genes/proteins are of interest can be provided by the user, which can for example be based on a minimum p value and if desired a minimum change between experimental groups by calculating the Z score and sorted accordingly. The function calculatePathwayStatistics, takes the address of the folder containing the pathways to be analysed and a criteria (for eg. P Value < 0.05) on the basis of which the Z score should be calculated.

Z score is calculated by dividing the number of genes/proteins/metabolites matching the above mentioned criteria by the total number of genes/proteins/metabolites present in the dataset and also found in the pathways.
In the working directory you get a folder named contents and a hyperlinked html page index. The html page is divided into 3 sections: The top left panel contains information regarding the number of gene/proteins found in the pathways, gene/protein meeting the criterion set by the user, the criteria for calculating Z Score is also displayed. The address of the dataset, pathway directory and the gene database used for mapping is shown next. After that there is a table containing the clickable list of pathways analysed, the next column gives the number of genes/proteins present in the statistical analysis table that were also found on that particular pathway, the column next to that gives the number of genes within that total number that meet the criteria for Z Score Calculation and the last column gives the resulting Z Score for that pathway.

R code:

Python code:

Perl code:

Back to the top


Exercise 7 : Query Pathway

A pathway can be queried to provide a list of the elements present in a pathway, i.e a list of the genes, proteins, metabolites present in the pathway with the getDataNodes function. It takes the name or ID of the pathway to be queried as an argument, in case the pathway name or ID is not provided a list of elements from the current open pathway is returned.
R code:
Python code:
Perl code:

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: