Pathway Logic Guided Tour: STM7

STM7: A Pathway Logic Model of Intracellular Signal Transduction

Introduction

STM7 is a formal knowledge base containing information about the changes that occur in the proteins inside a cell in response to exposure to receptor ligands, chemicals, or various stresses. Ultimately, we intend to link these responses to changes in gene expression and cellular morphological phenotypes such as induction of proliferation and differentiation. So far we have been able to produce hypothetical pathways leading to the activation of a few transcription factors and epigenetic regulators. PLA provides a graphical representation of the STM7 reaction network and the ability to browse and analyze this network.

The STM7 model is a graph of the changes that occur when peptides, chemicals, or stresses are added to cultured cells. For simplification the graph is divided into 32 maps -- each representing one stimuli. These maps are called "dishes" because they describe an initial state which corresponds to the state of the cells in a culture dish at the start of an experiment, plus a stimuli. The base initial state in each of these dishes is the same--the cells are untreated and quiescent. Each dish is obtained by adding to the base state one ligand or stimulus and it's associated receptor(s) if applicable. The currently available stimuli are: Adriamycin, Anisomycin, Bleomycin, anti-Cd40, Csf1, Egf, Etoposide, Hgf, Hydroxyurea, Ifna or Ifnb family members, Ifng, Igf1, IL1, IL2, IL4, IL6, IL12, Insulin, ionizing radiation, Lipopolysaccharide (Lps), Neocarzinostatin (NCS), Ngf, Pdgf, Phorbal 12-myristate 13-acetate (PMA), PolyIC, Serum, Sorbitol, Tgfb1, Various TLR9 ligands, Tnf, UV radiation, and Vegf.

In addition we have combined all the maps into one map with all the stimuli (allDish) and one with no stimuli (NoStimDish). The 'no stim' dish has all of the receptors for the ligands, but as the name suggests, no ligands. Effects caused by more than one ligand can be viewed by creating a new dish using the edit command in the "Select Dish" menu. The 'no stim' dish is a convenient starting point (in the dish editor, ask Maude). An example of dish construction is provided in the SmallKB guide. The network displayed by PLA for a given dish is the network of reactions that have been shown to occur in response to the stimulus.

The experimental evidence for each rule is supplied in datum form. Each datum represents a result from a experiment published in a refereed journal. In some cases, the evidence is weak or incomplete or simply based on correlations. Some attempt has been made to point out the weak spots but the bulk of the effort has gone into collecting and presenting the data in such a way that the biologist can decide for himself. Datums are written in a shorthand designed to be both readable by biologists and parsible by a computer. A manual containing the abbreviations and conventions used by the shorthand can be found in the curation notebook . Information about unfamiliar protein names can be found by clicking on the protein representation in the PLA viewer.

The Dishes

Cell biologists spend a lot of time throwing things (stimuli) at cells in culture and studying the results. The size of (number of rules and datums) each dish depends on the available literature and the inclination of the curator to find it.

• Peptide ligands with specific receptors

In some cases, the stimulus is a peptide ligand that binds specifically to one or more receptors. The results depends on which receptors are present on the cell and in a state to bind the ligand. The dishes that use ligands as stimuli are:

Dish	Ligand(s)	Receptor(s)	# of Rules	# of Datums
Csf1Dish	Csf1	Csf1R	29	106
EgfDish	Egf	EgfR	129	2281
HgfDish	Hgf	HgfR	48	329
IfnabDish	Ifna family, Ifnb1	IfnaR1, IfnaR2	14	161
IfngDish	Ifng	IfngR1, IfngR2	13	194
Igf1Dish	Igf1	Igf1R	16	133
IL1Dish	IL1a, IL1b	IL1R1, IL1Rap	67	1288
IL2Dish	IL2	IL2Ra, IL4Rb, IL2Rg	49	772
IL4Dish	IL4	IL4R, IL2Rg	30	281
IL6Dish	IL6	IL6R, Gp130	23	282
IL12Dish	IL12	IL12Rb1, IL12Rb2	15	313
InsDish	Ins	InsR	39	732
NgfDish	Ngf	NgfR, Trka	46	301
PdgfDish	Pdgfa, Pdgfb	PdgfR	25	203
Tgfb1Dish	Tgfb1	TgfbR1, TgfbR2	57	968
TnfDish	Tnf	TnfR1	87	1915
VegfDish	Vegfa	VegfR2	11	55

• Surface protein ligands with specific receptors

Cd40Lg is a protein that is expressed on the surface of activated T-cells that binds to Cd40 on the surface of antigen presenting B-cells. To avoid confounding factors that arise from using one cell to activate another, the response to Cd40Lg is usually studied by treating cells with an antibody to Cd40.

Dish	Stimulus	Receptor	# of Rules	# of Datums
Cd40Dish	anti-Cd40	Cd40	35	264

• Multiple stimuli with multiple receptors

When studying signal transduction it is usually not helpful to perform experiments in which unknown mixtures of ligands are added to cells. One exception is the addition of serum to cells that have been made quiescent by removal of serum. The addition of serum to cells in G0 phase causes them to re-enter the cell cycle, make new protein, replicate DNA, and divide. This provdes researchers with information about the changes in proteins during the cell cycle regardless of the actual stimuli that caused them.

Dish	Stimulus	Receptor(s)	# of Rules	# of Datums
SerumDish	Serum	many	29	303

• Stimuli that turn on Pattern Recognition Receptors

The pattern recognition receptors (PRRs) are used by cells to recognize pathogen-associated molecular patterns such as bacterial carbohydrates, nucleic acids, peptides, and lipids. Certain compounds have been found that mimic bacterial products so that the effects of PRR binding can be studied.

Dish	Stimulus	Receptor(s)	# of Rules	# of Datums
LpsDish	Lps	TLR2, TLR4	98	1072
PolyICDish	PolyIC	TLR3	25	270
TLR9Dish	TLR9 Ligands	TLR9	19	110

• Chemical Stimuli

Researchers often use chemicals to stimulate intracellular events in the absence of receptor binding. Anisomycin and sorbitol cause phosphorylation and activation of the stress responsive Mapks of the Jnk and P38 families. PMA (phorbol 12-myristate 13-acetate) is very similar to diacylglycerol, the natural activator of classic PKC isoforms. Adriamycin is considered to be a mimetic for DNA damage, genotoxic stress that activates Tp53. Bleomycin induces DNA strand breaks and is commonly used a radiomimetic to study the acetylation of ATM and activation of Chek2. Etoposide induces breaks in double stranded DNA and prevents repair by topoisomerase II binding. Hydroxyurea has the capacity to generate DSBs only in replicating cells so it is useful in the study of cross talk between Atm and Atr. NCS (Neocarzinostatin) intercalates into DNA and induces double-strand breaks which makes it another compound that is used to mimic ionizing radiation in signal transduction studies.

Dish	Stimulus	Targets(s)/Effect(s)	# of Rules	# of Datums
AdriamycinDish	Adriamycin	Tp53	8	65
AnisoDish	Anisomycin	Jnks and P38s	27	191
BleomycinDish	Bleomycin	DNA breaks	8	39
EtoposideDish	Etoposide	DS-DNA strand breaks	15	159
HydroxyureaDish	Hydroxyurea	DS-DNA strand breaks	7	32
NCSDish	NCS	DS-DNA strand breaks	10	59
PMADish	PMA	Pkcs	33	309
SorbitolDish	Sorbitol	Jnks and P38s	11	66

• Other stresses

Two major stresses that cause DNA damage are ionizing radiation (IR) and ultraviolet light (UV). IR causes double strand DNA breaks and the activation of Atm and Prkdc whereas UV causes single strand DNA damage and the activation of Atr. UV also leads to the activation of Jnks and P38s and phosphorylation of Tp53. Thus, IR and UV are valuable tools for studying the activation of cell cycle checkpoints.

Dish	Stimulus	Targets(s)/Effect(s)	# of Rules	# of Datums
UVDish	Ultra Violet Light	dsDNA	45	490
IRadDish	Ionizing Radiation	ssDNA, Jnks and P38s	29	910

The purpose of Pathway Logic models such as STM7 is to:

help biologists keep track of published experimental results
assist in hypothesis making
arrange (and rearrange) the curated data into conclusions (graphically)
locate holes in experimental evidence things that have been accepted as truth for so long that they have become the truth without any evidence
integrate pathway components into a common framework to see connections between proteins in various "pathways" not discussed in the experimental or review literature
provide accurate maps suitable for computational analysis

This demo assumes that the user is already familiar with the use of PLA as described in the SmallKB guide and the PLA reference manual .

In this guided tour, double click on a figure to see an enlarged version. Single click to restore the figure to its original size.

Guided Tour: Revisiting the Egf Pathway using PLA and the STM7 KB.

Introduction

In this guided tour we focus on response to Egf stimulation. This version of the Egf Network looks different from others because it is based on experimental data only. An Egf rule is derived from changes observed in response to Egf binding to the EgfR. If assumptions about protein interactions based on over-expression or test-tube reactions or similarity with other tyrosine kinase receptors are used, notes are included in the evidence files. The change in state of the components of the pathway are recorded as Pathway Logic rules which are then assembled into Petri net format by PLA. The resulting graph is loaded into PLA for viewing and analysis. A link to the evidence supporting each rule is provided in the viewer. Once a Petri net is assembled, PLA can be used to find Paths through the network to a chosen goal. The path we are going to discuss in this demo is the path between Egf binding to the EgfR and activation of Erk1 and Erk2 (abstracted as Erks as most of the existing evidence does not distinguish between these two members of the family). This path is traditionally considered to be:

Egf → EgfR → Grb2 → Sos1 → Ras → Raf → Meks → Erks

References:

Biocarta
The Cancer Cell Map at Memorial Sloan-Kettering Cancer Center.
Kegg
Reactome

Now let us see what it looks like in the STM7 model.

The STM7 Demo

If you haven't already started the STM7 online demo, you can do so by clicking here and following the instructions.

When PLA starts up, a knowledge base manager window (title: PLA KBManager) appears in the upper left corner of your screen. The KBManager displays a list of available knowledge bases (rule sets). "RKB" is selected by default and will be highlighted in blue.

Start by selecting a predefined dish: press the "Select Dish" button on the right side of the KBManager window. This produces a menu with two options: "Edit" and "PreDefined". Select (click on) "PreDefined". This produces a (sub)menu with a list of pre-defined dishes. Select "EgfDish" (by clicking on the corresponding menu item).

Shortly a PLA viewer window with tab titled graph2:EgfDish will appear on your screen displaying the Egf Network (the network of reactions reachable starting from the occurrences in EgfDish). [Note that the graph numbers in titles may differ if you have created graphs or nets in addition to those discussed in the demo. This is fine.]

egf

Screen shot of the Egf dish graph

A network of reactions is displayed as a graph with two kinds of nodes. Ovals represent occurrences—proteins or chemicals in a specific state and location. For example the oval labeled Cdc42-GDP@CLi represents the protein Cdc42 bound to GDP and located on the inside (cytoplasmic side) of the cell membrane (@CLi).

We use the following abbreviations.

Locations

Loc	Description	Loc	Description
XOut	outside the cell, the medium or supernatant	CLm	in/across the cell membrane
CLi	attached to the inside of the cell membrane	CLc	in the cytoplasm
NUc	in the nucleus	Sig	a way to display comments or cellular phenotypes in a network

Receptor Complexes

Receptor	Description	Receptor	Description
Cd40C	forms in response to Cd40Lg binding to Cd40	Csf1RC	forms in response to Csf1 binding to Csf1R
EgfRC	forms in response to Egf binding to EgfR	GP130C	forms in response to IL6 binding to Gp130 and IL6R
HgfRC	forms in response to Hgf binding to HgfR	IfnaRC	forms in response to members of the Ifna family or Ifnb1 binding to IfnaR2 and IfnaR1
IfngRC	forms in response to Ifng binding to IfngR1, Jak1, IfngR2, and Jak2	Igf1RC	forms in response to Igf1 binding to Igf1R
IL1R1C	forms in response to IL1 binding to IL1R1 and IL2Rap	IL2RC	forms in response to IL2 binding to IL2Rb, IL2Ra, and IL2Rg
IL4RC	forms in response to IL4 binding to IL4R IL2Rg	IL12Rb1C	A preassembled complex consisting of IL12Rb1 and Tyk2
IL12Rb2C	A preassembled complex consisting of IL12Rb2 and Jak2	IL12RC	A complex that forms in response to IL12 binding to IL12Rb1, Tyk2, IL12Rb2, and Jak2
IL22RC	forms in response to IL22 binding to IL22Ra1 and IL10Rb	InsRC	forms in response to Ins binding to InsR
NgfRC	forms in response to Ngf binding to NgfR	PdgfRC	forms in response to Pdgf binding to PdgfR
Tgfb1RC	forms in response to Tgfb1 binding to TgfbR1 and TgfbR2	TLR2C	forms in response to TLR2 ligands binding to TLR2
TLR3C	forms in response to TLR3 ligands binding to TLR3	TLR4C	forms in response to TLR4 ligands binding to TLR4
TLR9C	forms in response to TLR9 ligands binding to TLR9	TnfR1C	forms in response to Tnf binding to TnfR1
TrkaC	forms in response to Ngf binding to Trka	VegfR2C	forms in response to Vegfa binding to VegfR2

Non Receptor Complexes

Non Receptor	Description	Non Receptor	Description
RNAC	forms around the capped mRNA complex	CHR	forms around Chromatin

Modifications

Modification	Description	Modification	Description
acetyl!site	acetylated on a specific site	act	activated
degraded	degraded	dimer	dimerized
GDP	bound to GDP	GTP	bound to GTP
K48ubiq	covalently bound to ubiquitin polymerized via K48 linkages	K63ubiq	covalently bound to ubiquitin polymerized via K63 linkages
p50	a 50kD cleavage product	phos	phosphorylated
phos!site	phosphorylated at a specific site	sumo	sumoylated
ubiq	ubiquitinated	Yphos	phosphorylated on tyrosine
off	not transcribing mRNA	on	transcribing mRNA

Darker colored ovals represent occurrences in the initial state (the selected dish when the network is generated by choosing a dish). Lighter colored ovals represent potential states/locations of these components.

Rectangles represent rules. The label in a rectangle is its (abbreviated) identifier in the knowledge base. Solid arrows from an occurrence to a rule indicate that the occurrence is a reactant (rule input). Solid arrows from a rule to an occurrence indicate that the occurrence is a product (rule output). Dashed arrows from an occurrence to a rule indicate that the occurrence is a modifier/enzyme/catalyst—it is necessary for the reaction to take place but is not changed by the reaction. The graph has been simplified by checking the Redraw? box followed by clicking on the 'HideEdges' button in the tool bar. This operation removes redundant edges. Meaning: if there are edges connecting A → B → C and A → C, then the edge leading from A to C will be removed, as it is logically implied by the other two edges. In the sequel all graphs will have been so simplified.

Subnets and Pathnets

The network shown in the graph2:EgfDish includes all the reactions that fire in response to Egf. Suppose we are only interested in those reactions leading to the activation of Erks. This is accomplished by finding the occurrence that contains activated Erks, marking it as a goal, and asking for the relevant subnet:

Select Occurrences tab (in the lower right panel).
Type Erks in the Find Occurrence window.
Press on the Tab key until "Erks-act-phos(TEY)@CLc" is highlighted in the list below.
Click on the Find button. This will locate and highlight the requested occurrence node in the main view window.
Click on this node to open the context tab in the lower right panel.
Click on the "make occ a goal" button in the context window. The node will turn green.
Press the button labeled Subnet in the tool bar. Shortly a new tab will appear displaying a graph representing a subnet containing all the paths leading to activation of the Erks. (graph4:S(graph2)).

erksS

Screen shot of the subnet relevant to activation of Erks

Graph4 is still a network in that it contains more than one route to get to the Erks-act-EgfR goal. Clicking on the FindPath button in the tool bar will find one path (usually the shortest).

erksP

Screen shot of a pathway activating of Erks

Graph6 is called a pathnet because the set of rule nodes represent a particular flow of signal from initial state to goals (called an execution or computation).

This path is particularly simple because it uses "common rules" (indicated by "c" at the end of the rule number) that lack detailed biological context information. To find another path, go back to graph4, click on rule 014c, click on the "hide rule" button in the context tab on the right, and click on "FindPath" again (graph8:P(graph4)).

erksPS

Another Erks pathway

If the Pathnet in graph8 is compared to the Subnet in graph4 (choose (graph4:S(graph2)) from the Compare menu), nodes of the path will be pink and alternative nodes will be highlighted in teal (graph11:C(graph8,graph4) ).

erksPS

The Erks pathway in the context of the subnet (edges hidden)

Comparison of the subnet with the pathnet shows that there is a choice of two major routes to the activation of Mlk3. Either Cdc42 or Rac1 can activate Mlk3. Cdc42 and Rac1 themselves are activated by different GEFs (guanine nucleotide exchange factors).

A subnet using Rac1 instead of Cdc42 can be generated by returning to the graph4 (the Erks subnet of EgfDish), marking Cdc42-GDP-CLi as an avoid (keeping 014c hidden) and pressing the FindPath button again (graph13:P(graph4)).

erksP2

Screen shot of a pathway activating Erks avoiding Cdc42

You can compare the two pathways by selecting the "Compare" menu from the menu bar, and selecting the original pathnet (the only one available). The pink shows the shared elements of the pathways, while the blue and teal indicate elements unique to the different pathways.

erksP2

Comparing pathways activating Erks using and avoiding Cdc42

Rules

The little gray rectangular nodes represent rules. A rule describes the transition of the state of the proteins in a cell to another state. Only the proteins involved in the transition are mentioned in the rule. Consider rule 196 at the bottom of the Pathnet (graph8). (You can see this rule in its own window by clicking on the rule node and then clicking on the "exploreRule" button in the resulting information window. The rule is now displayed in an explorer window and the node colors indicate whether there are any connections in the parent network. Place the mouse over the color key in the upper right for details.) The rule says that if the following occurrences are present:

IqGap1@CLc, Braf-act@CLc, Mlk3-act@CLc, Pi3k@EgfRC, Fak2-act@EgfRC, Hras-GTP@CLi, Erks@CLc, Pld1@CLi, Shoc2@CLc, Gab1-Yphos@EgfRC, Git1-Yphos-EgfRC, Shp2-Yphos@EgfRC, Mek1-act@CLc, Shc1-Yphos@EgfRC, Shc1-Yphos@EgfRC, Src1-act@EgfRC and the Egf:Egfr complex

then the Erks will become TEY phosphorylated and activated.

Evidence.

Where did all those bizarre requirements come from? Clicking on the rule node will open the information window. Clicking on the Show Evidence button will call up a document containing the experimental evidence that supports (or not) the rule.

The evidence page is written in datum shorthand and will look at first to be uninterpretable. Instructions on how to read the evidence can be found in the curation notebook. For those who find it easier to read each reference and dig out the evidence for themselves, we have provided the PubMed ids and the Figure/Table numbers of the relevant experiments.

Discussion of rule 196.

The evidence page for rule 196 reports on 237 experiments in which Erk1, Erk2, or both were either phosphorylated on their TEY site (T202/Y204 for Erk1, T185/Y187 for Erk2) or activated in an in vitro kinase assay. The peak response occurred around 5-10 min. Forty-nine different cell lines or primaries were represented.

Things that are required

The requirements listed above for the activation of Erks in response to Egf (rule 196) are determined by performing the stimulation in cells lacking proteins due to knock-out at the gene level [KO], knock-down by RNA interference [RNAi], mutation at the gene level [substitution], or expression of a dominant-negative mutant [addition]. Some of the required proteins such as Mek1, Hras, and Shc1 are familiar and mentioned in most of the published Egf Pathway diagrams. Surprisingly, although Raf1 is activated in response to Egf, there is plentiful evidence that Raf1 is NOT required for Erk phosphorylation or activation in response to Egf

Surprisingly, although Raf1 is activated in response to Egf, there is plentiful evidence that Raf1 is NOT required for Erk phosphorylation or activation in response to Egf [11296227, 11296228, 12048245, 17314031, 15258589]

References [17314031, 11296228] also show that Mek1/2 activation in response to Egf does not require Raf1.

So what activates Mek1? Other kinases capable of phosphorylating Meks at the SMANS site (S218/S222 for Mek1, S222,S226 for Mek2) are Braf [17314031], Araf [8621729], Pak2 or Pak3 [17314031], and Mlk3 [12738796].

Could Araf substitute for Raf1? Evidence that Araf is activated in response to Egf is equivocal (click here) and although Araf has been shown to be capable of increasing the kinase of activity of Mek1 in a cell-free system no experiments have be found testing the requirement of Araf for either Mek1-phos/act or Erks-phos/act in response to Egf.

Activated Pak1, Pak2, and Pak3 can phosphorylate Mek1 directly on S298 which leads to Mek1 autophosphorylation on (S218/S222) [17314031 Fig-4a].

Although Pak1 has been shown to be activated in response to Egf, phosphorylation of Meks and Erks at their activation sites in response to Egf was not affected by RNAi to Pak1 and Pak2 [16129686-Fig-3a]. In addition, there are reports that Erks are not activated when cotransfected with active Pak1 [8805275 Fig-5a, 9351825 Fig-6b, 16698790 Fig-6b].

Pak3 can be excluded because it is not activated irt Egf [11259591-Fig-5a].

Although Mlk3 can phosphorylate Mek1 directly [12738796] on an unknown site, its requirement by Braf and Erks does not require its kinase activity [16537381].

Furthermore, over-expression of Mlk3 actually inhibits the activation of Erks in response to Egf [12738796].

This is indicative of a scaffold or adaptor protein rather than a active member of a phosphorylation cascade. Mlk3 is required for activation of Braf [16537381], Mek1 [15258589] and Erks[15258589, 16537381] in response to Egf.

It is unfortunate that the papers that demonstrated that Braf was required for Erk phosphorylation [15199148, 15258589] did not perform one additional experiment to show that Braf is required for Mek1 activation in response to Egf. One of the objectives of Pathway Logic is to make it easier to discover such omissions.

Src and Fak2 are recruited to the EgfR complex and phosphorylated in response to Egf so it is logical to wonder if they are part of the path that activates Erks. Erk activation is only partially inhibited by expression of dominant-negative Src and Fak2. A partial inhibition suggests that there may be something that substitutes for the proteins. That idea is supported by the information that activation of Src in response to Egf requires Rala [10675331] but Erk phosphorylation does not require either Rala [10675331] or RalGds [11283245, 18285454].

The locations

The interactions of some of the required proteins such as Fak2, Shc1, Sos1, Src, and Git1 with phosphorylated EgfR have been extensively studied so that we have information about their change in location and modifications in response to Egf. Others such as Shoc2 and IqGap1 are recent findings and more curation (or experimentation) is needed to determine location and modifications. These proteins have been placed in the cytoplasm without modifications until more information is available.

Interestingly we have been unable to find any experiments in which Hras, Braf, Mek1, or Erks have been shown to coprecipitate (or not) with EgfR after Egf stimulation. Thus the accumulation of these proteins into the EgfR complex are merely a supposition of the curator. The lack of evidence for the location is duly noted at the top of each evidence file.

Things that are missing

A noticeable discrepancy is that Mek2 is not mentioned. Out of 43 experiments in which changes to Meks in response to Egf are recorded, only one of them looks at Mek2 specifically. In the cases where an anti-phospho-Mek antibody was used, the antibody was reported to be specific for only phospho-Mek1 or for both phospho-Mek1 and Mek2. There were no experiments that used an antibody specific for phospho-Mek2. Therefore, only Mek1 is represented in this pathnet.

Conclusions

The network presented here is not intended to be cast in stone. It is a set of hypotheses derived from the curated data that provide a logical explanation of how a signal propagates in response to an Egf stimulus.

This demo provides an illustration of:

How PLA can be used to help interpret experimental results.
The ease with which pathways can be assembled, drawn, and navigated.
How collecting experimental results in the form of datums can aid reasoning and discussion.
How gaps in experimental knowledge can be discovered in the process of formulating and checking the hypotheses.