JSON Datums for the Pathway Logic Signal Transduction Model
Here we provide access to a JSON representation of the Pathway Logic (PL) Signal Transduction Model (STM) and datum collection. The latest data was uploaded on December 28 2016. It is assumed that the user is familiar with JSON and the PL concepts. More information available from the PL website or by request to Carolyn Talcott.
Datums are a computable representation of experimental results, used to infer rules in the STM models. The JSON representation is 16dec28datums.json and is 57M and contains over 60K datums. datum-json-scheme.txt is an informal description of the structure of datum objects. See the datums page for a description of datum elements.
JSONBThe JSONB directory contains a full controlled vocabulary, and is available as a zip file.
maps PL names to descriptions, including metadata with fields for reference database accession and synonyms includes all of the protein, chemical, and ops curated, whether or not they appear in a rule
- the sort hierarchy for the executable model
the sort hierarchy for datum parsing includes the above, and things like handles assays, detection ops, etc.
The Rule Knowledge Base (RKB) comes in two flavors, one generated by including common rules (a rule label begins with digits followed by "c." vs digits followed by ".") and one generated without including common rules. The former is much bigger. Dishes represent experimental setups -- what cells are expressing and any treatment. Dishnets add reachable rules to the dish making an executable model. In some dishes common rules are essential to fill gaps. In some cases, like the EgfDish, which is more highly curated, they tend to add noise. Common rules may be useful in starting with a model generated without them and more selectively adding those that improve the model instead of the simple automatic add them all approach that the Pathway Logic Assistant (PLA) takes.The JSON for the RKB without common rules is contained in the zip file cltJSONUnc.zip directory. It contains:
- as above restricted to entities used
- descriptions of locations
- descriptions of modifications
- descriptions of occurrences
each rule has a name/identifier, and three lists of occurrence names: controls, consumed, produced consumed and produced are the rule input (left hand side) and output (righthand side). Controls represents the required biological context such as enzymes, cofactors, etc. Each rule also has an evidence field.
- JSON for each dish (experimental setup) and its reachable rules.
- Each dishnet has fields init (names of the occurrences in the initial state), and rids (names of the rules).
- The names of occurrences and rules refer to the desctipions in occs.json and rules.json
- 1575 rule
- 1378 occurrences
- The EgfDish contains 215 rules, 127 initial occurrences
- STM with Common Rules
- 5220 rules
- 2216 occurrences
- The EgfDish contains 904 rules, 144 initial occurrences
- STM with out Common Rules
- 2131 rules
- 1783 occurrences
- The Egfdish contains 297 rules 150 initial occurrences