Help:WikiPathways Sparql queries

From WikiPathways

Jump to: navigation, search

On http://sparql.wikipathways.org/ WikiPathways content is replicated. Currently this SPARQL endpoint is being developed.

This project is written up in the "Using the Semantic Web for Rapid Integration of WikiPathways with Other Biological Online Data Resources" paper.

Contents

Resources


Submit ideas

Prefixes

Below are example queries. For readability we have omitted the prefixes. We use the following prefixes: (Not complete yet)

PREFIX gpml:    <http://vocabularies.wikipathways.org/gpml#>
PREFIX wp:      <http://vocabularies.wikipathways.org/wp#> 
PREFIX wprdf:   <http://rdf.wikipathways.org/>
PREFIX biopax:  <http://www.biopax.org/release/biopax-level3.owl#> 
PREFIX cas:     <http://identifiers.org/cas/>
PREFIX dc:      <http://purl.org/dc/elements/1.1/> 
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX foaf:    <http://xmlns.com/foaf/0.1/> 
PREFIX ncbigene:<http://identifiers.org/ncbigene/>
PREFIX pubmed:  <http://www.ncbi.nlm.nih.gov/pubmed/> 
PREFIX rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 
PREFIX rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
PREFIX skos:    <http://www.w3.org/2004/02/skos/core#>
PREFIX xsd:     <http://www.w3.org/2001/XMLSchema#> 

Example queries

Queries with a * require a bit more time for results.

Metadata queries

List the information about the data sets in the SPARQL end point:

select distinct ?dataset str(?titleLit) as ?title ?date ?license where {
  ?dataset a void:Dataset ;
    dcterms:title ?titleLit ;
    dcterms:license ?license ;
    pav:createdOn ?date .
}

Execute


Pathway oriented queries

Get the species currently in WikiPathways with their respective URI's

PREFIX rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
PREFIX wp: <http://vocabularies.wikipathways.org/wp#>

SELECT DISTINCT ?organism str(?label) as ?name
WHERE {
    ?concept wp:organism ?organism ;
      wp:organismName ?label .
}

Execute

List pathways and their species

PREFIX dc:      <http://purl.org/dc/elements/1.1/> 
PREFIX wp:     <http://vocabularies.wikipathways.org/wp#>

SELECT DISTINCT str(?title) as ?pathway str(?label) as ?organism 
WHERE {
    ?pw dc:title ?title ;
      wp:organism ?organism ;
      wp:organismName ?label .
} 

Execute

List the species captured in WikiPathways and the number of pathways per species

PREFIX dc:      <http://purl.org/dc/elements/1.1/> 
PREFIX wp: <http://vocabularies.wikipathways.org/wp#>

SELECT DISTINCT ?organism str(?label) as ?name count(?pw) as ?pathwayCount
WHERE {
    ?pw dc:title ?title ;
      wp:organism ?organism ;
      wp:organismName ?label .
}
ORDER BY DESC(?pathwayCount)

Execute

List all pathways for species "Mus musculus"

The following query list all mouse pathways. ?wpIdentifier is the link through identifiers.org, ?pathway points to the rdf version of wikipathways and ?page is the revision which is loaded in the sparql endpoint.

PREFIX rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dc:      <http://purl.org/dc/elements/1.1/> 
PREFIX foaf:    <http://xmlns.com/foaf/0.1/> 
PREFIX wp: <http://vocabularies.wikipathways.org/wp#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>

SELECT DISTINCT ?wpIdentifier ?pathway ?page
WHERE {
    ?pathway dc:title ?title .
    ?pathway foaf:page ?page .
    ?pathway dc:identifier ?wpIdentifier .
    ?pathway wp:organismName "Mus musculus"^^xsd:string .
 }
ORDER BY ?wpIdentifier

Execute Perl




Get all pathways with a particular gene

List all pathways per instance of a particular gene or protein (wp:GeneProduct)

PREFIX wp:      <http://vocabularies.wikipathways.org/wp#>
PREFIX rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dcterms: <http://purl.org/dc/terms/>

SELECT DISTINCT ?pathway str(?label) as ?geneProduct
WHERE {
    ?geneProduct a wp:GeneProduct . 
    ?geneProduct rdfs:label ?label .
    ?geneProduct dcterms:isPartOf ?pathway .
    ?pathway a wp:Pathway .
    
    FILTER regex(str(?label), "CYP"). 
}

execute

Get all groups and complexes containing a particular gene

List all groups and complexes per instance of a particular gene or protein (wp:GeneProduct)

PREFIX wp:      <http://vocabularies.wikipathways.org/wp#>
PREFIX rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dcterms: <http://purl.org/dc/terms/>

SELECT DISTINCT ?pathway str(?label) as ?geneProduct
WHERE {
    ?geneProduct a wp:GeneProduct . 
    ?geneProduct rdfs:label ?label .
    ?geneProduct dcterms:isPartOf ?pathway .

    FILTER NOT EXISTS { ?pathway a wp:Interaction } .
    FILTER NOT EXISTS { ?pathway a wp:Pathway } .
    FILTER regex(str(?label), "CYP"). 
}

execute

Get all the genes on a particular pathway

List all the genes and proteins (wp:GeneProduct) associated with a particular pathway WPID.

PREFIX wp:      <http://vocabularies.wikipathways.org/wp#>
PREFIX rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dcterms: <http://purl.org/dc/terms/>

select distinct ?pathway str(?label) as ?geneProduct where {
    ?geneProduct a wp:GeneProduct . 
    ?geneProduct rdfs:label ?label .
    ?geneProduct dcterms:isPartOf ?pathway .
    ?pathway a wp:Pathway .
    ?pathway dcterms:identifier "WP1560"^^xsd:string . 
}

excute

Count the number of pathways per ontology term

In WikiPathways, pathways can be tagged with ontology terms from Pathway, Cell Line and Disease ontology. The following query returns a pathway count for each term from any of the available ontologies. These terms are collectively modeled as wp:pathwayOntology; but this includes all ontologies, not just the "Pathway" ontology.

SELECT DISTINCT ?pwOntologyTerm count(?pwOntologyTerm) as ?pathwayCount
 WHERE {
	?pathwayRDF wp:ontologyTag ?pwOntologyTerm .
 }
 ORDER BY DESC(?pathwayCount)

Execute

Get all pathways with a particular ontology term

In WikiPathways, pathways can be tagged with ontology terms from Pathway, Cell Line and Disease ontology. The following query returns a list of pathways tagged with a given term from any of the supported ontologies. These terms are collectively modeled as wp:pathwayOntology; but this includes all ontologies, not just the "Pathway" ontology.

 SELECT ?label as ?pwOntologyTerm ?pathway 
 WHERE {
	?pathwayRDF wp:pathwayOntology ?o .
        ?pathwayRDF foaf:page ?pathway .
        ?pathwayRDF dc:title ?title .
        ?o <http://www.w3.org/2000/01/rdf-schema#label> ?label .
        ?o rdfs:subClassOf ?superClass .
        ?superClass rdfs:label ?superClassLabel .

        FILTER regex(str(?label), "^cancer$")
 }

execute

Get all ontology terms for a particular pathway

List all the ontology terms tagged on a particular pathway.

SELECT ?o as ?pwOntologyTerm str(?titleLit) as ?title ?pathway 
WHERE {
  ?pathwayRDF wp:ontologyTag ?o ;
    foaf:page ?pathway ;
    dc:title ?titleLit ;
    dcterms:identifier "WP1560"^^xsd:string . 

  FILTER (! regex(str(?pathway), "group"))
}

Execute

Get all pathways with Pubmed references

SELECT DISTINCT ?pathway ?pubmed 
WHERE 
     {?pubmed a       wp:PublicationReference . 
      ?pubmed dcterms:isPartOf ?pathway }
ORDER BY ?pathway

Execute


Get all pathways with a particular Pubmed reference

SELECT DISTINCT ?pathway ?pubmed 
WHERE {
      ?pubmed a       wp:PublicationReference . 
      ?pubmed dcterms:isPartOf ?pathway .

      FILTER regex(str(?pubmed), "14769483$") 
}
ORDER BY ?pathway

The $ at the end of the PubMed identifier ensures that, for example, 147694831 does not match too; the regex instruction '$' means "end of the string".

Execute

Get all pathways and the number of refences per pathway

SELECT DISTINCT ?pathway COUNT(?pubmed) AS ?numberOfReferences
WHERE 
     {?pubmed a       wp:PublicationReference . 
      ?pubmed dcterms:isPartOf ?pathway }
ORDER BY DESC(?numberOfReferences) 

Execute

Get a full dump of all pathways and their pathway ontological terms

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dc:  <http://purl.org/dc/elements/1.1/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX schema: <http://schema.org/>
PREFIX wp:      <http://vocabularies.wikipathways.org/wp#>
PREFIX dcterms:  <http://purl.org/dc/terms/>

SELECT DISTINCT ?depicts str(?titleLit) as ?title str(?speciesLabelLit) as ?speciesLabel ?identifier ?ontology
WHERE {
	?pathway foaf:page ?depicts .
        ?pathway dc:title ?titleLit .
        ?pathway wp:organism ?species .
        ?pathway wp:organismName ?speciesLabelLit .
        ?pathway dc:identifier ?identifier .

        OPTIONAL {?pathway wp:ontologyTag ?ontology .}
} 

Execute


Data statistics oriented queries

Count the number of metabolites per species

Though strictly speaking, it guesstimates it, because it counts the number of unique metabolite identifiers. Normalization in the RDF generation code ensures we do not double count metabolites with identifiers from different databases, but it still differentially counts metabolites with different charge states.

PREFIX gpml:    <http://vocabularies.wikipathways.org/gpml#>
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX dc:      <http://purl.org/dc/elements/1.1/>
PREFIX rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 

select (count(distinct ?metabolite) as ?count) (str(?label) as ?species) where {
  ?metabolite a wp:Metabolite ;
    dcterms:isPartOf ?pw .
  ?pw dc:title ?title ;
    wp:organism ?organism ;
    wp:organismName ?label .
} GROUP BY ?label ORDER BY DESC(?count)

Execute

Interaction oriented queries

Get all interactions for a particular datanode

Find all interactions that are connected to a particular datanode. (wp:Interaction).

PREFIX gpml:    <http://vocabularies.wikipathways.org/gpml#>
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX dc:      <http://purl.org/dc/elements/1.1/>
PREFIX rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 

#Find all interactions that are connected to a particular datanode.

SELECT DISTINCT ?interaction ?pathway  WHERE {

   ?pathway a wp:Pathway .
   ?interaction dcterms:isPartOf ?pathway . 
   ?interaction a wp:Interaction . 
   ?interaction wp:participants <http://identifiers.org/ensembl/ENSG00000125845> .   
}

Execute

Find all datanodes (GeneProducts, Metabolites, Pathways) that are connected to a particular datanode via any type of interaction (wp:Interaction).

SELECT DISTINCT ?participants ?DataNodeLabel ?interaction WHERE {
   ?interaction a wp:Interaction .
   
   ?interaction wp:participants <http://identifiers.org/ensembl/ENSG00000125845> .   
   ?interaction wp:participants ?participants .
   ?participants a wp:DataNode .
   ?participants rdfs:label ?DataNodeLabel .   
}

Execute

Get all interactions for a particular pathway.

PREFIX wp:    <http://vocabularies.wikipathways.org/wp#>

SELECT DISTINCT ?pathway ?interaction   
WHERE {

   ?pathway a wp:Pathway .
   ?pathway dc:identifier <http://identifiers.org/wikipathways/WP1425> .
   ?interaction dcterms:isPartOf ?pathway . 
   ?interaction a wp:Interaction .  
}

Execute

Get all interactions for a particular pathway and their participants.

PREFIX wp:    <http://vocabularies.wikipathways.org/wp#>

SELECT DISTINCT ?pathway ?interaction ?participants ?DataNodeLabel
WHERE {

   ?pathway a wp:Pathway .
   ?pathway dc:identifier <http://identifiers.org/wikipathways/WP1425> .
   ?interaction dcterms:isPartOf ?pathway . 
   ?interaction a wp:Interaction .
   ?interaction wp:participants ?participants .
   ?participants a wp:DataNode .
   ?participants rdfs:label ?DataNodeLabel .  
}

Execute

Get all Interactions.

Limited to 1000 interactions.

PREFIX wp:    <http://vocabularies.wikipathways.org/wp#>

SELECT DISTINCT ?pathway ?interaction ?participant
WHERE {

   ?pathway a wp:Pathway . 
   ?interaction dcterms:isPartOf ?pathway .
   ?interaction a wp:Interaction .
   ?interaction wp:participants ?participant . 
}

LIMIT 1000


Execute

Get all Interactions for a species (Homo sapiens).

PREFIX wp:    <http://vocabularies.wikipathways.org/wp#>

SELECT DISTINCT ?pathway ?interaction   
WHERE {

   ?pathway a wp:Pathway .
   ?pathway wp:organismName "Homo sapiens"^^xsd:string . 
   ?interaction dcterms:isPartOf ?pathway . 
   ?interaction a wp:Interaction .  
}


Execute


Get downstream adjacent nodes from a source.

A directed interaction always runs from source to target. (eg. s --> t)

PREFIX wp:    <http://vocabularies.wikipathways.org/wp#>

SELECT DISTINCT ?source ?label ?target ?label1 ?pathway ?interaction
WHERE {
    ?source dc:identifier <http://identifiers.org/ensembl/ENSG00000125845> .
    ?source dcterms:isPartOf ?pathway .
    ?pathway a wp:Pathway .

    ?interaction dcterms:isPartOf ?pathway .
    ?interaction a wp:Interaction .
    ?interaction wp:source ?source .
    ?interaction wp:target ?target .

    ?source rdfs:label ?label .
    ?target rdfs:label ?label1 .
 }


Execute

Get upstream adjacent nodes from a target.

A directed interaction always runs from source to target. (eg. t <-- s)

PREFIX wp:    <http://vocabularies.wikipathways.org/wp#>

SELECT DISTINCT ?target ?label1 ?source ?label ?pathway ?interaction
WHERE {
    ?target dc:identifier <http://identifiers.org/ncbigene/659> .
    ?target dcterms:isPartOf ?pathway .
    ?pathway a wp:Pathway .

    ?interaction dcterms:isPartOf ?pathway .
    ?interaction a wp:Interaction .
    ?interaction wp:target ?target .
    ?interaction wp:source ?source .

    ?target rdfs:label ?label1 .
    ?source rdfs:label ?label .  
 }


Execute

Datasource oriented queries

Get all datasources currently captured in WikiPathways

SELECT DISTINCT str(?datasourceLit) as ?datasource  
WHERE {
         ?concept dc:source ?datasourceLit
} 

Execute

Get the number of entries per datasource in WikiPathways

SELECT DISTINCT str(?datasourceLit) as ?datasource count(?dataNode) as ?numberEntries 
WHERE {
  ?concept dc:source ?datasourceLit ;
    wp:isAbout ?dataNode .
} 
ORDER BY DESC(?numberEntries)

Execute

Count the identifiers per data source

SELECT str(?datasourceLit) as ?datasource count(distinct ?identifier) AS ?numberEntries 
WHERE {
  ?concept dc:source ?datasourceLit .
  ?concept dc:identifier ?identifier
} 

Execute

Count the identifiers per data source and order them from high to low

SELECT str(?datasourceLit) as ?datasource count(distinct ?identifier) AS ?numberEntries 
WHERE {
  ?concept dc:source ?datasourceLit .
  ?concept dc:identifier ?identifier
} ORDER BY DESC(?numberEntries)

Execute

Return all compounds annotated with the "ChEMBL compound" as data source and the pathways they are in

SELECT DISTINCT ?identifier ?pathway
WHERE {
        ?concept dcterms:isPartOf ?pathway .
        ?concept dc:source "ChEMBL compound"^^xsd:string .
        ?concept dc:identifier ?identifier .
        
} 

Execute

Curators oriented queries

Get the pathway with the erroneous data source "null"

SELECT DISTINCT  ?identifier ?pathway ?label
WHERE {
        ?concept dc:source "null"^^xsd:string .
        ?concept dc:identifier ?identifier .
        ?concept dcterms:isPartOf ?pathway .
        ?concept rdfs:label ?label
} 

sparqlbin Execute

Get all geneproducts that lack either a DataSource or an Identifier

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms:  <http://purl.org/dc/terms/>

select distinct ?pathway ?label where {?geneProduct a wp:GeneProduct . 
      ?geneProduct rdfs:label ?label .
      ?geneProduct dcterms:isPartOf ?pathway .
      
      FILTER regex(str(?geneProduct), "^node"). 
      FILTER regex(str(?pathway), "^http").
      }

Execute

Get entities with more than one identifier

select ?entity count(?identifier) as ?count where {
  ?entity <http://purl.org/dc/terms/identifier> ?identifier .
} order by desc(?count) 

Execute




PubChem-compound 1004

Warning: may time out, which indicates there is no current use of this identifier value.

Wrongly used for phosphate. It is the uncharged compound. Phosphate is, instead, and particularly thinkgs like "Pi", CID 1061 for ortho-phosphate, aka [PO4]2-.

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms:  <http://purl.org/dc/terms/>
prefix xsd:     <http://www.w3.org/2001/XMLSchema#>

select ?pathway ?source
where {
  ?mb dc:source ?source ;
    dcterms:isPartOf ?pathway ;
    dcterms:identifier "1004"^^xsd:string .
}

Execute

Outdated HMDB identifiers

These results show HMDB identifiers used in WikiPathways but that are revoked or have become secondary identifiers.

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms:  <http://purl.org/dc/terms/>

select distinct str(?identifierStr) as ?identifier
where {
  ?mb a wp:Metabolite ;
    dc:source "HMDB"^^xsd:string ;
    dcterms:identifier ?identifierStr .
  OPTIONAL { ?mb  wp:bdbHmdb ?bridgedb . }
  FILTER (!BOUND(?bridgedb))
} order by ?identifierStr

Execute

Metabolites not classified as such

One can list all data sources for non-metabolites with this query.

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms:  <http://purl.org/dc/terms/>

select str(?datasourceLit) as ?datasource count(?identifier) as ?count
where {
  ?mb dc:source ?datasourceLit ;
    dcterms:identifier ?identifier .
  FILTER NOT EXISTS { ?mb a wp:Metabolite }
} order by desc(?count)

Execute

That mostly lists gene identifier sources, etc, but watch out for the metabolite identifier data sources. For example, metabolites not marked as such but with a metabolite identifier can be found this way. Down the list is CAS (but genes are chemicals too...), and a few minor more:

HMDB 		4
ChEBI 		3
GLYCAN 		3
COMPOUND 	3
PubChem 	2

I would expect GLYCAN and COMPOUND to be misnomers of the matching KEGG subsets.

Metabolites sometimes marked as DataNode@Type Metabolite

Based on label comparisons, we can find things that are labeled the same as a data node with the same label. Of course, this can give false positives, because genes can be incorrectly marked as metabolite in some pathway, but that is another SPARQL query. Another reasons is that sometimes genes and metabolites actually have the same name!

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms:  <http://purl.org/dc/terms/>
prefix xsd:     <http://www.w3.org/2001/XMLSchema#>

select ?pathway ?nonmb ?mb str(?labelLit) as ?label
where {
  ?nonmb rdfs:label ?labelLit .
  ?mb rdfs:label ?labelLit .
  ?nonmb dcterms:isPartOf ?pathway .
  FILTER ( ?nonmb != ?mb )
  FILTER NOT EXISTS { ?nonmb a wp:Metabolite }
  FILTER EXISTS { ?pathway a wp:Pathway }
  FILTER EXISTS { ?mb a wp:Metabolite }
}

Execute

Metabolites with too many labels

This query can result in false positives too, particularly with the new RDF.

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms:  <http://purl.org/dc/terms/>

select distinct ?mb count(distinct ?labelLit) as ?labelCount
where {
  ?mb a wp:Metabolite ;
    rdfs:label ?labelLit ;
    dcterms:isPartOf ?pathway .
  ?pathway a wp:Pathway .
} order by desc(?labelCount) ?mb 

Execute

And get the actual labels (and more) with:

describe <http://identifiers.org/hmdb/HMDB02174>

Execute

Or use this one, but mind that pathway/label combinations are combinatorial, because they share the same node:

select distinct ?pathway str(?labelLit) as ?label
where {
  <http://identifiers.org/hmdb/HMDB01401> a wp:Metabolite;
    rdfs:label ?labelLit ;
    dcterms:isPartOf ?pathway .
  ?pathway a wp:Pathway .
} order by ?pathway

Execute

Metabolites with an Entrez Gene identifier

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms: <http://purl.org/dc/terms/>
prefix xsd:     <http://www.w3.org/2001/XMLSchema#>

select distinct ?pathway ?mb str(?labelLit) as ?label str(?identifierLit) as ?identifier 
where {
  ?mb a wp:Metabolite ;
    rdfs:label ?labelLit ;
    dc:source "Entrez Gene"^^xsd:string ;
    dcterms:identifier ?identifierLit ;
    dcterms:isPartOf ?pathway .
  ?pathway a wp:Pathway .
} order by ?pathway

Execute

Metabolites without a link to Wikidata

PREFIX wdt: <http://www.wikidata.org/prop/direct/>

SELECT DISTINCT ?metabolite WHERE {
  ?metabolite a wp:Metabolite .
  OPTIONAL { ?metabolite wp:bdbWikidata ?wikidata . }
  FILTER (!BOUND(?wikidata))
}

Execute

A variant sorting the metabolites by the number of pathways they occur in:

PREFIX wdt: <http://www.wikidata.org/prop/direct/>

SELECT ?metabolite (count(DISTINCT ?pathwayRes) as ?pathways) WHERE {
  ?metabolite a wp:Metabolite ;
    dcterms:identifier ?id ;
    dcterms:isPartOf ?pathwayRes .
  ?pathwayRes a wp:Pathway .
  OPTIONAL { ?metabolite wp:bdbWikidata ?wikidata . }
  FILTER (!BOUND(?wikidata))
} GROUP BY ?metabolite ORDER BY DESC(?pathways)

Execute

Federated queries - !Under Construction!

Other SPARQL endpoints used in the federated queries



WikiPathways with ChEMBL: all ChEMBL assays for pathways

SELECT ?pathway ?target ?assay WHERE {
{
  SELECT DISTINCT
    ?pathway ?uniprot
    iri(
      bif:concat("http://bio2rdf.org/uniprot:",
      bif:regexp_substr('http://identifiers.org/uniprot/(.*)',?uniprot, 1))
    ) as ?chembluniprot
  WHERE {
    ?s ?p ?uniprot .
    ?s dcterms:isPartOf ?pathway .
    FILTER regex(?uniprot, "uniprot")
  }
}
  SERVICE <http://rdf.farmbio.uu.se/chembl/sparql/> {
    ?target owl:sameAs ?chembluniprot .
    ?score chembl:forTarget ?target .
    ?assay chembl:hasTargetScore ?score .
}
}

Execute

WikiPathways with ChEMBL: all molecules targeting pathways

SELECT ?pathway ?target ?assay ?smiles WHERE {
{
  SELECT DISTINCT
    ?pathway ?uniprot
    iri(
      bif:concat("http://bio2rdf.org/uniprot:",
      bif:regexp_substr('http://identifiers.org/uniprot/(.*)',?uniprot, 1))
    ) as ?chembluniprot
  WHERE {
    ?s ?p ?uniprot .
    ?s dcterms:isPartOf ?pathway .
    FILTER regex(?uniprot, "uniprot")
  }
}
  SERVICE <http://rdf.farmbio.uu.se/chembl/sparql/> {
    ?target owl:sameAs ?chembluniprot .
    ?score chembl:forTarget ?target .
    ?assay chembl:hasTargetScore ?score .
    ?activity chembl:onAssay ?assay ;
      chembl:forMolecule ?molecule .
    ?molecule bo:smiles ?smiles .
    
}
}

Execute

WikiPathways with EBI Atlas RDF - !Under Construction!

Genes differentially expressed in asthma and Pathways

For the genes differentially expressed in asthma, get the gene products associated to a WikiPathways pathway. (Built upon example query 5 in: http://www.ebi.ac.uk/rdf/services/atlas/sparql ). You can substitute the EFO number for other disease codes.

PREFIX identifiers:<http://identifiers.org/ensembl/>
PREFIX atlas: <http://rdf.ebi.ac.uk/resource/atlas/>
PREFIX atlasterms: <http://rdf.ebi.ac.uk/terms/atlas/>
PREFIX efo: <http://www.ebi.ac.uk/efo/>

SELECT DISTINCT ?wpURL ?pwTitle ?expressionValue ?pvalue where {

SERVICE <https://www.ebi.ac.uk/rdf/services/atlas/sparql> {
     ?factor rdf:type efo:EFO_0000270 . 
     ?value atlasterms:hasFactorValue ?factor . 
     ?value atlasterms:isMeasurementOf ?probe . 
     ?value atlasterms:pValue ?pvalue . 
     ?value rdfs:label ?expressionValue . 
     ?probe atlasterms:dbXref ?dbXref .
}
     ?pwElement dcterms:isPartOf ?pathway .
     ?pathway dc:title ?pwTitle .
     ?pathway dc:identifier ?wpURL .
     ?pwElement wp:bdbEnsembl ?dbXref .
}
ORDER BY ASC(?pvalue)

Execute

Genes differentially expressed in type II diabetes mellitus and Pathways

PREFIX identifiers:<http://identifiers.org/ensembl/>
PREFIX atlas: <http://rdf.ebi.ac.uk/resource/atlas/>
PREFIX atlasterms: <http://rdf.ebi.ac.uk/terms/atlas/>
PREFIX efo: <http://www.ebi.ac.uk/efo/>

SELECT DISTINCT ?wpURL ?pwTitle ?expressionValue ?pvalue where {

SERVICE <https://www.ebi.ac.uk/rdf/services/atlas/sparql> {
     ?factor rdf:type efo:EFO_0001360 . 
     ?value atlasterms:hasFactorValue ?factor . 
     ?value atlasterms:isMeasurementOf ?probe . 
     ?value atlasterms:pValue ?pvalue . 
     ?value rdfs:label ?expressionValue . 
     ?probe atlasterms:dbXref ?dbXref .
}
     ?pwElement dcterms:isPartOf ?pathway .
     ?pathway dc:title ?pwTitle .
     ?pathway dc:identifier ?wpURL .
     ?pwElement wp:bdbEnsembl ?dbXref .
}
ORDER BY ASC(?pvalue)

Execute

Genes differentially expressed in obesity and Pathways

PREFIX identifiers:<http://identifiers.org/ensembl/>
PREFIX atlas: <http://rdf.ebi.ac.uk/resource/atlas/>
PREFIX atlasterms: <http://rdf.ebi.ac.uk/terms/atlas/>
PREFIX efo: <http://www.ebi.ac.uk/efo/>

SELECT DISTINCT ?wpURL ?pwTitle ?expressionValue ?pvalue where {
SERVICE <https://www.ebi.ac.uk/rdf/services/atlas/sparql> {
     ?factor rdf:type efo:EFO_0001073 . 
     ?value atlasterms:hasFactorValue ?factor . 
     ?value atlasterms:isMeasurementOf ?probe . 
     ?value atlasterms:pValue ?pvalue . 
     ?value rdfs:label ?expressionValue . 
     ?probe atlasterms:dbXref ?dbXref .
}
     ?pwElement dcterms:isPartOf ?pathway .
     ?pathway dc:title ?pwTitle .
     ?pathway dc:identifier ?wpURL .
     ?pwElement wp:bdbEnsembl ?dbXref .
}
ORDER BY ASC(?pvalue)

Execute

WikiPathways with Wikidata - !Under Construction!

Metabolites in Wikipedia with InChIKeys from Wikidata

PREFIX wdt: <http://www.wikidata.org/prop/direct/>

SELECT ?metabolite ?wikidata ?inchikey WHERE {
  ?metabolite a wp:Metabolite ;
    wp:bdbWikidata ?wikidata .
  SERVICE <https://query.wikidata.org/sparql> {
    ?wikidata wdt:P235 ?inchikey .
  }
} LIMIT 10

Execute


Identifier to WikiPathways lists

List of WikiPathways for Ensembl identifiers

select distinct ?pathwayRes str(?wpid) as ?pathway str(?title) as ?pathwayTitle fn:substring(?ensId,32) as ?ensembl where {
  ?gene a wp:GeneProduct ;
    dcterms:identifier ?id ;
    dcterms:isPartOf ?pathwayRes ;
    wp:bdbEnsembl ?ensId .
  ?pathwayRes a wp:Pathway ;
    dcterms:identifier ?wpid ;
    dc:title ?title .
}

Execute

List of WikiPathways for HGNC symbols

select distinct ?pathwayRes str(?wpid) as ?pathway str(?title) as ?pathwayTitle fn:substring(?hgncId,36) as ?HGNC where {
  ?gene a wp:GeneProduct ;
    dcterms:identifier ?id ;
    dcterms:isPartOf ?pathwayRes ;
    wp:bdbHgncSymbol ?hgncId .
  ?pathwayRes a wp:Pathway ;
    dcterms:identifier ?wpid ;
    dc:title ?title .
}

Execute

List of WikiPathways for NCBI Gene identifiers

select distinct ?pathwayRes str(?wpid) as ?pathway str(?title) as ?pathwayTitle fn:substring(?ncbiGeneId,33) as ?NCBIGene where {
  ?gene a wp:GeneProduct ;
    dcterms:identifier ?id ;
    dcterms:isPartOf ?pathwayRes ;
    wp:bdbEntrezGene ?ncbiGeneId .
  ?pathwayRes a wp:Pathway ;
    dcterms:identifier ?wpid ;
    dc:title ?title .
}

Execute

List of WikiPathways for HMDB identifiers

select distinct ?pathwayRes str(?wpid) as ?pathway str(?title) as ?pathwayTitle fn:substring(?hmdbId,29) as ?hmdb where {
  ?gene a wp:Metabolite ;
    dcterms:identifier ?id ;
    dcterms:isPartOf ?pathwayRes ;
    wp:bdbHmdb ?hmdbId .
  ?pathwayRes a wp:Pathway ;
    dcterms:identifier ?wpid ;
    dc:title ?title .
}

Execute

List of WikiPathways for ChemSpider identifiers

select distinct ?pathwayRes str(?wpid) as ?pathway str(?title) as ?pathwayTitle fn:substring(?csId,35) as ?chemspider where {
  ?gene a wp:Metabolite ;
    dcterms:identifier ?id ;
    dcterms:isPartOf ?pathwayRes ;
    wp:bdbChemspider ?csId .
  ?pathwayRes a wp:Pathway ;
    dcterms:identifier ?wpid ;
    dc:title ?title .
}

Execute

List of WikiPathways for PubChem CID identifiers

select distinct ?pathwayRes str(?wpid) as ?pathway str(?title) as ?pathwayTitle fn:substring(?cid,46) as ?PubChem where {
  ?gene a wp:Metabolite ;
    dcterms:identifier ?id ;
    dcterms:isPartOf ?pathwayRes ;
    wp:bdbPubChem ?cid .
  ?pathwayRes a wp:Pathway ;
    dcterms:identifier ?wpid ;
    dc:title ?title .
}

Execute

Code examples

Perl

There is an RDF api available. Below is an example that extracts the data by converting the query into a url and extracts the data as CSV.

#!/usr/bin/perl
 
use LWP::Simple;
use URI::Escape;
my $sparql = "SELECT DISTINCT ?wpIdentifier ?elementneedsattention ?elementLabel
WHERE {
    ?pathway dc:title ?title .
    ?elementneedsattention a gpml:requiresCurationAttention .
    ?elementneedsattention dcterms:isPartOf ?pathway .
    ?elementneedsattention rdfs:label ?elementLabel . 
    ?pathway wp:organism ?organism .
    ?pathway foaf:page ?page .
    ?pathway dc:identifier ?wpIdentifier .
    ?organism rdfs:label \"Mus musculus\"^^<http://www.w3.org/2001/XMLSchema#string> .
 }
ORDER BY ?wpIdentifier";
 
my $url = 'http://sparql.wikipathways.org/?default-graph-uri=&query='.uri_escape($sparql).'&format=text%2Fcsv&timeout=0&debug=on';
 
my $content = get $url;
die "Couldn't get $url" unless defined $content;
 
print $content;

Java

For java we recommend the Jena Framework.

import com.hp.hpl.jena.query.Query;
import com.hp.hpl.jena.query.QueryExecution;
import com.hp.hpl.jena.query.QueryExecutionFactory;
import com.hp.hpl.jena.query.QueryFactory;
import com.hp.hpl.jena.query.QuerySolution;
import com.hp.hpl.jena.query.ResultSet;

public class javaCodeExample {

	public static void main(String[] args) {
		String sparqlQueryString = "SELECT * WHERE {?s ?p ?o} LIMIT 10";
		Query query = QueryFactory.create(sparqlQueryString);
		QueryExecution queryExecution = QueryExecutionFactory.sparqlService("http://sparql.wikipathways.org", query);
		ResultSet resultSet = queryExecution.execSelect();
		while (resultSet.hasNext()) {
			QuerySolution solution = resultSet.next();
			System.out.print(solution.get("s"));
			System.out.print("\t"+solution.get("p"));
			System.out.println("\t"+solution.get("o"));
		}
	}
}

php

For php we recommend the arc2: Easy RDF and SPARQL for LAMP systems

R

   library(rrdf)
   sparql.remote(
     "http://sparql.wikipathways.org/",
     "SELECT DISTINCT ?p WHERE { ?s ?p ?o }"
   )

Bioclipse

The below code works in both the JavaScript and the Groovy console:

   rdf.sparqlRemote(
     "http://sparql.wikipathways.org/",
     "SELECT DISTINCT ?p WHERE { ?s ?p ?o }"
   )

SPARQL from the command line

For quick and easy querying, we recommend to use curl (Linux and OS X)

curl -F "query=SELECT * WHERE {?s ?p ?o} LIMIT 10" http://sparql.wikipathways.org



Return to Help Contents

Personal tools
Navigation