Help:WikiPathways Sparql queries

From WikiPathways

Jump to: navigation, search

On http://sparql.wikipathways.org/ wikipathways content is replicated. Currently this SPARQL endpoint is being developed, with very irregular updates.

Contents

Resources

Other sparql endpoints

Submit ideas

Prefixes

Below are example queries. For readability we have omitted the prefixes. We use the following prefixes: (Not complete yet)

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dc:      <http://purl.org/dc/elements/1.1/> 
PREFIX cas:     <http://identifiers.org/cas/> 
PREFIX wprdf:   <http://rdf.wikipathways.org/> 
PREFIX foaf:    <http://xmlns.com/foaf/0.1/> 
PREFIX pubmed:  <http://www.ncbi.nlm.nih.gov/pubmed/> 
PREFIX wp:      <http://vocabularies.wikipathways.org/wp#> 
PREFIX biopax:  <http://www.biopax.org/release/biopax-level3.owl#> 
PREFIX dcterms:  <http://purl.org/dc/terms/> 
PREFIX rdfs:    <http://www.w3.org/2000/01/rdf-schema#> 
PREFIX ncbigene:  <http://identifiers.org/ncbigene/> 
PREFIX xsd:     <http://www.w3.org/2001/XMLSchema#> 
PREFIX rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 
PREFIX gpml:    <http://vocabularies.wikipathways.org/gpml#> 
PREFIX skos:    <http://www.w3.org/2004/02/skos/core#> 

Example queries

Queries with a * requires a bit more time for results.


Pathway oriented queries

Get the species currently in WikiPathways with their respective URI's

SELECT DISTINCT ?organism ?label
WHERE {
    ?concept wp:organism ?organism .
    ?organism rdfs:label ?label .
 } 

Sparqlbin

List pathways and their species

SELECT DISTINCT ?title ?label 
WHERE {
    ?pathway dc:title ?title .
    ?pathway wp:organism ?organism .
    ?organism rdfs:label ?label .
 } 

Execute

List the species captured in WikiPathways and the number of pathways per species

SELECT DISTINCT ?organism ?label count(?pathway) as ?noPathways
WHERE {
    ?pathway dc:title ?title .
    ?pathway wp:organism ?organism .
    ?organism rdfs:label ?label .
 }
ORDER BY DESC(?noPathways)

Execute


List all pathways for species "Mus musculus"

The following query list all mouse pathways. ?wpIdentifier is the link through identifiers.org, ?pathway points to the rdf version of wikipathways and ?page is the revision which is loaded in the sparql endpoint.

SELECT DISTINCT ?wpIdentifier ?pathway ?page
WHERE {
    ?pathway dc:title ?title .
    ?pathway wp:organism ?organism .
    ?pathway foaf:page ?page .
    ?pathway dc:identifier ?wpIdentifier .
    ?organism rdfs:label "Mus musculus"^^<http://www.w3.org/2001/XMLSchema#string> .
 }
ORDER BY ?wpIdentifier 

Execute Perl

List all mouse pathways that require curation attention

The following query lists all pathways for the mouse that contains elements that requires attention.. It lists the canonical identifier (ie the page that always point to the latest revision), the wiki page with the latest revision loaded in the Sparql endpoint and the last URI of that page.

SELECT DISTINCT ?wpIdentifier ?elementneedsattention ?elementLabel
WHERE {
    ?pathway dc:title ?title .
    ?elementneedsattention a gpml:requiresCurationAttention .
    ?elementneedsattention dcterms:isPartOf ?pathway .
    ?elementneedsattention rdfs:label ?elementLabel . 
    ?pathway wp:organism ?organism .
    ?pathway foaf:page ?page .
    ?pathway dc:identifier ?wpIdentifier .
    ?organism rdfs:label "Mus musculus"^^<http://www.w3.org/2001/XMLSchema#string> .
 }
ORDER BY ?wpIdentifier 

Execute Perl

Count the pathways per pathway category

SELECT DISTINCT  ?category count(?category) as ?noCategories
WHERE {
        ?pathway wp:category ?category .
        ?pathway dc:title ?title .
} 
ORDER BY ?category

execute

List all pathways of category Metabolic Process

SELECT DISTINCT  *
WHERE {
        ?pathway wp:category wp:MetabolicProcess .
        ?pathway dc:title ?title .
} 

Sparqlbin execute

Get all pathways with a particular gene

List all pathways per instance of a particular gene or protein (wp:GeneProduct)

    select distinct ?pathway ?label where {
      ?geneProduct a wp:GeneProduct . 
      ?geneProduct rdfs:label ?label .
      ?geneProduct dcterms:isPartOf ?pathway .
    
      FILTER regex(str(?label), "CYP"). 
    }

execute


Get all groups and complexes containing a particular gene

List all groups and complexes per instance of a particular gene or protein (wp:GeneProduct)

  select distinct ?pathway ?label where {
     ?geneProduct a wp:GeneProduct . 
     ?geneProduct rdfs:label ?label .
     ?geneProduct dcterms:isPartOf ?pathway .
   
     FILTER regex(str(?label), "CYP"). 
     FILTER regex(str(?pathway), "group") 
   }

execute

Get all the genes on a particular pathway

List all the genes and proteins (wp:GeneProduct) associated with a particular pathway WPID.

select distinct ?pathway ?label where {
      ?geneProduct a wp:GeneProduct . 
      ?geneProduct rdfs:label ?label .
      ?geneProduct dcterms:isPartOf ?pathway .
    
      FILTER regex(str(?pathway), "WP615"). 
      FILTER (! regex(str(?pathway), "group"))
    }

excute

Count the number of Pathways per ontology term

In WikiPathways, pathways can be tagged with ontology terms from Pathway, Cell Line and Disease ontology. The following query returns a pathway count for each term from any of the available ontologies. These terms are collectively modeled as wp:pathwayOntology; but this includes all ontologies, not just the "Pathway" ontology.

 SELECT DISTINCT ?label as ?pwOntologyTerm count(?pathway) as ?pathwayCount
 WHERE {
	?pathwayRDF wp:pathwayOntology ?o .
        ?pathwayRDF foaf:page ?pathway .
        ?pathwayRDF dc:title ?title .
        ?o <http://www.w3.org/2000/01/rdf-schema#label> ?label .
        ?o rdfs:subClassOf ?superClass .
        ?superClass rdfs:label ?superClassLabel .
 }
 ORDER BY DESC(?pathwayCount)

Sparqlbin

Execute

Get all pathways with a particular ontology term

In WikiPathways, pathways can be tagged with ontology terms from Pathway, Cell Line and Disease ontology. The following query returns a list of pathways tagged with a given term from any of the supported ontologies. These terms are collectively modeled as wp:pathwayOntology; but this includes all ontologies, not just the "Pathway" ontology.

 SELECT ?label as ?pwOntologyTerm ?pathway 
 WHERE {
	?pathwayRDF wp:pathwayOntology ?o .
        ?pathwayRDF foaf:page ?pathway .
        ?pathwayRDF dc:title ?title .
        ?o <http://www.w3.org/2000/01/rdf-schema#label> ?label .
        ?o rdfs:subClassOf ?superClass .
        ?superClass rdfs:label ?superClassLabel .

        FILTER regex(str(?label), "^cancer$")
 }

execute

Get all ontology terms for a particular pathway

List all the ontology terms tagged on a particular pathway.

SELECT ?label as ?pwOntologyTerm ?pathway 
 WHERE {
	?pathwayRDF wp:pathwayOntology ?o .
        ?pathwayRDF foaf:page ?pathway .
        ?pathwayRDF dc:title ?title .
        ?o <http://www.w3.org/2000/01/rdf-schema#label> ?label .
        ?o rdfs:subClassOf ?superClass .
        ?superClass rdfs:label ?superClassLabel .

        FILTER regex(str(?pathway), "WP615"). 
        FILTER (! regex(str(?pathway), "group"))
 }

execute

Get all pathways with Pubmed references

SELECT DISTINCT ?pathway ?pubmed 
WHERE 
     {?pubmed a       wp:PublicationReference . 
      ?pubmed dcterms:isPartOf ?pathway }
ORDER BY ?pathway

Execute


Get all pathways with a particular Pubmed reference

SELECT DISTINCT ?pathway ?pubmed 
WHERE {
      ?pubmed a       wp:PublicationReference . 
      ?pubmed dcterms:isPartOf ?pathway .

      FILTER regex(str(?pubmed), "14769483") 
}

ORDER BY ?pathway

execute

Get all pathways and the number of refences per pathway

SELECT DISTINCT ?pathway COUNT(?pubmed) AS ?numberOfReferences
WHERE 
     {?pubmed a       wp:PublicationReference . 
      ?pubmed dcterms:isPartOf ?pathway }
ORDER BY DESC(?numberOfReferences) 

Execute

Get a full dump of all pathways from the analytical set and they pathway ontological terms

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dc:  <http://purl.org/dc/elements/1.1/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX schema: <http://schema.org/>
PREFIX wp:      <http://vocabularies.wikipathways.org/wp#>
PREFIX dcterms:  <http://purl.org/dc/terms/>

SELECT DISTINCT ?depicts ?title ?speciesLabel ?identifier ?ontology ?label 
WHERE {
	?pathway foaf:page ?depicts .
        ?pathway dc:title ?title .
        ?pathway wp:organism ?species .
        ?species rdfs:label ?speciesLabel .
        ?pathway dc:identifier ?identifier .
        OPTIONAL {?pathway wp:pathwayOntology ?ontology .
        ?ontology rdfs:label ?label .}
} 

Execute

Interaction oriented queries

Get all interactions of a particular datanode

Find all datanodes (GeneProducts, Metabolites, Pathways) that are connected to a particular datanode via any type of interaction (gpml:Line).

SELECT DISTINCT ?wpIdentifier ?dn2Identifier  WHERE {
?pathway dc:identifier ?wpIdentifier .
{SELECT DISTINCT * WHERE {
?datanode2 dc:identifier ?dn2Identifier .
?datanode2 a gpml:DataNode .
?datanode2 dcterms:isPartOf ?pathway .
?datanode2 gpml:graphid ?dn2GraphId .
?line gpml:graphref ?dn2GraphId .
FILTER (?datanode2 != ?datanode1) 
FILTER (?datanode2 != <http://commonchemistry.org/ChemicalDetail.aspx?ref=noIdentifier>)

{SELECT DISTINCT * WHERE {   
   ?datanode1 dc:identifier <http://identifiers.org/hmdb/HMDB01586> .
   ?datanode1 gpml:graphid ?dn1GraphId .
   ?datanode1 a gpml:DataNode .
   ?datanode1 dcterms:isPartOf ?pathway .

   ?line gpml:graphref ?dn1GraphId .
   ?line a gpml:Line .
   ?line gpml:graphid ?lineGraphId .
   ?line dcterms:isPartOf ?pathway .}}

}}
}

Execute

Get all interactions per pathway

SELECT DISTINCT ?wpIdentifier ?dn1Identifier ?dn2Identifier  WHERE {
?pathway dc:identifier ?wpIdentifier .
{SELECT DISTINCT * WHERE {
?datanode2 dc:identifier ?dn2Identifier .
?datanode2 a gpml:DataNode .
?datanode2 dcterms:isPartOf ?pathway .
?datanode2 gpml:graphid ?dn2GraphId .
?line gpml:graphref ?dn2GraphId .
FILTER (?datanode2 != ?datanode1) 
FILTER (!regex(str(?datanode2), "noIdentifier")) .
{SELECT DISTINCT * WHERE {   
   ?datanode1 dc:identifier ?dn1Identifier .
   ?datanode1 gpml:graphid ?dn1GraphId .
   ?datanode1 a gpml:DataNode .
   ?datanode1 dcterms:isPartOf ?pathway .
   ?line gpml:graphref ?dn1GraphId .
   ?line a gpml:Line .
   ?line gpml:graphid ?lineGraphId .
   ?line dcterms:isPartOf ?pathway .}}
   FILTER (!regex(str(?datanode1), "noIdentifier")) .

}}
}

Execute


Get all Interactions

SELECT DISTINCT ?dn1Identifier ?dn2Identifier  WHERE {
?pathway dc:identifier ?wpIdentifier .
{SELECT DISTINCT * WHERE {
?datanode2 dc:identifier ?dn2Identifier .
?datanode2 a gpml:DataNode .
?datanode2 dcterms:isPartOf ?pathway .
?datanode2 gpml:graphid ?dn2GraphId .
?line gpml:graphref ?dn2GraphId .
FILTER (?datanode2 != ?datanode1) 
FILTER (!regex(str(?datanode2), "noIdentifier")) .
{SELECT DISTINCT * WHERE {   
   ?datanode1 dc:identifier ?dn1Identifier .
   ?datanode1 gpml:graphid ?dn1GraphId .
   ?datanode1 a gpml:DataNode .
   ?datanode1 dcterms:isPartOf ?pathway .
   ?line gpml:graphref ?dn1GraphId .
   ?line a gpml:Line .
   ?line gpml:graphid ?lineGraphId .
   ?line dcterms:isPartOf ?pathway .}}
   FILTER (!regex(str(?datanode1), "noIdentifier")) .

}}
}


Execute

Datasource oriented queries

Get all datasources currently captured in WikiPathways

SELECT DISTINCT ?datasource 
WHERE {
         ?concept dc:source ?datasource
} 

Execute

Get the number of entries per datasource in WikiPathways

SELECT DISTINCT ?datasource count(?datasource) as ?numberEntries 
WHERE {
        ?concept dc:source ?datasource
} 
ORDER BY DESC(?numberEntries)

Sparqlbin

Count the identifiers per data source

SELECT DISTINCT ?datasource ?identifier count(?identifier) AS ?numberEntries 
WHERE {
        ?concept dc:source ?datasource .
        ?concept dc:identifier ?identifier
} 

Sparqlbin

Count the identifiers per data source and order them from high to low

SELECT DISTINCT ?datasource ?identifier count(?identifier) AS ?numberEntries 
WHERE {
        ?concept dc:source ?datasource .
        ?concept dc:identifier ?identifier
} 
ORDER BY DESC(?numberEntries)

Sparqlbin

Return all Chembl compounds in WikiPathways and the pathways they are in

SELECT DISTINCT ?identifier ?pathway
WHERE {
        ?concept dcterms:isPartOf ?pathway .
        ?concept dc:source "ChEMBL compound"^^xsd:string .
        ?concept dc:identifier ?identifier .
        
} 

Sparqlbin

Curators oriented queries

Get the pathway with the erroneous data source "null"

SELECT DISTINCT  ?identifier ?pathway ?label
WHERE {
        ?concept dc:source "null"^^xsd:string .
        ?concept dc:identifier ?identifier .
        ?concept dcterms:isPartOf ?pathway .
        ?concept rdfs:label ?label
} 

sparqlbin Execute

Get all geneproducts that lack either a DataSource or an Identifier

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms:  <http://purl.org/dc/terms/>

select distinct ?pathway ?label where {?geneProduct a wp:GeneProduct . 
      ?geneProduct rdfs:label ?label .
      ?geneProduct dcterms:isPartOf ?pathway .
      
      FILTER regex(str(?geneProduct), "^node"). 
      FILTER regex(str(?pathway), "^http").
      }

Execute

Get entities with more than one identifier

select ?entity count(?identifier) as ?count where {
  ?entity <http://purl.org/dc/terms/identifier> ?identifier .
} order by desc(?count) LIMIT 12

Extract contributors

SELECT DISTINCT ?contributor  
WHERE {
       ?pathway dc:contributor ?contributor
}

Sparqlbin

Extract the amount of pathways edited per contributor

SELECT DISTINCT ?contributor, count(?pathway) as ?pathwaysEdited  
WHERE {
       ?pathway dc:contributor ?contributor
}
ORDER BY DESC(?pathwaysEdited)

Sparqlbin

find the pathways a user have edited so far.

SELECT DISTINCT ?pathway, ?pathwayLabel
WHERE {
       ?pathway dc:contributor wpuser:Andra .
       ?pathway dc:contributor ?contributor .
       ?pathway rdfs:label ?pathwayLabel .
}

Sparqlbin

PubChem-compound 1004

Wrongly used for phosphate. It is the uncharged compound. Phosphate is, instead, and particularly thinkgs like "Pi", CID 1061 for ortho-phosphate, aka [PO4]2-.

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms:  <http://purl.org/dc/terms/>
prefix xsd:     <http://www.w3.org/2001/XMLSchema#>

select ?pathway ?source
where {
  ?mb dc:source ?source ;
    dcterms:isPartOf ?pathway ;
    dcterms:identifier "1004"^^xsd:string .
}

Run (none at this moment)

Outdated HMDB identifiers

These results show HMDB identifiers used in WikiPathways but that are revoked or have become secondary identifiers.

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms:  <http://purl.org/dc/terms/>

select distinct ?identifier
where {
  ?mb a wp:Metabolite ;
    dc:source "HMDB"^^xsd:string ;
    dc:identifier ?identifier .
  OPTIONAL { ?mb  wp:bdbHmdb ?bridgedb . }
  FILTER (!BOUND(?bridgedb))
} order by ?identifier

Run

Metabolites not classified as such

One can list all data sources for non-metabolites with this query.

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms:  <http://purl.org/dc/terms/>

select ?datasource count(?identifier) as ?count
where {
  ?mb dc:source ?datasource ;
    dcterms:identifier ?identifier .
  FILTER NOT EXISTS { ?mb a wp:Metabolite }
} order by desc(?count)

Run

That mostly lists gene identifier sources, etc, but watch out for the metabolite identifier data sources. For example, metabolites not marked as such but with a metabolite identifier can be found this way. Down the list is CAS (but genes are chemicals too...), and a few minor more:

"CTD Gene"^^<http://www.w3.org/2001/XMLSchema#string> 	5
"HMDB"^^<http://www.w3.org/2001/XMLSchema#string> 	4
"ChEBI"^^<http://www.w3.org/2001/XMLSchema#string> 	3
"GLYCAN"^^<http://www.w3.org/2001/XMLSchema#string> 	3
"COMPOUND"^^<http://www.w3.org/2001/XMLSchema#string> 	3
"PubChem"^^<http://www.w3.org/2001/XMLSchema#string> 	2

I would expect GLYCAN and COMPOUND to be misnomers of the matching KEGG subsets.

Non-Metabolites with CAS identifier

Note that a CAS identifier can also refer to mixtures, compound classes, etc.

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms: <http://purl.org/dc/terms/>
prefix xsd:     <http://www.w3.org/2001/XMLSchema#>

select distinct ?pathway ?mb str(?label) as ?name str(?identifier) as ?id 
where {
  ?mb dc:source "CAS"^^xsd:string ;
    rdfs:label ?label ;
    dcterms:identifier ?identifier ;
    dcterms:isPartOf ?pathway .
  FILTER NOT EXISTS { ?mb a wp:Metabolite }
} order by ?pathway

Run

Non-Metabolites with PubChem identifier

At the time of writing, this results in an empty set.

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms: <http://purl.org/dc/terms/>
prefix xsd:     <http://www.w3.org/2001/XMLSchema#>

select distinct ?pathway ?mb ?label ?identifier 
where {
  ?mb dc:source "PubChem-compound"^^xsd:string ;
    dcterms:identifier ?identifier ;
    dcterms:isPartOf ?pathway .
  OPTIONAL { ?mb rdfs:label ?label . }
  FILTER NOT EXISTS { ?mb a wp:Metabolite }
} order by ?pathway

Metabolites sometimes marked as DataNode@Type Metabolite

Based on label comparisons, we can find things that are labeled the same as a data node with the same label. Of course, this can give false positives, because genes can be incorrectly marked as metabolite in some pathway, but that is another SPARQL query.

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms:  <http://purl.org/dc/terms/>
prefix xsd:     <http://www.w3.org/2001/XMLSchema#>

select ?pathway ?nonmb ?mb ?label
where {
  ?nonmb rdfs:label ?label .
  ?mb rdfs:label ?label .
  OPTIONAL { ?nonmb dcterms:isPartOf ?pathway . }
  FILTER ( ?nonmb != ?mb )
  FILTER NOT EXISTS { ?nonmb a wp:Metabolite }
  FILTER EXISTS { ?mb a wp:Metabolite }
  FILTER (!regex(str(?nonmb),  "noIdentifier", "i"))
  FILTER (!regex(str(?mb),  "noIdentifier", "i"))
}

Run

Metabolites with an identifier but undefined data source

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms: <http://purl.org/dc/terms/>
prefix xsd:     <http://www.w3.org/2001/XMLSchema#>

select distinct ?pathway ?mb ?identifier 
where {
  ?mb a wp:Metabolite ;
    dc:source ""^^xsd:string ;
    dc:identifier ?identifier ;
    dcterms:isPartOf ?pathway .
  FILTER (!isIRI(?identifier))
  FILTER (str(?identifier) != "")
} order by ?pathway

Run

Metabolites with a data source but no identifier

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms: <http://purl.org/dc/terms/>
prefix xsd:     <http://www.w3.org/2001/XMLSchema#>

select distinct ?pathway ?mb ?source 
where {
  ?mb a wp:Metabolite ;
    dcterms:identifier ""^^xsd:string ;
    dc:source ?source ;
    dcterms:isPartOf ?pathway .
  FILTER (str(?source) != "")
  FILTER (!regex(str(?pathway),  "internal.wikipathways.org", "i"))
} order by ?pathway

Run

Metabolites with too many labels

This is particularly caused by the metabolite URIs to be based on a non-existing identifier:

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms:  <http://purl.org/dc/terms/>

select distinct count(?label) as ?count ?pathway ?mb 
where {
  ?mb a wp:Metabolite ;
    rdfs:label ?label ;
    dcterms:isPartOf ?pathway .
} order by desc(?count) ?pathway ?mb limit 410

Run

An example such entity with many labels and being both a metabolite, gene, complex, etc:

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms:  <http://purl.org/dc/terms/>

select distinct str(?label) ?type
where {
  <http://bio2rdf.org/geneid:noIdentifier> a ?type ; rdfs:label ?label .
} order by ?label

Metabolites with an Entrez Gene identifier

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms: <http://purl.org/dc/terms/>
prefix xsd:     <http://www.w3.org/2001/XMLSchema#>

select distinct ?pathway ?mb ?label ?identifier 
where {
  ?mb a wp:Metabolite ;
    rdfs:label ?label ;
    dc:source "Entrez Gene"^^xsd:string ;
    dcterms:identifier ?identifier ;
    dcterms:isPartOf ?pathway .
  FILTER (str(?identifier) != "")
} order by ?pathway

Run

Federated queries - !Under Construction!

WikiPathways with GeneWiki

SELECT DISTINCT ?wplabel ?identifier ?snp where {

                        ?s dc:identifier <http://identifiers.org/ncbigene/53975> .
                        ?s dc:identifier ?identifier .
                        ?s rdfs:label ?wplabel .
                        ?s dc:source ?source .
                    SERVICE <http://genewiki.semwebinsi.de/> {
                       ?gws dc:identifier ?identifier .
                       ?gws rdf:type ?gwtype .
                       ?gws <http://genewikiplus.org/wiki/Special:URIResolver/Property-3AHasSNP> ?snp . 
                    }

             }


prefix dc: <http://purl.org/dc/elements/1.1/>
prefix dcterms:  <http://purl.org/dc/terms/>

select distinct * where { 
            ?pwEntity dc:identifier ?identifier . 
            ?pwEntity dcterms:isPartOf ?pathway .
        SERVICE <http://genewiki.semwebinsi.de/> {
            ?concept dc:identifier <http://identifiers.org/ncbigene/12189> .
            ?concept dc:identifier ?identifier .
            ?concept <http://genewikiplus.org/wiki/Special:URIResolver/Property-3AIs_associated_with_disease> ?disease .
        }
}

WikiPathways with ChEMBL: ChEMBL compounds in WikiPathways (without BridgeDB)

SELECT *
  WHERE {{
        SELECT DISTINCT ?pathway ?concept iri(bif:concat("http://linkedchemistry.info/chembl/chemblid/", bif:regexp_substr('http://identifiers.org/chembl.compound/(.*)',?identifier, 1))) as ?ChEMBLId where {
                        ?concept dcterms:isPartOf ?pathway .
                        ?concept dc:source "ChEMBL compound"^^xsd:string .
                        ?concept dc:identifier ?identifier .     
                        FILTER regex(str(?identifier), "^http").      
        }
} SERVICE <http://rdf.farmbio.uu.se/chembl/sparql/>{
        ?ChEMBLId ?p ?o .
} }

Execute

WikiPathways with ChEMBL: all ChEMBL assays for pathways

SELECT ?pathway ?target ?assay WHERE {
{
  SELECT DISTINCT
    ?pathway ?uniprot
    iri(
      bif:concat("http://bio2rdf.org/uniprot:",
      bif:regexp_substr('http://identifiers.org/uniprot/(.*)',?uniprot, 1))
    ) as ?chembluniprot
  WHERE {
    ?s ?p ?uniprot .
    ?s dcterms:isPartOf ?pathway .
    FILTER regex(?uniprot, "uniprot")
  }
}
  SERVICE <http://rdf.farmbio.uu.se/chembl/sparql/> {
    ?target owl:sameAs ?chembluniprot .
    ?score chembl:forTarget ?target .
    ?assay chembl:hasTargetScore ?score .
}
}

Execute

WikiPathways with ChEMBL: all molecules targeting pathways

SELECT ?pathway ?target ?assay ?smiles WHERE {
{
  SELECT DISTINCT
    ?pathway ?uniprot
    iri(
      bif:concat("http://bio2rdf.org/uniprot:",
      bif:regexp_substr('http://identifiers.org/uniprot/(.*)',?uniprot, 1))
    ) as ?chembluniprot
  WHERE {
    ?s ?p ?uniprot .
    ?s dcterms:isPartOf ?pathway .
    FILTER regex(?uniprot, "uniprot")
  }
}
  SERVICE <http://rdf.farmbio.uu.se/chembl/sparql/> {
    ?target owl:sameAs ?chembluniprot .
    ?score chembl:forTarget ?target .
    ?assay chembl:hasTargetScore ?score .
    ?activity chembl:onAssay ?assay ;
      chembl:forMolecule ?molecule .
    ?molecule bo:smiles ?smiles .
    
}
}

Execute


WikiPathways with EBI Atlas RDF - !Under Construction!

Genes differentially expressed in asthma and Pathways

For the genes differentially expressed in asthma, get the gene products associated to a WikiPathways pathway. (Built upon example query 5 in: http://www.ebi.ac.uk/rdf/services/atlas/sparql ). You can substitute the EFO number for other disease codes.

PREFIX identifiers:<http://identifiers.org/ensembl/>
PREFIX atlas: <http://rdf.ebi.ac.uk/resource/atlas/>
PREFIX atlasterms: <http://rdf.ebi.ac.uk/terms/atlas/>
PREFIX efo: <http://www.ebi.ac.uk/efo/>

SELECT DISTINCT ?wpURL ?pwTitle ?expressionValue ?pvalue where {

SERVICE <http://www.ebi.ac.uk/rdf/services/atlas/sparql> {
     ?factor rdf:type efo:EFO_0000270 . 
     ?value atlasterms:hasFactorValue ?factor . 
     ?value atlasterms:isMeasurementOf ?probe . 
     ?value atlasterms:pValue ?pvalue . 
     ?value rdfs:label ?expressionValue . 
     ?probe atlasterms:dbXref ?dbXref .
}
     ?pwElement dcterms:isPartOf ?pathway .
     ?pathway dc:title ?pwTitle .
     ?pathway dc:identifier ?wpURL .
     ?pwElement wp:bdbEnsembl ?dbXref .
}
ORDER BY ASC(?pvalue)

Execute

Genes differentially expressed in type II diabetes mellitus and Pathways

PREFIX identifiers:<http://identifiers.org/ensembl/>
PREFIX atlas: <http://rdf.ebi.ac.uk/resource/atlas/>
PREFIX atlasterms: <http://rdf.ebi.ac.uk/terms/atlas/>
PREFIX efo: <http://www.ebi.ac.uk/efo/>

SELECT DISTINCT ?wpURL ?pwTitle ?expressionValue ?pvalue where {

SERVICE <http://www.ebi.ac.uk/rdf/services/atlas/sparql> {
     ?factor rdf:type efo:EFO_0001360 . 
     ?value atlasterms:hasFactorValue ?factor . 
     ?value atlasterms:isMeasurementOf ?probe . 
     ?value atlasterms:pValue ?pvalue . 
     ?value rdfs:label ?expressionValue . 
     ?probe atlasterms:dbXref ?dbXref .
}
     ?pwElement dcterms:isPartOf ?pathway .
     ?pathway dc:title ?pwTitle .
     ?pathway dc:identifier ?wpURL .
     ?pwElement wp:bdbEnsembl ?dbXref .
}
ORDER BY ASC(?pvalue)

Execute

Genes differentially expressed in obesity and Pathways

PREFIX identifiers:<http://identifiers.org/ensembl/>
PREFIX atlas: <http://rdf.ebi.ac.uk/resource/atlas/>
PREFIX atlasterms: <http://rdf.ebi.ac.uk/terms/atlas/>
PREFIX efo: <http://www.ebi.ac.uk/efo/>

SELECT DISTINCT ?wpURL ?pwTitle ?expressionValue ?pvalue where {
SERVICE <http://www.ebi.ac.uk/rdf/services/atlas/sparql> {
     ?factor rdf:type efo:EFO_0001073 . 
     ?value atlasterms:hasFactorValue ?factor . 
     ?value atlasterms:isMeasurementOf ?probe . 
     ?value atlasterms:pValue ?pvalue . 
     ?value rdfs:label ?expressionValue . 
     ?probe atlasterms:dbXref ?dbXref .
}
     ?pwElement dcterms:isPartOf ?pathway .
     ?pathway dc:title ?pwTitle .
     ?pathway dc:identifier ?wpURL .
     ?pwElement wp:bdbEnsembl ?dbXref .
}
ORDER BY ASC(?pvalue)

Execute

Code examples

Perl

There is an RDF api available. Below is an example that extracts the data by converting the query into a url and extracts the data as CSV.

#!/usr/bin/perl
 
use LWP::Simple;
use URI::Escape;
my $sparql = "SELECT DISTINCT ?wpIdentifier ?elementneedsattention ?elementLabel
WHERE {
    ?pathway dc:title ?title .
    ?elementneedsattention a gpml:requiresCurationAttention .
    ?elementneedsattention dcterms:isPartOf ?pathway .
    ?elementneedsattention rdfs:label ?elementLabel . 
    ?pathway wp:organism ?organism .
    ?pathway foaf:page ?page .
    ?pathway dc:identifier ?wpIdentifier .
    ?organism rdfs:label \"Mus musculus\"^^<http://www.w3.org/2001/XMLSchema#string> .
 }
ORDER BY ?wpIdentifier";
 
my $url = 'http://sparql.wikipathways.org/?default-graph-uri=&query='.uri_escape($sparql).'&format=text%2Fcsv&timeout=0&debug=on';
 
my $content = get $url;
die "Couldn't get $url" unless defined $content;
 
print $content;

Java

For java we recommend the Jena Framework.

import com.hp.hpl.jena.query.Query;
import com.hp.hpl.jena.query.QueryExecution;
import com.hp.hpl.jena.query.QueryExecutionFactory;
import com.hp.hpl.jena.query.QueryFactory;
import com.hp.hpl.jena.query.QuerySolution;
import com.hp.hpl.jena.query.ResultSet;

public class javaCodeExample {

	public static void main(String[] args) {
		String sparqlQueryString = "SELECT * WHERE {?s ?p ?o} LIMIT 10";
		Query query = QueryFactory.create(sparqlQueryString);
		QueryExecution queryExecution = QueryExecutionFactory.sparqlService("http://sparql.wikipathways.org", query);
		ResultSet resultSet = queryExecution.execSelect();
		while (resultSet.hasNext()) {
			QuerySolution solution = resultSet.next();
			System.out.print(solution.get("s"));
			System.out.print("\t"+solution.get("p"));
			System.out.println("\t"+solution.get("o"));
		}
	}
}

php

For php we recommend the arc2: Easy RDF and SPARQL for LAMP systems

R

   library(rrdf)
   sparql.remote(
     "http://sparql.wikipathways.org/",
     "SELECT DISTINCT ?p WHERE { ?s ?p ?o }"
   )

Bioclipse

The below code works in both the JavaScript and the Groovy console:

   rdf.sparqlRemote(
     "http://sparql.wikipathways.org/",
     "SELECT DISTINCT ?p WHERE { ?s ?p ?o }"
   )

SPARQL from the command line

For quick and easy querying, we recommend to use curl (Linux and OS X)

curl -F "query=SELECT * WHERE {?s ?p ?o} LIMIT 10" http://sparql.wikipathways.org



Return to Help Contents

Personal tools
Navigation