WikiPathways Metabolomics
On this page we collect SPARQL queries to see the state of the Metabolome in WikiPathways. Triggered by
Andra’s RDF / SPARQL work, curation started with metabolites
without database identifiers. But this soon led to the observation that metabolites are often not even
annotated as being a metabolite (using <Label>
rather than <DataNode>
).
The Data
The latest revision you can look up with:
select (str(?o) as ?version) where {
?pw a void:Dataset ;
dcterms:title ?o .
}
Metabolome
The following queries provide an overview of the Metabolome captures by WikiPathways.
The key type for metabolites is the wp:Metabolite
. We can see all available properties with:
select (str(?o) as ?version) where {
?pw a void:Dataset ;
dcterms:title ?o .
}
All Metabolites
We can get the count of metabolites datanodes in WikiPathways with:
select count(distinct ?mb) where {
?mb a wp:Metabolite .
}
As list:
select distinct ?mb ?label where {
?mb a wp:Metabolite ;
rdfs:label ?label .
}
Or metabolites for just zebrafish pathways:
select distinct ?metabolite (str(?titleLit) as ?title) where {
?metabolite a wp:Metabolite ;
dcterms:isPartOf ?pw .
?pw dc:title ?titleLit ;
wp:organismName "Danio rerio" .
}
Metabolic Data Sources
Sorted by use
ChEBI, HMDB, and LIPID MAPS are the main data sources for identifiers:
select str(?datasource) as ?source count(distinct ?identifier) as ?count
where {
?mb a wp:Metabolite ;
dc:source ?datasource ;
dc:identifier ?identifier .
} order by desc(?count)
All metabolites from one source
All KEGG identifiers
This SPARQL query lists all metabolite datanodes annotated with a KEGG compound identifier:
select distinct ?identifier
where {
?mb a wp:Metabolite ;
dc:source "KEGG Compound" ;
dc:identifier ?identifier .
} order by ?identifier
All HMDB identifiers
Return all HMDB identfiers with:
select distinct ?identifier
where {
?mb a wp:Metabolite ;
dc:source "HMDB" ;
dc:identifier ?identifier .
} order by ?identifier
Metabolic Pathways
Of general interest is the number of pathways per species:
select distinct str(?orgName) as ?organism count(?pw) as ?pathways where {
?pw wp:organismName ?orgName .
} order by desc(?pathways)
Metabolomes
Human Metabolome
prefix ncbi: <http://purl.obolibrary.org/obo/NCBITaxon_>
select distinct ?mb where {
?mb a wp:Metabolite ;
dcterms:isPartOf ?pw .
?pw wp:organism ncbi:9606 .
} order by ?mb
Arabodopsis thaliana Metabolome
prefix ncbi: <http://purl.obolibrary.org/obo/NCBITaxon_>
select distinct ?mb where {
?mb a wp:Metabolite ;
dcterms:isPartOf ?pw .
?pw wp:organism ncbi:3702 .
} order by ?mb
Pathways with the most metabolites
select ?pathway count(distinct ?mb) as ?mbCount
where {
?mb a wp:Metabolite ;
dcterms:isPartOf ?pathway .
} order by desc(?mbCount)
Metabolites in the most Pathways
With the remark that BridgeDb is not involved yet: the results are based on metabolite datanodes, not unique metabolites.
select ?mb count(distinct ?pathway) as ?pwCount
where {
?mb a wp:Metabolite ;
dcterms:isPartOf ?pathway .
} order by desc(?pwCount)
Enzymatic reactions
SELECT DISTINCT ?wpid ?catalyst ?source ?sourceDb ?target ?targetDb WHERE {
?pathway a wp:Pathway ;
dc:identifier / dcterms:identifier ?wpid .
# ?catalysis a wp:Catalysis .
?catalysis dcterms:isPartOf ?pathway ;
wp:source / rdfs:label ?catalyst ;
wp:participants ?reaction .
?reaction a wp:Interaction .
?reaction wp:source ?source .
?source a wp:Metabolite .
OPTIONAL{?source wp:bdbWikidata ?sourceDb .}
?reaction wp:target ?target .
?target a wp:Metabolite .
OPTIONAL{?target wp:bdbWikidata ?targetDb .}
} ORDER BY ASC(?source)