‘ODAM’ (Open Data for Access and Mining) is an experimental data table management software to make research data accessible and available for reuse with minimal effort on the part of the data provider. Designed to manage experimental data tables in an easy way for users, ODAM provides a model for structuring both data and metadata that facilitates data handling and analysis. It also encourages data dissemination according to FAIR principles by making the data interoperable and reusable by both humans and machines, allowing the dataset to be explored and then extracted in whole or in part as needed.
The Rodam package has only one class, odamws that provides methods to allow you to retrieve online data using ‘ODAM’ Web Services. This obviously requires that data are implemented according the ‘ODAM’ approach , namely that the data subsets were deposited in the suitable data repository in the form of TSV files associated with their metadata also described in TSV files.
The R ODAM package offers a set of functions for retrieve data and their metadata of datasets that are implemented help with the “Experimental Data Table Management System” (EDTMS) called ODAM, which stands for “Open Data for Access and Mining”.
See https://inrae.github.io/ODAM/ for further information.
library(Rodam)
## Loading required package: httr
Initialize the ‘ODAM’ object with the wanted dataset along with its corresponding URL of webservice
dh <- new('odamws', wsURL='https://pmb-bordeaux.fr/getdata/', dsname='frim1')
options(width=256)
options(warn=-1)
options(stringsAsFactors=FALSE)
show(dh)
## levelName SetID Identifier WSEntry Description Count
## 1 plants 1 PlantID plant Plant features 552
## 2 °--samples 2 SampleID sample Sample features 1287
## 3 ¦--aliquots 3 AliquotID aliquot Aliquots features 530
## 4 ¦ ¦--cellwall_metabo 4 AliquotID aliquot Cell wall Compound quantifications 75
## 5 ¦ ¦--cellwall_metaboFW 5 AliquotID aliquot Cell Wall Compound quantifications (FW) 75
## 6 ¦ ¦--activome 6 AliquotID aliquot Activome Features 266
## 7 ¦ ¦--plato_hexosesP 10 AliquotID aliquot Hexoses Phosphate 266
## 8 ¦ ¦--lipids_AG 11 AliquotID aliquot Lipids AG 57
## 9 ¦ °--AminoAcid 12 AliquotID aliquot Amino Acids 69
## 10 °--pools 7 PoolID pool Pools of remaining pools 195
## 11 ¦--qMS_metabo 8 PoolID pool MS Compounds quantification 25
## 12 °--qNMR_metabo 9 PoolID pool NMR Compounds quantification 64
Get all WebService entries defined in the data subset ‘samples’
dh$getWSEntryByName("samples")
## Subset Attribute WSEntry
## 1 plants PlantID plant
## 2 plants Rank row
## 3 plants PlantNum plantnum
## 4 plants Treatment treatment
## 5 samples SampleID sample
## 6 samples Truss truss
## 7 samples DevStage stage
## 8 samples FruitAge age
## 9 samples FruitPosition position
## 10 samples FruitDiameter diameter
## 11 samples FruitHeight height
## 12 samples FruitFW weightfw
## 13 samples FruitDW weightdw
a ‘WSEntry’ is an alias name associated with an attribute that allows user to query the data subset by putting a filter condition (i.e. a selection constraint) on the corresponding attribute. Not all attributes have a WSEntry but only few ones, especially the attributes within the identifier and factor categories. For instance, the WSEntry of the ‘SampleID’ attribute is ‘sample’. Thus, if you want to select only samples with their ID equal to 365, you have to specify the filter condition as ‘sample/365’.
data <- dh$getDataByName('samples','sample/365')
data
## PlantID Rank PlantNum Treatment SampleID Truss DevStage FruitAge HarvestDate HarvestHour FruitPosition FruitDiameter FruitHeight FruitFW FruitDW DW
## 1 E35 E 311 Control 365 T6 FR.02 47DPA 40423 0.5 5 55.46 48.98 83.32 5.299152 NA
## 2 A17 A 17 Control 365 T6 FR.02 47DPA 40423 0.5 3 56.59 47.77 82.02 5.216472 NA
## 3 A8 A 8 Control 365 T6 FR.02 47DPA 40423 0.5 5 55.11 44.90 71.82 4.567752 NA
## 4 D3 D 210 Control 365 T6 FR.02 47DPA 40423 0.5 5 49.28 44.35 58.28 3.706608 NA
## 5 H11 H 356 Control 365 T6 FR.02 47DPA 40423 0.5 6 46.68 38.69 49.25 3.132300 NA
But if this WSEntry concept is not clear for you, you can retrieve the full data subset, then performe a local selection as shown below :
data <- dh$getDataByName('samples')
data[data$SampleID==365, ]
## PlantID Rank PlantNum Treatment SampleID Truss DevStage FruitAge HarvestDate HarvestHour FruitPosition FruitDiameter FruitHeight FruitFW FruitDW DW
## 658 E35 E 311 Control 365 T6 FR.02 47DPA 40423 0.5 5 55.46 48.98 83.32 5.299152 NA
## 659 A17 A 17 Control 365 T6 FR.02 47DPA 40423 0.5 3 56.59 47.77 82.02 5.216472 NA
## 660 A8 A 8 Control 365 T6 FR.02 47DPA 40423 0.5 5 55.11 44.90 71.82 4.567752 NA
## 661 D3 D 210 Control 365 T6 FR.02 47DPA 40423 0.5 5 49.28 44.35 58.28 3.706608 NA
## 662 H11 H 356 Control 365 T6 FR.02 47DPA 40423 0.5 6 46.68 38.69 49.25 3.132300 NA
data$HarvestDate <- dh$dateToStr(data$HarvestDate)
data$HarvestHour <- dh$timeToStr(data$HarvestHour)
data[data$SampleID==365, ]
## PlantID Rank PlantNum Treatment SampleID Truss DevStage FruitAge HarvestDate HarvestHour FruitPosition FruitDiameter FruitHeight FruitFW FruitDW DW
## 658 E35 E 311 Control 365 T6 FR.02 47DPA 2010-09-02 12h0 5 55.46 48.98 83.32 5.299152 NA
## 659 A17 A 17 Control 365 T6 FR.02 47DPA 2010-09-02 12h0 3 56.59 47.77 82.02 5.216472 NA
## 660 A8 A 8 Control 365 T6 FR.02 47DPA 2010-09-02 12h0 5 55.11 44.90 71.82 4.567752 NA
## 661 D3 D 210 Control 365 T6 FR.02 47DPA 2010-09-02 12h0 5 49.28 44.35 58.28 3.706608 NA
## 662 H11 H 356 Control 365 T6 FR.02 47DPA 2010-09-02 12h0 6 46.68 38.69 49.25 3.132300 NA
Get ‘activome’ data subset along with its metadata
ds <- dh$getSubsetByName('activome')
ds$samples # Show the identifier defined in the data subset
## NULL
ds$facnames # Show all factors defined in the data subset
## [1] "Treatment" "DevStage" "FruitAge"
ds$varnames # Show all quantitative variables defined in the data subset
## [1] "PGM" "cFBPase" "PyrK" "CitS" "PFP" "Aconitase" "PFK" "FruK"
## [9] "pFBPase" "GluK" "NAD_ISODH" "Enolase" "NADP_ISODH" "PEPC" "Aldolase" "Succ_CoA_ligase"
## [17] "NAD_MalDH" "AlaAT" "Fumarase" "AspAT" "NADP_GluDH" "NAD_GAPDH" "NADP_GAPDH" "NAD_GluDH"
## [25] "TPI" "PGK" "Neutral_Inv" "Acid_Inv" "G6PDH" "UGPase" "SuSy" "NAD_ME"
## [33] "ShiDH" "NADP_ME" "PGI" "StarchS" "AGPase" "SPS"
ds$qualnames # Show all qualitative variables defined in the data subset
## [1] "Rank" "Truss"
ds$WSEntry # Show all WS entries defined in the data subset
## Subset Attribute WSEntry
## 1 plants PlantID plant
## 2 plants Rank row
## 3 plants PlantNum plantnum
## 4 plants Treatment treatment
## 5 samples SampleID sample
## 6 samples Truss truss
## 7 samples DevStage stage
## 8 samples FruitAge age
## 9 samples FruitPosition position
## 10 samples FruitDiameter diameter
## 11 samples FruitHeight height
## 12 samples FruitFW weightfw
## 13 samples FruitDW weightdw
## 14 aliquots SampleID sample
## 15 aliquots AliquotID aliquot
## 16 activome AliquotID aliquot
## 17 activome PGM Phosphoglucomutase
## 18 activome pFBPase bisphosphatase
## 19 activome PGK kinase
## 20 activome SPS synthase
## 21 activome PFK phosphofructokinase
## 22 activome Aconitase Aconitase
## 23 activome FruK fructokinase
## 24 activome GluK Glucokinase
## 25 activome ShiDH dehydrogenase
## 26 activome Enolase Enolase
## 27 activome PEPC Carboxylase
## 28 activome Aldolase aldolase
## 29 activome Succ_CoA_ligase ligase
## 30 activome AlaAT transaminase
## 31 activome Fumarase fumarase
## 32 activome AspAT aminotransferase
## 33 activome NADP_GAPDH NADP
## 34 activome NAD_GAPDH NAD
## 35 activome NAD_GluDH NAP
## 36 activome PGI isomerase
## 37 activome Acid_Inv invertase
## 38 activome UGPase phosphorylase
## 39 activome NADP_ME enzyme
Rank <- simplify2array(lapply(ds$varnames, function(x) { round(mean(log10(ds$data[ , x]), na.rm=T)) }))
cols <- c('red', 'orange', 'darkgreen', 'blue', 'purple')
boxplot(log10(ds$data[, ds$varnames]), outline=F, horizontal=T, border=cols[Rank], las=2, cex.axis=0.8)
Based on the subset network, the common ID to be considered is the “SampleID” identifier
refID <- "SampleID"
subsetList <- c( "samples", "activome", "qNMR_metabo", "cellwall_metabo" )
n <- length(subsetList)
Mintersubsets <- matrix(data=0, nrow=n, ncol=n)
for (i in 1:(n-1))
for (j in (i+1):n)
Mintersubsets[i,j] <- length(dh$getCommonID(refID,subsetList[i],subsetList[j]))
rownames(Mintersubsets) <- subsetList
colnames(Mintersubsets) <- subsetList
Mintersubsets[ -n, -1 ]
## activome qNMR_metabo cellwall_metabo
## samples 254 188 70
## activome 0 188 70
## qNMR_metabo 0 0 23
setNameList <- c("activome", "qNMR_metabo" )
dsMerged <- dh$getSubsetByName(setNameList)
cols <- c( rep('red', length(dsMerged$varsBySubset[[setNameList[1]]])),
rep('darkgreen', length(dsMerged$varsBySubset[[setNameList[2]]])) )
boxplot(log10(dsMerged$data[, dsMerged$varnames]), outline=F, horizontal=T, border=cols, las=2, cex.axis=0.8)
options(width=128)
sessionInfo()
## R version 4.0.3 (2020-10-10)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 19045)
##
## Matrix products: default
##
## locale:
## [1] LC_COLLATE=French_France.1252 LC_CTYPE=French_France.1252 LC_MONETARY=French_France.1252 LC_NUMERIC=C
## [5] LC_TIME=French_France.1252
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] Rodam_0.1.14 httr_1.4.2
##
## loaded via a namespace (and not attached):
## [1] digest_0.6.27 R6_2.5.0 jsonlite_1.7.2 magrittr_2.0.1 evaluate_0.14 highr_0.8
## [7] rlang_0.4.11 stringi_1.5.3 curl_4.3.1 jquerylib_0.1.3 bslib_0.2.4 rmarkdown_2.17
## [13] data.tree_1.0.0 tools_4.0.3 stringr_1.4.0 xfun_0.29 yaml_2.2.1 compiler_4.0.3
## [19] htmltools_0.5.1.1 knitr_1.31 sass_0.4.2