To begin with, I will be converting Karl’s cis/trans eQTL plot to pull its data from a triple store dynamically. Currently there are two scripts that process an r/qtl cross and a few supporting dataframes to create a set of static JSON files, which are then loaded into the graph. Using a triple store to hold the underlying data, however, the values required by the visualization can be accessed dynamically based on the structure of the original R objects. As of now I have successfully converted each of of the necessary datasets to RDF, and am working on generating queries that Karl’s d3 code can use to access it through a 4store SPARQL endpoint (which supports JSON output).
The objects involved are quite large, and the Data Cube vocabulary (really RDF in general) is fairly verbose in its representation of information, so I am working on loading what I have into the right databases and reducing redundancy in the output. However, if you’d like some idea of how the data are being represented and accessed, I’ve set up a demo on Dydra with a subset of the data and some example queries.
Testing and Validation
In addition to working with Karl, I’ve taken time to refactor my code toward creating Data Cube RDF for more general structures. Originally the main module worked off of an Rserve object, but I’ve redone everything to use plain ruby objects, which the generator classes are responsible for creating. To support this refactoring, and the creation of new generators for data types such as CSV files, I’ve begun using Rspec to build the spec for my project. I’ve added tests against reference output and syntactical correctness, but these are respectively too brittle and too permissive to ensure novel data sets will generate valid output. To this end, I have implemented a number of the official Data Cube Integrity Constraints as part of the spec. The ICs are a set of SPARQL queries that can be run on your RDF output to ensure various high level features are present, and go beyond simple syntax validity in ensuring you have properly encoded your data. I’ve had to make a few modifications, since the ICs are slightly out of data, and some of the SPARQL 1.1 facilities they make use of aren’t fully supported by the RDF.rb SPARQL gem. Aside from a place in the set of tests, the ICs could also be useful as part of the main code, providing a way for the end user to ensure that their data is compatible with Data Cube tools like Cubeviz.