By adding 2 new microservices to our regular mu.semte.ch setup (https://github.com/mu-semtech/mu-project) we can create a very nifty workflow that will automatically expand semantic files in our graph database.
File uploader service
We need a file uploader service that after accepting a POST with a file saves that file and adds information about that file in our triple store. This information can be expressed using the following rather limited vocabulary:
A file is of class http://mu.semte.ch/vocabularies/file-service/File
and has the following properties
- http://mu.semte.ch/vocabularies/file-service/internalFilename
- http://mu.semte.ch/vocabularies/file-service/filename
- http://mu.semte.ch/vocabularies/file-service/uploadedAt
- http://mu.semte.ch/vocabularies/file-service/status
Semantic expander service
Next we need to have a semantic expander service. This service is a little bit more complicated because it handles 2 separate functionalities.
The first functionality that this service should have is support to consume delta’s as they are generated by the delta service (https://github.com/mu-semtech/mu-delta-service). In these reports we will need to filter out the files that contain semantic data and whose status has changed to uploaded. We can achieve this in a rather brute way by first making a set of all URIs of the subjects in the insert reports. After this we can make a simple query that looks somewhat like this:
SELECT DISTINCT ?internal_filename WHERE { ?uri a http://mu.semte.ch/vocabularies/file-service/File . ?uri http://mu.semte.ch/vocabularies/file-service/internalFilename ?internal_filename . ?uri http://mu.semte.ch/vocabularies/file-service/filename ?filename . FILTER(strends(?filename, “.ttl”) || ?filename, “.rdf”)) FILTER(?uri in ([LIST OF ALL THE URI’s FOUND])) FILTER NOT EXISTS { ?uri http://mu.semte.ch/vocabularies/semantic-expander/expanded ?date } }
This query will provide us with a list of filenames. We can now expand each of theses filenames. This can be done either (1) by converting the files to one or more insert queries, (2) by using a graph protocol to load an entire file or (3) by using store specific logic to load the files (i.e. using iSQL on Virtuoso to create a load list and then starting the load endpoint).
And tada! Whenever we upload a file with semantic content to our backend, the semantic expander service will pick it up automatically and load the contents in the triple store. Almost magic.