Auto-expanding uploaded semantic files

By adding 2 new microservices to our regular mu.semte.ch setup (https://github.com/mu-semtech/mu-project) we can create a very nifty workflow that will automatically expand semantic files in our graph database.

File uploader service

We need a file uploader service that after accepting a POST with a file saves that file and adds information about that file in our triple store. This information can be expressed using the following rather limited vocabulary:
A file is of class http://mu.semte.ch/vocabularies/file-service/File
and has the following properties

Semantic expander service

Next we need to have a semantic expander service. This service is a little bit more complicated because it handles 2 separate functionalities.

The first functionality that this service should have is support to consume delta’s as they are generated by the delta service (https://github.com/mu-semtech/mu-delta-service). In these reports we will need to filter out the files that contain semantic data and whose status has changed to uploaded. We can achieve this in a rather brute way by first making a set of all URIs of the subjects in the insert reports. After this we can make a simple query that looks somewhat like this:

SELECT DISTINCT ?internal_filename
 WHERE {
   ?uri a http://mu.semte.ch/vocabularies/file-service/File .
   ?uri http://mu.semte.ch/vocabularies/file-service/internalFilename ?internal_filename .
   ?uri http://mu.semte.ch/vocabularies/file-service/filename ?filename .
   FILTER(strends(?filename, “.ttl”) || ?filename, “.rdf”))
   FILTER(?uri in ([LIST OF ALL THE URI’s FOUND]))
   FILTER NOT EXISTS {
     ?uri http://mu.semte.ch/vocabularies/semantic-expander/expanded ?date
   }
 }

This query will provide us with a list of filenames. We can now expand each of theses filenames. This can be done either (1) by converting the files to one or more insert queries, (2) by using a graph protocol to load an entire file or (3) by using store specific logic to load the files (i.e. using iSQL on Virtuoso to create a load list and then starting the load endpoint).

And tada! Whenever we upload a file with semantic content to our backend, the semantic expander service will pick it up automatically and load the contents in the triple store. Almost magic.