The funded part of SPAWS is coming to a close, so its time for me to write about some of the things I’ve learned so far from working on this project.
Is the LearningRegistry Node infrastructure suitable for sharing paradata between app stores?
The major question this project posed was whether the LearningRegistry infrastructure could be used to effectively share paradata among a set of app stores. While we had some teething issues with the JLern test server, overall the infrastructure performed well, and the documentation was good enough to answer any questions I had.
There are some limitations we had to overcome; for example, there is no normalization of paradata, and so I had to code this into SPAWS itself. However I suspect its a useful enough capability to be eventually added to LR as an add-on module in the future.
The only aspect of the LR infrastructure we didn’t get to test was the federation/distribution aspect, where paradata is synchronised among a network of nodes rather than through a single node. Although from a SPAWS perspective I suspect it wouldn’t make any actual difference to the way it operates.
Which structures to use for Paradata?
When I started I wasn’t sure whether we would be able to use existing JSON “recipes” for paradata, have to write our own, or whether to use CAM. In the end, I created some very simple recipes for the paradata we needed, published on Github. The basic ActivityStreams-inspired format used by LR seems good enough for the purpose, whereas CAM seemed more suited to more detailed learning analytics than the kind of basic usage information I needed to share, so I decided not to use CAM in SPAWS, at least not for the purpose of the initial project.
Something interesting that came up from our advisory group was the use of contextual information in the paradata – so links back to the page where the original comment was made, or the detail page for the widget, and to the public profile page of the user. This allows for stores (or other kinds of agents) to extract additional information where this may be useful, for example for analytics. This meant that we could keep the “core” paradata recipes very light and functionally-oriented (i.e., sticking to data that would actually be used in the UI or in core functions such as popularity ordering).
Will anyone want to share paradata?
In the proposal I set out a business case for paradata sharing for “niche” app stores, somewhat speculatively at the time. Its good to see that this business case does seem to have some traction, and other commercial app stores have shown an interest as well as the initial set of educational app stores I focussed on for the project.
Timescales as always are something of a problem, with most of the stores I work with not going live (or even beta) for some months yet. However, there was enough of an opportunity to test out the SPAWS library in integration with the development sites of the stores to see it working OK.
Are there any legal or privacy issues with paradata sharing?
There are potential privacy issues with sharing paradata that is personally generated, however I stepped around this by only sharing paradata that was either completely depersonalised (e.g. everyone’s total number of downloads/shares/likes for a widget) or completely public (e.g. published reviews). This is pretty superficial but functional as far as stores are concerned. However, by including a context URL to the original store page and user profile pages, its possible to harvest additional contextual information as needed; again this is all public rather than protected information. More detailed or personalised analytics are either not necessary for app stores at their current stage of development, or would be highly unlikely to be shared. So, the legal and privacy questions proved quite simple to resolve in this case.
How sustainable is the SPAWS code?
I started out the code on Github from the very outset, not particularly with the idea of attracting contributions during the first six months (though that would be very welcome in future). The SPAWS library itself is published on OSSRH so is fairly simple to include in Java projects.
I also ended up contributing a fair number of pull requests to LRJavaLib, a generic LearningRegistry library that SPAWS builds on top of. There is currently quite a small Java community using LR so its especially important to use and contribute to existing libraries.
As it stands, SPAWS is quite a small codebase, with plenty of comments and testcases, so should be quite straightforward to sustain with minimal effort. Its also integrated into Edukapp, so will also be updated as needed for that project.
Which APIs to use?
The LearningRegistry has a range of supported APIs; it provides simple HTTP “obtain” and “publish” services that use JSON, and it also supports OAI-PMH and SWORD. I’m not the biggest fan of OAI-PMH so I was quite relieved to find the simple obtain service was quite sufficient for my purposes; likewise for publishing the basic POST service was enough. I suspect that these protocols only really come into play when synchronising large volumes of metadata and paradata; for the SPAWS scenarios the amount of paradata will always be relatively small, certainly for each request, and not worth the extra hassle of having to maintain state on the client side.
In the end, I opted for using a cacheing strategy for “external” paradata rather than to actually synchronise it internally with the store’s own data, avoiding any potential conflicts. Perhaps on a larger scale it would make sense to create an actual local repository for external paradata and synch it using OAI-PMH, but that would seem to be quite a way off – the next logical scaling step for Edukapp, for example, would simply be to move from memory cache to disk cache, and increase the cache size.
Given that I didn’t need to use OAI-PMH or SWORD, the standards of interest were principally the widget specs themselves – W3C and OpenSocial. The stores already provide an abstraction of the common metadata for the two specs so it wasn’t really necessary for SPAWS to rely on any particular features of the specifications apart from one rather critical one – there is no “identifier” for an OpenSocial gadget.
This is quite an issue to overcome; I opted for relying on the URL for now while punting the issue to the OpenSocial spec community to see if it could be addressed there.
Historically, each OpenSocial gadget was hosted on its own site rather than packaged and hosted elsewhere; however over time there have been gadgets being packaged up and rehosted and so this would become more of a requirement – perhaps even using the W3C format to package and transport OpenSocial gadgets.
I can also see now that what I’ve worked on here for app/widget stores would also work fine for pretty much any kind of site – so product catalogues or more traditional repositories could also reuse it pretty much as-is, including the recipes. I was originally expecting it to be similar but not directly reusable; this is something worth pursuing .
Given what I’ve learned, would I do anything differently? I don’t think so, I think this project has answered the questions we posed initially, so its more now a question of what to do next.
We also really need to come up with a solution for PHP-based stores like ROLE, for example we could use the same algorithms as in the Java version of SPAWS, but build on LRPHP.
However, what I most want to do now is get the code into production in the stores that are being launched and take it from there.