Why CKAN’s datastore_search_sql gives you a syntax error, and how to fix it

If you’re using the DataStore extension with CKAN, one of the first things you’re likely to try is to execute a SQL query on your data. However, you’ll likely see something like this:

["(ProgrammingError) syntax error at or near \"-\"\nLINE 1: SELECT * FROM 2da8c567-9d09-4b00-b098-c3e036170a86

This is because by default CKAN tends to create resource IDs that PostgreSQL doesn’t like as table names.

To get around this, just put quotes around the resource ID in the URL like so:

Posted in development | Leave a comment

Sign and encrypt your email. Please.

(This was prompted by the news that Groklaw is shutting down, in large part due to concerns over conducting business by email now that there is no legal or constitutional protection for its privacy. You can find out more about this story here)

Email is wonderful and terrible. Its pretty much the one technology that no business or organisation can live without. Its also, by default, pretty much insecure enough that anyone can snoop it with little more than basic networking tools.

But there are some simple measures that you can take to make it much, much more robust.

Simplest of all is to use servers that use encryption of the communication channel (TLS). This is nice and easy for users because they don’t even need to know about it. It prevents casual eavesdropping over the network. Most providers these days use encrypted communication channels for email.

However, the big hole in this scheme is that, while your communication is encrypted to others using the network, its plain to read for your provider. Not a problem if you trust your provider with you privacy and security . But these days, why would you?

To close this gap, you need to actually encrypt the messages themselves, not just the channel they are sent over. The tool I use for this is GPG, and a handy plugin for Apple’s Mail program called GPGMail. This automatically signs emails you send (preventing forgery) and also automatically encrypts email if you have the public key of the person you are sending it to. (If you’re interested, mine is here).

You can see this working by, for example, sending encrypted email from your GMail account, then looking at the message in the GMail web interface – all you get is a big block of seemingly random characters as Google can’t decipher the message and read it. Even though I’m using their service to deliver it! How cool is that?

The system only really starts to work if more people use it, so that the amount of messages that can be encrypted becomes a significant part of the total traffic. If only a few messages on the network are encrypted, its easy enough for Bad People to just target those and break their encryption. If there are billions of encrypted emails flying around, it becomes an untenable and expensive proposition to break them open, and mining all emails by default looks far less attractive for both companies and governments.

So, even if you are of the “I have nothing to hide” point of view, there is still a good reason to use encrypted communications if you can.

Understanding how keys work is the main education barrier to getting more people using the system. It would be nicer if email applications made using encryption and signing easier by default, but I guess they have plenty of incentives not to…

For a much better guide for how to set it all up, try this article on LifeHacker.

Posted in Uncategorized | Leave a comment


Well this whole Prism business has been quite interesting from a personal perspective, especially this “metadata” business.

A long time ago, though not actually that far away from where I am now, I worked as a junior technical writer at a software company called Harlequin (later Xanalys). I worked in the intelligence systems division, and had been involved in some very cool projects for things like crime mapping, network analysis, and homicide case management. We had some great news clippings on the office walls of crimes solved using our technology.

One day my supervisor informed me that I needed to update the manual for one of the company’s more popular but least-liked products, something called CaseCall. Everyone within the company hated CaseCall as far as I could tell, and after a few days working on it I could tell why.

What CaseCall did was basically automate its way around some otherwise quite sensible restrictions on police extracting metadata from telecommunication providers.

In principle, any investigating officer could get in touch with any provider and ask for details about who-called-whom over a particular time period and analyse the data, but in practice not many did because the law put in place a number of steps you needed to go through to have your request approved.

What CaseCall did was turn things like “The absence of this information will prejudice this investigation because…” into a drop-down list of boilerplate non-answers so that the officer could press the submit button at one end, and the service provider press the accept button at the other, and the metadata could flow into the very clever analysis tools that the company had developed (and indeed still sell today).

Harlequin won the first Big Brother award for a product in 1998 for CaseCall. Sadly, no-one from the company went to collect it, as it would have looked great in the office:

Product: Software by Harlequin that examines telephone records and is able to compare numbers dialled in order to group users into ‘friendship networks’ won this category. It avoids the legal requirements needed for phone tapping.

(Friendship network analysis in 1998! Pretty good, huh? Mark Zuckerberg would have been about 14 around then.)

Basically, what we were doing was avoiding the whole business of phone tapping and collecting content, and instead going after metadata. After all, the metadata was usually sufficient to identify important network nodes, identify useful patterns of behaviour, and corroborate other types of information acquired by other means, such as interviews and field officer reports.

The metadata was actually in many ways more useful than the “data” (the content of the phone calls in this case), which would have taken a lot of work to transcribe and analyse, and may not have actually provided much more analytic value than the metadata alone. (It was great  to read this little sketch by Cory Doctorow about metadata today which kind of makes the same point.)

So don’t underestimate “metadata”..!

Posted in Uncategorized | 3 Comments

Phishing for peer reviewers

Today I got an email from the ICL 2013 conference:

Dear Scott Wilson,

This is a short reminder for you to complete your reviews for the ICL/IGIP 2013 Conference.

Overall 2 submissions were recently assigned to you for reviewing, 2 are not yet completed.

We kindly remind you that we implement a “review-to-present” model. At least one of the authors from each full paper is expected to act as a reviewer of other submissions in order to have their paper(s) published in the conference proceedings.

Please log into your ConfTool account
(http://www.conftool.com/icl-conference/). On the “Overview” page you can find now a review section where you will be able to download the papers and enter your reviews.

We need your reviews latest until 22 May 2013. We can’t exceed this deadline, because we would like to inform the authors about the acceptance in time.

Thank you for your support.

Best regards,

Your organizers of ICL 2013.

OMG! My two reviews are due in! I’d better go complete them!

Except of course I’ve never been asked, let alone accepted, to be on the review committee for ICL. As far as I can remember, I’ve never even been to one.

Thankfully, as soon as I read this it rang a vague bell. Where had I heard this before? Oh, I know, back in 2012:

Dear Scott Wilson,

We are now ready to start the review process of the Full Paper extended abstracts submissions for the ICL/IGIP Conference 2012.

You have been assinged up to three papers to review. To download the paper(s) and to enter your reviews please login to your ConfTool account. On the “Overview” page you can find now a review section.

For all the authors of full papers we kindly remind you that “Full Paper” is a “review-to-present” submission type. Each paper MUST have at least one author participating in the reviewing process.

We would appreciate to receive your reviews by 14 May 2012. We can’t exceed this deadline, because we would like to inform the authors about the cceptance in time.

Thank you for your support during the ICL/IGIP review process.

Best regards,

Danilo G. Zutin
Technical Program Chair

I’d just ignored it at that time. Why? Well, because back in 2011 I received:

On 1 Jun 2011, at 21:23, ICL Conference Secretariat wrote:

Dear Scott Wilson,

may we kindly remind you, that the deadline for submitting the reviews assigned to you is on Monday, 06 June 2011.

We can’t extend the review phase, because we have to inform authors about the results in time.

Thank you for your understanding and kind regards,

Conference Chairs

14th International Conference on Interactive Collaborative Learning

To which I replied, somewhat snarkily:

As far as I’m aware I have never agreed to have any connection with this conference. Please remove me from your mailing list.

Of course they ignored this, as they had when back in 2010 they sent me this:

On 25 May 2010, at 20:41, ICL Conference Secretariat wrote:

Dear Scott Wilson,

this is a gentle reminder that the paper review deadline for the ICL2010 Conference  is in four days (Saturday, 29 May 2010).

For your convenience the input of your reviews via the ConfTool is still possible until Sunday evening.

Please note that this is a hard deadline, so that the chairs can perform their duties in a timely manner and inform the authors about acceptance/rejection in time.

For your information: We have received more than 140 full paper submissions from over 45 countries and we are looking forward to a successful conference.

Calls for some submission types are still open.

Best regards,

Jeanne Schreurs

Michael Auer

13th International Conference on Interactive Computer aided Learning


To which I’d replied, confusedly:

I have no idea why I received this email – I’m not attending the conference, nor have I volunteered to join the programme committee for it?

Now, fake conferences and academic spam are becoming a real nuisance. However, I think ICL is a real conference because some real people I know have had papers accepted there!  Also, the very real Sandra Schaffert organised a mashup workshop at ICL 2009 for which I actually was on the programme committee!

But the fact is that in its communications ICL behaving like some sort of bizarre phishing scheme rather than an academic event.

You normally don’t allocate reviews for papers to random people on the Internet, you actually invite them onto your programme committee and give them the option of saying “no”.

Maybe this is just a bug in the conference organising software they’re using. Then again, who knows, maybe a spam-bait-phish model of conference peer review actually works?

Posted in Uncategorized | Leave a comment

HtmlCleaner 2.5 is out!

Yesterday we released HtmlCleaner 2.5, which fixes up a few bugs, and also has a rewritten DOCTYPE handler based on the latest guidance issued with HTML5.


Posted in Uncategorized | Leave a comment

Open Source Meets Open Standards

I’ve just published a post over on the OSS Watch team blog on open source and open standards, introducing a new OSS Watch briefing paper on the topic.

This is where my CETIS and OSS Watch roles cross over! The post linked above is from a policy point of view, whereas the briefing paper is more aimed at developers and project managers. But what does this look like from a standards wonk viewpoint?

We often espouse the virtues of having open source implementations (or reference implementations) for driving adoption of standards, however there can also be barriers to open source that may be less obvious.

The paid publishing model often used by de jure standards organisations is definitely a barrier in the sense that developers are less likely to browse the standard and decide to use it, however I think on the whole its far less of an obstacle than a lack of clarity on the issues of patent licensing, copyright, conformance claims, and trademarks. If the standard is critical for interoperability, paying $120 for the specs isn’t as big a deal as potentially getting sued by Oracle or IBM for patent infringement.

For major standards setting organisations like W3C its not much of an issue, as developers generally know they can implement W3C  standards freely, whether or not they really understand the legal detail. However, for less well known standards-setting communities and consortia, its necessary to clearly spell out their position if they want to encourage open source implementations.

Posted in cetis, standards | 2 Comments

Understanding Glass: An SF perspective

When the IPhone (and later the IPad) arrived the first things people started comparing it to were things like the movie Minority Report for its interface, and also to the technologies used in Star Trek, with its many glossy black touch panel devices.

Science Fiction sometimes does a good job of – if not predicting the future – exploring the implications of many possible futures.

I think Google Glass is another good example of this.

My first thought on seeing Google Glass was a fairly recent novel by Vernor Vinge, Rainbows End (2006).

Rainbows End book cover

However, this is a novel about augmented reality, and virtualization. Its well worth reading for an exploration of how an augmented reality layer may affect society and technology, in particular his vision of a locked-down technological future where the only way of interacting with “black boxes” is via virtual interfaces does have some echoes of todays trend towards walled garden networks.

However, the “overlay” aspects of Google Glass are perhaps not the most interesting.

In a recent post, Mark Hurst discusses the “lifebits” recording feature of Glass.  This to me immediately brought to mind two very different SF stories.

In Other Days Other Eyes (1972), Bob Shaw introduced us to a world of “Slow Glass”, a materials that delays the transmission of light from one side to another.


He takes us through the use of the technology as a means of ubiquitous surveillance – find a piece of glass, however small, and look through it to see what happened in the past. Google Glass, though using a very different technology, has similar properties to going around wearing a piece of Slow Glass, in that it offers the capability of looking through it into the past – via lifestreaming of video (and audio – not something Slow Glass could accomplish!).

Shaw does a very thorough job of exploring the use of this technology, and I highly recommend reading the novel. In particular, there is an exploration of how society adapts to ubiquitous surveillance, which has a kind of ring of the Kübler-Ross stages of grief. Initially, efforts are made to try to avoid Glass, for example a secret meeting in the book is held in a room whose walls are freshly hosed down with a new layer of plastic each day just in case any fragments of Glass had been placed in it. However by the end of the story, everyone just accepts that anything, anywhere, and anywhen may be seen by others, with the vivid image at the end of particles of Glass suspended in raindrops.

Shaw also touches on the nostalgic aspects of Glass, with a story of a man watching his lost family from outside his house.

This nostalgic theme is also a key part of a very interesting short story by John Crowley, Snow (1985).

Snow is told from the perspective of a widower, whose wife’s first husband had bought her a “wasp” that continually recorded video of her for posterity. In the story, the protagonist visits a futuristic memorial park where he compulsively reviews the recordings, until eventually entropy starts to set in.

Again, the technology that Crowley uses here isn’t very much like Google Glass, but the implications of the story feel quite close to the near future. Crowley presents a compelling use for the lifestreaming (or lifebits, or whatever you want to call it) and also explores some of the potential downsides. Its quite short, and worth a read. Would you use Google Glass to do this?

I’m sure there are many other examples of fiction anticipating a Glass-like invention, if you know of any, let me know in the comments!

Posted in augmented reality, google | 4 Comments

Open Source Junction 4: Open Source Hardware meets Open Source Software

(reblogged from the OSS Watch blog)

OSS Watch is delighted to announce a new event in the Open Source Junction series aimed at facilitating knowledge exchange between industry and academic innovation. Open Source Junction brings together the best business and academic minds to explore how the two sectors can jointly innovate, develop and exploit open source software in conjunction with open source hardware.

Open Source Junction 4 is taking place on 14th-15th March at Trinity College, Oxford and focuses on open source hardware.

Open Source Hardware (OSH), like Open Source Software (OSS), is an open approach to technology where the information needed to create hardware artefacts – such as schematics, drawings and bill of materials – is distributed, allowing others to produce artefacts, and to modify and improve on the design. Open Source Hardware has a wide range of applications, including medical appliances, lab equipment, surveillance drones, and toys.

If you’re involved in an Open Source Hardware project, a project that uses Open Source Software in conjunction with hardware, or just enjoy cool hardware hacks, we’d like to see you at Open Source Junction.

We’re keen to exploit the tangible nature of Open Source Hardware at the event by encouraging attendees who are part of a hardware project, whether Open Source Hardware in the strictest sense or an innovative use of commodity hardware in conjunction with Open Source Software, to give a short presentation or demo.  If you’re interested in doing this, please give details on the registration form.

You can read full details of the event and sign up at EventBrite.

Posted in Uncategorized | Leave a comment

How to engage students in real open source projects

I gave a talk at FOSSA 2012 earlier this week sharing some experiences with teaching students using open source projects in a module at the University of Bolton.

Basically, there are 5 tips:

  1. Start with soft skills, not code
  2. Let students pick their own projects
  3. Teach how to “read” a project
  4. Get students interacting with the project community – not the lecturer
  5. Assess public interactions

Here’s the slides:

Hopefully I’ll be working with the teaching team to develop Year 2 & 3 modules building on this work.

Posted in Uncategorized | Tagged , | 11 Comments

Sharing usage data about web apps between stores

SPAWS logoThe 140 Character Question:

Can different web app stores share usage data such as reviews, ratings, and stats on how often an app has been downloaded or embedded?

Thats the question that we investigated in the SPAWS project. And the answer is:


Building on the Learning Registry and Activity Streams we connected together several web app stores aimed at sharing web widgets and gadgets. Each time a user visits a store and writes a review about a particular widget/gadget, or rates it, or embeds it, that information is syndicated to other stores in the network.

This means that, even if a store is focussed on a niche market, the web apps in the store can include user reviews and information collected from a wider federation. It also means that web app developers can pull together all the reviews and download stats for their apps to display on their own site, even when they are sharing their work in multiple stores.

We created a software library that developers can use to add “paradata sharing” to web app stores, and integrated it into Edukapp, a “white label web app store”. Edukapp is being used by ITEC for sharing web apps between secondary school teachers, and now by SURFNet to share web apps used in research portals in higher education. Both of these stores  should be in production use in 2013, and several other web app stores, both educational and commercial, have also shown strong interest in adopting it.

SPAWS isn’t limited to web apps and app stores – the same approach can be used for all types of resources and repositories, for example to share reviews and usage stats about learning materials, books, 3D printer models or whatever you like.

The SPAWS software library itself is open source and can be readily added to any Java project using Maven or Ivy. You can also fork it on Github.

Thanks go to the JISC/HEA OER Programme for funding this work, and to Amber Thomas for being a great programme manager!


Posted in app stores, apps | Tagged , | 4 Comments