Weekly Update June 19 – June 23

This was a busy one! So what have I been up to?

Working on agile business intelligence with both of my Jisc Analytics Labs teams, with a face to face meeting at Wolverhampton plus two online scrums. I also developed an Alteryx workflow for our friends over at HESA to help them join up older versions of the Estates Management Record (EMR) that should help institutions wanting to do some longer term trend analysis.

Meeting my partners and associates in Cetis LLP at our home base at Halton Mill. We had a great meeting, developing ideas for new service offerings in areas such as Student Journey Transformation, Digital Credentials, and Course Discovery, as well as developing our strategy for the coming year in existing areas such as Business Intelligence, Learning Analytics, Research Data ManagementData Wrangling and Technology Selection and Evaluation. Oh, and we may finally actually redo our website 🙂

Studying Machine Learning with a Coursera course from Stanford. Mostly I’ve been working on artificial neural networks for classification problems.

Writing bids for new work with OECD and with the European Space Agency.

Ich Lerne noch Deutsch mit Duolingo.

Growing things in the garden and greenhouse. Tomatoes developing nicely!

Watching Wallander and The Expanse on Netflix.

 

In case you don’t know me, I’m a consultant and partner in Cetis LLP, specialising in BI/analytics, open source software, and technology discovery and pre-procurement. Always interested in new clients, so feel free to get in touch.

Posted in Uncategorized | Leave a comment

(Bi-)Weekly Update June 5 – June 16

I’ve been fairly busy lately and missed a weekly update! I was also out Friday night so didn’t get chance to write then either. However, here we are again. So, what have I been up to?

Working on agile business intelligence with both of my Jisc Analytics Labs teams; I’ve been reshaping  data for analysis using Alteryx, generating synthetic data for testing, and also coaching analysts in using Alteryx and Tableau. I’ve also been coaching analysts in reshaping survey data using transposition – in general, people tend to make “short and wide” datasets, whereas machine analysis much prefers “long and narrow” datasets. This week I was in London meeting one of my teams in person (mostly we work remotely) and was really impressed with the visualisations they’ve managed to develop so far.

Analysing the new LEO dataset.

Studying Machine Learning with a Coursera course from Stanford. This past two weeks I’ve been working on artificial neural networks for machine learning, implementing the various algorithms used in training networks and generating predictions for classification problems such as image recognition.

Writing bids for new work, including one on research data management for De Montfort University plus a couple of others for the OECD and the European Space Agency.

Ich Lerne noch Deutsch mit Duolingo.

Growing things in the garden and greenhouse. We’ve had lots of strawberries, the chillies, peppers and tomatoes are just starting to form, and I’ve picked the first raspberry!

Reading “Inverted World” by Christopher Priest.

Walking at least 5k every day for the past two weeks. Mostly working from home isn’t great for the figure, so I’ve been making an effort to do a good long walk each day.

Applying patches to HtmlCleaner for some odd cases involving unrecognised namespaces.

In case you don’t know me, I’m a consultant and partner in Cetis LLP, specialising in BI/analytics, open source software, and technology discovery and pre-procurement. Always interested in new clients, so feel free to get in touch.

Posted in Uncategorized | Leave a comment

Weekly Update May 29 – June 2

Its been half-term this week, so I’ve mostly been spending time with family, especially as one daughter has been ill (now there was bad timing!) but also:

Working on agile business intelligence with one of my Jisc Analytics Labs teams; I’ve been reshaping  HESA data for analysis using Alteryx, generating synthetic data for testing, and also coaching analysts in Alteryx and Tableau.

Studying Machine Learning with a Coursera course from Stanford. This week I’ve been studying classification, and implementing logistic regression and regularisation algorithms in Octave. The assignments are surprisingly tough – but I’m developing my ability to solve problems using vectorisation rather than iteration, which is something I can apply to non-ML problems too.

Arranging to do some more teaching next semester. Probably undergraduate cloud computing and algorithms, but its still all a bit vague at the moment.

Ich Lerne noch Deutsch mit Duolingo.

Growing things in the garden and greenhouse. New potatoes almost ready for harvesting. Have had to protect my french beans and garlic from the rabbits. (They don’t even especially like garlic, but that doesn’t stop them mowing it down.)

Reading “Smoke” by Dan Vyleta.

In case you don’t know me, I’m a consultant and partner in Cetis LLP, specialising in BI/analytics, open source software, and technology discovery and pre-procurement. Always interested in new clients, so feel free to get in touch.

Posted in Uncategorized | Leave a comment

Weekly Update May 22-26

Well, its been an eventful week in Manchester as we all know. But, that aside, its time for me to write up what I’ve been doing.

Working again on agile business intelligence with my Jisc Analytics Labs teams; I’ve been reshaping HESA estates data and DfE Apprenticeships data for analysis using Alteryx and also coaching analysts in using some more advanced Tableau features.

Writing a project proposal for the UFI VocTech challenge with our friends at We Are Open Co-op.

Studying Machine Learning with a Coursera course from Stanford. This week I’ve been implementing gradient descent and cost functions in Octave.

Applying patches to HtmlCleaner to handle some advanced cases of character handling. Looks like I’ll be making a release early in June.

Ich Lerne noch Deutsch mit Duolingo.

Walking to collect my youngest daughter from school everyday. Its about 3 miles each way!

Growing plants with the aid of all this sunshine. Tomatoes, peppers, chillies potatoes and peas all coming along nicely.

Reading “Occupy Me” by Tricia Sullivan.

Next week is half-term for the kids, so I probably won’t blog until the week after.

In case you don’t know me, I’m a consultant and partner in Cetis LLP, specialising in BI/analytics, open source software, and technology discovery and pre-procurement. Always interested in new clients, so feel free to get in touch.

Posted in Uncategorized | Leave a comment

Weekly Update May 12-19

So, what have I been up to lately?

Working again on agile business intelligence with both of my Jisc Analytics Labs teams; I went to a meeting in London for the team focussed on postgraduate courses, and worked remotely for the team working on estates data. Its been a busy week – creating synthetic data as placeholders for private data sources, and reshaping HESA data using Alteryx.

Discussing a potential project proposal for the UFI VocTech challenge with partners.

Watching Eurovision. I really wanted Belgium to win this year – the Portugese thing did nothing for me.

Preparing our new online research data service for launch. Website is up, last few bugs seem to be fixed, and we’re bringing onboard private beta participants.

Ich Lerne noch Deutsch. 46%! Danke, Duolingo.

Studying Machine Learning with a Coursera course from Stanford. Getting my linear algebra on this week using Octave!

Helping one of our clients fill a gap in their development capacity.

In case you don’t know me, I’m a consultant and partner in Cetis LLP, specialising in BI/analytics, open source software, and technology discovery and pre-procurement. Always interested in new clients, so feel free to get in touch.

 

Posted in Uncategorized | Leave a comment

Weekly Update May 8-12

Weekly update time! So what have I been up to? Actually its been a pretty quiet week.

Working on agile business intelligence with one of my Jisc Analytics Labs teams, getting them up to speed with data reshaping in Alteryx during our face-to-face meeting in Salford, and supporting another team with Tableau. I’m off to London next week for their face-to-face meeting, so will do a fair amount of coaching then too.

Discussing a proposal for some open source work with the OECD.

Chatting with some lovely people at Microsoft about our use of the Azure platform. Azure is usually my first choice when it comes to cloud services these days.

Releasing HtmlCleaner 2.21. We had a regression that wasn’t picked up by the rather extensive test suite.

Participating in a video interview for a MOOC being developed for the University of Edinburgh – the whole course is aimed at businesses and non-profits on utilising open resources, but my bit was on the subject of open source and commercialisation. So I spent an afternoon under the lights being interviewed for a while for four weeks worth of course videos.

Watching season 2 of The Killing. Nearly done!

Growing things in the garden. Potted on my two heritage perennial kales, sowing more lettuces, eagerly watching the chillies, peppers and tomatoes grow in the greenhouse.

Preparing our rather splendid new online research data service for launch soon.

Enrolling on the Stanford machine learning MOOC.

Reading the Fables graphic novel.

Meeting (virtually) with my business partners at Cetis LLP, and welcoming some new partners into the cooperative. So much going on in the company right now, and still no time to update the website since we launched over two years ago. We’re also looking at joining up with other technology-focussed coops in the UK.

Ich Lerne noch Deutsch. 45%! Danke, Duolingo.

Visiting my eldest daughter’s school for parents evening. They think she’s awesome, but we knew that already.

In case you don’t know me, I’m a consultant and partner in Cetis LLP, specialising in BI/analytics, open source software, and technology discovery and pre-procurement. Always interested in new clients, so feel free to get in touch.

Posted in Uncategorized | 1 Comment

Weekly Update May 1-5

Another week goes by … this week I’ve been

Working with my Analytics Labs teams once more. I stood in as Scrum Master for my PGT analysis team, took charge of the Trello board, and helped out with various Tableau-related queries.

Organising a video interview for next week on open source.

Tinkering with machine learning in Weka. This week I’ve been creating some classification models. I’ve started training a model using the UNISTATS dataset with the J48 decision tree algorithm to try to predict future NSS scores based on course structure.

Releasing HtmlCleaner version 2.20. I had 12,000 downloads last month from Maven Central, and 445 direct from Sourceforge.

Watching season 2 of The Killing. No spoilers, please!

Learning German. Jetzt ich kann sage “ich habe ein wichtig Ente”. Vielen danke, Duolingo.

Pondering a bid to the UfI VocTech Seed funding challenge. We have a very cool ready-to-go project on employability skills, but is it a good fit?

Developing the venerable Cetis/K-Int Content Transcoder, a service-based tool for converting various types of e-learning content formats, following a tip off from my Cetis colleague @wilm that people still need it. I’ve created a new fork with updated dependencies. Next, I’ll rewrite it to run as a desktop app as well as an online app.

Potting on various plants in the greenhouse.

Preparing our rather splendid new online research data service for launch. More on this soon…

Reading the Heart of Darkness by Joseph Conrad.

Posted in Uncategorized | Leave a comment

Weekly Update Apr 24-28

I’m going to try to blog some weekly updates, inspired by Doug. Lets see how it goes, eh?

This week I’ve been:

Working with a team of HEI data analysts looking into Taught Postgraduate courses as part of Analytics Labs, a fabulous programme from Jisc and HESA that our company is supporting. As usual I’ve been sourcing data and reshaping it for analysis using Alteryx.

Working with another team of HEI data analysts from around the UK looking at apprenticeships, and also estates data. Once again I get to play with the Working Futures datasets! Plus I get to use the estates returns for the first time, possibly to connect up with some open LA planning datasets.

Talking with University of Edinburgh about providing subject matter expertise on FOSS for a new MOOC they’re running.

Talking with researchers at the University of Manchester on making some interesting health data analytics software they’ve developed into a sustainable open source project, under the OSS Watch mantle.

Tinkering with machine learning using Weka.

Working with Manchester Metropolitan University on planning the student systems integration for their major change programme

Developing an improved attribute handling process for HtmlCleaner with input from community members.

Playing with Mastodon (the microblogging platform, not the band.)

Watching The Killing. OK, everyone else has seen it, but we only just got Netflix in our house.

Learning German. (Zufolge nach Duolingo, ich kenne 45%! Ja, wirklich…)

Growing tomatoes, chillies, and peppers in the greenhouse. Potatoes, garlic, shallots and peas are doing nicely in containers.

Wow, I’ve been busier than I thought. Maybe I’ll keep doing this.

Posted in Uncategorized | 1 Comment

Busy busy busy!

13857515175_9800819b9c_z

OK, I admit it, I’ve been ignoring this blog for way too long! But that doesn’t mean nothing has been happening – quite the opposite.

About a year ago I became one of the founding partners of Cetis LLP, and since then have been extremely busy working for a range of clients on everything from student systems integration and business intelligence, to research data apps and digital credentialing.

I’m still very much involved in open source and open standards – its a key ingredient in almost every solution I’ve been involved in. I’ve also fitted in a small amount of teaching and training on these topics, but haven’t been able to keep up with any academic writing.

My next challenge is working with partners to build our own cloud-based data service offering. Watch this space! Just not too closely, as I may be too busy to write anything for a while 🙂

bee photo by Bill Damon used under CC-BY

Posted in Uncategorized | Leave a comment

How are things going with HtmlCleaner?

HtmlCleaner is the FOSS project I’ve been maintaining since 2013. So, how is it going so far?

Downloads and users

I can get download stats from Sourceforge, which hosts the binaries, but perhaps a better perspective is the number of times the project is being downloaded via Maven Central as a component of other projects – Sonatype handily provide these statistics.

Last month (June 2015), there were 1,184 downloads from Sourceforge. There were also 10,468 downloads from Maven Central. Its averaged around that number per month over the year. Sourceforge downloads were at their peak in 2014 when they hit 1500 a month; but have been stable at 1000 or so a month since.

How meaningful these statistics are I’m not sure; they do seem to show a pretty stable level of interest in HtmlCleaner, which is encouraging. Also, a total of 130,000 downloads a year seems like a lot – there must be quite a few users out there!

Community

Although we haven’t added more committers to the project (something I’m keen to do) we have had a lot more patches over the past year being submitted by users and included in releases. Most recent releases have included at least one user-submitted patch.

(My general philosophy on patches is that, if they work and are well tested, they go in – I don’t have any sort of ideological preferences for how code is written or whether I think a feature is necessary; if a user wants something to the extent they create a patch for it, its a valid feature request by definition.)

We have had lots of users submitting bugs and questions too, which is a great sign. (No, really, I like seeing bug reports! Only software with no users has no visible bugs…)

Releases

Release frequency has been a bit patchy, something thats entirely my fault as all kinds of other priorities get in the way. We’ve had 3 releases so far in 2015, 3 in 2014, and 6 in 2013. Still before that there was only one in 2010 and two in 2008 so we’re still doing well!

The Code

We added some new features this year, finally updating to full Html 5 tag set support, and adding some much nicer command-line operations. However, I think we’re getting close to the time that I need to strip down and rebuild the engine for a 3.0 version as we’re coming up against the limits of tweaking the existing engine.

If cleaning up shoddy HTML is something that interests you, pop along to HtmlCleaner and help out!

Posted in development, open source | Tagged | Leave a comment