Troublesome CSV

Another day, another source of bad data. This time, it was a CSV. As you can see, the format changes mid file. This makes importing tricky. The way I tend to handle importing into SQL is to create staging tables with all the columns defined as varchar, then, once its imported into SQL, then convert it. The problem occurs when you try to determine the number of columns required. Traditionally, we let the system auto determine how many columns we have, and if we have headers, define the names, unfortunately because the format changes we can’t do this otherwise it will ignore the additional columns - assuming the first row doesn’t contain all the columns....

December 3, 2017 · 1 min · matt40k

Care Quality Commission (CQC)

Today one of our older SSIS packages failed which loads data into our warehouse, it turns out one of the fields had been expanded as the business had started to use a different standard for one of the codes. Despite not being told about this, it causing it fail and generate extra work on an already busy Monday it was actually a really good thing. Firstly, once we identified the problem (and fixed it), our main business contact was aware of it, was able to explain why it was happening....

September 5, 2016 · 2 min · matt40k

Is today Day Zero or Day One?

This might sound pretty pointless and silly but when talking about a 20 working day SLA a day can mean alot between a massive fine or compliance. During one of my driving lessons with Gary we were talking about his recent streak of passes. The conversation then lead on to pass rates and us wondering how this broken down – for example, by gender, ethnicity, age, driving instructor type (intense driving school \ chain \ individual driving instructor \ family etc)....

June 10, 2016 · 2 min · matt40k

GeoJSON

So I’ve been playing wit mapping data, one of the nice things about GitHub is that it supports GeoJSON and automatically renders it With the mapping data already imported into Geography in SQL Server it was easy enough to convert to GeoJSON (Stackoverflow to the rescue again!)

December 10, 2015 · 1 min · matt40k

Parking fines issued in Ipswich

I made the above video using Microsoft Excel using data from a FOI request to Ipswich Borough Council about the number of Parking fines issued since 2010.

June 16, 2015 · 1 min · matt40k

Technical Analysis on Schools websites in England

So in my last blog post I asked, are you ready for IPv6? The post came about when I was looking at Schools MIS data, which Graham, Joshua and myself have being look at to see who are the big movers and shakers in the Schools administration software (MIS) arena. Data is collected by what software suppliers a school uses to submit the School Census (in England) which is requested under the Freedom of Information (FOI) from Department of Education(DfE) (saving having to FOI every individual school)....

December 5, 2014 · 2 min · matt40k