Tuesday 17 December 2019

Ezproxy harvest now automated

Just in time for the Christmas break, I have got the harvesting of our ezproxy logs fully automated.

From now on, a shell script will run in the early hours of each morning on our ezproxy server.  This gathers the latest day's worth of log data, reformats it, and then loads it all into our MySQL database.

It means I can leave the process running without manual intervention (which had taken the form of weekly FTPs of log files and manual script-running), and start concentrating on working with colleagues to:

  • come up with a useful dashboard of commonly-needed information
  • devise some more challenging questions to pose about usage patterns with our electronic resources
I will save that fun until 2020.

It should also mean that I can start stepping back from the process, and leave colleagues with a useful new facility for looking at patterns of usage data that have always been unavailable to us in any usable form until now.


Thursday 12 December 2019

The 'Quickcount' - an easy tally of enquiries, with built--in reporting and graphs



A couple of years ago, we chose to let our contract lapse with the third-party who provided our enquiries logging system.

This outside system cost a lot of money, was poorly regarded by staff, and aimed to satisfy such a straightforward need that I thought I had to be able to write something that could do the same - but at no cost.

Also, the available reporting was very limited. There were no graphs, and staff could not ask different questions of the data without contacting the supplier to code up a new report.

Basically, the system needed to provide a simple counter that would allow staff on our desk to record the types of enquiry that came their way (e.g. 'Directional', 'Printer problems', 'Using the catalogue').

The answer I came up with used PHP and a MySQL database.

Using this in-house model, staff recorded queries with a rapid tap of a button on a web page.  This logged the details to a MySQL database.

From the database, we could report back on the enquiries received using a data visualization package of choice.

To begin with, we exported the data as *csv, and looked at it in Excel, but very quickly we saw the potential for creating real-time dashboards using FusionCharts, Google Charts, and latterly Microsoft Power BI.

The end product took a few enjoyable hours to develop, and provided us with a system that was quick and easy to use, and which provided as many statistics as we fancied.

You can try out a straightforward demonstration here:

https://www.sussex.ac.uk/library/quickcount/quickcount_demo.php

Feel free to tap away at the 'Category' buttons, then click to see either the real-time reports that use FusionCharts, or the delayed (but much more attractive) ones via Microsoft Power BI.

I would be happy to share the code for this with anyone who might have a use for a no-frills counter that provides real time visualizations.

Monday 18 November 2019

The Real World comes a-calling: five weeks of coding fun!

As a techie, I have to admit that when we put our new kiosks into place on the first and second floors, they looked rather excellent.

However, as often happens to techies, it didn't take long for the real world to point out what should have been glaringly obvious to me all summer.

As soon as people started to look at these fine kiosks, staff immediately started to comment on the fact that everyone had to incline their head to one side to work out where they were.

I had even been doing the same myself, without realising it.

The explanation was simple.  The floorplan graphics had all been drawn with our main entrance at the bottom of the picture.  This meant that, when viewed at a kiosk location on an upper floor, the map was orientated at 90 degrees to the standpoint of the user.

So, a subject that was shelved straight in front of someone appeared as if it should be off to their right.

My subconscious mind had must have been suppressing this blooper for at least a year, when it had been the first thing my daughter mentioned when I showed her the floorplans (don't you wish you had an interesting Dad like me?).  This time, I couldn't take refuge behind parental knowing-it-all.

I was going to have to rotate the floorplans.

Sounds easy, doesn't it?  That is certainly what everyone thought who suggested it to me.  'What don't you just ...'

However, to pull it off, I had to go back to beginnings with the first and second floor image files, and redraw new versions of the graphics.  This was because of my earlier innocence when converting the floorplans into vector graphics: which left me with no way to rotate the floor without all the text turning onto its side.

That set me off on five weeks of recoding two floors to reflect the real world.

I finished the job on Friday afternoon (it is now Monday morning), and still can hardly believe I am free of it.  The end results are what I should have aimed for from the beginning.  Now that I have mapped all the text and building landmarks as proper mathematical co-ordinates, I have plans that I can edit quickly and easily.

The end results are worth every second I spent on them.  I just wish I had known eighteen months ago how I could have avoided myself needing to do them.







Friday 15 November 2019

Ezproxy - walkthough final step - cool graphs!

Step Four: analyse the data and make whizzy graphs

All that remains is to point your reporting tool of choice at the MySQL database.

We have bee using Microsoft Power BI to quickly produce visualisations that depict usage of our electronic resources in a way that has never been possible before.



Ezproxy walkthrough Part Two

Step Three - Python script

I have put a slightly redacted version of the Python script at:
https://github.com/alfi1/ezproxy-harvest
The script goes through the output file I created previously from our log files, and for each line:

  • Makes an Alma API call that gets information about the user from our Library Management system
  • Extracts 'school', 'department', and 'course' from the API results
  • Converts the IP address of the requester to a geographic location (country name)
  • Writes out to a MySQL database:
  • School
  • timestamp
  • resource
  • country
  • usergroup
  • course
From a point of view of GDPR, note that the user name is discarded, and not sent to the database.  We only want a school/department/course affiliation, and nothing that could identify the individual.

Ezproxy harvesting - walk through of the steps so far

I am genuinely happy at the prospect of being able to analyse usage statistics for our electronic resources.  I have heard myself telling colleagues it was impossible for so many years that I feel ashamed that I never seriously tried to pull it off before.

My Python -> MySQL model is shaping up well.

Here is an outline of how the process works so far.  (This will all be automated at a later stage, but at the moment involves me taking the place of scheduled jobs).

Step One: get the Ezproxy logs

We host our own Ezproxy server, so I just FTP the most recent batch to a network drive that allows me to run Python.

The log files I need are named along the lines:
  • ezproxy.log.04Nov2019
  • ezproxy.log.05Nov2019
  • ezproxy.log.06Nov2019

Step Two: extract the details I need

From these huge logfiles, I only need a tiny subset of information:
  • IP address of the requester
  • User name of the requester
  • Timestamp
  • Which of our electronic resources they viewed
I do this at the command line, by going through the logs and cutting out what I need:

cat ezproxy*.log* | cut -d' ' -f1,3,4,7 |  grep 'connect?session' > ezproxy.out

(This basically retrieves columns 1, 3, 4, and 7 from the log file, from each line that shows the user authenticating their session)

With the user names redacted, the output looks like:



Step Three - run it through my Python script

Details in next post

Friday 8 November 2019

Kiosks - news flashes and advertising posters

I had coded the kiosk interface as a dynamic webpage: using Bootstrap to handle the formatting, and PHP to gather up the data from outside systems (the API of PC availability; the study rooms; the occupancy total).

While doing so, I hadn't put any thought into the possibility that anyone apart from me would ever want to make changes to the content.  As such, the backend setup was very programmatic, and editable only on a web server to which few of our staff have access.

I had basically created a setup that would leave me hostage to making each and every change!

That would have been fine if we were talking about interesting developments to the floorplans, or additional API calls: but, very quickly, colleagues were asking me to post latest-news style articles at short notice.

To allow me to retain control of  the code, while letting colleagues update the page, I came up with a simple trick of a web form that allowed colleagues to enter a headline and the text of a news flash.  Whatever they typed into the form was saved to a MySQL database, from which it was inserted live into the kiosk main page - with all the appropriate formatting applied.

This proved such a quick win (in sparing me from spade work) that I adapted the web-form->MySQL method to allow colleagues to add a digital poster to the display.  This permitted our Admin team to advertise topical events and news, without needing me, and leaving the code of the page safe from accidental change.

In the picture, you can see a poster advertising some events from our Digital Discovery week.  The news flash has been temporarily removed to make space for a scrolling carousel of book jackets that link through to our catalogue during Black History Month.