My Python -> MySQL model is shaping up well.
Here is an outline of how the process works so far. (This will all be automated at a later stage, but at the moment involves me taking the place of scheduled jobs).
Step One: get the Ezproxy logs
We host our own Ezproxy server, so I just FTP the most recent batch to a network drive that allows me to run Python.
The log files I need are named along the lines:
- ezproxy.log.04Nov2019
- ezproxy.log.05Nov2019
- ezproxy.log.06Nov2019
Step Two: extract the details I need
From these huge logfiles, I only need a tiny subset of information:
- IP address of the requester
- User name of the requester
- Timestamp
- Which of our electronic resources they viewed
I do this at the command line, by going through the logs and cutting out what I need:
cat ezproxy*.log* | cut -d' ' -f1,3,4,7 | grep 'connect?session' > ezproxy.out
(This basically retrieves columns 1, 3, 4, and 7 from the log file, from each line that shows the user authenticating their session)
With the user names redacted, the output looks like:
Step Three - run it through my Python script
Details in next post
No comments:
Post a Comment