Taking a stab at making a privacy-preserving analytics thing for my blog.
I'm hashing the user's IP and User-Agent, so they're never stored directly. Not 100% accurate, but good enough for me to work out whether a page-view is unique or not.
Country is very rough - using GeoIP.
Referer is either from HTTP or if the linking site uses UTM.
Seems *broadly* accurate compared to other plugins.
I'm not sure of the performance if this hits a million rows. Might need further DB tables.
Getting there! I can now (inefficiently) see how many page views and (roughly) how many unique users have visited my blog.
I can also see where people are coming from on a specific day.
(Remember, this is just for me to noodle about with. This isn't going to be an amazingly accurate, professional-grade plugin.)
As pointed out by @DamonHD - the HTML <meter> element also works well!
Bit of CSS to rotate it, table for layout, job's a goodun!
Might have gone a *little* overboard! I rather like this though.
Self-hosted analytics showing a graph of the last 7 days.
Today's top referers, pages, and countries.
Super simple and suitable for my needs.
Next up - import historic data and work out if I want to keep countries & referers in it.
I'm basically farting around with the design because I don't want to do the hard work of periodically consolidating the database.
I *think* for now I just need a separate table of:
Date, Views, Users.
I can increment the Views every time the page is visited using something like `ON DUPLICATE KEY UPDATE visits = visits + 1;`
But deduplicating the users will be harder. I don't want to run a daily cron job for this.
@Edent be careful with 'UPDATE count = count + 1" patterns. They're unlikely to accurately count as you expect due to concurrency issues. Especially in MVCC systems, it will depend alot on transaction isolation levels.
For this kind of thing, just logging the visits as records and aggregating on analysis / periodically is often the better approach. Removes a lot of the concurrent write headache and give you more flexibility.
@Edent not super familiar with Maria/MySQL, more a PostgreSQL person.
But maybe: https://mariadb.com/docs/server/server-usage/triggers-events/event-scheduler/events
@Edent failing that, good old cron.
@Edent also in general window functions are great for aggregating these kind of stats.
@intrbiz aha! Great, thanks 🙂
@intrbiz thanks. Is there a sensible way to schedule those sorts of reconciliations?