[ANSWERED] re: #464 "Ingest arbitrary data"

Well, consider the internet TV scenario. It’s highly desirable to gather statistics on viewer watching patterns, trends, and monetisation-related activity.

For example:

  • When a video ad is played, as part of a playlist, it’s really useful to know not just who clicked on it, but when they clicked on it. Not just when relative to the ad playout, but when relative to the whole playlist - as in, know what videos were played before the ad did, to be able to guess at the viewer’s mood when he clicked. This allows the advert producer to know what was effective, the advertiser to know when it was effective, and the broadcaster to know how much of the ad played to be able to price his ad-inventory accordingly.
  • When a playlist item is clicked, that playlist item starts to play. It’s useful, again, to know what was playing, because it’s possible to infer that the viewer isn’t in the mood to watch whatever that is any more. Thus we gain insight into viewer watching habits, which we can correlate to time of day, type of viewer (ie demographics), etc. Especially if we can visualise the viewing habit on a per-video and per-playlist basis on a graph which presumably starts at the top, and then drops away as people leave. We can infer that any peaks or sudden drops are worth investigating, and for bonus points we can show this to the broadcaster at the same time as the content plays, so they can attempt to see what made the event. For live-events this is even more valuable - one can tailor the content on the fly to real-time audience feedback from such a stats/reporting engine.

There are many other such facts that it is possible to define and record, and it seems like you have an effective fact-tracking mechanism. For us to leverage it, we would need the flexibility to define our own facts, hence, “arbitrary data”. It also seems like you have a nicely extensible reporting system to write plugins against for visualisation of analytics. We were hoping it would become possible for piwik to be at least the basis for our statistics ingestion (being web-based and with an API, it’s ideal in that respect), and maybe the reporting platform as well, depending on how well it scales (though, obviously, scaling problems are nice problems to have, business-wise :wink: ).

I hope that has removed at least some of the vagueness from the original ticket? It’s clearly a blue-sky feature - but at the same time, it would be really useful to know a product to just drop in and handle this aspect of what we’re currently building. If any vagueness remains, please say, and I shall expand on things further.[

Maybe what you would need is: http://dev.piwik.org/trac/ticket/47 ?