Information on Piwik capacity and responsiveness given various configs?

I’m investigating Piwik to get away from Google Analytics (and Analytics 360). I need to be able to…

  • handle hundreds of millions of events per day; easily breaking 10k QPS (queries per second/events per second)
  • have solid US responsiveness/coverage (e.g. distributing endpoints West, Central, East - no concern for International right now), meaning no crazy HTTP RTT delays or anything like that
  • analyze user information <5min latency (basically I want something that competes with GA’s “Real-Time” info)
  • years of data retention; I’m fine with significantly slower reporting performance for >13month old data, meaning automatic datawarehousing and things like Glacier are fine with me

I understand this might have “significant cost” - that’s fine, I just want it clearly quantified (I can’t be the first guy asking about this, can I?). I don’t need to manage complex transactions. I don’t need to do any real customization of the Piwik system. I’m hoping there’s some sort of KB article about “if you choose X, Y, or Z AWS systems, you’ll get A, B, or C level of performance, as define by QPS or whatever, and you’ll fill up your disks at D, E, or F rate.”

I’m happy to use any AWS products that make sense, but I do not want to have to code custom stuff (meaning I’m interested in “how it’s supposed to work out of the box”). I’m also happy to use other systems (e.g. Rackspace) and I have the ability to trivially CDN anything that needs wide distribution for HTTP reasons.

Thanks!

For what you want you are going to need a quote from a significant company that can handle all of your needs. I recommend AT&T as they are what we use for our internet sites. My much smaller intranet site is done by me and does not require the numbers you are talking about.

Does AT&T rep Pwiki as a product? Can you share any contact information like a website or person I could talk to? Happy to contact them.

I’m fairly sure AT&T recommends IBM metrics rather than Piwik. I’m not really involved with it so don’t have a contact for you. If Piwik could be clustered it could work but the Piwik devs would have to chime in.

Got it. Well, I didn’t realize IBM had a “competitive” product (it doesn’t really have to be fully or directly competitive, just something that can compete in my case ). Thanks for the tip.