Medium Website with Zero Hits Last Week

I run a medium-sized website that gets about 50k hits each week, but Piwik has really been struggling lately. I’ve been a devoted Piwik user for years, I’ve filed bugs, and even worked on documentation, but I’m getting really fed up. I’m tired of dealing with what should be a reliable part of my instructure so I’m hoping for some help before admitting defeat and jumping ship.

Some background:

  1. Our server has LOTS of power and I can’t believe how much Piwik slows it down. The server has 24 cores, 128GB of RAM and SSD drives mounted in a RAID array. There is NO reason Piwik should struggle on this machine, but I regularly see Piwik archiving chugging along at 99% of a CPU. Something is very wrong about that, since 50k hits/week is not really very many. I regularly do data work on this server to the tune of changing a million records – that’s no big deal…why is Piwik’s 10k/day a big deal?

  2. I’ve read the FAQ about setting up archives[/url], and [url=http://piwik.org/faq/troubleshooting/faq_183/]the FAQ about tweaking MySQL for performance.

I’ve tweaked my settings so that:

  • PHP’s memory_limit = -1,
  • MySQL’s wait_timeout = 20600 (about six hours) and so
  • MySQL’s max_packet = 128.

Six hours and unlimited memory should be enough for just about anything related to tracking users. I can’ timagine a MySQL packet bigger than 128MB.

  1. The FAQ about setting up cron jobs suggests an hourly cron job, but since it takes ages to run and chugs along at 99% CPU, I changed my cron jobs to run only once per day at five minutes past midnight:

5 * * * * /usr/bin/php5 /var/www/piwik/console core:archive --url=https://www.my-website.com/piwik/ > /dev/null

  1. In Piwik’s settings webpage I’ve set the archiving to trigger no more than every 18000 seconds and I’ve disabled the web-based archiving.

I believe these are all of the recommended settings for making Piwik work…and yet…I got zero results in my archive for last week. I could go and delete the MySQL tables for last week, as I did for the week prior and then re-run the archives, watching to make sure it works, but this all feels very wrong.

Questions:

  • I don’t ever want to see zero results again. I didn’t get an error email from cron this time. What do I need to do to make Piwik work reliably as we scale to 100k hits/week?
  • The amount of processing Piwik does is unbelievable when it runs its archive. Is this expected behavior? It’s only 50k people per week, less than 10k/day. That’s nothing, right? Why should Piwik struggle with that on a server like this? What can we do to make it do less work? It seems a bit unbelievable that I can’t have hourly results without pegging a CPU at 99%, but that’s where I’m at right now.

Really appreciate any help. It’s the community that’s gotten me this far.

Mike

Hi Mike,

24 cores, 128GB of RAM and SSD drives mounted in a RAID array

Clearly you have a problem with this server. Maybe it’s not configured properly or there’s something else going on. with 24 cores and 128Gb of ram, the archiver should never use 99% of your CPU, but it should use 99% of 2 or 4 CPUs at most.
So i’m confused… maybe your mysql server is misconfigured? Have you asked an experienced sysadmin to take a look at it?

The amount of processing Piwik does is unbelievable when it runs its archive. Is this expected behavior?

Sure it’s expected, Piwik pre-processes all data. archiving reports will take several minutes when you do 200K hits per month, but it shouldn’t impact your server since it has so much power. The fact that your server becomes slow or over-loaded is the problem, not that Piwik takes several minutes to process data. Maybe your monitoring could show you what’s wrong? maybe there is not enough memory allocated to Mysql. It could be so many things that it’s hard to debug or help a lot in here.

Thanks for the response, sorry, I guess I wasn’t entirely clear. The slowdown I’m referring to is Piwik pegging four cores of one CPU at 99% for several minutes. The server itself keeps going smoothly, but a even removing four of its cores every hour for several minutes seems weird for something like Piwik - As mentioned above, I’ve changed Piwik to only do archives once daily, but still this surprises me.

That aside though, my larger issue is that I get these reports with zero hits in them. I’ve done everything in the FAQs and yet I still have zero hits from last week.

You said that it only should take a few minutes for the archive to run, so after my last response I re-ran it like so:

/usr/bin/php5 /var/www/piwik/console core:archive --force-all-websites --force-all-periods --url=https://www.my-website.com/piwik/

It ran for about an hour, pegging at least two cores and often four of them. Today is monday so I expect the weekly report should be fast, but it’s also the 30th, so I suppose that’ll make the monthly report slow. Still, an hour with several cores pegged seems crazy.

There are many options if you want to investigate but it’s not easy as it requires sysadmin expertise.

for example you can: enable mysql slow query log, enable monitoring, ask sysadmin to check out the graphs.

Also check that all system checks are green in Settings>System check

I have experienced the same issue. Some websites have zero Hits in the week-overview.
If i switch to Period “daily” or “month” everything is working fine.
(Our Piwik server is also working with the archiver.)

Another issue i noticed is that the e-commerce-overview widget is buggy sometimes.
Sometimes it shows only 0-values while other widgets are working. After re-adding it to the dashboard it works for a while.

Hi again, thanks for the responses.

I did a few more tests today:

  1. We only use MySQL for Piwik and I turned on the slow query log and re-ran the archive script. Watching the log, I can easily see that the log is not getting anything added to it by Piwik (I tested with SELECT sleep(15); to make sure it was working). When I look in top, I can see that the program using CPU is PHP, not MySQL, so it makes sense that we don’t have any slow queries. Some sort of processing that’s happening in PHP appears to be the problem, but I’m not sure how to proceed to debug that, since I’m not a PHP developer. I am happy to look into it though, if somebody can tell me how.

  2. I checked in Settings > System Check and it’s green all the way down.

I’m fairly convinced that something non-trivial is happening in PHP. What can we do to test this?

Thanks again,

Mike

Piwik is processing your report data which is CPU intensive. Maybe you could solve your problems by giving “less priority” to PHP process so that Piwik does nto overload your server while it’s processing reports daily?

I could do that, sure, but the problem isn’t exactly that Piwik is overwhelming my server, it’s that on an extremely powerful machine it took Piwik 67 minutes to run the archive yesterday, during which time, PHP kept 2-4 of my cores fully occupied. For the number of hits we get, that’s just really a LONG time to process something. I could tune Piwik to have a lower CPU priority, but then who knows if it would ever finish archiving if I did that.

I guess what I’m hoping for is a recognition that this level of processing isn’t a desired behavior and some thoughts on features I could disable or other things I or we could do to make Piwik less CPU intensive.

I just want to say that last week I got 200k page views (from mainly 4 sites) and I’m archiving every 15 minutes with Cron using Piwik 2.4b6 and archiving takes 300 seconds on average.

The virtualized PHP server with Xcache enable has 8 GB memory and 2 CPU and MySQL is on another machine with similar specs and it’s “bored”.
:wink:

Do you use an opcode PHP accelerator ? Even with many CPU, it cuts at least PHP CPU time by 4 to 10.

Dali

Thanks for those suggestions Dali. I’ve run five more tests on this.

  1. I never got complete results archived for June, so I deleted their tables from MySQL, and re-ran the archiver. This took 163 minutes to complete.

  2. I installed XCache with default settings (apt-get install php5-xcache; sudo service apache2 restart), and ran the report for July. It’s currently Tuesday, July 8th, so this was a partial-month, partial-week report. It took 9 minutes to run.

  3. I didn’t like how tests one and two were comparing a full month (June) to a partial month (July), so I deleted the MySQL tables again for June and re-ran that report with XCache installed. In this case I aborted the test after 33 minutes, in order to alter XCache’s settings.

  4. I upped xcache.size to 128M from default of 16M, and I changed xcache.count to 24 cores instead of the default of 1 and re-ran the June report, again deleting the MySQL tables before beginning. This time took 128 minutes.

  5. I re-ran the July report with the new XCache settings. This took 18 seconds, compared to 9 minutes without the XCache settings changes.

So:

  • June, No opt code accelerator: 163 minutes
  • June, XCache with default settings: 33m before I killed it
  • June, XCache with performance settings: 128 minutes
  • July, XCache with default settings: 9 minutes
  • July, XCache with performance settings: 18 seconds

I’m a bit confused how to interpret this. The 18 seconds, obviously, is a huge – massive – improvement, but I’ve never seen performance like that before and XCache definitely can’t account for that much change. The change from 163 minutes to 128 minutes feels more appropriate for XCache, but that’s still way too slow.

Xcache here is 3.0.1 (Apache 2.2.x and PHP 5.3.13)

Xcache.size = 128M (at the moment 99M is still available)
PHP memory_limit is at 1024M

That virtual machine just host Piwik (Apache-PHP) and the other server is just MySQL with Piwik as its only DB.

The Apache server is under 10% load when not archiving and look like this while archiving:

[attachment 1812 PiwikCPU.jpg]

And here is my Cron output:


Running Piwik 2.4.0-b6 as Super User
---------------------------
NOTES
- Reports for today will be processed at most every 600 seconds. You can change this value in Piwik UI > Settings > General Settings.
- Reports for the current week/month/year will be refreshed at most every 3600 seconds.
- Archiving was last executed without error 13 min 10s ago
- Will process 4 websites with new visits since 13 min 10s , IDs: 2, 4, 5, 6
---------------------------
START
Starting Piwik reports archiving...
Archived website id = 2, period = day, 234 visits in last 2 days, 39 visits today, Time elapsed: 0.870s
Skipped website id 2 periods processing, already done 15 min 1s ago, Time elapsed: 0.873s
Archived website id = 4, period = day, 356 visits in last 2 days, 49 visits today, Time elapsed: 0.904s
Skipped website id 4 periods processing, already done 14 min 49s ago, Time elapsed: 0.907s
Archived website id = 5, period = day, 1415 visits in last 2 days, 155 visits today, Time elapsed: 1.299s
Skipped website id 5 periods processing, already done 14 min 35s ago, Time elapsed: 1.301s
Archived website id = 6, period = day, 6845 visits in last 2 days, 624 visits today, Time elapsed: 2.970s
Archived website id = 6, period = week, 39153 visits in last 2 weeks, 17931 visits this week, Time elapsed: 13.062s
Archived website id = 6, period = month, 152569 visits in last 2 months, 35876 visits this month, Time elapsed: 39.086s

It’s funny because having Piwik Cron running every 15 minutes always looked speedier than having it running every hour or so…

Keep us informed of your new discoveries !

Dali

And about MySQL Server, I strongly suggest you try MySQL Tuning Primer. It can gives you hints if MySQL is a bottleneck…

https://launchpad.net/mysql-tuning-primer

Are your tables InnoDB ?

Dali

I’m fairly certain it’s not MySQL. I set up the slow query log, and it’s not getting any new entries. Plus, the extreme processing occurs when PHP is eating up several cores, not when MySQL queries are being run.

I’m also fairly certain now that XCache doesn’t provide any performance boost to CLI. Two reasons:

  1. Logically, it doesn’t make much sense. All the optcode accelerators do (as I understand them) is cache the compiled version of the program. If you’re loading a website 1,000 times, yes, that makes a difference. If you’re loading a CLI once, not so much.
  2. I dug into this, and found a ticket that appears to say there’s no caching for CLI: #317 (Why XCache doesn’t cache files in CLI?) – XCache

So:

  • XCache - Waste of time.
  • MySQL - Not the problem.
  • PHP - Too slow most of the time, but very fast once.

I don’t know what to do next in order to debug. Any help understanding the tests I ran or ideas for next steps would be more than appreciated.

Did another test today and had the archive script finish in just one minute. I haven’t changed anything, so I’m quite baffled why it would change so dramatically. I’ll keep my eye on it and report back if I can figure out any other trends.

in the meantime, I still have several reports with a value of zero. This is a really common thing, it seems like. I wonder, is it worth it to create a script that simply finds and fixes the zero reports? That’d be significantly easier to work with than the current solution of deleting MySQL tables…

It’s important to understand that the archiving cron script will archive different things every time. if you want to do a proper performance test you need to:

  1. delete piwik_archive_* tables for 1 month or several (and do this before every test)
  2. run the command to tell piwik to archive Everything, see this faq: How do I reprocess all websites, all dates and all periods, after initial import of logs? - Analytics Platform - Matomo