Segments datas are wrong

Hello,

I have been running piwik for two years now we are pretty happy with it but I recently noticed something was wrong with our segments data. We have a few dozens of segments and it appears that pages count, visit counts, etc. are wrong. For example, we have that segment that used to hit around 90k visits a month and since july the numers are now as low as 200 visits.

I know those numbers are wrong because if I look at the number of visits or pageviews for one url inside the segment and compare it with the same url on the default “all the visits” segment, it appears that the page gets more visits on the “all visits” segment than the total of the visits for the whole segment.

I know you’ll tell me maybe most of the view for that page is made in a different context than the segment you set up. Actually it might be true for a tiny percentage of the visits, but I doublechecked by analysing my piwik’s server apache’s logs and when piwik says there was only 200 pagesviews in my segment during last month, I can find more than 1000 calls to piwik.php coming from that page with the proper segment data every day of the past month.

I also set up a Google Analytics account and set up the same segment and the data is conform with what we were expecting. For example, for that big segment, we got 88K visits in september according to GA and…234 according piwik.

Whatever happened (did I change something in my tracking code, configuration, did an upgrade break something?) it must have happened either at the end of june or first week of july since since then all the segments lost their coherence.

What I don’t really understand is why I still have a few visits in my segments. If something went wrong with my tracker code or whatever, I would have rather expected the segments to be empty instead of still getting a few hits. Why those hits are still accounted when 99.99% are just lost… weird.

Do anyone have an idea of what could have gone wrong?

Thanks.

Hello

that’s strange. can you maybe try to upgrade to latest beta version? I would like to test early beta and RC releases, how do I enable automatic updates to use these development versions? - Analytics Platform - Matomo

I cloned my piwik server, including the database, upgraded to the latest build (switching the automatic upgrade to the beta channel kept saying I had the latest release…). My site also uses both trackers now. I’ll keep an eye on the segment datas and see if there’s a difference.

All my reports are pre-generated using a cron call to /usr/bin/php5 /var/www/piwik/console core:archive --url=http://localhost/ every hour. I noticed these bits during the archiving process (using the latest build, forcing the segment):

DEBUG CoreConsole[2015-10-14 07:19:01] Earliest created time of segment ‘customVariablePageValue1==SET’ w/ idSite = 9 is found to be 2015-02-26.
DEBUG CoreConsole[2015-10-14 07:19:01] process_new_segments_from set to beginning_of_time or cannot recognize value

Also, the timestamps are wrong, my time is UTC+0200 so it should have been “2015-10-14 09:19:01”. No idea if it means anything there…

Is there a way for me to check if the data in the database is rigth? I mean how could I check if the reports and the database aren’t coherent? I’m trying to figure out if the problem is in the reports or in the captured data itself (problem with my tracker, maybe…)

Thanks

I do not see any interesting change in the segments with Piwik 2.15.0-rc2.

I tried different things:

  • removed the donottrack support (didn’t change anything)
  • removed anonymization (didn’t change anything)
  • removed customization of url (didn’t change anything)
  • tried to set a segment based on the url instead of a custom variable (didn’t change anything)
  • tried to set a segment based on a custom variable name since all my segments are based on a custom variable’s value (didn’t change anything)
  • created a segment "browser contains “Firefox” (didn’t work, empty)
  • created a bogus segment "browser is not “blablablah” and this time it actually worked and showed a decent number of visits.

my tracker:


<!-- Piwik -->
			var _paq = _paq || [];
			<!-- recommendation CNIL -->
			_paq.push([function() {
				var self = this;
				function getOriginalVisitorCookieTimeout() {
					var now = new Date(),
					nowTs = Math.round(now.getTime() / 1000),
					visitorInfo = self.getVisitorInfo();
					var createTs = parseInt(visitorInfo[2]);
					var cookieTimeout = 33696000; // 13 mois en secondes
					var originalTimeout = createTs + cookieTimeout - nowTs;
					return originalTimeout;
				}
				this.setVisitorCookieTimeout( getOriginalVisitorCookieTimeout() );
			}]);

			_paq.push(['setCookieDomain','*.xxxxxx.xx']);
			_paq.push(['setCustomVariable', 1,"Site","SET",'page']);
			_paq.push(['setCustomVariable', 2,"Redacteur","xxxxx",'page']);
			_paq.push(['setCustomVariable', 3,"RubriqueNiveau1","SET_FR1",'page']);
			_paq.push(['setCustomVariable', 4,"RubriqueCourante","SET_FR1",'page']);
						
			_paq.push(['setCustomUrl','http://xxxxxxxxxx/jsp/fiche_pagelibre.jsp?CODE=90032948&LANGUE=0']);
			
			_paq.push(["trackPageView"]);
			_paq.push(["enableLinkTracking"]);

			(function() {
				var u=(("https:" == document.location.protocol) ? "https" : "http") + "://analytics.xxxxx.xx/";
				_paq.push(["setTrackerUrl", u+"piwik.php"]);
				_paq.push(["setSiteId", "9"]);
				var d=document, g=d.createElement("script"), s=d.getElementsByTagName("script")[0]; g.type="text/javascript";
				g.defer=true; g.async=true; g.src=u+"piwik.js"; s.parentNode.insertBefore(g,s);
			})();

After activating the debug mode on my tracker I noticed this in the logs:

DEBUG Actions[2015-10-16 07:41:09] [be7d5] Invalid custom variables detected (id=1)
DEBUG Actions[2015-10-16 07:41:09] [be7d5] Invalid custom variables detected (id=2)
DEBUG Actions[2015-10-16 07:41:09] [be7d5] Invalid custom variables detected (id=3)
DEBUG Actions[2015-10-16 07:41:09] [be7d5] Invalid custom variables detected (id=4)

in core/Tracker/Request.php line 555:


if ($id < 1
                || $id > $maxCustomVars
                || count($keyValue) != 2
                || (!is_string($keyValue[0]) && !is_numeric($keyValue[0]))
            ) {
                Common::printDebug("Invalid custom variables detected (id=$id)");
                continue;
            }

That snippet expects $keyValue to be an array, but it is actually a json string: ‘[“Site”, “SET”]’ hence the invalid value of $keyValue[0] which is always ‘[’ !

I’m opening a bug report on github.