Interpreting rows from log_visit

Hey guys, I’m working on a custom report (which involves going straight to the DB), and I’m trying to figure out some potential inconsistencies with the way piwik is logging data. I have the following four rows of data from log_visit from the same visitor:

(sorry, this is probably hard to read)



visitor_returning	visitor_count_visits	visitor_days_since_last	visitor_days_since_first visit_first_action_time visit_last_action_time
0	1	0	0	2012-07-10 08:36:14	2012-07-10 08:36:14
1	1	0	0	2012-07-10 10:59:55	2012-07-10 10:59:55
0	1	8	8	2012-07-18 15:10:38	2012-07-18 15:10:38
0	1	22	22	2012-08-01 18:53:53	2012-08-01 18:53:53


Basically, it’s the same user, but visitor_returning is only marked as “1” for the second visit, and not for the 3rd and 4th visit. I would also expect the days since first and last visit to be different (for example, instead of 22 for the last row, I’d expect visitor_days_since_last to be 13 - the # of days from 07/18 to 08/01)

Am I interpreting these values incorrectly?

It looks like visitor_days_since_last and visitor_returning indeed have wrong values for some visits.

Do you see this behaviour for all visitors, or only some visitors ?

So, it seems to be only happening for a relatively small number of users. Just to figure out the behavior I was mentioned above, I ran the following query:


SELECT 
	visitor_returning,
	COUNT(visitor_returning),
	AVG(visitor_days_since_last) 
FROM 
	log_visit 
WHERE 
	visitor_days_since_last > 0 
GROUP BY 
	visitor_returning

… and the result was:


+-------------------+--------------------------+------------------------------+
| visitor_returning | COUNT(visitor_returning) | AVG(visitor_days_since_last) |
+-------------------+--------------------------+------------------------------+
|                 0 |                    13514 |                     193.9778 |
|                 1 |                  2381745 |                      13.7203 |
+-------------------+--------------------------+------------------------------+
2 rows in set (9.85 sec)

(Notes: we’re only tracking one site, so I didn’t use idsite. log_visit has about ~12.5 million rows - so, there are a lot of people who don’t come back :wink: ).

In this case, I’d expect visitor_days_since_last to always be 0 when visitor_returning = 0, but it doesn’t seem to be the case.

Perhaps I can safely assume that if “visitor_days_since_last” is greater than 0, than they’re a return visitor?

“visitor_days_since_last” is greater than 0, than they’re a return visitor?

It should be a safe assumption indeed. Maybe we could even do that in the piwik code. I took note of it!

Heya, thanks for all your responses. I hope you can indulge me with one more answer to one more question :wink:

To perhaps clean up this minor discrepancy on my end in the Piwik code, I was thinking about a few different solutions.

The first way I was thinking is to modify Piwik_Tracker_Visit’s handleNewVisit method. I’m running 1.8.4, and there’s a bit of code that puts info into the visitorInfo array; notably this line in handleNewVisit() :


'visitor_returning' 	=> $isReturningCustomer ? 2 : ($visitCount > 1 || $this->isVisitorKnown() ? 1 : 0),

… I was thinking about modifying it like this:


'visitor_returning' 	=> $isReturningCustomer ? 2 : ($visitCount > 1 || $this->isVisitorKnown() || $daysSinceLastVisit > 0 ? 1 : 0),

… so this will make it so that a visitor is marked as a returning visitor if visitor_days_since_last is greater than 0.

The second approach I thought about was just making a custom plugin that implements the Tracker.newVisitorInformation event hook (and just modify the referenced $notification object). I thought this way might be cleaner, since I try to avoid modifying piwik core code when I can.

Anyway, I guess my question is: am I missing anything here (if there are more areas I’ll need to change), and am I approaching this correctly?

Tha’ts what I had in mind :slight_smile: can you please, test the patch, and create a new ticket w/ patch attached? I’ll commit it for next release

Thank you for the report! Fixed in: Tracker: when "visitor_days_since_last" is greater than 0, assume visitor_returning = 1 · Issue #3615 · matomo-org/matomo · GitHub