EDIT! – read note at footer – issues with overcounting using this method!
We have a blog report that used a custom variable to log the authors name. This was fairly standard practise in order to group together blog posts and content – and we have a wide range of bloggers both guesting and on the payroll, so there are a multitude of reasons why we need to know which bloggers contributions are performing and which are not.
During our recent transfer to Universal Analytics, we obviously followed best practise and configured all these custom variable elements into custom dimensions, working on the same basis etc.
So during the cross over, custom variables died a death on the report – cue ‘where have our stats gone? / GA broken please fix’ email requests, and re-referencing the notification of the transfer where UA would now be reporting these under custom dimensions and to look there. The issue is one of convergence. Is it possible to set up a report that ‘munges’ these two elements together seamlessly to give continuation? Or does the business have to run two reports until such time the new one is the only one of interest. Obviously the former is the preferred option in order to allow historical comparison.
So first stage would be to set up a segment, which allows for both options to be driven from the data – so using a custom segment, setting a condition rule to return all traffic that had a blog author attached to it, either variable or dimension.
This materialised as a simple OR query using regex = ‘.’ (no quote marks in the actual regex!) in order to bring back entries where there were any characters attached for both variables and dimensions:
Edited – because its always a lower case slug of the blog authors name, I could use matches regex [a-z] instead of ‘.’…
This works in as much it will generate an overall visit statistic, across both types of data – and therefore allows comparison with previous periods:
However the table based output will not allow the two to be joined – the methods attempted were to output the custom variable and then add the custom dimension as a secondary dimension – but as the custom variable does not return non matching data, there is nothing for the custom dimension to reference against. If you attempt to open up the segment to include non-custom variable data, this effectively says return all data, and nullifies the segment – it is the output that will not allow you to say ‘return all custom variable entries, and group up all page views that did not have a custom variable value under a blank value’. So therefore a possible answer would be to set the custom dimension on EVERY page, listing a blank as the value for pages without an author – however the pages are obviously logged historically, this was not done previously and therefore would only work moving forward. As this is not a problem moving forward, this is not a feasible solution!
EDIT! – this does not actually work. There seems to be a fundamental difference between the way GA adds the custom variable to the page hit (and then recalls a count of all page hits by custom variable) and the way GA allocates a custom dimension to a page hit (and then subsequently samples the results in order to attribute that custom dimension across all page hits). This end result means that blog author counts are over calculating by approximately 25% and blog authors are being credited with the same blog entry, as well as pages they did not edit (home page, training lists etc).
This does not appear resolveable, so the only avenue left available is to extract the information using SSIS and then match up the page views to the authors in a back office function, or add the blog authors to the page names. Shame, that was a nice use of custom variables and I’m sure as i look further into this the limitations of this will become more and more apparent.