There’s a war going on my fellow SEOs and Search Marketers. Has been for a couple of years now. The war on organic data.
It was a war that started off very covertly, almost without incidence, as noted by Jon Henshaw over 2 years ago on Raven Tool’s Blog. Google, by way of depreciated APIs, quietly pulled SERP [Search Engine Results Page] ranking data. Which then led to more and more companies scraping the results from Google and placing an extra burden on their servers. And, perhaps Google was banking on the fact, though somewhat quietly kept, Google Webmaster Tools has had “ranking data” for over 3 years now. Maybe this was Google’s evolutionary step? Nonetheless, it was the one of the first assaults on organic data; Google’s conscious and deliberate action to close off a major pipeline to SEOs and Webmasters. It registered as nothing more than a blip to most of the community, myself included, but the SEO tool companies probably had a good idea where it was headed. Maybe they decided to just “wait and see”?
2011: Google Kicks in the Door
A year after Google shut off the Ranking Data APIs, they got brazen. They gathered up the troops and kicked the down the door to the SEO house, fingers
hugging the triggers. It was akin to enacting Google’s own personal Patriot Act. They black-boxed the organic search data. “NOT PROVIDED”. Users who were signed-in or using SSL Google searches would appear as “not provided” in your organic search keyword data to help “maintain the privacy”. All under the guise of privacy. Immediately this change was said to only affect less than 10% of data in analytics. “Single digits”, was the quote from Matt Cutts.
2012: One Year of “Not Provided”
Danny Sullivan’s excellent write up “Dark Google” tells the story. Single digits? It’s hard to imagine that Matt could say it and keep a straight face. I don’t know about you, but my sites are consistently between 1o% and 15% “not provided”. And, in some extreme cases, they range near 25%. 25% shielded keyword data for a small business is pretty big, and pretty crappy. That’s an awfully large gaping hole to be missing out on. How can they [small businesses] help it that people are signed-in or starting off in “https” search? That’s a ton of valuable data that small businesses could be using to help them analyze customer behavior, to help them write better, more targeted content to their customers, and to help them expand and refine their consumer funnels to convert more and stay afloat in an economy that still moving sideways. I think of it this way: if Google suddenly lost 15% of its collected search data overnight, don’t you think they’d be pretty pissed off? Just, “poof”, and it’s gone.
Losing That Data Might Be the Best Thing For SEOs
I know that it sounds backwards, but hear me out. Also, let’s put aside the obvious here: of course this move was intended to get every business involved in Paid Search (not just the those who spend millions over millions every year). Because like SEOs, Google knows it’s the thousands of small accounts [mid-tail/long-tail] that add up. If you ensure the data that was once free has to route through a paid resource, you’re going to get a large swath of folks to jump on board and buy-in.
Perhaps it was all part of their design (can’t discount that theory), but losing organic data has forced SEOs and Search marketers to expand their tool-kit to get that data. When you can’t rely on a single source, you’ve got to employ multiple channels (i.e. social, content, CRO, etc.) to piece together the story again. I won’t lie, that’s a stretch (even to the writer). However, there’s a small nugget of truth that this assault on organic data has forced us to become better marketers.
2013 and Beyond
The more Google pushes its own products (i.e. Gmail, Places, Google Drive, Insights & Trends) to its social platform (Google+) to create a fluid SUPER-DATA-HIVE, the more users must be signed in to interact. Luckily, Google+ hasn’t quite pulled off the interaction and engagement with users it hoped to (so far). Then add in Firefox moving to Google SSL search by default and iOS6 doing the same. What you get is a “black box” on organic data that is the size of Utah, and is only going to get larger and more vacuous. It is their data to do with as they please, after all. The hypocritical precedent set a year ago by Google will continue onward: “user privacy”. They clearly don’t want to be viewed under the same lens as Facebook.
I think that by the end of 2013 organic site data will reduced to drips from a leaky faucet as SSL search become the rule and not the exception. I can’t say what the end-game is here; whether its Google constructing a service to “buy back” organic data that they anonymize or making SEOs piece together the puzzle from several different platform strands. Or, just so it can be said aloud, push every business into AdWords platform to get “all the consumer data”. It certainly seems like that’s the objective with all these maneuvers: squeeze SEOs organic data into a corner so small that it becomes non-representative of overall searcher/consumer behavior.
It’s even more than looking at metrics to see where site performance excels or declines. The most important step, and perhaps the biggest hurdle, in analytics is in the implementation on existing sites. Especially on big sites. What I’m talking about specifically is goal implementation on a live site while maintaining the integrity and continuity of data.
There is a Difference
It seems simple enough; you track the funnel of each goal, URL by URL, and drop them in. Most times it’s never that simple. I’ve implemented analytics on both sites: big corporate sites and small and small-ish sites. No two are ever alike and no two ever do things the same way. There’s definitely a difference between implementing analytics on live big sites and on live small sites.
Small Site Goal Implementation Planning
With small sites, there’s a very good chance the site was never tracking anything to begin with. No analytics period. And, if they did have analytics in place, chances are the only thing they ever saw was traffic volumes because either they had never implemented tracking on goal conversions, or they had no conversion to speak of. In this instance, it’s easy. You implement the goals where there were none before and let analytics go to work.
And, even if the site was tracking goals prior, the chances of the having to track more than one or two goals (outside of creating on-click events or virtual page views) is slim. So preparation will be limited and one-to-one goal match-ups are easy to accomplish, as well as the maintaining the continuity of data on the site. At a cautious level for small sites, you may want to dump all the data as far back as it goes in case something does go wrong. In a worst case scenario, you can piece the puzzle back together.
Large Site Goal Implementation Planning
It’s best to start off this section with being honest. I’ve screwed one up in the past. But it’s that screw up that led me to write this post initially, and was the catalyst to create a game plan of tracking implementation.
Large site goal implementation planning is a different animal all together. They’re complex with lots of spinning wheels and cogs. It’s a formidable task to be sure. But there are ways you can make this project easier on yourself and make the implementation more manageable. Based my own experience (both the failure and the successes) here’s how I like to go about breaking down the implementation:
- Take at least a year’s worth of data from the site for safe keeping. With big sites, a lot things can happen that are simply beyond your control. You might not be left holding the bag should something go awry, but having the data in case something does go wrong, is the only way to be able to stitch the puzzle together.
If you’re entering the game late and you don’t have time to grab that data, I’d suggest that you dive into the analytics themselves. Reverse Goal Path is a wonderful thing to track down all the destination URLs of all the past goals. Moreover, under the assumption new profiles haven’t been created, you can set date ranges to see that past activity and make record of it.
- Make sure you’re talking with the company point-person to make sure you have ALL the goals they want tracked.
Just looking through the site isn’t enough in this case. You may not find all the trackable goals on your own, and you may not know the client wants certain events to be tracked, or you may be tracking something they don’t care about (wasting valuable goals). It sounds like common sense, but when you get into time deadlines and pre-launch mode, things get missed. And the last thing anyone of us wants to do is miss critical trackables.
- Examine all your client’s current goals.
I’ve run into a couple times where clients unknowingly have had been “double-counting” goals. Or incorrectly tracking goals. The sole purpose of doing this is to leave no surprises for the client. When you implement anew, site conversion rates change (for better or worse). What you don’t want to happen is to see the conversion rate for a goal or two plummet and have no explanation. In this way you can prepare a client for what they may be likely to see due to double-counting.
- Map out current goals with new goals and match them up
This is the mistake a made. I didn’t do this. And it’s critical. Because many larger sites will be close to exhausting all 20 goals in a profile (if not utilizing a second profile for tracking), not making a one-to-one match ups where possible will crush the integrity of the data. While you may retain the continuity of the data, it’ll be worthless. Profiles live on forever, and as such, mis-matched goals can, and likely will, wipe out the data integrity.
- Finally, discuss with your client how they want the goals implemented. Do they want new profiles or use existing profiles
It’s a big decision that can affect how reporting is done. Communication is a key here. As the SEO/SEM you need to layout the ramifications of each option. On one hand, to create new profiles, the data continuity starts from the day you create it. On the other hand, if there isn’t much match-up between old goal tracking and new goal tracking, does it make sense to make the data mushy? Every situation will be different, so this is a conversation that should be had.
Hopefully this helps you start putting a game plan together when you are about to implement new goal tracking on both small and big sites. I’d love to hear your suggestions on how you game plan your analytics implementation.
Nearly every single SEO/SEM knows that using analytics is an absolute essential, but when you’re not measuring and uncovering the most valuable data, either through analytics reporting or traffic segments, you’re not using the tool to your advantage.
Regular Expressions (RegEx) are one of the most powerful and least used in an analytics tool belt. It certainly walks the line of SEM geek-craft, but there will come a time when you need to pull out the pocket protector and thick, taped glasses to get the data you need. Here’s what Google defines a regular expression as:
Regular Expressions are a set of characters you can use match one or more strings of text. Regular Expressions is that they support wildcard matching, letting you capture a lot of variations (in URLs for example) using a single string of characters.
The Basics of RegEx Characters: A Quick Guide
If you are familiar with or write advanced query operators for Google, Yahoo, or Bing, writing RegEx for analytics is along the same vein. Below are the most common characters I use when writing regular expressions in analytics reports to refine the data.
- ” ^ ” (caret): using the caret before letter or keyword at the start of the string will match position rather than a character
- Example: (^keyword) or (^www)
- ” $ ” (dollar sign): using the dollar sign at the end of the letter or keyword will match a position rather than a character
- Example: (y$)
- ” | ” (pipe delimiter): used to string together multiple items into a series of options
- Example: (keyword|keyword1|keyword2|etc.)
- ” + ” (plus sign): will match as many items as possible
- Example: (keyword+)
Why You Should Be Using Regular Expressions in Your Analytics
As an SEO/SEM you need to get granular. While Google does a good job of getting you high-level data in their standard reporting, it is a bit harder to dig deep if you don’t have your handy-dandy RegEx operators. Beyond that, there are times when the sites you work with will require RegEx in order to track goals properly. It’s an essential tool in your toolkit for measuring correctly, tracking SEO strategy progress, and measuring ROI.
Using Regular Expressions to Filter Keywords in GA
The keywords report is one of those live and die by reports in GA. Google is kind enough to let you segment by paid and organic, but if you want to get to the meat of your efforts, you’ll need a regular expression.
Keywords Before RegEx
Your initial keyword list will more than likely have the majority of those keywords being branded. Either through user typing in the website into Google (yes, that still happens) or querying the corporation’s name or brand. It’s not very helpful if you need to find out if your SEO/SEM strategy is actually working. For instance, have the onsite tweaks you made generated more traffic, or the link building efforts you started on “Keyword X” started boosting traffic and conversions?
The initial list you’re presented with from GA isn’t going to help you prove any of that. Using a RegEx made up of the caret (^), pipe delimiter (|), and possibly the dollar sign ($), you can create a filter that will get you to the heart of your organic traffic.
Regular Expressions to Track Dynamic Goals
On occasion it can’t be avoided; websites have dynamic URLs generated on the “Thank You” page. When this happens you can’t simply just add-in the thank you page URL. You have to write a regular expression to capture the all the goals.
Let’s say you have a “Thank You” URL that looks something like this: http://%5Bprimary domain]/thank-you/contact-us?sid=259. Using a RegEX to amend the URL (/thank-you/contact-us+), we’ve effectively take the dynamic element out of the goal conversion and are now able to track as accurately as possible. If the URL structure is more segmented and delineates more definitive sections, then you can create multiple goals still using the (+) to combat the dynamic session id associated.
If you’re concerned that dynamic portions of the URL helped to designate what section or what particular form, you can use “Reverse Goal Path” for each goal to see the exact path a user took through the site and completed the conversion. It will take more front-loaded leg work, but the piece of mind you’ll create for your client is worth the extra hour or two.
The Power to Slice Your Data Deeper
Those are two very common examples that regular expressions can be used for within your analytic reporting. With RegEx, you can’t be afraid to fail because you will. The trick is to keep testing your expressions until you get exactly the data you need.
There are great tools out there will allow you to test your RegEX to make sure it works prior to implementing on live analytics; Epik One’s Regular Expression Filter Tester is a perfect for this task. It’s specific for Urchin 5 and Google Analytics. RegEx can help you slice your data anyway you want, you just have to know the questions you want to ask.