Regex to Know for Screaming Frog and Google Analytics

Every marketer knows that data-driven campaigns are the key to better audience insights and increased conversions. But what happens when you’re drowning in too much data? With 51% of marketers saying they want to make more data-driven decisions, it’s clear that they have an uphill battle to make meaning from this data.

To make sense of this immense mound of information, marketers are increasingly taking notes from the programming department. The answer to effectively sifting through data is simple: regex.

Regular expressions, or regex, are an incredibly useful tool. But since most marketers don’t have a programming background, many haven’t heard of regex.

Regex is a language that helps you sift through data. Regex is important because it gives you more flexibility. Stop weeding through pages of URLs in Google Analytics, or scrolling for 30 minutes to find a URL in Screaming Frog. Regex helps you find the needle in a haystack in mere seconds.

There’s a learning curve to regex, but it can ultimately save hours of time, returning searches more quickly and even running more effective reports.

Use regex to find patterns and matches to save more time and increase accuracy. Let’s take a page from programmers by implementing regex in Google Analytics and Screaming Frog.

Regex basics
Before we can go in depth about regex for Google Analytics and Screaming Frog, we have to first understand how to write regex.

While we cover the basics of regex here, keep in mind that the best way to learn is to dive in and get your hands dirty. Use tools like Regex101 to test your regex before putting it into your system.

There are only 13 expressions in regex. However, you can combine these expressions to drill down and create custom expressions, too.

Pipe: |
The pipe expression is translated to mean ‘or.’ Let’s say you’re searching for data that contains certain keywords. If you’re looking for data on both hot dogs and hamburgers, or more specifically, hot dogs OR hamburgers, you would write the expression:

hotdogs | hamburgers

Dot: .
Think of Dot as a plug. You can plug it in to fill in missing characters. For example, if you’re looking for items numbered 10 – 19, you could use the regex 1. to pull that data. This regex command is more powerful when you combine it with other expressions.

Asterisk: *
Asterisks match zero or more of the previous character. So, if you wrote pot*ato, you would pull results for poato, potato, and pottttttato.

Dot-asterisk: .*
The dot-asterisk helps you match zero or more random characters.

Let’s say you have an online cupcake shop. If you use the regex /products/.*cupcakes/, you can pull all categories matching this expression. It’s an easy way to compile all of your product pages into one, easy to understand view.

Backslash: \
There’s one problem with regex: the special characters used in the commands are also used in site data or URLs. That can cause confusion when pulling data, and that’s why the backslash command is available. Backslash keeps you from translating legitimate characters into mistaken regex commands.

It’s most helpful when pulling IP addresses. Simply add a backlash to indicate the dots aren’t a Dot regex command:


Caret: ^
A caret means you only want to pull data starting with a certain character. If you type ^cupcake, then the regex will only pull data starting with the word cupcake.

Dollar sign: $
The dollar sign is the opposite of the caret; it means ‘ends with.’ You can type /cupcake$ to search for a URL that ends after the word cupcake, and no other versions of the URL.

Question mark: ?
A question mark means that the last character is optional. It’s often used to find misspellings. You would normally combine with other regex, like the Pipe:

Cupp?cakes | cuu?pcakes

Parentheses: ()
Remember PEMDAS from math class? It works the same way in regex. With parentheses, you’re grouping like items together.

If you wrote ^products/(cupcakes|cakes)/$, it means you would pull the product pages for either cupcakes or cakes.

Square brackets: []
Square brackets are best paired with dashes. They’re used to help you make simple lists. You can write the regex s[iou]p to match for the words sip, sop, and sup.

Dashes: –
If you need a more advanced list, use dashes within square brackets. Use [a-z] to match on all lowercase letters, [A-Z] for uppercase matches, and [0-9] for number matches.

This is particularly helpful if your items are dated or chronicled. For example, if you ran a magazine, you could search for Marketing Magazine 201[0-9] to search for data ranging from 2010 to 2019.

Plus sign: +
The plus sign isn’t used very often, but it’s worth a mention. You can use it to match one or more of your previous characters. Cupcakes+ would match for cupcaes, cupcakess, cupcakesss, and so on.

Curly brackets: {}
Curly brackets help you look for numbers that repeat. This expression is great for searching through numbers, like looking up IP addresses.

The regex with curly brackets might look like:


In layman’s terms, this would help you look for IP addresses 12.345.678.0 to 12.345.678.99.

Remember to practice your regex expressions first. It can take some time to master these expressions, but it’s well worth the investment when you can pull data from Screaming Frog or Google Analytics at warp speed.

Let’s look at how you can apply regex when working in Screaming Frog and Google Analytics.

Regex for Screaming Frog
Screaming Frog is a fantastic resource for SEO and marketing. However, if you’ve ever used this search engine crawler, you know that it can pull an overwhelming amount of data. Don’t spend hours digging through data. Use regex expressions to find the information you need in mere minutes. Check out these common regex for Screaming Frog.

Correct misspellings or update text
Most companies update spellings or the naming conventions of their products over time. Because of that, they often need to go back and change spellings. Don’t do this manually! Regex can help you identify where language needs to be changed.

If you need to update a term or name, search for the regex (example|Example|EXAMPLE), replacing ‘example’ with the incorrect versions of the term that need fixing.

You can also use this regex for spell check. Use the Pipe regex to search for commonly misspelled words. Remember to list the misspellings themselves, not the proper spellings of the word. For a list of common misspellings, check out this resource from Oxford Dictionary.

An example of this regex would be:


Exclude feature
Screaming Frog pulls a massive amount of data. Save time by preventing the crawler from accessing certain pages with the Exclude feature. For example, if you want to exclude your thank you page, use the Exclude feature with this regex:


Include feature
The opposite of Exclude is Include, which can be used to narrow your search to specific pages. This is perfect if you have a large site or an eCommerce site with complex URLs.

For example, you can search for URLs with the word ‘shoes’ in them by using the regex .*shoes.* in the Include feature.

If you use Screaming Frog to pull any data, you need to be familiar with regex. They can not only save time, but prevent errors by pulling only the data you need.

Let’s take a look at how you can also use regex when analyzing your site traffic in Google Analytics.

Regex for Google Analytics
Google Analytics is one of the most popular analytics platforms for marketers. It’s a fairly intuitive system, although, like Screaming Frog, Analytics pulls in a mountain of data. Use regex to minimize the time spent pulling information.

Filter data
Filter out certain data in Google Analytics using regex. For example, you can view certain page URLs with the regex ^/cupcakes. You can also set up a filter that will hide visits made from your own IP address, showing only data from your audience visits.

Set goals
Goals in Google Analytics help you measure conversions. Regex lets you have a bit more control over what counts as a goal. In the Destination area of your goal, enter your regex command. If you want to measure thank you page conversions, you might use ^/thank-you/$, or ^/thank-you/, if you want to measure query parameters or order IDs in the URL.

Create custom segments
Data-driven marketing is all about understanding your audience. Use regex to segment your site visitors into measurable data points. Analytics shows all of your sessions by default. Use regex to drill down and see which personas are visiting which URLs.

Regex has so many applications for Google Analytics. It’s tempting to set up regex all over your account to make things run more smoothly. But remember: keep regex as simple as possible, and always test your regex before executing it. Regex is meant to make your life simpler, not cause unnecessary headaches.

The bottom line
As the lines blur between coding and marketing, marketers have to step up to the plate. Take a page from programmers to stay ahead of the game.The most successful marketing departments will embrace tactics like regex to find the data they need more quickly, resulting in faster, better data-driven decisions.

Leave a Reply

Your email address will not be published. Required fields are marked *