One thing I like to do when doing a competitive analysis for a client is rifle through their backlinks vis-a-vis their competitors’ and reverse engineer their competitors’ marketing strategies. But sometimes that means extracting text from another column of data to organize your data for maximum pivoting. Since this is a common task I do a lot, I’m going to show you how to pull domains out of a list of URLs using a combo meal of the LEFT and SEARCH functions.
First Batter Up: LEFT Function
The LEFT function follows the following structure:
LEFT( text, [number_of_characters] )
text: For this you can either put text in quotation marks inside the formula or point to the cell where the text is. Most people choose the latter. It’s much more scalable.
number_of_characters: So anytime you see something in brackets like this, it means that parameter is optional. In the case of the LEFT function, I think it’s kind of silly to make the number of characters optional. If you leave it out, Excel will just grab one character. Can’t really think of many examples that would be useful, but knock yourself out if you find one. If you have a column of data and the text has the same number of characters in each cell, you’re in luck. You can just enter the number of characters.
Most of us aren’t that lucky though, especially Internet marketers who have to do messy tasks like extract the domain from a group of links from different domains. In such occasions the SEARCH or FIND functions are really helpful.
SEARCH vs FIND Functions in Excel
I like SEARCH over find because it’s not case sensitive (and FIND is). Plus, SEARCH allows you to use wildcards and FIND doesn’t. One of my fave trainers, Mike Girvin, did a YouTube video demonstrating the differences between the two. I can only think of one time I needed to use FIND (because I needed it to be case sensitive). But in this case either will do.
Next Up: SEARCH Function
So what the SEARCH and FIND functions do is return the position of a character you search for. This works swimmingly with the LEFT function because you can find a character’s position and return that for the number of characters you want to extract. If you need to stop at the character before the one you can isolate using the SEARCH function, you can subtract 1 (or however many you need) from the value returned by the SEARCH function. Or you might need to include the character(s) just after it; in that case you would just add 1 to the value.
The SEARCH function follows the following structure:
SEARCH( substring, string, [start_position] )
substring: Pure, unbridled geek speak that means whatever you’re searching for (we’re going to use “/”)
string: Typically the cell this text string is in, though you could enter text as long as you flank it with quotation marks
start_position: This is optional because oftentimes you’ll start looking for your character in the beginning of the string. In our case, the two forward slashes at the beginning of each URL (e.g., http:// and https://) prove to be problematic. So what we’ll need to do is tell Excel to start counting at the 9th character (because http:// has 7 characters and https:// has 8).
When I’m finished the formula looks like this:
Note: If you’re doing this in a formatted table, Excel will used structured references instead. That’s what I always do. One other little benefit is you don’t have to double-click the top cell to send the formula down. It gets applied to the column automatically.
I just didn’t use a table for this tutorial because the structured references can make the formulas look unnecessarily complicated when you’re just learning.
Watch It In Action
Still a little squeamish? Download the workbook and check out the video.
Stupid Feedback Noise
In quiet spots in the video I get feedback. The only solution I’ve had any success with is to unplug the cord from my PC, but I forgot to do that. Sorry.
How To Use This Data
In my next post I’ll show you how you can then take this data, pop it into a pivot table, and analyze it to
steal glean from all of your competitors’ best ideas.