When I first joined Moz as Principal Search Scientist in 2015, I held the same attitude towards web spam as most marketers: spam was a mostly harmless annoyance to webmasters and was directed primarily at search engines such as Google. However, with the launch of Link Explorer (our new link index) in 2018, I became acutely aware of the impact of link manipulation — not just on search engines, but on webmasters and marketers in ways not yet fully articulated.
The negative impact of link manipulation on the web
Link manipulation, for those not yet familiar, is the use of tactics to increase the number of links pointing to your website in order to achieve some goal (normally improved rankings). However, it turns out that link manipulation has played a darker, more sinister role over the previous half decade or so as Domain Authority (DA), a proprietary Moz metric that predicts a site’s ability to rank in Google, has become a virtual currency. Guest blog posts, comment spam, Private Blog Networks (PBNs), link farms, and directories across the board are selling their wares based on how high their Domain Authority score is. Subsequently, some webmasters would use link manipulation techniques to artificially inflate their DA, allowing them to fraudulently charge high prices.
It was a real and present problem, and we were uniquely suited to solve it.
How we adjusted Domain Authority to combat spammers
When I and others at Moz sat down to reconsider the Domain Authority metric, we knew we wanted to address Domain Authority not simply for the purposes of predicting search rankings (which is and will always be the primary intent), but to also address the artificial Domain Authority inflation causing harm across the web. We began by identifying the types of link manipulation regularly used to inflate Domain Authority. These methods included comment spam, link buying, and several varieties of link networks. Most importantly, we needed to devalue these techniques without impacting the predictive power of Domain Authority. We needed to rebuild DA from the ground up.
How did we teach DA to recognize link manipulation?
Historically, we built Domain Authority by applying Machine Learning to link data and search engine results pages (SERPs): The algorithm would learn what causes one domain to outrank another. Unfortunately, this methodology had its limits. Training against SERPs meant that the resulting metric would only learn to discriminate between sites that already rank — not sites that don’t rank at all. Since link manipulation often results in penalties or bans by search engines, the neural network had few bad examples from which to learn. To answer this, we seeded our training set with known unranked sites in position 10 on the SERPs. The neural network powering Domain Authority could now distinguish between sites that rank and those that don’t. We also introduced “profile variables” from which the neural network could learn. I’ll explain a few below:
Spam Score as a Profile Variable
We created a unique variable that represents the distribution of links to your site based on the Spam Scores of those linking domains. The average website has a negative power relationship between Spam Scores and linking domains. That is to say, most of their links come from very low Spam Score domains, and very few of their links come from high Spam Score domains.
On the graph below, you can see the natural curve highlighted in green. You can also see examples of two sites which do not conform to this curve. They don’t conform because they are using a particular type of link manipulation tactic that tends to source domains that are slightly spammier than the average domain on the web. Subsequently, their link profile sticks out like a sore thumb.
Traffic and Linking domains as a profile variable
We created another unique variable that represents the distribution of links to your site based on the traffic to those linking domains. Much like Spam Scores, the average website has a negative power relationship between traffic and linking domains. Most links come from low-trafficked sites, while only a handful of links come from high-trafficked sites. Any deviation from this pattern is a signal the neural network could use to identify link manipulation.
With these novel variables and unique training set in place, we were confident the new Domain Authority would perform head-over-heels better than it had previously. Not only did we establish the highest DA correlations with SERPs in recent history, we successfully devalued link manipulation in a number of industries:
- On average, the Domain Authority of link buyers dropped 15.7%
- High-, medium-, and low-quality auction domains dropped in DA 61%, 95%, and 98% respectively
- On average, link sellers lost 56% of DA
- Comment spammers lost 34% of DA
- Link networks lost 79% of DA
- Domainer networks lost 98% of DA
Why is this important to the marketing industry?
Well, there are several important outcomes delivered by the new Domain Authority.
- Webmasters can feel more comfortable that the Domain Authority metric is not being manipulated when considering content placement or domain acquisition.
- Unscrupulous SEO practitioners will have a harder time fooling customers with inflated DA.
- SEO practitioners who use dubious techniques will not be able to rely on increasing DA scores as customer KPIs.
- Webmasters can rely on Domain Authority as a strong indicator of potential rankings.
We’ve taken strides to ensure Domain Authority is used as a fair and trustworthy metric for digital marketers, hopefully doing our part to rid the web of spam while we’re at it. But there’s always more room for improvement, and we plan to keep innovating to provide the most valuable and reliable metrics for the digital marketing industry.