1. The Example Scenario
I’ve been reading up on SEO lately and have stumbled upon the concept of a Private Blog Networks (PBN). I thought I’d do a fun little exercise and show how Google can potentially identify PBN as well as the sites that are use them for link building. This is not a formal proof – just a quick writeup that introduces the idea and explores an example.
Let’s set the stage:
- we have a small/medium PBN network – say – 20 domains (PBN 1-20)
- we have 20 clients with 20 money sites (MS 1-20) sign up for the service and starting using the PBN to build links
- each client build 10 links each from within the PBN back to their money site
I’m using these numbers simply to drive the point – these can of course be entirely different without affecting the simple technique of identifying the PBN domains.
Now – let’s see how difficult it would be for Google to identify some or all of these 20 PBN sites.
2. Identifying the PBN
Let’s now ask the 100$ question – has Money Site 1 (MS1) been link built using a PBN?
First step is to look at the link profile of MS1 – let’s only consider the more authoritative domains since these are more likely to be PBN links. MS1 will have a number of these authoritative links – say 50 – out of which 10 are from the PBN. At this point Google has no idea which of these are PBN links, if any.
What they do know however is the full link graph. Looking at these 50 linking domains we can analyze the sites they’re linking out to. Our MS1 is of course among them, along with some of the other Money Sites – MS 2-20. Not all of them, but enough, since our PBN has 20 sites and probably a lot more than 20 clients.
At this point, all of these 50 domains are potentially part of the PBN (some of them actually are) – so all we need to identify which is which. Let’s do that.
The signal we’re going to use is the intersection of sites these 50 domains are linking out to – and identify which look natural and which don’t. The criteria is simple:
- a non-PBN site that links to MS1 naturally is very unlikely to point to any other of the 10 money sites
- a PBN site on the other hand will have a sizable intersection since it does link out to some of the Money Sites
Let’s look at these 2 criteria in-depth.
Why is a non-PBN site not going to have any intersection with other non-PBN sites, out of the 50 domains? First – the Money Sites are in entirely different niches – so they wouldn’t naturally get links from the same sites. Second, these sites are small(ish) – they’re not Wikipedia and they not the Huffington Post – so the chances that they get common links naturally is similarly very small.
3. Wrap-up
So – at this point we have identified two things:
- what the PBN sites are – out of the 50 being the ones that are linking to the same sites
- what the Money Sites are – this is the intersection of common sites the 50 were pointing to
Of course little on the web is black and white and, depending on how the linking patterns are – sections of the PBN or the whole network can be identified – but the signal is there and clear as day. Enough link building and there’s no reason for the any of the PBN to slip by.
Hope this quick writeup sheds some light on the risk of using PBN links versus relevant editorial links. If you’re aiming to build a long-term site and asset – using a PBN may be something you want to stay clear of.
Nice! It is about time someone pointed out just how fragile these PBN’s are. I am sure it is only a question of time before google comes up with a formula which knocks out 95% of the private PBN’s Most niche type sites that I look at and are doing well, can easily be evaluated manually to see if they are using a PBN.
Hey Neale – yeah, I do think it’s not very difficult to analyze a site and figure out if it’s using a PBN or not once you start with the full link graph. That being said, it’s a very risky strategy for an authority site you’re aiming to grow into a real asset. If you’re growing a niche site that you don’t have big plans for, the risk is probably acceptable. Cheers,
Eugen.
I don’t see how your going to spot PBN if it blocks bots…. (Ahrefs, Majestic, Moz)
Well, we’re not talking about Moz or Ahrefs – these may indeed be blocked. Google however has the whole picture – since the Google crawler is not blocked – and so they can perform the analysis with full accuracy. Cheers,
Eugen.
Interesting thoughts. I know that platforms like reddit do similar things to discover vote-rings. Banks do similar things to discover fraud-rings. That’s where graph databases and graph analyses really shine. One thing I’m wondering about is whether Google can also link content to their authoring entities. E.g. can Google discover whether: 1. An author has written this AND that content (e.g. blog, Stack Overflow, etc.) 2. An author belongs to this OR that bigger entity (e.g. employer) Some information is obviously known to Google via OAuth (i.e. “login with Google”), but I’m sure there are other means to discover author… Read more »
It’s standard practice and quite effective – I’m surprised they haven’t taken action before. However – a few days ago – they have and in a big way.
I think Google can tell who the author of something is by also using the metadata (rel publisher, author) mostly. However, in the case of a PBN – there’s no way to know who the author is. As for SEO – author is supposedly gaining weight but … how much? It’s not the kind of thing that has an easy answer.
Yep, there’s certainly no easy answer. And you only have one shot at experimenting before you “kill” your real identity 🙂