Home | Archive | Contact
Previous Entries

Archive for the 'Marketing Insights' Category

Blogs Worth Reading

Monday, December 15th, 2008

I’ve never done a round-up of the blogs I read before, which I guess is a bit selfish. So, in no particular order (and this isn’t a complete list) some of my favourite blogs, if you’re looking for some inspiration.

Dark SEO Programming is run by Harry. As he puts it, “SEO Tools. I make ’em”. A great guy if you need help with coding and somewhat of a captcha guru, with a sense of humour. Definitely worth keeping up with. I wouldn’t be surprised if this guy starts making big Google waves in the next few years.

Ask Apache is a blog I absolutely love. Great, detailed tutorials on script optimisation, advanced SEO and mod_rewrite. AskApache’s blog posts are the kind of ones that live in your bookmarks, rather than your RSS Reader.

Andrew Girdwood is a great chap from BigMouthMedia I met last year (although I very much doubt he remembers that). Andrew seems to be a vigilante web bug hunter. What I like about his blog is that he is usually the first to find weird things with Google that are going down. This usually gets my brain rolling in the right direction of my next nefarious plan. ^_^

Blackhat SEO Blog run by busin3ss is always worth checking out. He was even kind enough to give me a pre-release copy of YACG mass installer to review (it’s coming soon – I’m still playing!). Apart from his excellent tools, his blog features the darker side of link building, which of course, interests me greatly.

Kooshy is a blog run by a guy I know, who.. Well I think he wants to remain anonymous (at least a little). He’s just got started again after closing down his last blog and moving Internet personas (doesn’t the mystery just rivet you?). Anyway, get in early, I think we can expect some good stuff from here. He’s already done a cool post on Pimpin’ Duplicate Content For Links.

Jon Waraas is run by.. Can you guess? Jon has something that a lot of even really smart Internet entrepreneurs are missing, good old fashioned elbow grease. This guy is a workaholic and it pays off in a big way. Apart from time saving posts on loads of different ways to monetise your site, build backlinks and flush out your competitors I get quite a lot of inspiration for his constant stream of effort and ideas. I could definitely take a leaf out of his work ethic book.

Blue Hat SEO is becoming one of the usual suspects really. If you’re here, you probably already know about Eli. Being part of my “let’s only do a post every few months club”, I love Eli’s blog because there is absolutely no fluff. He gets straight down to the business of overthrowing Wikipedia, exploiting social media and answering specific SEO questions. You’ll struggle to find higher quality out there.

SEO Book is probably the most “famous” blog I’m going to mention here. Aaron was off at a disadvantage, because to be honest, I thought he was a massive waste of space for quite a while. (I guess that’s what happens when you take your SEO youth on Sitepoint listening to the people with xx,xxx posts on there). I bought his SEO Book and for me, at least, it was way too fluffy. I’m pleased he’s started an SEO training service now as it represents much better value. I’m sure he was making a lot of money from his SEO Book, but perhaps milked it too long (like I probably would have). Anyway, I kept with his blog and I’ve been impressed with his attitude and posts. He’s done some really cool stuff, like the SEO Mindmap and more recently, a keyword strategy flowchart which would be useful for those looking to a more structured search approach. He’s also written about algorithm weightings for different types of keywords and of course has some useful SEO Tools.

Slightly Shady SEO – Great name, great blog. Although XMCP will probably take it as an insult, I’ve always regarded Slightly Shady as the blog most similar to mine on this list. Maybe it’s because I wish I’d written some of the posts he has, before he did, hehe. Again, a no BS approach to effective SEO, whether he’s writing about Google’s User Data Empire, hiding from it or site automation it’s all gravy.

The Google Cache is a great blog for analytical approaches to SEO. There are some awesome posts on Advanced Whitehat SEO and using proxies with search position trackers. I like.

SEOcracy is run by a lovely database overlord called Rob. Rob’s a cool guy, he was kind enough to donate some databases to include in the Digerati Blackbox a while back. Most of his databases are stashed away in his content club now, which is well worth a look in. He’s also done some enlightening posts on keyword research, stuffing website inputs and Google Hacking.

This is all I’ve got time for now, apologies if I’ve missed you. There may be a Part II in the near future.

Posted in Affiliate Marketing, Approved Services, Black Hat, Blogging, Digerati News, Google, Grey Hat, Marketing Insights, Research & Analytics, Search Engine Optimisation, Social Marketing, Splogs, Viral Marketing, White Hat, Yahoo | 7 Comments

Blackhat SEO Tools & Scripts – The Digerati Blackbox

Thursday, June 12th, 2008

buenos dias, friends!

I’ve put together a little treat for all of you budding and new blackhats out there. I got quite annoyed this week with the whitehattards on Sphinn.

Those of you who actually know me, will know I believe whitehat stuff is very important to building a web business. However, I also believe there is strong case for at least experimenting with gray/blackhat (whatever you want to call it). There are some markets you literally cannot touch without getting off your rainbow shitting whitehat unicorn of light. Unfortunately, there’s a lot of, erm, “dedicated” whitehats out there that refuse to even learn what blackhat is. I’d like to take this opportunity to shed some myths (AKA venting) about blackhat. For those of you who don’t enjoy reading pissed off (I believe the whitehat word for pissed is “snarky” – Thanks Matt.C), feel free to skip down the page to the goodies.

Things that whitehattards believe to be true:

1. That “on page” SEO is some uber-skill which takes years to learn.

False. If you actually get a good web developer, the chances are he (or she!) will make a decent crawable website. You might be able to help them out with some keyword research to help target title/header tags, or give them a little advice on PR sculpting for large sites with nofollow. Good internal linking structures are pretty commonly well known – at least with the web developers I know. If any pure whitehat starts talking about precise keyword density, just laugh in their face.

2. The main thing about SEO is creating good content.
Good content gets links, yes. Well done. Why are you doing SEO when so many crimes are going unsolved around the world? Good content is important for whitehat site, yes. However, good content is not bloody SEO! How do I know this? Would you bother writing good content if search engines didn’t exist? Yes, you would. Therefore it is actually a component of web design, not SEO!

3. There’s no point in blackhat, you’ll just get banned.
This little corker comes from two types of people, normally from people who have never tried blackhat (glad they’re qualified to comment, why not go give a lecture on brain surgery while you’re at it). Or, secondly, people who have tried some very, very, basic blackhat and done it badly and left footprints like a crack-addicted yeti storming around the web. I know of many blackhat sites that have enjoyed top positions for years without getting caught for competitive key phrases those whitehats couldn’t touch with a NASA sized hard drive full of great content.

4. I’m a good whitehat SEO because “I know” where to get links from
Aww now, c’mon. Not really a “core” SEO skill is it? I’ll give it to you, that it helps. I think what you’re trying to say is “I understand how the web works and where it is possible to drop links” or “I use social news/community sites”. I know people who have never built a link in their life and would make great whitehat SEO link builders because they spend ages writing content for blogs and taking part in Digg, Reddit, Stumble, blaahh, blahhh. At best, it’s a transferable skill.

5. Blackhat SEOs only resort to blackhat because they can’t produce good websites
This one (which I saw several times on Sphinn), just leaves my jaw dropped. Generally, blackhats are far more accomplished programmers than whitehats and can build much cleaner and more efficient websites (and a lot do) if they wish. The fact is, by scripts and automation they’ve found a way to make a decent income without burning the midnight oil writing content about their new “diamond goat hoof jewellery” niche they’ve found. This comment normally comes from whitehats who wouldn’t know a blackhat if they spammed them in the face.

There is however, advanced white hat SEO, as Eli kindly demonstrates in his painfully bastardish always right way.

Ahem. Anyway…..

The Digerati Blackbox

So, I’ve collected together a set of tools, scripts, databases and tutorials which will help the beginner blackhat find their feet. Some of the stuff is pretty good, albeit fairly basic. You should be able to make something decent if you combine some of these scripts, or strip out some of the code into your own creations.

Blackbox Contents:

Cloaking & Content Generation:

cloakgen1.zip:
This is a cloak / dynamic content generation script. To use it you simply add a small piece of code to the top of each page you wish to be cloaked. When someone accesses your page then cloakgen is run and if the user-agent suggests the visitor is a standard user then they are simply shown your standard page. However if the user-agent suggests that the visitor is a search engine then it will start doing the business. It will start by finding out what page called it, then it will open this page and find out what the most common words on the page are. Once it has worked this out then it will scrape some content about that word from wikipedia and add it with your normal page content. Each keyword will be emphasised in a random way. For example the keyword could be bold or red font etc. The final page will be output in the following way:

Title of the page in capital letters
Large title at the top of the page
Content of the website with emphasization and wiki content

padkit.zip:
PAD is the Portable Application Description, and it helps authors provide product descriptions and specifications to online sources in a standard way, using a standard data format that will allow webmasters and program librarians to automate program listings. PAD saves time for both authors and webmasters. This is what you want to use with the below databases.

yacg.zip:
You should have heard of Yet Another Content Generator (YACG). It’s a beautifully easy way to get websites up and running in minutes with mashed up scraped content.

Databases:

articles.zip:
A database of 23,770 different articles on a variety of topics.

bashquotes.zip:
This is a database of every quote on Bash.org. This huge Database has every single quote as of May 1st, 2007!

KJV_bible.zip:
The whole thing King James Bible – Old & New Testament.

medical_dictionary.sql.zip:
Over 130,000 rows of medical A-Z

Keyword Scripts:

ask-single-keyword-scraper.zip:
This script allows you to scrape a range of similar keywords to your original keyword from Ask.com.

google-single-keyword-scraper.zip:
This script will take a base keyword and then scrape similar keywords from google.

msn-live-api-scraper.zip:
This script uses php cURL to scrape search results from the MSN LIVE Search API.

overture-single-keyword-scraper.zip
Enter one base keyword and scrape similar keywords from overture.

Linkbuilding Scripts:

dity.zip:
A very easy to use (and old) multi guestbook spammer.

logscraper.zip:
Nifty little internal linker (read more about it here)

trackback.zip
Very powerful trackback poster. Trackback Solution is 100% multithreaded and very efficient at automatically locating and posting trackback links on blogs.

xml-feed-link-builder-z.zip
Very nice script to generate links from to your site from people scraping RSS.

Misc Scripts:

alexa-rank-cheater1.zip:
Automate the false increase of your Alexa rating/rank.

typo-generator-esruns.zip:
Create typos of a competitive keyword and rank easy!

Scraping:

feedwordpress.0.993.zip:
Wordpress plugin that makes scraping the easiest thing in the world.

Proxies:

proxy_url_maker.zip:
Create a list of web proxy URLs used for negative seo purposes or spam

proxygrabber.zip:
A script to download proxies from the samair proxy list site.

CAPTCHAs:

delicious.zip:
Delicious CAPTCHA broken. In Python.

smfcaptchacrack.zip:
Simple machines forums captcha breaker compiled and designed to run on Linux but portable to Windows.

Tutorials:

curl_multi_example.zip:
What it says on the tin. Examples of m-m-m-multi curl!

superbasiccurl.zip:
4 super basic tutorials on using curl/regex.

I’d like to give special thanks to all donators and people who included their stuff here:

Steve – For the majority of scripting here.
Rob – For the databases
Eli – For delicious CAPTCHA breaker
Rob – For trackback magic
Harry – For proxygrabber/linux captcha scripts

Here it is:

blackhat seo tools
Download Digerati Blackbox Toolkit (51.4Mb)

Disclaimer: I’m not offering support on any of these tools or scripts, although I might do a couple of tutorial posts on how to use them. So don’t ask me how to use them, check out the respective author’s website if you get stuck. Obviously Digerati Marketing Ltd, I, my dog, or anyone else cannot be held responsible for any type of loss or damages of any kind (even an act of God Google) if you choose to use them. At your own risk blah blah blah. Zzzzzz. Enjoy.

Posted in Black Hat, Grey Hat, Marketing Insights, Research & Analytics, Search Engine Optimisation, Social Marketing, Splogs, White Hat | 64 Comments

SEO Guerrilla Warfare

Monday, June 9th, 2008

I get a lot of e-mails and questions about trying to SEO against big companies and established websites. A lot of people seem to get stuck in the mindset of “Oh, no MegaCorp(tm) has a $79 billion SEO budget per month, there’s no way we’ll ever beat them!?

The fact is, you can. You can eat them for breakfast, then wipe your month with their over-inflated legalise Terms & Conditions page(s). This can be especially satisfying if you decide to take down a company you’re not particular fond of. While you’re sitting on your balcony at home having breakfast taking in some sun (or rain, if you’re a Brit), you can daydream about them running around their shiny boardroom pointing at the big graph on the wall that’s going down, generally shrieking at each other as their search empire crumbles at their feet.

Let’s get going, comrade.

Digerati Marketing Guerrilla Warfare

Dedication to SEOs everywhere
I would like to dedicate this post to all SEOs who work on their own projects and have tackle big businesses trying to elbow them out of the game at every stage. While you may have the better sites, the better content and more passion that what you’re doing, you’re messages are oppressed by greedy companies who want to fill the Internet with their mediocre content, their brand and of course, their ads. I give no apologies to the companies we will disrupt, but sorry guys, that space belongs to us (and our ads).

General Principles of Guerrilla Warfare
To win a Google war, you must have a detailed understanding of your own strength and weaknesses, as well as your enemy’s. Everyone has their own areas of the web they excel in, whether it’s programming, design, content writing or networking, however to be consistently successful, you will need to develop a wide range of these skills either yourself, or in your guerrilla band. Large enemy armies (err, I mean companies) share many common weaknesses. It is these weaknesses which you will learn to relate to your own skill sets and exploit until the enemy is utterly demoralised, scattered, beleaguered and exhausted.

If you attempt to face your enemy, overtly on the battlefield, you’ll lose. However, there are many advantages to being a small entity, you can operate unseen, you can move quickly and your personal interaction can sow the seeds of descent which will turn the enemy’s own populace against them.

Guerrilla Warfare Strategies

Weakness #1:A Large Army Requires Lots Of Supplies

Large Army

Okay, so MegaCorp has more dozens, or even hundreds of staff and you’re on your own, or it’s only you and a couple of comrades working from your small base in the woods (or whatever you call home). This can actually be an advantage!

Large businesses only want to get involved in projects which are of course, profitable for them. However, for these businesses to make profit, they have to pay for office buildings, web developers, designers, agencies, sales staff, editorial staff, marketing staff, the coffee machine and keep replacing the tea spoons that staff is nicking. This means you can have a better core offering than your enemy.

Have a look at their business model, how are they making money? Are they filling pages with advertising? Are they selling advert space? Perhaps they’re offering a service that other companies are paid to be included in? When it comes to monetization you can; show less adverts, charge the same advertisers less to be advertised on your site, or offer the same service for cheaper or free!

So take the model where companies are being paid to be listed on your enemy’s website. As an individual, you could quite easily make a decent amount of money showing some contextual advertising, or selling some advert space. So collect all the names of your enemy’s allies and e-mail them, offering the same benefits – but for free (or cheaper!).

For instance, “I noticed you are listed on website xxxx and you are paying $250 a month for this service. I am running xxxx website and I am prepared to offer free listings for your company for life, if you will display this badge showing you are listed here.”

Most companies would much rather chuck you a link than pay a monthly subscription, so in this instance, you’ve gained exposure (from the link), a couple of steps forward in terms of SEO (you’ve got some quality links from relevant sites) and you’ve made your enemy look bad by offering the same service for free (or a lot less).

The key here is research. Use the fact you’re small and unknown to research, spy and gain information. Pose as a potential client or advertiser and contact your enemy, asking for rate-cards or prices, ask for visitors stats. Use all of this information to build up a picture of their revenue model. From this you can calculate their revenue, and work on a counter-revenue model, which offers better value to visitors or participating entities.

Whatever money they’re making, you can afford to make less and still be far, far more profitable than them. Use this to your advantage to out-do their offer on all fronts and make your website more appealing.

Weakness #2:The Army Must Control The Populace

Russian Soldiers With megaphone

Big companies have a big brand to protect. Dispute their dominant appearance most companies are absolutely terrified of damaging their brand and will do anything to avoid taking risks. This is war, and risks need to be taken! From talking to hundreds of companies about their websites, one of the most common fears is UGC (User Generated Content), they are terrified to let people speak their minds for fear they might speak against the current regime! The CEO sits quivering in his chair that someone might say “FUCK” on his website and he’ll have angry people writing letters and bashing on the doors of the ivory tower.

You can really press the advantage hard here. It seems common sense to most savvy web developers and entrepreneurs nowadays, but open your site to the masses. Let them submit content, comment on content, talking in forums, whatever way you can allow them to have some interaction and control over their website. That’s right, it’s not your website, it’s the peoples’ website – so let them have some control!

Some more forward-thinking enemy do allow YGC on their sites, but it is typically heavily moderated to give the impression of free speech, when in reality, everyone is suspicious about the 5 out of 5 star user reviews on every product going. I recently saw a keynote, where 2 very similar forums launched at the same time, one moderated and one not. 1 year on, the forum that wasn’t moderated had six times the monthly traffic. People like some freedom when they’re giving you content.

Depending how you’re operating, you can take as many risks as you like, all the way to making black hat versions of your website (suicide sites). Make sure you separate these entities well away from your core troops (different servers and WHOIS) so they are not traceable, but any noise you can make will disrupt and demoralise the enemy. This tactic may not be appropriate in all circumstances, however if you’re competing with an e-commerce site, why not make your genuine article site while working on a few blackhat suicide versions? So what if they get banned after a few months? You’ll have made some money and damaged the enemy.

Weakness #3:Large Armies Are Slow To Manoeuvre

Battle Map

Large companies regularly have this trait in common. It takes them absolutely fucking ages, to do the most simple of things. If the colour of the text is going to be changed you’ll need a pre-meeting, a meeting, a post-meeting debrief, a spec produced, changed, put in a developer queue, tested, have a review meeting, blah blah.

The enemy is likely to be dependent on multiple sources when they need something changed. To have changes done quickly it will likely cost them an arm and leg, infrastructure changes are avoided like trench foot. Exploit this weakness to the fullest.

Take the time to evolve your website, if there are beneficial changes, make them. If there’s something in the news about your niche, respond to it. If there’s breaking news, get it out first. You can move quickly without encumbrance and seize the initiative while they’re still packing their bergens. Carpe Diem, Comrade.

Weakness #4:Lots Of Soldiers = Lots Of Cannon Fodder

Dads Army

While chasing profits and having to maintain a large work force, many companies try to save money by hiring slightly cheaper staff, or “just as good as” guys. That’s right, they’re taking rookie soldiers and putting them on the front line.

As any good General knows, sending untrained troops into battle is no better than herding sheep onto the front line, you’re going to lose. Hard. It may be that the enemy has got a lot of adept people working for them, but there have been communication problems between the ideas guy and the end developer. I have yet to see a website that has been built by a non-web specialist company that is flawless.

Spend time looking around the enemy’s terrain, see what they have done well and do it yourself. Immediately benefit from their expensive end-user research, at no cost to yourself. Find what they’ve done badly and improve it on your site. It seems that coming second carries with it, its own set of advantages.

Large armies tend to be sloppy, assuming victory by sheer size. Take all of their small weaknesses, poor internal linking, non-SEO friendly URLs, no use of “nofollow” tags and stack them so you have a distinctive advantage. The underdog leaves no bone un- scavenged.

Weakness #5:Large Armies Leave Big Tracks

Footprints in Snow

A large army cannot move undetected and thus it is easy to track their movements. Once you have your website battle ready, why not check out the enemy’s backlink profile in Yahoo! Site Explorer? A lovely, juicy list of their entire link building activity. You’ll want to get on that procuring links from every source they received links from, so you’ll very quickly draw even. If they are actively link building, take note of the kind of sites they are targeting.

Pay special attention if there are any “suspicious” links in there. You know the type, site-wide links from ring tone websites, MySpace Layout websites or obvious link networks. If there are, it is your civic duty to report these war crimes under the Google Convention! You’ll find cash-rich companies tend to involve themselves in these tactics quite quickly as it seems the most cost-efficient way for them to operate, so if you catch em, get em in stocks, pronto.

Weakness #6:Large Armies Have Slower Communications

Army Radio Operator

It is likely that a lot of the time, the left hand won’t know what the right is doing. Staff working at large companies won’t be able to communicate their detailed daily operations to each other. Use this along with any skill shortages and your previously gathered intelligence on what sites they link up with.

Set some booby-traps for them to walk right into! Create a few quick websites with some mashed up content that fits the profile of sites they want linking to them. Get in contact and offer a link from your homepage to their website, if they link to an obscure article on your website.

Of course, on your homepage you can use “X-Robots” in your header-delivery to nofollow any links on that page, which will be totally undetectable by nofollow plugins, or even by viewing the source code. The only way they’ll discover it is if they view the header information being sent by the site, which they won’t of course. Once you’ve done your link exchange, you’ve got 2 options:

1) Spring the booby trap! Why not 301 that page they’ve linked to, to a spammy blackhat website. Google will love that, along with their visitors!

2) Use their own resources against them! Or you could 301 that page to your own website, so the enemy is very kindly helping your efforts.

The great thing is that this will work dozens of times. Dealing with different people each time, they enemy won’t know what’s going on until it’s too late and they’ll soon start fearing other websites, not knowing who will help or harm them!

The Ongoing War

These are just a few of the many weaknesses that plague large company websites. I hope I have inspired you to take up arms against your would-be oppressors. When you divide and conquer, you’ll find that you can win a lot of battles against seemingly impervious web-giants and eventually bring them to their knees.

Ernesto Che Guevara
(Just stay out of Bolivia)

Posted in Black Hat, Google, Grey Hat, Marketing Insights, Research & Analytics, Search Engine Optimisation | 18 Comments

SEO Ranking Factors

Saturday, May 31st, 2008

Right, lets kick this thing in the nuts. Wouldn’t it be great if you could have a decent list of SEO Ranking Factors and more specifically, tell me exactly what you need to rank for a key phrase?

Well, SEOMoz went and done this.

You’ve probably all seen it before, the famous SEOMoz Search Ranking Factors, the highly regarded opinions of 37 leaders of search spread over a bunch of questions. It sounds slick, it looks cool and it’s a great introduction to SEO. There is, however, a rather major problem. None of them pissing agree! 37 leaders in search, closed ended questions, yet almost ALL of the answers have only “average agreement”, just look at the pie charts at the end, there is massive dispute between the correct answer.

I find this interesting. It leaves two possibilities

1) SEOMoz’s questions are flawed and there is no “correct” answer – this kind of kills the whole point of the project.

2) If there is a “correct” answer, then it would seem that 25%-50% of “leading people in search” don’t know WTF they are talking about.

Now before I continue, I’m not going to claim I have all the answers, far, far from it. I do some stuff and that stuff works well for me. The other thing I would like to point out is that I actually really like the SEOMoz blog and I think they provide extremely high quality content in high frequency, which is bloody hard to do. So please no flaming when I seem to be bashing their hard work, I’m simply pointing out a few things rather crudely. Oh, they’re nice people too, Jane is very polite when I stalk her on Facebook IM.

Anyway, back to slating. I think it is very hard to give quality answers to questions such as, how does page update frequency effect ranking? From my experience, I’ve found Google quite adaptive in knowing, based on my search query, whether it should serve me a “fresh” page or one that’s collecting dust. Eli from BlueHatSEO has also made some convincing arguments that the “optimum” update frequency of a page depends on your sector/niche.

Also, these things change. Regulary. Those clever beardies at Google are playing with those knobs and dials all the time. Bastards.

Okay, I now hate you for slating SEOMoz, do you have anything useful to say?
Maybe? Maybe not. As I mentioned in my last post, I’m going to talk about some projects I’m working on at the moment and one of these is specifically aimed at getting some SEO Ranking Factors answers.

I could of course just give what I believe to be the “correct” answers to the SEO Ranking Factors questions, but like everyone else, I’d be limited to my own SEO experience. We need more data, more testing, more evidence.

There’s loads of little tools floating around the net that will tell you little things like, if you have duplicate meta descriptions, your “keyword density” (hah), how many links you have, all that stuff. Then you’ll get some really helpful advice like “ShitBOT has detected your keyword only 3.22% on this page, you should mention your keyword 4.292255% for optimum Googleness”. Yes, well. Time to fuck off ShitBOT. These tools are kind of fragmented over the net, so it would take ages to run all 101 to build up a complete “profile” of your website, which really… Wouldn’t tell you all that much. It wouldn’t tell you much because you’re only looking at your own website, your own ripples in the pond. You need to zoom out a bit, get in a ship and sail back a bit, then maybe put your ship in a shuttle, blast off until you can see the entire ocean.

Well, crap. It all looks different from here..

Creating a Technological Terror
I can’t do this project alone. Fortunately, one of the smartest SEO people I know moved all the way across the country to my fine city and is going to help.

Here we go….

1) Enter the keyword you would like to rank for.

2) We will grab the top 50 sites in Google for this search term.

2) i) First of all, we will do a basic profile of these sites, very similar, but a bit more depth than the data SEOQuake will give you. So things like domain age, number of sites linking to domain, how these links are spread within the site, page titles, amount of content, update frequency, PageRank etc. We’ll also dig a bit deeper and take titles and content from pages that rank for these key phrases and store them for later.

2) ii) The real work begins here. For each one of these sites that rank, we are going to look at the second tier, which I don’t see many people doing. We are going to analyse all of the types of sites that link to these sites that rank well. This will involve: Doing the basics, such as looking at their vital stats, so their PR, links, age of domain, TLD and indexed pages.

Then we’re going to take this a step further. We are going to be scanning for footprints to work out the type of link. This means, is it an image link? Is it a link from a known social news site like Digg or Reddit? Is it a link from a social bookmarking site like StumbleUpon or Delicious? Is it a link from a blog? Is it a link from a forum? A known news site? Is it a link from a generic content page? If so, lets use some language processing and try and determine if it’s a link from a related content page, or a random ringtones page. Cache all of this data.

3) We have a huge amount of data now, we need to process it. Ranking for the keyterm casino, lets put it onto a graph showing their actual ranking for this keyterm vs their on page vital stats. Lets see the ranking vs the types of links they have. Lets see how the sites rank vs the amount of links, the age of links etc.etc…

4) We can take this processing to any level needed. Lets pool together all the data we have of the 50 sites and take averages. What do they have in common for this search term? Are these common ranking factors shared between totally different niches and keywords?

This is the type of information that I think I know. I think it would be valuable to know the information I know (=

So I guess you can expect a lot of playing with the Google Charts API, scatter graphs showing link velocity against domain age and total links and all that shit.

You get the idea.

There’s actually all other kind of secondary analysis that can be pumped into this data. For instance, even though it’s a kind of made up term, I think “TrustRank” has some sauce behind it. (There’s a good PDF on TrustRank here). Lets think of it in very, very simple, non-mathematical terms for a moment.

One fairly basic rule of thumb for the web can be that a trusted (“good”) site will generally not link to a “bad” (spam, malware, crap) site. It makes sense, generally very high quality websites vet the other sites that they link to. So it makes sense that Google select a number of “seed” sites and give them a special bit of “trust” juice, which says that whatever site this one links to, is very likely to be of good quality. This trend continues down the chain, but obviously the further down this chain you get, the more and more likely it is that this rule will be broken and someone (maybe even accidentally) will link to what Google considers a “bad” website. For this reason, the (and I use this terminology loosely) “Trust” that is passed on will be dampened at each tier. This allows a margin for calculated error, so if they chain in essence is broken, the algorithm maintains its quality, because it allows for this.

I think most people could name some big, trusted websites. Why not take time to research these sites, really trusted authority sites – one’s that it’s at least a fair bet has some of this magical Trust? Say we have a list of ten of these sites, why not crawl them and get a list of every URL that they link to? Why not then crawl all of these URLs and get a list of all the sites THEY link to? Why not grab the first 3 or 4 “tiers” of sites? Great now, you’ve probably got a few million URLs. Why not let Google help us? Lets query this URLs against the keywords we’re targeting. What you’re left with is a list of pages from (hopefully) trusted domains, that are related to your niche. The holy grail of whitehat link building. Now pester them like a bastard for links! Offer content, blowjobs, whatever it takes!

Wouldn’t it be interesting if we took this list of possible Trusted sites and tied in this theory with how many of our tendrials of trusted networks link to our high-ranking pages? There’s a lot of possibilities here.

This project will be taking up a significant chunk of my time over the next months. Maybe the data will be shit and we won’t find any patterns and it will be a giant waste of time. At least then I can say with confidence that SEO is actually just charm-glasping, pointy hat-wearing, pole chanting black art that so many businesses seem to think it is. At least I’ll be one step closer to finding out.

Apologies once again to SEOMoz if you took offense. I love you x

Posted in Blogging, Google, Marketing Insights, Research & Analytics, Search Engine Optimisation, Social Marketing, White Hat | 10 Comments

I ask you.

Tuesday, May 27th, 2008

This blog is changing focus. I haven’t posted since I returned from Tulum (yes, I know I said Cancun, but it turns out I don’t listen as well as I should) because I’ve stopped myself from doing so. There’s been a lot of things I’ve wanted to talk about, such as Lyndon’s run in with Google over hoax linkbait, Google really getting to grips with forms, big sites that are cloaking and even geo-hashing. It’s like a trap for me to fall into, seeing all these opinions flying around and wanting to throw my 2 cents in. Of course, I have opinion on all of these topics, but one of my only objectives when starting this blog was to keep every post as informative as possible and try and dig up some strategies, techniques, theories or research that isn’t in a million other places. There’s an incredibly annoying echo effect on my RSS reader as these stories reverberate around the (gag) “blogosphere”. So, rather than post what everyone else is posting, I’d rather post nothing at all, so when you are here, hopefully you’ll get something really….. Nice.

That being said, any decent sized post I do, takes in total around 4 hours as I try and decode the gibberish noise in my head into something tangible enough to put on display. The process helps me organise my thoughts, however it is time-consuming and at the moment I’m incredibly time-starved.

At the moment, my time is broken down between a few major projects:

1) A massive (as of yet unnamed) project to analyse search ranking factors
2) The further development of my currently released SEO Tools
3) The growth and refinement of AutoStumble (now over 300 users!)
4) I’ve also just started work on niche, white-hat community site which will need a lot of attention.
5) Various other websites/maintaining current web property

The change of focus for this blog is going to be looking more closely at the SEO, programming, technical and marketing principles behind these projects – which will benefit everyone, as once again, blogging will become “integrated” in what I do and there’s still a lot of valuable stuff to share.

I ask you.
You’ve taken the time to subscribe (or at least) visit my blog, thank you. If you have a specific topic you’d like to see me write about, let me know and I’ll see what I can do. Would you guys like these single, sporadic and very detailed posts on more advanced SEO concepts on their own? Or would you like “lighter” reading for insights into current issues as well? Answers on a postcard, or in the comment box – whichever is easiest for you. I’m going to be checking out everyone’s sites and blogs who comments here so I can see who’s really reading this stuff :)

I’ll shortly be posting detailed overviews of the above projects.

Posted in Blogging, Community Sites, Digerati News, Marketing Insights, Search Engine Optimisation, Social Marketing | 9 Comments