Does Blackhat SEO still work?
Monday, December 7th, 2009
If anyone tells you blackhat SEO doesn’t work, get them to comment on this.
Posted in Black Hat, Search Engine Optimisation | 6 Comments »
Monday, December 7th, 2009
If anyone tells you blackhat SEO doesn’t work, get them to comment on this.
Posted in Black Hat, Search Engine Optimisation | 6 Comments »
Tuesday, June 9th, 2009
Quick heads up, if you want a free link from http://www.further.co.uk/blog all you have to do is Tweet an SEO Tip to the #fseo hashtag on Twitter.
Full details here: http://www.further.co.uk/blog/Tweet-SEO-Tips-Get-A-Link-From-Us-144
Doesn’t get much easier than that.
I’m doing a lot of blogging over there at the moment (one reason why there’s fewer updates here). So if you want more solid advice (with less blackhat), I’d recommend:
How Much Is An SEO Site Audit Worth?
SEO Keyword Selection And Calculating Value
Posted in Search Engine Optimisation | 8 Comments »
Tuesday, March 3rd, 2009
Good afternoon and a happy square root day to you. (C’mon it’s no more made up than Valentine’s Day).
Despite my initial reservations, I’m actually finding Twitter moderately useful for content and link discovery, the trick is just really following the right people and ditching time wasters. I’m not going to bore you with a lecture on how Twitter is the next big thing, in fact I’m pretty sure we’re fast approaching the point at which Gartner’s Hype Cycles soon predict a crash of interest and disillusionment.

Well, maybe, maybe not – argue it amongst yourselves, it’s not what I really want to talk about. I want to talk about…
Twitter and Spam
Although I’ve only really talked about parasite hosting indirectly, when looking at ranking factors to do with age and trust, I think it’s a point briefly worth mentioning.
I saw Quadzilla posted today about parasite hosting on twitter. Hopefully, that hasn’t eluded you, aside from other methods of finding places to parasite host all you need to look for are trusted domains that allow you to post content with little moderation. Even a basic search for Viagra shows that the #2 position is essentially a parasite hosted page on the hotfroguk directory (thanks Ryan for your dedication in trawling Viagra results).
As Quadzilla rightly points out, with Twitter being almost totally unmoderated, the sad fact is it’s going to get bombed to hell over the next 12 months by blackhat SEOs and then Google will do something about it and game over.
There are however (slightly) more legitimate uses for Twitter if you’ve got your heart set on some easy rankings.
Twitter and content generation
Content generation can be a tricky game, you can plain scrape it (not really generation
), scrape it and spin it, you can use synonym replacement, markov chaining, or if you’re really smart – come up with your own way to do it.
There are several problems inherited with content generation, whether it’s duplicate content, poor quality or your algorithm gets skewed by internet random. I’ve seen a lot of people trying to generate websites based on data they can pull from keyword trends or “hot” trends. The problem is that most of the services give you the information you need, after the fact. The news has come, the search spike has been and you’re content generation system has given you a crummy bit of content which now has to compete with established sites with real content. Oh, and the fact nobody cares anymore.
Twitter, on the other hand is instant. It’s not uncommon for me to discover new “hot” things on twitter hours before mainstream news (i.e. authoritative sites) publish it (and days before Seth Godin makes an informed in hindsight) comment.
Without spoon feeding, I put this to you: Why not let tweeting twits find your content for you? There’s many ways you can do this:
1) There are lovely people that get this information for you. For instance: http://twitturly.com/ will give you the most tweeted links. There’s all your early breaking generic news for you, just set your cURL bot to follow those tinyurls and discover the source and scrape away.
2) If you’re in a niche, find everyone who tweets in that niche, use cURL to crawl of the links they tweet, log them to a database, use a little intelligent keyword selection to make sure their relevant, then repost.
Then of course, ping the world with your new content, break some captchas and submit to a list of social sites and drop a few links here and there. Aside from services such as Google Blog Search, which work on an almost exclusively chronological basis, you stand a good chance of getting a healthy amount of visitors since you’re one of the first few to get content up.
Added note for clarity: I’m talking about scraping titles/content from URLs you have followed from tweets – not tweets themselves. The majority of the links to new breaking / interesting stories will come inside a very small window. So if you can post this content up while there is still interest / searches and before someone has link dominance, you should even be able to give the duplicate content penalty the slip, even if you’ve 100% scraped – so you’re on a winner – you could even retweet it (:
Oh, don’t forget to jam it full or ads or something. Who cares? It’s all automated. Think of it at least as a weekend project, but don’t break Twitter, it’s growing on me (:
Posted in Black Hat, Scripting, Search Engine Optimisation, Social Marketing, Splogs | 5 Comments »
Tuesday, December 16th, 2008
Using cURL and page scraping for specific data is one of the most important things I do when creating databases. I’m not just talking about scraping pages and reposting here, either.
You can use cURL to grab the HTML of any viewable page on the web and then, most importantly take that data and pick out the bits you need. This is the basis for link analysis scripts, training scripts, compiling databases from sources around the web, there’s almost limitless things you can do.
I’m providing a simple PHP class here, which will use cURL to grab a page then pull out any information between user specified tags, into an array. So for instance, in our example you can grab all of the links from any web page.
The class is quite simple – I had to get rid of the lovely indententation to make it fit nicely onto the blog, but it’s fairly well commented.
In a nutshell, it does this:
1) Goes to specified URL
2) Uses cURL to grab the HTML of the URL
3) Takes the HTML and scans for every instance of the start and end tags you provide (e.g. < a > < / a >)
4) Returns these in an array for you.
Download taggrab.class.zip
<?php
class tagSpider
{
// set variable to hold curl instance
var $crl;
// this is where we dump the html we get
var $html;
// set for binary type transfer
var $binary;
// this is the url we are going to do a pass on
var $url;
// automatically executed on class call to clear variables
function tagSpider()
{
$this->html = "";
$this->binary = 0;
$this->url = "";
}
// takes url passed to it and.. can you guess?
function fetchPage($url)
{
// set the URL to scrape
$this->url = $url;
if (isset($this->url)) {
// start cURL instance
$this->ch = curl_init ();
// this tells cUrl to return the data
curl_setopt ($this->ch, CURLOPT_RETURNTRANSFER, 1);
// set the url to download
curl_setopt ($this->ch, CURLOPT_URL, $this->url);
// follow redirects if any
curl_setopt($this->ch, CURLOPT_FOLLOWLOCATION, true);
// tell cURL if the data is binary data or not
curl_setopt($this->ch, CURLOPT_BINARYTRANSFER, $this->binary);
// grabs the webpage from the internets
$this->html = curl_exec($this->ch);
// closes the connection
curl_close ($this->ch);
}
}
// function takes html, puts the data requested into an array
function parse_array($beg_tag, $close_tag)
{
// match data between specificed tags
preg_match_all("($beg_tag.*$close_tag)siU", $this->html, $matching_data);
// return data in array
return $matching_data[0];
}
}
?>
So that is your basic class, which should be fairly easy to follow (you can ask questions in comments if needed).
To use this, we need to call it from another PHP file to pass the variables we need to it.
Below is tag-example.php which demonstrates how to pass the URL, start/end tag variables to the class and pump out a set of results.
Download tag-example.zip
<?php
// Inlcude our tag grab class
require("taggrab.class.php"); // class for spider
// Enter the URL you want to run
$urlrun="http://www.techcrunch.com/";
// Specify the start and end tags you want to grab data between
$stag="<a href=";
$etag="</a>";
// Make a title spider
$tspider = new tagSpider();
// Pass URL to the fetch page function
$tspider->fetchPage($urlrun);
// Enter the tags into the parse array function
$linkarray = $tspider->parse_array($stag, $etag);
echo "<h2>Links present on page: ".$urlrun."</h2><br />";
// Loop to pump out the results
foreach ($linkarray as $result) {
echo $result;
echo "<br/>";
}
?>
So this code will pass the Techcrunch website to the class, looking for any standard a href links. It will then simply echo these out. You could use this in conjunction with SearchStatus Firefox Plugin to quickly see what links Techcrunch is showing bots and what they are following and nofollowing.
You can view a working example of the code here.
As I said, there’s so much you can do from a base like this, so have a think. I might post some proper tutorials on extracting data methodically, saving it to a database then manipulating it to get some interesting results.
Enjoy.
Edit: You’ll of course need cURL library installed on your server for this to work!
Posted in Grey Hat, Research & Analytics, Scripting, Search Engine Optimisation | 16 Comments »
Monday, December 15th, 2008
I’ve never done a round-up of the blogs I read before, which I guess is a bit selfish. So, in no particular order (and this isn’t a complete list) some of my favourite blogs, if you’re looking for some inspiration.
Dark SEO Programming is run by Harry. As he puts it, “SEO Tools. I make ‘em”. A great guy if you need help with coding and somewhat of a captcha guru, with a sense of humour. Definitely worth keeping up with. I wouldn’t be surprised if this guy starts making big Google waves in the next few years.
Ask Apache is a blog I absolutely love. Great, detailed tutorials on script optimisation, advanced SEO and mod_rewrite. AskApache’s blog posts are the kind of ones that live in your bookmarks, rather than your RSS Reader.
Andrew Girdwood is a great chap from BigMouthMedia I met last year (although I very much doubt he remembers that). Andrew seems to be a vigilante web bug hunter. What I like about his blog is that he is usually the first to find weird things with Google that are going down. This usually gets my brain rolling in the right direction of my next nefarious plan. ^_^
Blackhat SEO Blog run by busin3ss is always worth checking out. He was even kind enough to give me a pre-release copy of YACG mass installer to review (it’s coming soon – I’m still playing!). Apart from his excellent tools, his blog features the darker side of link building, which of course, interests me greatly.
Kooshy is a blog run by a guy I know, who.. Well I think he wants to remain anonymous (at least a little). He’s just got started again after closing down his last blog and moving Internet personas (doesn’t the mystery just rivet you?). Anyway, get in early, I think we can expect some good stuff from here. He’s already done a cool post on Pimpin’ Duplicate Content For Links.
Jon Waraas is run by.. Can you guess? Jon has something that a lot of even really smart Internet entrepreneurs are missing, good old fashioned elbow grease. This guy is a workaholic and it pays off in a big way. Apart from time saving posts on loads of different ways to monetise your site, build backlinks and flush out your competitors I get quite a lot of inspiration for his constant stream of effort and ideas. I could definitely take a leaf out of his work ethic book.
Blue Hat SEO is becoming one of the usual suspects really. If you’re here, you probably already know about Eli. Being part of my “let’s only do a post every few months club”, I love Eli’s blog because there is absolutely no fluff. He gets straight down to the business of overthrowing Wikipedia, exploiting social media and answering specific SEO questions. You’ll struggle to find higher quality out there.
SEO Book is probably the most “famous” blog I’m going to mention here. Aaron was off at a disadvantage, because to be honest, I thought he was a massive waste of space for quite a while. (I guess that’s what happens when you take your SEO youth on Sitepoint listening to the people with xx,xxx posts on there). I bought his SEO Book and for me, at least, it was way too fluffy. I’m pleased he’s started an SEO training service now as it represents much better value. I’m sure he was making a lot of money from his SEO Book, but perhaps milked it too long (like I probably would have). Anyway, I kept with his blog and I’ve been impressed with his attitude and posts. He’s done some really cool stuff, like the SEO Mindmap and more recently, a keyword strategy flowchart which would be useful for those looking to a more structured search approach. He’s also written about algorithm weightings for different types of keywords and of course has some useful SEO Tools.
Slightly Shady SEO – Great name, great blog. Although XMCP will probably take it as an insult, I’ve always regarded Slightly Shady as the blog most similar to mine on this list. Maybe it’s because I wish I’d written some of the posts he has, before he did, hehe. Again, a no BS approach to effective SEO, whether he’s writing about Google’s User Data Empire, hiding from it or site automation it’s all gravy.
The Google Cache is a great blog for analytical approaches to SEO. There are some awesome posts on Advanced Whitehat SEO and using proxies with search position trackers. I like.
SEOcracy is run by a lovely database overlord called Rob. Rob’s a cool guy, he was kind enough to donate some databases to include in the Digerati Blackbox a while back. Most of his databases are stashed away in his content club now, which is well worth a look in. He’s also done some enlightening posts on keyword research, stuffing website inputs and Google Hacking.
This is all I’ve got time for now, apologies if I’ve missed you. There may be a Part II in the near future.
Posted in Affiliate Marketing, Approved Services, Black Hat, Blogging, Digerati News, Google, Grey Hat, Marketing Insights, Research & Analytics, Search Engine Optimisation, Social Marketing, Splogs, Viral Marketing, White Hat, Yahoo | 7 Comments »