Duplicate posts
May 11, 2009
Duplicate posts are an issue that seem to constantly plague autoblogging and similar autoposting WordPress plugins. There have been numerous causes for this problem in the past and we are constantly working on methods to fix this. We have had good success combating this problem, but occasionally some server configuration or WordPress update will reintroduce this bug.
Background
When AutoBlogged parses items in a feed, the first thing it does is grab the item’s title and link. Next, it will take those two values and check the database to see if those posts already exist in your WordPress blog. AutoBlogged will search the database for duplicate titles and/or links, depending on which options you have checked on the Filtering Options page.
When it searches for duplicate titles, it tries searching for the literal title as well as the sanitized form of the title using the WordPress sanitize_title and theĀ sanitize_title_with_dashes functions. If duplicate link checking is enabled, it will search the database for the exact link the post uses. Depending on which options you have enabled, it could do up to five checks for each post.
AutoBlogged should get one of three responses from the database: either the title and/or link already exists, an empty recordset that indicates they don’t exist, or an error message that something failed while processing the request.
The problem is that sometimes AutoBlogged gets back an empty recordset when a post exists rather than a positive result or an error message.
Causes
Most often the reason the database returns a false result is because some error or timeout occurred but the database did not return an error message. We have found this to be the case with some web server configurations. The problem is that by not returning an error, it is very difficult to debug the problem.
Another possible cause is that the character set that WordPress uses does not match that of the database therefore the duplicate checks never match any titles.
Solutions
To solve the problem with duplicates, we suggest you first make sure you have the latest AutoBlogged release. We are always improving the dupe checking code and this might be all you need to fix the problem. If that doesn’t do it, try the following solutions in order:
- Try optimizing (or even repairing) all of your database tables using phpMyAdmin or your hosting control panel. About half the time this fixes the duplicates problem.
- Check to make sure your database is not overloaded. If you ever see database errors on your WordPress site, this is very likely what is causing duplicate posts as well. You might also want to check the runtime configuration and system variables for possible problems using phpMyAdmin. There are links on the phpMyAdmin main page to show this information. There are a lot of settings here that affect MySQL performance and you may need an expert to help you out here if you suspect this is the problem.
- Check the resource load on the server itself to make sure it isn’t overworked or maxing out its resources. This is fairly common with cheap shared hosting accounts.
- If you have 10,000 or more posts in your blog or your site gets more than 3,000 visitors per day, you may simply need more powerful hardware. WordPress can be sluggish when there are too many posts or when it gets too busy and this could cause database connection failures.
- Try configuring WordPress to use the default database character set by opening wp-config.php and removing or commenting out this line:
define(‘DB_CHARSET’, ‘utf8′);
If none of these solutions work for you, perhaps the easiest solution is to find a different hosting company. We actually very rarely see the problem on any of our test servers and we have noticed that it often occurs with cheap or oversold hosting companies. We have always had good results with HostNine and A Small Orange.
If you manage your own web server, you may need to find a MySQL expert to help you optimize the database for your type of load.
As we mentioned before, this is a problem that we are constantly monitoring and always working to eliminate. If you are unable to fix the problem yourself, we are happy to help you debug it, especially if it helps us find new fixes to prevent it.






Blog Posts