Uploading WordPress posts using Perl and MySql

The previous article showed how to remotely update a WordPress post. It works, but it is very slow.

This article will show how to update a WordPress post by directly manipulating the underlying database. The code works with MySQL, and should be easy to adapt to other databases, as they are likely to use identical table structures.

The first thing we must know is how the database changes when a post is updated. Reading the documentation is so unlikely to answer this that I didn’t even try it. To find out what changes an update made in the online editor makes:

  • Open an existing post in the online editor.
  • Save the post, after adding and then removing a space character.
  • Dump the whole database in text form.
  • Once again save the post, this time after som minor change.
  • Dump the whole database in text form.
  • Compare the two dumps.

Here is a shell command for dumping the database, imaginatively called “wordpress”:

for t in $( echo "show tables" | mysql -u $MUSERNAME -p$MPASSWORD wordpress | sed -n '2,$p' ); do
    echo "select * from $t"'\G;'
done |  mysql -u $MUSERNAME -p$MPASSWORD wordpress

Comparing the two dumps shows that the tree fields that get updated are post_content, post_modified and post_modified_gmt. In addition, there may be a “revision” record, with post_type equal to revision. No reason to bother with the latter, since version management, if we want it, applies to the .post files. It might be a good idea to turn off the revision feature, as that will save disk space, increase speed and improve database hygiene.

A MySQL statement to update the three files looks like:

   update wp_posts set post_content = LOAD_FILE('/full/path/to/post.html'),
                       post_modified = '2016-06-09 12:32:31',
                       post_modified_gmt = '2016-06-09 10:32:31' where ID=13;

Adding this to wordpress-upload-posts.pl was quite easy:

    (my $pfile = $filename) =~ s/html$/post/;
    my $t = (stat("$pfile"))[9];
    my $g = strftime("%Y-%m-%d %H:%M:%S",    gmtime($t));
    my $l = strftime("%Y-%m-%d %H:%M:%S", localtime($t));

    $filename = File::Spec->rel2abs($filename);
    print("update wp_posts set post_content = LOAD_FILE('$filename'), post_modified = '$l', post_modified_gmt = '$g' where ID=$id;\n");

This code uses the file time of the .post file for setting the modification time in the database. Using this time, instead of e.g. the file time of the .html file or the time when the database update is performed, seems the most relevant choice. If e.g. we implement some minor cosmetic change in post.pm, then the modification time will not change, even though the .html file changes. As the factual content of the posts do not change, there is no need to update the modification time.

To try this, use the -m option:

$ wordpress-upload-posts.pl  -m  |  mysql -v -uusername  -ppassword  databasename

Updating a post now takes less than one millisecond, compared to more than one second using XMLRPC. Note that the script still uses XMLRPC for building the map from “slugs” to IDs, so the startup time is quite long, about 600 milliseconds on my machine.

It is also possible to supply a map file to the script, using the -M option:

$ wordpress-upload-posts.pl  -m  -M mapfile  |  mysql -v -uusername  -ppassword  databasename

If the map file is given, then XMLRPC will not be used at all, saving some 550 milliseconds on my machine. Each line is an id/slug pair:

98      creating-web-pages-using-perl
99      creating-web-pages-using-perl-upload-1
100     creating-web-pages-using-perl-upload-2

This file can be created using SQL:

select ID, post_name from wp_posts where post_type = 'post' and length(post_name) > 2;

The length test is primarily meant to filter out articles with empty “slugs”.

To be on the safe side, regenerate the map file every time. Makefile extract:

/tmp/map:
        @echo "select ID, post_name from wp_posts where post_type = 'post' and length(post_name) > 2;" \
                | mysql -u$$MUSERNAME -p$$MPASSWORD $(WPDATABASE) \
                | grep '^[0-9]' > $@

mupload: /tmp/map
        wordpress-upload-posts.pl -m -M $< \
                | mysql -v -u$$MUSERNAME -p$$MPASSWORD $(WPDATABASE) \
                | grep -v -- --- || true
        @rm -f $<

There is an example Makefile, cleverly named Makefile.example, in the archive.

Download the archive or view wordpress-upload-posts.pl.

Take heed

It works for me. It might work for you. Or it might totally eff up your WordPress database.

You really, Really, REALLY should backup your database before trying this.

Happy updating!

You can reach me by email at “lars dash 7 dot sdu dot se” or by telephone +46 705 189090

View source for the content of this page.