Improving wp_delete_post in WordPress Core

One of the common processes that runs in Store Locator Plus is deleting locations.    For sites with a few dozen locations the process runs smoothly.  For sites with thousands of locations but deleting one or two at a time, not a big deal.   But for sites that are deleting tens-of-thousands of locations at a time the process becomes painfully slow.   A mere 2,500 locations can take up to a full minute to be removed on a fairly decent performance server.   That’s not the type of performance I like to see from our product.

After digging into the performance of the PHP stack initial indicators point to the custom post types as the primary culprit.   It turns out that deleting a single custom post type in the WP_Posts table runs through a dozen gyrations to delete the post.  Multiple filters are called, associated taxonomies are delete, taxonomy meta is deleted.  It is a TON of extra overhead.    But even with removing records from a half-dozen tables the data queries seem out-of-control.

Removing just 9 locations generates over 190 data queries.     If there is one thing that has not changed in decades of writing software it is that data queries are costly.  They may run on solid-state drives with advanced memory caching but doing 190 data queries is still far slower than nearly any other part of the application.

Some insight into the delete posts process

The process of deleting a page or post, whether a default post or page type or a custom post type like the one in Store Locator Plus will call the wp_delete_post() function.

The first thing this does is execute a direct WPDB select query against the posts table.   The value is returned as an object and stored in the $post variable which is passed to a get_post( $post ) call.

When get_post() is called and standard PHP object, as is done in this case, and the default ‘raw’ filter parameter is used the object is sanitized a a new WP_Post object is created from it.    So far so good.   Some extra “homework” is done and the WP_Post() object is returned.   We are , in essence, using $post = get_post( $post) to typecast a standard MySQL object returned from the data query to a more WordPress-friendly WP_Post object type.

Sidebar: the get_post() call with different parameters passed in will call WP_Post::get_instance( <id> ) which will call wp_cache_add() to eliminate subsequent SQL select queries for the same post — the performance discovery below may serve more effectively if added to the get_post( <obj> , ‘raw’ ) processing instead of being specific to wp_delete_post().

So far we’ve done a single SQL select call for the delete process (many other data queries have already been run, but that is a topic for another day) and cast the result to a WP_Post object.  No harm, no foul.

As we move down the stack we encounter the ‘before_delete_post’ action hook.   As of WordPress 3.7, ‘before_delete_post’ runs the _reset_front_page_settings_for_post() function. This is one of the first possible places our SQL performance takes a hit, but only because it is “the first of many” — if we address the issue down in this “rabbit-hole” we are only kicking the can down the road.     It turns out that _reset_front_page_settings_for_post() calls… you guessed it, get_post( <id> ) but in this case the <id> is passed as a STRING that came straight out of the prior SELECT and get_post() call run earlier.    Yes, the code calls get_post() after get_post() was just called moments ago.   This is not unique to this one action , in case you are wondering, many other methods called later in the delete post process do the same.

Sidebar: _reset_front_page_settings_for_post() is only calling get_post() so it can check the post type and see if it is a page, and if it is make sure if the home page was deleted it is managed properly.   If the before_delete_posts action passed the entire post object instead of only the post ID this could be avoided.     Now that the before_delete_posts action is “out in the wild” this would need to be added as an optional second parameter that _reset_front_page_settings_for_post() could latch onto.   Pass it by reference if you are worried about memory overhead and you avoid the extra processing of the get_post() method to retrieve data you already have available.

Since many other processes in the delete stack will call get_post(), let’s focus on that for a moment.  After our initial get_post( $post ) called above with $post as an OBJECT, all the subsequent calls use get_post( <id::string> ). This  causes get_post() to “fall through” the initial logic and call  WP_Post::get_instance( <id> ).    WP_Post::get_instance() checks the global wp_cache hash to see if the entry for the post ID is already there, however it was never added to the cache — this calls, as you likely guessed, a nearly-identical select to the one we called to kick off wp_delete_posts() with a simple addition of a LIMIT 1 to the query and this time around adds the post object to the wp_cache.

This code from WP_Post::get_instance() will look familiar when reading “A discovery” below.

...
            $_post = $wpdb->get_row( $wpdb->prepare( "SELECT * FROM $wpdb->posts WHERE ID = %d LIMIT 1", $post_id ) );

            if ( ! $_post )
                return false;

            $_post = sanitize_post( $_post, 'raw' );
            wp_cache_add( $_post->ID, $_post, 'posts' );
...

Now we’ve done TWO queries to pull the same exact data from the posts table.   That’s not very efficient

A discovery

While analyzing the code that was executing it appears as though one extra SELECT query is made against the WordPress posts table for every post to be deleted.    It all starts with wp_delete_post running a SELECT to fetch post data directly from the posts table as you can see at the start of the method in the WP 5.0 code posted below.

It turns out that in my local copy of WordPress 5.0 a single line of code can eliminate that extra query on every post being deleted: wp_cache_add( $post->ID, $post, ‘posts’ );

While this will not likely impact many sites there are other implications here.     It is likely that WordPress Core can be revised in other areas to make the posts queries more efficient.    1 less SQL query x 10 posts deleted per week x 27 MILLION websites is a lot of CPU power being wasted.

The original wp_delete_post function:

/**
 * Trash or delete a post or page.
 *
 * When the post and page is permanently deleted, everything that is tied to
 * it is deleted also. This includes comments, post meta fields, and terms
 * associated with the post.
 *
 * The post or page is moved to trash instead of permanently deleted unless
 * trash is disabled, item is already in the trash, or $force_delete is true.
 *
 * @since 1.0.0
 *
 * @global wpdb $wpdb WordPress database abstraction object.
 * @see wp_delete_attachment()
 * @see wp_trash_post()
 *
 * @param int  $postid       Optional. Post ID. Default 0.
 * @param bool $force_delete Optional. Whether to bypass trash and force deletion.
 *                           Default false.
 * @return WP_Post|false|null Post data on success, false or null on failure.
 */
function wp_delete_post( $postid = 0, $force_delete = false ) {
    global $wpdb;

    $post = $wpdb->get_row( $wpdb->prepare( "SELECT * FROM $wpdb->posts WHERE ID = %d", $postid ) );

    if ( ! $post ) {
        return $post;
    }

    $post = get_post( $post );

    if ( ! $force_delete && ( 'post' === $post->post_type || 'page' === $post->post_type ) && 'trash' !== get_post_status( $postid ) && EMPTY_TRASH_DAYS ) {
        return wp_trash_post( $postid );
    }

    if ( 'attachment' === $post->post_type ) {
        return wp_delete_attachment( $postid, $force_delete );
    }

    /**
     * Filters whether a post deletion should take place.
     *
     * @since 4.4.0
     *
     * @param bool    $delete       Whether to go forward with deletion.
     * @param WP_Post $post         Post object.
     * @param bool    $force_delete Whether to bypass the trash.
     */
    $check = apply_filters( 'pre_delete_post', null, $post, $force_delete );
    if ( null !== $check ) {
        return $check;
    }

    /**
     * Fires before a post is deleted, at the start of wp_delete_post().
     *
     * @since 3.2.0
     *
     * @see wp_delete_post()
     *
     * @param int $postid Post ID.
     */
    do_action( 'before_delete_post', $postid );

    delete_post_meta($postid,'_wp_trash_meta_status');
    delete_post_meta($postid,'_wp_trash_meta_time');

    wp_delete_object_term_relationships($postid, get_object_taxonomies($post->post_type));

    $parent_data = array( 'post_parent' => $post->post_parent );
    $parent_where = array( 'post_parent' => $postid );

    if ( is_post_type_hierarchical( $post->post_type ) ) {
        // Point children of this page to its parent, also clean the cache of affected children.
        $children_query = $wpdb->prepare( "SELECT * FROM $wpdb->posts WHERE post_parent = %d AND post_type = %s", $postid, $post->post_type );
        $children = $wpdb->get_results( $children_query );
        if ( $children ) {
            $wpdb->update( $wpdb->posts, $parent_data, $parent_where + array( 'post_type' => $post->post_type ) );
        }
    }

    // Do raw query. wp_get_post_revisions() is filtered.
    $revision_ids = $wpdb->get_col( $wpdb->prepare( "SELECT ID FROM $wpdb->posts WHERE post_parent = %d AND post_type = 'revision'", $postid ) );
    // Use wp_delete_post (via wp_delete_post_revision) again. Ensures any meta/misplaced data gets cleaned up.
    foreach ( $revision_ids as $revision_id )
        wp_delete_post_revision( $revision_id );

    // Point all attachments to this post up one level.
    $wpdb->update( $wpdb->posts, $parent_data, $parent_where + array( 'post_type' => 'attachment' ) );

    wp_defer_comment_counting( true );

    $comment_ids = $wpdb->get_col( $wpdb->prepare( "SELECT comment_ID FROM $wpdb->comments WHERE comment_post_ID = %d", $postid ));
    foreach ( $comment_ids as $comment_id ) {
        wp_delete_comment( $comment_id, true );
    }

    wp_defer_comment_counting( false );

    $post_meta_ids = $wpdb->get_col( $wpdb->prepare( "SELECT meta_id FROM $wpdb->postmeta WHERE post_id = %d ", $postid ));
    foreach ( $post_meta_ids as $mid )
        delete_metadata_by_mid( 'post', $mid );

    /**
     * Fires immediately before a post is deleted from the database.
     *
     * @since 1.2.0
     *
     * @param int $postid Post ID.
     */
    do_action( 'delete_post', $postid );
    $result = $wpdb->delete( $wpdb->posts, array( 'ID' => $postid ) );
    if ( ! $result ) {
        return false;
    }

    /**
     * Fires immediately after a post is deleted from the database.
     *
     * @since 2.2.0
     *
     * @param int $postid Post ID.
     */
    do_action( 'deleted_post', $postid );

    clean_post_cache( $post );

    if ( is_post_type_hierarchical( $post->post_type ) && $children ) {
        foreach ( $children as $child )
            clean_post_cache( $child );
    }

    wp_clear_scheduled_hook('publish_future_post', array( $postid ) );

    /**
     * Fires after a post is deleted, at the conclusion of wp_delete_post().
     *
     * @since 3.2.0
     *
     * @see wp_delete_post()
     *
     * @param int $postid Post ID.
     */
    do_action( 'after_delete_post', $postid );

    return $post;
}

One-line performance-boost suggestion for WP Core for all post/page delete calls.

/**
 * Trash or delete a post or page.
 *
 * When the post and page is permanently deleted, everything that is tied to
 * it is deleted also. This includes comments, post meta fields, and terms
 * associated with the post.
 *
 * The post or page is moved to trash instead of permanently deleted unless
 * trash is disabled, item is already in the trash, or $force_delete is true.
 *
 * @since 1.0.0
 *
 * @global wpdb $wpdb WordPress database abstraction object.
 * @see wp_delete_attachment()
 * @see wp_trash_post()
 *
 * @param int  $postid       Optional. Post ID. Default 0.
 * @param bool $force_delete Optional. Whether to bypass trash and force deletion.
 *                           Default false.
 * @return WP_Post|false|null Post data on success, false or null on failure.
 */
function wp_delete_post( $postid = 0, $force_delete = false ) {
    global $wpdb;

    $post = $wpdb->get_row( $wpdb->prepare( "SELECT * FROM $wpdb->posts WHERE ID = %d", $postid ) );

    if ( ! $post ) {
        return $post;
    }

    $post = get_post( $post );

    // @charlestonsw - caches the post, eliminating an extra select from follow-on get_post calls
    // for example,  from _reset_front_page_settings_for_post fired by the 'before_delete_post' action hook
    //   which calls get_post( <id> ) which eventually call WP_Post::get_instance() firing an excess SELECT like the one 3 lines up
    //   and then puts the results af that through $post = get_post( $post ) like the one above this comment
    //   and then calls wp_cache_add() like the one below this comment
    //   so why do that again?  Add the cache here and stop all that nonsense.
    //
    wp_cache_add( $post->ID, $post, 'posts' );

    ...

    return $post;
}

 

A better and more efficient method that came about when writing this article is posted below.   This allows get_post() to manage the data query by passing in the post ID and triggering the WP_Post::get_instance( <id> ) method far earlier in the process.  This gives more consistency to the process and adds the cache like as done previously.

function wp_delete_post( $postid = 0, $force_delete = false ) {
    global $wpdb;

    $post = get_post( $postid );

    ...

    return $post;
}

 

Small scale test analysis

This is one specific discovery that has proven to be a simple yet effective way to boost performance.    A small scale test shows that processing n locations with the code patch above will yield exactly n fewer data I/O calls through mysqli.    Deleting 200 old posts from a site = 200 less data I/O calls.

In multiple runs of the small scale deletion of 9 custom posts the Store Locator Plus app shows a 15% performance increase on average.    A 2,500 location test is being run , but if the results bear out the minute-long process will take 50 seconds.   For 25,000 locations we jump from a 10 minute process to an 8 minute process.     This may just be the tip of the iceberg.

General Thoughts

Unfortunately there is no effective method of getting this performance boost from outside WordPress Core modification that we’ve discovered.

Removing the _reset_front_page_settings_for_post from the before delete action does not do anything useful.    The overhead to add and remove the action around our delete processing adds as much overhead as it saves.   That is because the get_post() => WP_Post::get_instance() is picked up just one process later as the cache is still unseeded at this point.

Pre-loading the wp_cache() by running a SELECT in the plugin and loading up the post with our own wp_cache_add() does not work because wp_delete_post() is calling a SELECT directly and casting it to a WP_Post() type instead of just calling get_post( id::string ) which completely bypasses the cache at this stage.   This is the second patch noted above comes from this realization and is likely a better patch.

While I was drilling down in the code I realized that when WP_Post::get_instance() was being called it was running a SELECT statement nearly identical to a SELECT statement run just moments ago.   Literally less than a second ago in CPU time.    wp_delete_post called SELECT * FROM wp_posts WHERE ID = ‘1112’ and WP_Posts::get_instance was building a query about to be executed as SELECT * FROM wp_posts WHERE ID = ‘1112’ LIMIT 1.   First of all the wp_delete_post should have the LIMIT 1. Second,   if it did have the LIMIT 1 it seems that the wpdb engine should be intelligent enough to see that the query about to be run is IDENTICAL to the query just run AND return results with no error.   A simple “process stack nonce” attached to the wpdb object could make it intelligent enough to skip the data I/O if back-to-back queries are called with identical SQL from the same session and thus return the already-set result.    Maybe it is unique to this particular circumstance, but I doubt it.    The content of the result in the wpdb global was identical before and after the WP_Posts::get_instance() call EVERY SINGLE TIME wp_delete_post() was called.

Store Locator Plus Sidebar

Why does SLP have custom post types when it uses a trio of custom data tables to manage location data, extended data fields, and category maps?

The decision was made a few years ago to stop using our custom category system which was very difficult to maintain and added a lot of weight to the plugin and instead use the built-in WordPress taxonomy (categories in layman’s terms) system.   It was obvious early on that the best way to do that would be to associate our custom locations data table with a “hidden” custom “Store Page” post type that would serve as the “glue” between our location data and the WordPress taxonomies.

 

One thought on “Improving wp_delete_post in WordPress Core

  1. Great analysis, thank you! Very very helpful. A plugin I use (Rest API Log) produced millions of post and meta records, removing them on the other hand, is quite the problem.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.