Performance Optimization With CachingDo-It-Yourself Caching Methods With WordPress

Advertisement

There are different ways to make your website faster: specialized plugins to cache entire rendered HTML pages, plugins to cache all SQL queries and data objects, plugins to minimize JavaScript and CSS files and even some server-side solutions.

1

But even if you use such plugins, using internal caching methods for objects and database results is a good development practice, so that your plugin doesn’t depend on which cache plugins the end user has. Your plugin needs to be fast on its own, not depending on other plugins to do the dirty work. And if you think you need to write your own cache handling code, you are wrong. WordPress comes with everything you need to quickly implement varying degrees of data caching. Just identify the parts of your code to benefit from optimization, and choose a type of caching.

WordPress implements two different caching methods:

  1. Non-persistent
    The data remains in the cache during the loading of the page. (WordPress uses this to cache most database query results.)
  2. Persistent
    This depends on the database to work, and cached data can auto-expire after some time. (WordPress uses this to cache RSS feeds, update checks, etc.)

Non-Persistent Cache

When you use functions such as get_posts() or get_post_meta(), WordPress first checks to see whether the data you require is cached. If it is, then you will get data from the cache; if not, then a database query is run to get the data. Once the data is retrieved, it is also cached. A non-persistent cache is recommended for database results that might be reused during the creation of a page.

The code for WordPress’ internal non-persistent cache is located in the cache.php file in the wp-includes directory, and it is handled by the WP_Object_Cache class. We need to use two basic functions: wp_cache_set() and wp_cache_get(), along with the additional functions wp_cache_add(), wp_cache_replace(), wp_cache_flush() and wp_cache_delete(). Cached storage is organized into groups, and each entry needs its own unique key. To avoid mixing with WordPress’ default data, using your own unique group names is best.

Example

For this example, we will a create function named d4p_get_all_post_meta(), which will retrieve all meta data associated with a post. This first version doesn’t involve caching.

function d4p_get_all_post_meta($post_id) {
    global $wpdb;

    $data = array();
    $raw = $wpdb->get_results( "SELECT meta_key, meta_value FROM $wpdb->postmeta WHERE post_id = $post_id", ARRAY_A );

    foreach ( $raw as $row ) {
        $data[$row['meta_key']][] = $row['meta_value'];
    }

    return $data;
}

Every time you call this function for the same post ID, an SQL query will be executed. Here is the modified function that uses WordPress’ non-persistent cache:

function d4p_get_all_post_meta($post_id) {
    global $wpdb;

    if ( ! $data = wp_cache_get( $post_id, 'd4p_post_meta' ) ) {
        $data = array();
        $raw = $wpdb->get_results( "SELECT meta_key, meta_value FROM $wpdb->postmeta WHERE post_id = $post_id", ARRAY_A );

        foreach ( $raw as $row ) {
            $data[$row['meta_key']][] = $row['meta_value'];
        }

        wp_cache_add( $post_id, $data, 'd4p_post_meta' );
    }

    return $data;
}

Here, we are using a cache group named d4p_post_meta, and post_id is the key. With this function, we first check to see whether we need any data from the cache (line 4). If not, we run the normal code to get the data and then add it to the cache in line 13. So, if you call this function more than once, only the first one will run SQL queries; all other calls will get data from the cache. We are using the wp_cache_add function here, so if the key-group combination already exists in the store, it will not be replaced. Compare this with wp_cache_set, which will always overwrite an existing value without checking.

As you can see, we’ve made just a small change to the existing code but potentially saved a lot of repeated database calls during the page’s loading.

Important Notes

  1. Non-persistent cache is available only during the loading of the current page; once the next page loads, it will be blank once again.
  2. The storage size is limited by the total available memory for PHP on the server. Do not store large data sets, or you might end up with an “Out of memory” message.
  3. Using this type of cache makes sense only for operations repeated more than once in the creation of a page.
  4. It works with WordPress since version 2.0.

Database-Driven Temporarily Persistent Cache

This type of cache relies on a feature built into WordPress called the Transient API. Transients are stored in the database (similar to most WordPress settings, in the wp_options table). Transients need two records in the database: one to store the expiration time and one to store the data. When cached data is requested, WordPress checks the timestamp and does one of two things. If the expiration time has passed, WordPress removes the data and returns false as a result. If the data has not expired, another query is run to retrieve it. The good thing about this method is that the cache persists even after the page has loaded, and it can be used for other pages for as long as the transient’s expiration time has not passed.

If your database queries are complex and/or produce results that might not change often, then storing them in the transient cache is a good idea. This is an excellent solution for most widgets, menus and other page elements.

Example

Let’s say we wanted an SQL query to retrieve 20 posts from the previous month, along with some basic author data such as name, email address and URL. But we want posts from only the top 10 authors (sorted by their total number of posts in that month). The results will be displayed in a widget.

When tested on my local machine, this SQL query took 0.1710 seconds to run. If we had 1000 page views per day, this one query would take 171 seconds every 24 hours, or 5130 seconds per month. Relatively speaking, that is not much time, but we could do much better by using the transient cache to store these results with an expiration time of 30 days. Because the results of this query will not change during the month, the transient cache is a great way to optimize resources.

Returning to my local machine, the improved SQL query to get data from the transient cache is now only 0.0006 seconds, or 18 seconds per month. The advantage of this method is obvious in this case: we’ve saved 85 minutes each month with this one widget. Not bad at all. There are cases in which you could save much, much more (such as with very complex menus). More complex SQL queries or operations would further optimize resources.

Let’s look at the actual code, both before and after implementing the transient cache. Below is the normal function to get the data. In this example, the SQL query is empty (because it is long and would take too much space here), but the entire widget is linked to at the end of this article.

function d4p_get_query_results() {
    global $wpdb;

    $data = $wpdb->get_results(' // SQL query // ');

    return $data;
}

And here is the function using the transient cache, with a few extra lines to check whether the data is cached.

function d4p_get_query_results() {
    global $wpdb;

    $data = get_transient('my_transient_key');

    if ($data === false) {
        $data = $wpdb->get_results(' // SQL query // ');
        set_transient('my_transient_key', $data, 3600 * 24);
    }

    return $data;
}

The function get_transient (or get_site_transient for a network) needs a name for the transient record key. If the key is not found or the record has expired, then the function will return false. To add a new transient cache record, you will need the record key, the object with the data and the expiration time (in seconds), and you will need to use the set_transient function (or set_site_transient for a network).

If your data changes, you can remove it from the cache. You will need the record key and the delete_transient function (or delete_site_transient for a network). In this example, if the post in the cache is deleted or changed in some way, you could delete the cache record with this:

delete_transient('my_transient_key');

Important Notes

  1. The theoretical maximum size of data you can store in this way is 4 GB. But usually you would keep much smaller amounts of data in transient (up to couple of MB).
  2. Use this method only for data (or operations) that do not change often, and set the expiration time to match the cycle of data changes.
  3. In effect, you are using it to render results that are generated through a series of database queries and storing the resulting HTML in the cache.
  4. The name of the transient record may not be longer than 45 characters, or 40 characters for “site transients” (used with multi-sites to store data at the network level).
  5. It works with WordPress since version 3.0.

Widget Example: Using Both Types Of Cache

Based on our SQL query, we can create a widget that relies on both caching methods. These are two approaches to the problem, and the two widgets will produce essentially the same output, but using different methods for data retrieval and results caching. As the administrator, you can set a title for the widget and the number of days to keep the results in the cache.

Both versions are simple and can be improved further (such as by selecting the post’s type or by formatting the output), but for this article they are enough.

Raw Widget

The “raw” widget version stores an object with the SQL query results in the transient cache. In this case, the SQL query would return all columns from the wp_posts table and some columns from the wp_users table, along with information about the authors. Every time the widget loads, each post from our results set would get stored in the non-persistent cache object in the standard posts group, which is the same one used to store posts for normal WordPress operations. Because of this, functions such as get_permalink() can use the cached object to generate a URL to post. Information about the authors from the wp_users table is used to generate the URL for the archive of authors’ posts.

This widget is located in the method_raw.php file in the d4p_sa_method_raw class. The function get_data() is the most important part of the widget. It attempts to get data from the transient cache (on line 52). If that fails, get_data_real() is called to run the SQL query and return the data. This data is now saved into the transient cache (line 56). After we have the data, we store each post from the set into the non-persistent cache. The render function is simple; it displays the results as an unordered list.

Rendered Widget

The previous method works well, but it could have one problem. What if your permalink depends on categories (or other taxonomies) or you are running a query for a post type in a hierarchy? If that is the case, then generating a permalink for each post would require additional SQL queries. For example, to display 20 posts, you might need another 20 or more SQL queries. To fix the problem, we’ll change how we get the data and what is stored in the transient cache.

The second widget is located in the method_rendered.php file in the d4p_sa_method_rendered class. Within, the names of class methods are the same, so you can easily see now the difference between the two widgets. In this case, the transient cache is used in the render() method. We’re checking for cached data, and if that fails we use get_data() to get the data set and generate a rendered list of results. Now, we are caching the rendered HTML output! No matter how many extra SQL queries are needed to generate the HTML (for permalinks or whatever else you might need in the widget), they are run only once, and the complete HTML is cached. Until the cache expires, we are always displaying HTML rendered without the need for any additional SQL queries or processing.

Download the Widget

You can download this D4P Smashing Authors plugin2, which contains both widgets.

Conclusion

As you can see, implementing one or both caching methods is easy and could significantly improve the performance of your plugin. If a user of your plugin decides to use a specialized caching plugin, all the better, but make sure that your code is optimized.

(al)

Footnotes

  1. 1 http://www.smashingmagazine.com/wp-content/uploads/2012/06/doityourself-cache-splash.png
  2. 2 http://www.smashingmagazine.com/wp-content/uploads/2012/06/d4p-smashing-authors.1.0.0.zip

↑ Back to topShare on Twitter

Founder of Dev4Press, dedicated to development for WordPress, focusing on premium plugins. Dev4Press website offers wide selection of practical tutorials for WordPress. Milan is author of many popular WordPress plugins, including GD Star Rating, GD Press Tools and GD CPT Tools. Milan also developed several plugins for bbPress for WordPress powered forums: GD bbPress Toolbox.

Advertising
  1. 1

    Great post!

    -7
  2. 2

    Konstantin Kovshenin

    June 26, 2012 3:57 am

    Nice overview!

    What you call “Non-Persistent Cache” is actually called Object Caching. Whether it is persistent or not, depends on whether a persistent caching plugin is installed. Prior to version 2.5, you could use the WP_CACHE constant in your config file to enable simple file-based caching for object cache too. Persistent object caching doesn’t necessarily store your data in the database. More common approaches are memcached, file-based cache, etc.

    More on object caching: http://codex.wordpress.org/Class_Reference/WP_Object_Cache

    And what you call “Database-Driven Temporarily Persistent Cache” is better known as Transient Caching, and is more often used with WP_Query/get_posts and not direct SQL. Running SQL to get a set of posts is non-future proof, difficult to read and maintain, and insecure. *Always* use WP_Query or get_posts to query for posts in WordPress. If you need any custom SQL at all, do it through appropriate filters: posts_where, posts_orderby, etc.

    Back to the topic, transient caching is also very useful to cache HTTP requests, which are generally more expensive than a dozen SQL queries against your own servers, so for example your latest tweets from Twitter, or your current Facebook status is much more common for transient cache.

    More on transient caching: http://codex.wordpress.org/Transients_API

    Also, transients were introduced in WordPress 2.8 and not 3.0.

    ~ Konstantin

    14
    • 3

      Mssr. Kovshenin… Please don’t just do a “hit and run” on this. Are you implying that for “best practices” Milan’s code should be altered in some way? If so – please give the code that YOU would use. I, for one, don’t follow the specifics of what you are implying.

      1
      • 4

        Konstantin Kovshenin

        June 27, 2012 4:08 am

        Looks like my other comment hit the spam bee, so we’ll have to wait for an approval ;) hang in there!

        -1
      • 5

        Konstantin Kovshenin

        June 28, 2012 10:37 pm

        @dj Yes it should. It should get rid of all SQL queries, $wpdb should never be used unless you really, *really* need to. You can find a complete example of transient caching with WP_Query on this Codex page: http://codex.wordpress.org/Transients_API#Complete_Example

        Here’s how Milan’s query could have changed into a proper WP_User_Query to get top authors, combined with a WP_Query to get their posts. Also, non-deterministic MySQL functions such as NOW() will by-pass MySQL’s query caching, so it’s better to generate that last month’s number outside of MySQL or maybe even use BETWEEN.

        And what about password protected posts? Milan’s query will select them too, and he’s very lucky he’s not printing the contents of those posts. If he was, an admin user visit (who can view the contents of all password protected posts) would cause the post contents to be cached in a transient, making them available for x amount of days, even to anonymous users. Instead, you should cache the query, but not the output if it can contain sensitive data like password protected posts, that way the_content will do its job and output a password field for anonymous uses. Caching content is fine with non-sensitive data though, like a Twitter feed or something, which is a *very* good example for transient caching with wp_remote_get. This is the main reason why many page caching plugins will not cache the output buffer for logged in users.

        Milan’s d4p_get_all_post_meta() function is a rather bad copy of get_post_custom(), and it’s worth noting that by the time a WP_Query or get_posts is run, all metadata and taxonomy is *already* inside your object cache, so functions like get_post_meta and get_post_custom will never hit the database, unless cache_results is turned off for WP_Query. Here’s more on WP_Query and object caching: http://codex.wordpress.org/Class_Reference/WP_Query#Parameters_relating_to_caching

        It’s also wrong saying persistent caching relies on the database, because it doesn’t. In fact, transient caching doesn’t actually rely on the database either, how’s that! Surprised? Let’s take a closer look at set_transient, see that $_wp_using_ext_object_cache global? Now let’s take a look at wp_start_object_cache. So transients will switch to using object caching if an object caching plugin is enabled, and will not use the options api, and thus, they will never hit the database.

        What I’m trying to say here is, that although the article is fine for starters, you shouldn’t believe everything mentioned here. It’s far more complicated than just set() and get(). I would also rewrite the examples to illustrate easier, yet more effective usage of all types of caches. As I already mentioned, a wp_remote_get would be nice to cache, and a simple WP_Query with a random order.

        Let me know if you have any questions about any of this and I’ll be more than happy to help you out when I can. Good luck!

        ~ Konstantin

        9
        • 6

          Forgive the ignorance, but what do you mean by “Instead, you should cache the query…” How do you use the Transient API to cache a query? Or are you talking about another method to cache the query?

          0
      • 7

        Konstantin Kovshenin

        June 29, 2012 2:21 am

        Ha! Looks like somebody deleted my pending comment, so I published another one, which is now pending again. I don’t think it’s harsh enough to not approve it, but in any case, if somebody wants to know the details about why I think this article is misleading, you’re free to ping me on Twitter: @kovshenin

        Cheers!

        3
    • 8

      Thanks for the comment! I have mentioned the Transient API, and the example here is used to illustrate the method of caching for an operation that returns same results over period of time. And that is why there are two examples for it: one for caching results from the query, and other to cache rendered results. Same method can be applied to any other operation that have slow changing results.

      6
      • 9

        Hey this is a great way of flushing the dns cache, I’ve been using alrnteate ways of doing it which seemed to work fine but it’s handy to know there’s more ways of doing it. I have a bunch of commands for various other OS’s if anybody is looking for tips with how to do it.

        0
  3. 10

    Gennady Kovshenin

    June 26, 2012 4:31 am

    Sanitize that SQL query, the deuce knows where $post_id will eventually come from and end up in.

    2
  4. 11

    Pankaj Parashar

    June 26, 2012 9:21 am

    Interesting stuff!! I never really imagined so many aspects of caching.

    0
  5. 12

    Kieran Masterton

    June 26, 2012 10:22 am

    Great overview, thanks Milan :)

    0
  6. 13

    Another great post mate :)

    1
  7. 14

    Awesome, finaly a clear explanation. Thx

    1
  8. 15

    good one :-)

    -1
  9. 16

    That was a great read !

    0
  10. 17

    As you discuss, out of the box the Persistent API can save resources by storing result sets but the biggest benefit to the caching API isn’t to save computation over the course of a page load. It’s to let your plugins leverage those third party object-caching plugins to save repeated database calls for multiple users on a site.

    Good habits like this are key to designing scalable plugins and separate the quality plugins from the mediocre.

    One of my favorite aspects of the WordPress platform is that it’s so easy to harness that power in a way that others can benefit from as well.

    1
  11. 18

    Hi,

    One of the most overlooked and important aspects of speeding up a users experience of a website is to address browser caching correctly. This is where a huge proportion of time is lost and the fixes are simple to configure, but so few people implement them correctly.

    You can tune the nuts out of the underlying platform technology, (WordPress, MySQL, PHP etc etc) but if the web server (Apache) is not configured in a performance oriented manner then I wouldn’t bother. You’re only as fast as your weakest link and that tends to be the bit between the web server and the user.

    I would average that less than 10% of websites I look have the web server configured correctly. I’d be interested in your thoughts?

    Regards,

    Dan

    -1
  12. 19

    Very technical overview that’s attractive for some people. A far easier way to speed up your page loading times is to use managed servers which don’t require any plugins.

    0
  13. 20

    HI,

    “Your plugin needs to be fast on its own, not depending on other plugins to do the dirty work”.
    Can you please explain this one better?

    Thanks
    -db

    0
    • 21

      I think he means that your plugin should be able to operate as expected on its own, without being dependent upon other plugins for optimal operation. For example, there are many plugins that explicitly extend and use WP Super Cache or similar plugins, but some plugins will just use that plugins functionality if it is already present.

      In that sense, the plugin is either completely dependent upon another plugin, or just improved by it. As a general design principle, your plugin will be considered less usable if it requires many components that any raw installation of WordPress does not have, implying: It is always preferable for a plugin to build upon the WordPress core functionality, and only secondarily consider using other plugins functionality as a substitute.

      0
  14. 22

    “When tested on my local machine, this SQL query took 0.1710 seconds to run. If we had 1000 page views per day, this one query would take 171 seconds every 24 hours, or 5130 seconds per month.”

    You don’t get the point of caching. Calling MySQL is expensive. So instead of making another request, we use a framework like PHP APC to cache the result. The result can be a MySQL query or any other kind of computation we are doing.

    The problem with MySQL is not the time required. MySQL is generally quite fast, and even 1 second is not a very long time. The problem is generally with the number of requests you make. Each new request is a new connection to MySQL, and a connection consumes a lot of RAM and CPU.

    The WP_Object_Cache is not made only for MySQL, but also supports any other kind of computation you want to make. It also helps if you are integrating with PHP APC or a similar extension.

    1
  15. 23

    Thanks for this overview! A very good way of using the Transient API is when you’re making calls to third-party APIs. I’m building a plugin that relies on an external API to fetch and store data, and since these data don’t change frequently there’s really not point at all in making repeated requests to get the same result twenty times a day… Better to store it in cache for a couple days.

    0

Leave a Comment

Yay! You've decided to leave a comment. That's fantastic! Please keep in mind that comments are moderated and rel="nofollow" is in use. So, please do not use a spammy keyword or a domain as your name, or else it will be deleted. Let's have a personal and meaningful conversation instead. Thanks for dropping by!

↑ Back to top