(Moderator’s note: Title was originally: “query/database optimisation”)

I’m written a function for a custom “filter” search panel that allows users to select terms from up to four custom taxonomies. I’m running the queries directly against the database and the query is averaging half a second to execute (with terms for all four taxonomies and one result returned).

This seems pretty slow to me. I was wondering if there’s anything I can do to either optimise the query or even the database to make this more efficient/faster. Perhaps writing a view, even? I have experience with MS-SQL but not much with MySQL and I’m not sure how things are different.

Here’s my function code:

    function filter_resources($phase,$wa,$aus,$topics){
    global $wpdb;
    $querystr="
    SELECT * 
        FROM $wpdb->posts A
            LEFT JOIN $wpdb->term_relationships B ON(A.ID = B.object_id)
            LEFT JOIN $wpdb->term_taxonomy C ON(B.term_taxonomy_id = C.term_taxonomy_id)
            LEFT JOIN $wpdb->terms D ON(C.term_id = D.term_id)

        LEFT JOIN $wpdb->term_relationships BB ON(A.ID = BB.object_id)
            LEFT JOIN $wpdb->term_taxonomy CC ON(BB.term_taxonomy_id = CC.term_taxonomy_id)
            LEFT JOIN $wpdb->terms DD ON(CC.term_id = DD.term_id)

        LEFT JOIN $wpdb->term_relationships BBB ON(A.ID = BBB.object_id)
            LEFT JOIN $wpdb->term_taxonomy CCC ON(BBB.term_taxonomy_id = CCC.term_taxonomy_id)
            LEFT JOIN $wpdb->terms DDD ON(CCC.term_id = DDD.term_id)

        LEFT JOIN $wpdb->term_relationships BBBB ON(A.ID = BBBB.object_id)
            LEFT JOIN $wpdb->term_taxonomy CCCC ON(BBBB.term_taxonomy_id = CCCC.term_taxonomy_id)
            LEFT JOIN $wpdb->terms DDDD ON(CCCC.term_id = DDDD.term_id)

        WHERE A.post_type="resources" 
            AND A.post_status="publish"
            AND C.taxonomy = 'phase-of-learning'
            AND D.term_id = '$phase'
            AND CC.taxonomy = 'wa-curriculum'
            AND DD.term_id = '$wa'
            AND CCC.taxonomy = 'australian-curriculum'
            AND DDD.term_id = '$aus'
            AND CCCC.taxonomy = 'topics'
            AND DDDD.term_id = '$topics'
        ORDER BY A.post_date DESC";
    return $wpdb->get_results($querystr,OBJECT);
}

Thanks!

3 Answers
3

While this is really a MySQL question it does help to understand the WordPress SQL schema and also I love trying to optimize SQL queries so rather than send you off to StackOverflow I’ll try to answer you here. You may still want to post it over there to get some other opinions.

And while I don’t fully understand your requirement I think I understand what you are asking I think I do so I’d like to present the following to see if it meets your needs better. I don’t have your data so it is a little hard for me to very that it indeed works but like I said, I think it meets your needs:

function filter_resources($phase,$wa,$aus,$topics){
  global $wpdb;
  $sql =<<<SQL
SELECT
  t.slug,p.*
FROM
  wp_posts p
  INNER JOIN wp_term_relationships tr ON p.ID=tr.object_id
  INNER JOIN wp_term_taxonomy tt ON tr.term_taxonomy_id = tt.term_taxonomy_id
  INNER JOIN wp_terms t ON tt.term_id = t.term_id
WHERE 1=1
  AND p.post_type="resources"
  AND p.post_status="publish"
  AND t.term_id IN (%d,%d,%d,%d)
  AND CONCAT(tt.taxonomy,"https://wordpress.stackexchange.com/",t.term_id) IN (
    'phase-of-learning/%s',
    'wa-curriculum/%s',
    'australian-curriculum/%s',
    'topics/%s'
  )
GROUP BY
  p.ID
HAVING
  COUNT(*)=4
ORDER BY
  p.post_date DESC
SQL;
  $sql = $wpdb->prepare($sql,
    $phase,$wa,$aus,$topics,  // For the %d replacements
    $phase,$wa,$aus,$topics   // For the %s replacements
  );
  $results = $wpdb->get_results($sql,OBJECT);
  return $results;
}

Basically this gives you all posts where all of your taxonomy terms are applied and it does so by doing a freeform query to match all posts that have the taxonomy/terms applied but limits to only those posts that have all terms applied grouping by wp_post.ID and finding all records for which the post is joined 4 times. When you run a MySQL EXPLAIN the optimization looks pretty good compared to what you had; many fewer tables joined. Hopefully this was the logic you needed.

Caching with the Transients API

And if you are trying to improve performance you might also consider caching the results in a “transient” for a limited amount of time (1 hour, 4 hours, 12 hours or more?) This blog post describes how to use the WordPress Transients API:

  • Overview of WordPress’ Transients API

Here’s the basic logic for transients:

define('NUM_HOURS',4); // Time to cache set for your use case
$data = get_transient( 'your_transient_key' );
if( !$data ) {
  $data = // Do something to get your data here
  set_transient( 'your_transient_key', $data, 60 * 60 * NUM_HOURS );
}  

To use transients in your filter_resources() function it might instead look like this:

define('RESOURCE_CACHE_HOURS',4);
function filter_resources($phase,$wa,$aus,$topics){
  $resources = get_transient( 'yoursite_filtered_resources' );
  if(!$resources) {
    global $wpdb;
    $sql =<<<SQL
SELECT
  t.slug,p.*
FROM
  wp_posts p
  INNER JOIN wp_term_relationships tr ON p.ID=tr.object_id
  INNER JOIN wp_term_taxonomy tt ON tr.term_taxonomy_id = tt.term_taxonomy_id
  INNER JOIN wp_terms t ON tt.term_id = t.term_id
WHERE 1=1
  AND p.post_type="resources"
  AND p.post_status="publish"
  AND t.term_id IN (%d,%d,%d,%d)
  AND CONCAT(tt.taxonomy,"https://wordpress.stackexchange.com/",t.term_id) IN (
    'phase-of-learning/%s',
    'wa-curriculum/%s',
    'australian-curriculum/%s',
    'topics/%s'
  )
GROUP BY
  p.ID
HAVING
  COUNT(*)=4
ORDER BY
  p.post_date DESC
SQL;
    $sql = $wpdb->prepare($sql,
      $phase,$wa,$aus,$topics,  // For the %d replacements
      $phase,$wa,$aus,$topics   // For the %s replacements
    );
    $resources = $wpdb->get_results($sql,OBJECT);
    $hours = RESOURCE_CACHE_HOURS * 60 * 60;
    set_transient( 'yoursite_filtered_resources', $resources, $hours);
  }  
  return $resources;
}

UPDATE

Here’s another take on the code that is attempting to handle the cases where less than four criteria are selected by the user:

define('RESOURCE_CACHE_HOURS',4);
function filter_resources($phase,$wa,$aus,$topics){
  $resources = get_transient( 'yoursite_filtered_resources' );
  if(!$resources) {
    $terms = $taxterms = array();
    if (!empty($phase))
      $taxterms[$phase] = 'phase-of-learning/%s';
    if (!empty($wa)) 
      $taxterms[$wa] = 'wa-curriculum/%s';
    if (!empty($aus))
      $taxterms[$aus] = 'axustralian-curriculum/%s';
    if (!empty($topics))
      $taxterms[$topics] = 'topics/%s';
    $count = count($taxterms);
    $having = ($count==0 ? '' : "HAVING COUNT(*)={$count}");
    $values = array_keys(array_flip($tax_terms));
    $values = array_merge($values,$values);  // For %d and $s
    $taxterms =  implode("','",$taxterms);
    $terms = implode(',',array_fill(0,$count,'d%'));
    global $wpdb;
    $sql =<<<SQL
SELECT
  t.slug,p.*
FROM
  wp_posts p
  INNER JOIN wp_term_relationships tr ON p.ID=tr.object_id
  INNER JOIN wp_term_taxonomy tt ON tr.term_taxonomy_id = tt.term_taxonomy_id
  INNER JOIN wp_terms t ON tt.term_id = t.term_id
WHERE 1=1
  AND p.post_type="resources"
  AND p.post_status="publish"
  AND t.term_id IN ({$terms})
  AND CONCAT(tt.taxonomy,"https://wordpress.stackexchange.com/",t.term_id) IN ('{$taxterms}')
GROUP BY
  p.ID
{$having}
ORDER BY
  p.post_date DESC
SQL;
    $sql = $wpdb->prepare($sql,$values);
    $resources = $wpdb->get_results($sql,OBJECT);
    $hours = RESOURCE_CACHE_HOURS * 60 * 60;
    set_transient( 'yoursite_filtered_resources', $resources, $hours);
  }  
  return $resources;
}

Leave a Reply

Your email address will not be published. Required fields are marked *