Better way to get tag stats?

I have over 4000 posts. I am trying to query all the posts and get the count of tags each post has and sum up posts count based on number of tags the post has in dashboard. The posts count shows up properly when post_per_page is less than 2000 but beyond 2000 , the query timesout . It just shows ‘0’ for all.

Code

  $args = array(
    'posts_per_page' => 4000,
    'post_status'  => 'publish',
  );

  $zerotags = 0;
  $onetag = 0;
  $twotags = 0;
  $morethantwo = 0;
  $sixtags_plus = 0;

  $query = new WP_Query( $args );
  while ( $query->have_posts() ) : $query->the_post();

     $posttags = get_the_tags();
     $tag_count = 0;

     foreach($posttags as $tag) {
       $tag_count++;
     }
     if ($tag_count == 0) {
       $zerotags++;
     }
     if ($tag_count == 1) {
       $onetag++;
     }
     if ($tag_count == 2) {
       $twotags++;
     }
     if ($tag_count > 2 && $tag_count < 6) {
       $morethantwo++;
     }
     if ($tag_count >= 6) {
       $sixtags_plus++;
     }

 endwhile;

 echo 'Zero Tags : '.$zerotags.'posts';
 echo 'One Tag : '.$onetag.'posts';
 echo 'Two Tags : '.$twotags.'posts';
 echo 'More than 2 and less than 6 : '.$morethantwo.'posts';
 echo 'More than 6 tags : '.$sixtags_plus.'posts';

Is there a better approach to query this so that the timeout doesn’t occur?

3 Answers
3

I addressed a similar problem not long ago – it’s all in the memory:

$post_ids = get_posts(
    array(
        'posts_per_page' => -1,
        'post_status'  => 'publish',
        'fields' => 'ids', // Just grab IDs instead of pulling 1000's of objects into memory
    )
);

update_object_term_cache( $post_ids, 'post' ); // Cache all the post terms in one query, memory should be ok

foreach ( $post_ids as $post_id ) {
    if ( ! $tags = get_object_term_cache( $post_id, 'post_tag' ) ) {
       $zerotags++;
    } else {
        $tag_count = count( $tags );

        if ( $tag_count === 1 ) {
            $onetag++;
        } elseif ( $tag_count === 2 ) {
            $twotags++;
        } elseif ( $tag_count >= 6 ) {
            $sixtags_plus++;
        }

        if ( $tag_count > 2 && $tag_count < 6 ) {
            $morethantwo++;
        }
    }
}

Update: Switched get_the_tags to get_object_term_cache – otherwise we lose all our hard work! (the former hits get_post, which will hit the db on every iteration and chuck the post object into memory – props @Pieter Goosen).

Update 2: The second argument for update_object_term_cache should be the post type, not the taxonomy.

Leave a Comment