We are stumbeling upon an issue of content being found on our website to early. For example, we have a post named ‘Client X’ we have associated the category ‘unpublished’ to it. We don’t want this post to be found on any site, however, some websites do find this posts.
What we did is:
- Add noindex to the post when in category ‘unpublished’ > this works, google does not pick it up
- Removed the post from sitemap generation
- Removed the post from oursite.com/feed (all possible feeds)
- Removed the post from API (/wp-json/wp/v2/posts)
Still the post is found by some crawlers. For example: http://explore.finchline.nl/ this website picks up the post when you search our website.
PS: We don’t want to unpublish or password protect the post (for other reasons). It needs to have WP status published.
I just found out 2 other ways this post could be found:
- oursite.com/2017/
- oursite.com/?post_type=post
Since I have already stumled upon like 5 different ways that this post can be found I am afraid that this won’t solve it. What other ways are there to find the content of a site? In other words, what do I need to do to make sure the post is never found as long as it is in the category ‘unpublished’.
Thanks!