Whether it’s a few pages you’re trying to hide or an entire staging site – almost everyone eventually has stuff showing up in Google that shouldn’t be there. Before we explore methods to prevent pages marked as “noindex” from appearing in search results, let’s take a closer look at what a “noindex” tag is and why its important.
What is a noindex meta tag?
A “noindex” tag is a specific HTML meta tag, known as a meta robots tag, that is used to instruct search engines like Google not to index a particular web page. When you add a noindex meta tag to the HTML code of a webpage, you are essentially telling search engines that you don’t want that page to be included in their search results. This is often used for pages that you want to keep hidden from search engine results, such as private or outdated content, but still want to keep on your website for other purposes.
A noindex meta tag is placed in the head of a page a looks like this:
<meta name=”robots” content=”noindex” />
Why are noindex meta tags important?
The “noindex” tag is crucial for website owners and digital marketers to control what content appears in search engine results, enhance SEO strategies, protect sensitive data, and improve the overall user experience. By selectively choosing which pages to index, sites can prevent sensitive or irrelevant content from showing up in search results while ensuring that their most valuable is priortized for indexation and ranking. Nindexing content also helps to avoid duplicate content issues, reserves crawl budget, and helps websites comply with legal requirements related to privacy and data protection.
Technical SEO is coplicated. Let the experts help!
Why Hasn’t Google Removed Pages with Noindex Meta Tags from the Search Results?
So let’s say you added a noindex meta tag (<meta name=”robots” content=”noindex”> in the <head>) to the relevant pages and you’re still seeing them in Google (usually via a site: search). Now what?
Almost always, the issue is one of two things.
1. Google can’t crawl the content
This is because you either blocked crawling via robots.txt or put the pages behind a login. Both decent ways to prevent google from seeing pages in the first place, but not helpful after it’s already indexed pages.
2.Google hasnt crawled the content
Common for dev sites or pages that once we’re linked on your site but have been removed.
How to Ensure Google Recognizes Your Noindex Meta Tag
In both cases mentioned above, the issue is Google doesn’t know you’ve added a noindex meta tag. Google doesn’t visit everything in it’s index every day or even every week, so we need to draw it’s attention.
- Open up crawling – take away logins or robots.txt crawl restrictions
- For more urgent issues, use the URL Removal Tool in search console (note – this is temporary!)
- Inspect the URL using the search bar at the top of every page of search console.
- Click Request Indexing
You read that right. You’re not really telling Google to index the url, you’re telling it to crawl. This will get Google to notice your noindex tag.
Once your page is removed from Google you can go back to putting the content behind a login or blocking via robots.txt.
How to Noindex Multiple URLs
Got a ton of pages? Maybe an entire site? No need to worry.
The easiest way to do this is creating an XML sitemap of your noindex pages and submitting that via search console.
If you have a list of URLs, there are sitemap generators out there. You can also use Screaming Frog to generate the file for you (and verify your noindex tags are all present at the same time).
This accomplishes the same thing (get Google to crawl your pages and see the noindex tags) at scale. You can also monitor the sitemap in search console to see the number of indexed pages drop to zero.
Simple as that! Go forth and make your content disappear.