What is Custom Robots txt Use For?.

In search preferences section, you will see 3 area that need an advance blogging knowledge to play with. They are meta tags, errors/redirections, and crawlers & indexing. In this post I’m gonna try to give you an easy explanation about what is custom robots txt use for.

Robots.txt is a command path for search engine to crawl or not to crawl a specific path or content on your blog or site. Its function is like a filter. Every blogspot site have a default robots.txt like below:

User-agent: Mediapartners-Google
Disallow:

User-agent: *
Disallow: /search
Allow: /

Sitemap: http://blogname.blogspot.com/feeds/posts/default?orderby=UPDATED

If you want to see yours, go to your browser address bar and type:

http://yourblogname.blogspot.com/robots.txt

What Are These Lines Means?.

User-agent: Mediapartners-Google
– Google Adsense Robot will crawl your blog content. If you add adsense on your blog, this robot will help your blog get the right advertise display on your pages.

Disallow:
– Command that tell SE robot to not visit this pages, post, or categories. There is no / sign after that, so it’s mean GA robot can crawl all your content.

User-agent: *
– All internet search engine robot.

Disallow: /search
– Robot have no permission to crawl folder search such as /search/label and search/search?updated, etc. Why?, because label is not a real url. Google wants you search topic from search engine box, not just click randomly on label or categories. This also to avoiding duplicate content.

Allow: /
– Allow all pages to get crawl except the path on disallow above.

Sitemap: http://blogname.blogspot.com/feeds/posts/default?orderby=UPDATED
– Your blog feed address.

How to Prevent a Certain Page?.

If you have been blogging for quite long time, there is some point, you want to block robots to crawl a certain page. Maybe it’s a private content or top secret that you just want to share with your co-worker. On user-agent: *, disallow: /search part, you can add this example path below:

User-agent: *
Disallow: /search
Disallow: /p/secret-1.html
Disallow: /p/secret-2.html
Allow: /

The image below is an example friendly robots.txt for blogspot. If you want to change it like below, go to settings —> search preferences —> crawlers and indexing —> enable custom robots.txt. Don’t forget to save changes when you done.

Click on the image for larger size.

2 thoughts on “What is Custom Robots txt Use For?.”

Imjsu

What is Custom Robots txt Use For?.

What Are These Lines Means?.

How to Prevent a Certain Page?.

2 thoughts on “What is Custom Robots txt Use For?.”

Leave a Reply Cancel reply

Sharing knowledge in cgi, blogging, and coding.