My robots.txt:
User-agent: googlebot
disallow: /xxx/y.html
y.html has lots of links like "/mmm/a.html" and "/asd/b.html".
My question, will Google index "/mmm/a.html" and "/asd/b.html"?
These links are only located in "/xxx/y.html".
My robots.txt:
User-agent: googlebot
disallow: /xxx/y.html
y.html has lots of links like "/mmm/a.html" and "/asd/b.html".
My question, will Google index "/mmm/a.html" and "/asd/b.html"?
These links are only located in "/xxx/y.html".
Note that your robots.txt must not have line breaks in a record (i.e., between
User-agent
andDisallow
), so it should be:This record will disallow "googlebot" to crawl URLs whose paths start with
/xxx/y.html
. So it will block URLs like:http://example.com/xxx/y.html
http://example.com/xxx/y.html.zip
http://example.com/xxx/y.html5
http://example.com/xxx/y.html/foo
This means that "googlebot" will never visit these pages. So if you have a link on one of these pages, the bot will not find it in the first place.
However, if Google learns about such a link in a different way, it will probably visit it (unless also blocked by robots.txt). Such other ways could be, for example, using tools that send statistics to Google (like Google Toolbar, Google Analytics etc.), having other pages include a link, having the link in a sitemap, submitting the link to Google, and so on ….