WordPress网站的robots.txt文件写法得根据自身网站情况来定,当然可以借鉴一些知名网站的robots.txt写法,比如可以看WordPress教程网的robots.txt,可以通过访问:http://www.e363.com/robots.txt 获取
User-agent:YisouSpider
User-agent:BLEXBot
User-agent:heritrix
User-agent:MJ12bot
User-agent:AhrefsBot
User-agent:SemrushBot
Disallow: /
Disallow:/wp-admin/
Disallow:/bf/
Disallow:/wp-content/
Disallow:/wp-includes/
Disallow:/*/trackback
Disallow:/feed
Disallow:/*/feed
Disallow:/comments/feed
Disallow:/comments
Disallow:/?s=*
Disallow:/*/?s=*
Disallow:/?r=*
Disallow:/?p=*
Disallow:/?orderby=views
Disallow:/*/comment-page-*
Disallow:/*?replytocom*
Disallow:/page/*
Disallow:/*/*/page/
Disallow:/page/1$
Disallow:/tag/*/*
Disallow:/tag/*/page/*
Disallow:/date/
Disallow:/date/*
Disallow:/author/
Disallow:/author/*
Disallow:/category/
Disallow:/category/*
Disallow:/category/*/page/*
Disallow:/?p=*&preview=true
Disallow:/?page_id=*&preview=true
Disallow:/wp-login.php
Disallow:/*.jpg$
Disallow:/*.jpeg$
Disallow:/*.gif$
Disallow:/*.png$
Disallow:/*.bmp$
Disallow:/wp-login.php?*
Disallow:/download1?id=*
Disallow:http://www.e363.com:8181/
Sitemap:http://www.e363.com/sitemap.xml
下面详细解释下每行规则的含义:
- User-agent: * 对所有搜索引擎开放收录
- Disallow: /wp- 禁止搜索引擎收录所有包含“wp-”字样的url,如wp-admin、wp-content、wp-includes、wp-login.php等
- Disallow: /? 禁止搜索引擎收录所有包含“?”字样的url
- Disallow: /feed/ 禁止搜索引擎收录RSS订阅页面
- Disallow: //feed/ 禁止搜索引擎收录所有分类目录、TAG、文章的RSS订阅 Disallow: /trackback/ 禁止收录网站的trackback Disallow: //trackback/ 禁止收录所有分类目录、TAG、文章的trackback
- Disallow: /page/ 禁止百度收录首页分页,防止首页权重过于分散
- Disallow: /a-category//page/ 同理,禁止百度收录分类目录分页 Disallow: /a-tag//page/ 同理,禁止百度收录TAG标签分页
- Sitemap:http://www.e363.com/sitemap.xml robots.txt文件增加Sitemap链接地址
THE END
暂无评论内容