|
|||||||
Join now to share free website templates or post on the forum. If you have never been on a forum before read the
FAQ. It's quick, easy and free to join!
|
![]() |
|
|
LinkBack | Thread Tools | Display Modes |
|
|||
|
Need some help.
I have a seperate file for key works in a file named robots.txt I have this file in the root directory - same as index.html As in this example: User-agent:* Disallow:/ Allow:/robots.txt /Investigators, Private Detective, etc I try to test my robots.txt and I get a message: "missing / at start of file or folder name" Don't know what this means or how to fix what. Looks like I don't have the correct coding ? |
|
||||
|
If it helps here is mine
# /robots.txt file for http://cleandeck.net/ # mail webmaster@cleandeck.net for constructive criticism User-agent: webcrawler Disallow: User-agent: lycra Disallow: User-agent: * Disallow: /tmp Disallow: /logs Just remember most of the bad spider crawlers ignore robots text. the email is bogus, it's used more as bait. the baddies get their email address and stop crawling my site. |
|
|||
|
Ok I understand your piece:
User-agent: * Disallow: /tmp Disallow: /logs Now I want to add keyworks here to be indexed from robots.txt. Is this possible to do keywords in robots.txt or do I have the entire concept wrong ? (Don't want the keywords in index.html) |
|
||||
|
robot.txt is not for key words, your html, php files hold them.
don't know why you don't??? so don't put any in. robot.txt was used in the old days to tell the spider what to index/ what to leave alone. The nice ones still respect this file. If you want real protection use the .htaccess file. something like this: Options -Indexes <Files 403.shtml> <limit GET POST PUT> order deny,allow order allow,deny allow from all </Files> ErrorDocument 404 /oops.html #get rid of bad bots RewriteEngine on RewriteCond %{HTTP_USER_AGENT} ^BadBot [OR] RewriteCond %{HTTP_USER_AGENT} ^EvilScraper [OR] RewriteCond %{HTTP_USER_AGENT} ^FakeUser RewriteRule ^(.*)$ http://go.away/ deny from 217.199.217.3 deny from 98.131.11.144 deny from 63.251.179.32 deny from 193.46.236.151 deny from 195.251.117.228 deny from 190.72.184.105 deny from 89.149.241.126 deny from 85.140.206.177 deny from 195.251.117.0/24 deny from 85.140.0.0/16 deny from 89.15.191.25 |
|
|||
|
I was looking to maintain a seperate robots.txt file to do 2 things:
1. to disallow all directories 2. to use the keywords. Don't see this is possible. I have to do robots.txt to dirallow all directories and meta code all keywords in my index.html Also is this the correct statement for my index.html <meta name="robots" contents="noindex,nofollow"> Sorry, I am a newbie - just started this last month - alot to learn. |
|
|||
|
My understanding of "noindex,nofollow" is that the crawler will not index my entire index.html page and follow the links and pick up junk.
Is my statement wrong ? Also by providing a meta keywords in my index.html, crawlers will index the keywords. At least this is how I now have it setup. Correction would be appreciated. |
|
||||
|
the NOFOLLOW directive only applies to links on the page it is written. It's entirely likely that a robot might find the same links on some other page without a NOFOLLOW (perhaps on some other site), and so still arrives at your undesired page.
Quote:
Quote:
|
|
|||
|
|
![]() |
| Thread Tools | |
| Display Modes | |
|
|