Go Back   Free website templates > Community Center > Give and Get Help
User Name
Password
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read


Join now to share free website templates or post on the forum. If you have never been on a forum before read the FAQ. It's quick, easy and free to join!
Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 07-15-2009, 10:36 PM
Junior Member
 
Join Date: Jun 2009
Posts: 11
narinesa is on a distinguished road
Default robots.txt

Need some help.
I have a seperate file for key works in a file named robots.txt
I have this file in the root directory - same as index.html

As in this example:

User-agent:*
Disallow:/
Allow:/robots.txt
/Investigators,
Private Detective,
etc

I try to test my robots.txt and I get a message:
"missing / at start of file or folder name"
Don't know what this means or how to fix what.

Looks like I don't have the correct coding ?
Reply With Quote
  #2 (permalink)  
Old 07-15-2009, 10:51 PM
ishkey's Avatar
Moderator
 
Join Date: Aug 2007
Location: North GA USA
Posts: 1,764
ishkey will become famous soon enough
Default

If it helps here is mine

# /robots.txt file for http://cleandeck.net/
# mail webmaster@cleandeck.net for constructive criticism

User-agent: webcrawler
Disallow:

User-agent: lycra
Disallow:

User-agent: *
Disallow: /tmp
Disallow: /logs

Just remember most of the bad spider crawlers ignore robots text. the email is bogus, it's used more as bait. the baddies get their email address and stop crawling my site.
__________________

Consultant - Programmer - WebMaster
cleandeck - lawn mower undercoating
wilmargraphite - graphite lubricants
Reply With Quote
  #3 (permalink)  
Old 07-15-2009, 11:19 PM
Junior Member
 
Join Date: Jun 2009
Posts: 11
narinesa is on a distinguished road
Default

Ok I understand your piece:
User-agent: *
Disallow: /tmp

Disallow: /logs
Now I want to add keyworks here to be indexed from robots.txt.

Is this possible to do keywords in robots.txt or do I have the entire concept wrong ?

(Don't want the keywords in index.html)
Reply With Quote
  #4 (permalink)  
Old 07-15-2009, 11:41 PM
ishkey's Avatar
Moderator
 
Join Date: Aug 2007
Location: North GA USA
Posts: 1,764
ishkey will become famous soon enough
Default

robot.txt is not for key words, your html, php files hold them.
don't know why you don't???
so don't put any in.
robot.txt was used in the old days to tell the spider what to index/ what to leave alone. The nice ones still respect this file.
If you want real protection use the .htaccess file.
something like this:

Options -Indexes

<Files 403.shtml>

<limit GET POST PUT>
order deny,allow


order allow,deny
allow from all
</Files>


ErrorDocument 404 /oops.html

#get rid of bad bots
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} ^BadBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^EvilScraper [OR]
RewriteCond %{HTTP_USER_AGENT} ^FakeUser
RewriteRule ^(.*)$
http://go.away/
deny from 217.199.217.3
deny from 98.131.11.144
deny from 63.251.179.32
deny from 193.46.236.151
deny from 195.251.117.228
deny from 190.72.184.105
deny from 89.149.241.126
deny from 85.140.206.177
deny from 195.251.117.0/24
deny from 85.140.0.0/16
deny from 89.15.191.25
__________________

Consultant - Programmer - WebMaster
cleandeck - lawn mower undercoating
wilmargraphite - graphite lubricants
Reply With Quote
  #5 (permalink)  
Old 07-15-2009, 11:52 PM
Junior Member
 
Join Date: Jun 2009
Posts: 11
narinesa is on a distinguished road
Default

I was looking to maintain a seperate robots.txt file to do 2 things:
1. to disallow all directories
2. to use the keywords.

Don't see this is possible.

I have to do robots.txt to dirallow all directories and
meta code all keywords in my index.html

Also is this the correct statement for my index.html
<meta name="robots" contents="noindex,nofollow">

Sorry, I am a newbie - just started this last month - alot to learn.
Reply With Quote
  #6 (permalink)  
Old 07-16-2009, 12:10 AM
ishkey's Avatar
Moderator
 
Join Date: Aug 2007
Location: North GA USA
Posts: 1,764
ishkey will become famous soon enough
Default

Yep you are right - but all you are leaving out are the good ones like google, yahoo. ms, just to name a few. The baddies do not give a S#%* about your meta tags or robots.txt file they will find the weak spot.

Don't understand your reasons for no index or not wanting to put keywords where they belong. If your content is written well enough, you may not need keywords. Some say they are on the way out, I say about halfway out.
__________________

Consultant - Programmer - WebMaster
cleandeck - lawn mower undercoating
wilmargraphite - graphite lubricants
Reply With Quote
  #7 (permalink)  
Old 07-16-2009, 03:50 AM
Junior Member
 
Join Date: Jun 2009
Posts: 11
narinesa is on a distinguished road
Default

My understanding of "noindex,nofollow" is that the crawler will not index my entire index.html page and follow the links and pick up junk.
Is my statement wrong ?

Also by providing a meta keywords in my index.html, crawlers will index the keywords. At least this is how I now have it setup.

Correction would be appreciated.
Reply With Quote
  #8 (permalink)  
Old 07-16-2009, 10:34 AM
ishkey's Avatar
Moderator
 
Join Date: Aug 2007
Location: North GA USA
Posts: 1,764
ishkey will become famous soon enough
Default

the NOFOLLOW directive only applies to links on the page it is written. It's entirely likely that a robot might find the same links on some other page without a NOFOLLOW (perhaps on some other site), and so still arrives at your undesired page.
Quote:
<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
Quote:
Also by providing a meta keywords in my index.html, crawlers will index the keywords. At least this is how I now have it setup.
you got it
__________________

Consultant - Programmer - WebMaster
cleandeck - lawn mower undercoating
wilmargraphite - graphite lubricants
Reply With Quote
  #9 (permalink)  
Old 07-16-2009, 04:48 PM
Junior Member
 
Join Date: Jun 2009
Posts: 11
narinesa is on a distinguished road
Default

Thanks for your assistance. I am online.

www.ForYourEyesOnlyAgency.com
Reply With Quote
  #10 (permalink)  
Old 08-13-2009, 08:28 PM
ishkey's Avatar
Moderator
 
Join Date: Aug 2007
Location: North GA USA
Posts: 1,764
ishkey will become famous soon enough
Default

No it is not necessary to use this file and their will not be any problems, but if you do use it, you are right it has to be in the root directory.
__________________

Consultant - Programmer - WebMaster
cleandeck - lawn mower undercoating
wilmargraphite - graphite lubricants
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT. The time now is 12:02 PM.



mouseover mouseover mouseover