• Protip: Profile posts are public! Use Conversations to message other members privately. Everyone can see the content of a profile post.

Spiders on Prime

Joined
10 September 2002
Messages
7,128
Location
Phoenix
I caught some spiders browsing Prime! I wonder if they are responsible for some of the performance problems we've had recently.
 

Attachments

  • spider1.jpg
    spider1.jpg
    5.7 KB · Views: 300
  • spider2.jpg
    spider2.jpg
    5.1 KB · Views: 319
PHOEN$X said:
I caught some spiders browsing Prime! I wonder if they are responsible for some of the performance problems we've had recently.

Hi

What does this spider do? Index the whole site for Jeeves users or what?

I am not that into these things so I maybe have misunderstood something.

How did you catch it? Have they registered at this site first?

Regards
 
They index the content of the site. How they appear "Ask Jeeves Spider" is interesting. I suspect something is built into the forum software that recognizes se spiders. Spiders routinely visit this forum. Evidence in in se relevant lsitings for this forum.
 
Martin, your understanding of spiders is impeccable for someone who claims to not be into these things. :D

You can see the Ask Jeeves Spider in action here. It's still indexing as we speak.

Gene, I was also surprised that the spider was so readily identifiable. I doubt it had to register first (I don't see it in the Members List. I had assumed that Lud would've implemented a robots.txt file to disallow indexing of these message boards, since it could dramatically increase traffic and bandwidth usage.
 
PHOEN$X said:
Gene, I was also surprised that the spider was so readily identifiable. I doubt it had to register first (I don't see it in the Members List. I had assumed that Lud would've implemented a robots.txt file to disallow indexing of these message boards, since it could dramatically increase traffic and bandwidth usage.
I can't think of a reason not to use a robots.txt file, and I would think Lud uses one. As far as bandwidth use by the spiders, I would make sure that all the uploaded images, videos and any other large file source wouild be stored in a directory that was prohibited by the robots.txt file. That way, only text is indexed, which is what I would want for SE placement purposes.

Most all SE robots are identifiable when viewing webstats, and I suspect the forum is displaying something similar when viewing the "who's online" thingy.

BTW, didn't I recently hear about Ask Jeeves being purchased (huge dollars)?
 
Neo, remember when you posted about spiders indexing our profiles? I just checked to see what Jeeves was up to. Yep, viewing (indexing) user profiles.
 
PHOEN$X said:
Martin, your understanding of spiders is impeccable for someone who claims to not be into these things. :D

You can see the Ask Jeeves Spider in action here. It's still indexing as we speak.

.

Hi

Maybe I am more into it than I know. :)

I know what they do, but my question should have been more on how they do it.

I can see it digging it's way throug this site.

Regards
 
KGP said:
Neo, remember when you posted about spiders indexing our profiles? I just checked to see what Jeeves was up to. Yep, viewing (indexing) user profiles.


Yup, I remember that. The really frustrating bit is despite what my username implies, I'm not a leet hacker that could hack our details out of the search engines! :( I guess I should be called "NeoDoofus" :D And you know there's something just plain wrong when you do a search for <B>PHOEN$X</B> and you get <A HREF="http://www.borntomotivate.com/DavidHasslehoff.html">this</A>. :rolleyes:
 
KGP said:
I can't think of a reason not to use a robots.txt file
***
well, first, i think i'd want to ask, is there any benefit to lud/users in letting spiders index prime?

if so, what's the problem with letting spiders index prime compared to the benefit of them doing so?

performance? relatively intermittent after the intial indexing, i'd bet.

it's the best nsx site on the internet and it's FREE, where else are we *really* going to invest our nsx-related time?

otoh, perhaps there's an economic advantage (to lud *and* users) in letting askjeeves/similar ilk collect data from this site. can't say/don't know, but if that were the case, who could blame lud from letting the system pay for itself/his years of hard work?

not i, as this may also relieve us of paying membership fees, april fools day jokes notwithstanding :)

as a matter of fact, although i don't really care for them much, i can't help but notice that prime has advertising, a la google, in various locations... and i think it's a good thing for him - cash in the pocket to offset his efforts/costs or DARE I SAY IT - allow prime to provide cash flow in return for his great work.

so, if there's some economic benefit (long or short term) in letting spiders crawl here, it makes perfect sense to me.

how 'bout we ask lud?

lud - what's the spider story here at prime?

hal
 
I allow spiders to index the forums. The whole point of this site is to make NSX information available, and if people are using search engines to look for something NSX-related I think they should be able to find that information if it is posted on the forums.

The spiders do not cause a problem resource-wise (server load or bandwidth). They are designed intelligently in that regard. They are all identified in the server log analysis reports. vBulletin also recognizes some of them.
 
Back
Top