PDA

View Full Version : Search engine question . . .



coldveins
November 20th, 2007, 21:09
My site is on google and my site has a "inline frame" design. Anyhow, the search engine is bringing up results that are direct link to the file like ,

www.urlhere.com/somefolder/news.html

Id would like it to just recognize the Sub domain pages in the search query because if they load a file like "news.html" than they will get that file which is built to be in a 300X400 <iframe> . So they have no use of the menu.

This will not work and i figure users that go in will be quickly to turn away from my site.

cheveirtua
December 8th, 2007, 12:23
Can't you just like block robots from crawling that section of your internal pages. There's a way for you to deny access to crawlers or robots going in and out of your website - specifically to that section only.

sep
December 8th, 2007, 18:51
Here is some info on how to do that: http://www.robotstxt.org/robotstxt.html

cfhdev
December 8th, 2007, 19:18
You can remove a page you do not want to show up from google.
To do this log into google if you have an account.
Go to Webmaster Tools.
If you do not have a google webmaster account you will need to create one.
Add your domain.
Verify the domain.
Go to the domain.
Click on tools.
Click on Remove URLs
Click on new removal request

cfhdev
December 8th, 2007, 19:20
You could also block or remove pages using meta tags

Rather than use a robots.txt file to block crawler access to pages, you can add a <META> tag to an HTML page to tell robots not to index the page. This standard is described at http://www.robotstxt.org/wc/exclusion.html#meta.

To prevent all robots from indexing a page on your site, you'd place the following meta tag into the <HEAD> section of your page:

<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">

To allow other robots to index the page on your site, preventing only Google's robots from indexing the page, you'd use the following tag:

<META NAME="GOOGLEBOT" CONTENT="NOINDEX, NOFOLLOW">

cfhdev
December 8th, 2007, 19:24
You could also block or remove pages using a robots.txt file

The simplest robots.txt file uses two rules:

User-Agent: the robot the following rule applies to
Disallow: the pages you want to block
These two lines are considered a single entry in the file. You can include as many entries as you want. You can include multiple Disallow lines and multiple User-Agents in one entry.

You can set an entry to apply to a specific bot (by listing the name) or you can set it to apply to all bots (by listing an asterisk). An entry that applies to all bots looks like this:

User-Agent: *

To block a page, list the page.

Disallow: /private_file.html

Google uses several different bots (user agents). The bot they use for the web search is Googlebot. for mobile search Googlebot-Mobile and Googlebot-Image follow rules you set up for Googlebot, but you can set up additional rules for these specific bots as well.