Biyernes, Hulyo 15, 2011

Activity 4: Searching the Internet

1. What are the advantages and disadvantages of using search engines?



Advantages:
  • The indexes of search engines are usually vast, representing significant portions of the Internet, offering a wide variety and quantity of information resources.
  • The growing sophistication of search engine software enables us to precisely describe the information that we seek.
  • The large number and variety of search engines enriches the Internet, making it at least appear to be organized.

Disadvantages:
  • Regardless of the growing sophistication, many well thought-out search phrases produce list after list of irrelevant web pages. The typical search still requires sifting through dirt to find the gems.
  • Using search engines does involve a learning curve. Many beginning Internet users, because of these disadvantages, become discouraged and frustrated.





2. Compare and contrast individual search engines and search metasearch engines
  • Individual search engines are compose of many individuals website like google,yahoo.ask and etc…..
  • Metasearch engines do not crawl the web compiling their own searchable databases. Instead, they search the databases of multiple sets of individual search engines simultaneously, from a single site and using the same interface. Metasearchers provide a quick way of finding out which engines are retrieving the best results for you in your search.



3. When is it appropriate to use a search engine?
  • Search engines are best at finding unique keywords, phrases, quotes, and information buried in the full-text of web pages. Because they index word by word, search engines are also useful in retrieving tons of documents. If you want a wide range of responses to specific queries, use a search engine.

4. When is it appropriate to use a search/subject directory?
  • Like the yellow pages of a telephone book, subject directories are best for browsing and for searches of a more general nature. They are good sources for information on popular topics, organizations, commercial sites and products. When you'd like to see what kind of information is available on the Web in a particular field or area of interest, go to a directory and browse through the subject categories.

5. What is an invisible web or “Deep Web”?
The "visible web" is what you can find using general web search engines. It's also what you see in almost all subject directories. The "invisible web" is what you cannot find using these types of tools.

6. How do you find an invisible web?

  • Simply think "databases" and keep your eyes open. You can find searchable databases containing invisible web pages in the course of routine searching in most general web directories. Of particular value in academic research are:
  • ipl2
  • Infomine

  • Use Google and other search engines to locate searchable databases by searching a subject term and the word "database". If the database uses the word database in its own pages, you are likely to find it in Google. The word "database" is also useful in searching a topic in the Google Directory or the Yahoo! directory, because they sometimes use the term to describe searchable databases in their listings.

  • Examples:

  • plane crash database

  • languages database

  • toxic chemicals database

  • Remember that the Invisible Web exists. In addition to what you find in search engine results (including Google Scholar) and most web directories, there are other gold mines you have to search directly. This includes all of the licensed article, magazine, reference, news archives, and other research resources that libraries and some industries buy for those authorized to use them.

  • As part of your web search strategy, spend a little time looking for databases in your field or topic of study or research. The contents of these may not be freely available: libraries and corporations buy the rights for their authorized users to view the contents. If they appear free, it's because you are somehow authorized to search and read the contents (library card holder, company employee, etc.).



  • The Ambiguity Inherent in the Invisible Web:

  • It is very difficult to predict what sites or kinds of sites or portions of sites will or won't be part of the Invisible Web. There are several factors involved:
  • Which sites replicate some of their content in static pages (hybrid of visible and invisible in some combination)?
  • Which replicate it all (visible in search engines if you construct a search matching terms in the page)?
Which databases replicate none of their dynamically generated pages in links and must be searched directly (totally invisible)?
  • Search engines can change their policies on what they exclude and include.

7. Why are these web pages not available in search engines or subject directories?

  • · There are still some hurdles search engine crawlers cannot leap. Here are some examples of material that remains hidden from general search engines:
  • The Contents of Searchable Databases. When you search in a library catalog, article database, statistical database, etc., the results are generated "on the fly" in answer to your search. Because the crawler programs cannot type or think, they cannot enter passwords on a login screen or keywords in a search box. Thus, these databases must be searched separately.
  • A special case: Google Scholar is part of the public or visible web. It contains citations to journal articles and other publications, with links to publishers or other sources where one can try to access the full text of the items. This is convenient, but results in Google Scholar are only a small fraction of all the scholarly publications that exist online. Much more - including most of the full text - is available through article databases that are part of the invisible web. The UC Berkeley Library subscribes to over 200 of these, accessible to our students, faculty, staff, and on-campus visitors through our Find Articles page.


  • Excluded Pages. Search engine companies exclude some types of pages by policy, to avoid cluttering their databases with unwanted content.
  • Dynamically generated pages of little value beyond single use. Think of the billions of possible web pages generated by searches for books in library catalogs, public-record databases, etc. Each of these is created in response to a specific need. Search engines do not want all these pages in their web databases, since they generally are not of broad interest.


  • Pages deliberately excluded by their owners. A web page creator who does not want his/her page showing up in search engines can insert special "meta tags" that will not display on the screen, but will cause most search engines' crawlers to avoid the page.


Sources:

http://www.gsn.org/web/research/internet/disadse.htm
http://www.sc.edu/beaufort/library/pages/bones/lesson1.shtml
http://websearch.about.com/od/invisibleweb/a/invisible_web.htm

Walang komento:

Mag-post ng isang Komento