|
Appendices
How to Search the Internet for Information
The Internet may be one of those things you keep hearing about, but never quite get around to. Or maybe its something youve dabbled in, and dont yet feel like youre using it to its full capacity. This short tutorial is meant to familiarize you with the basics, and point you towards people and other resources that will hopefully leave you more familiar and comfortable with using the Internet in general, and World Wide Web in particular.
What is the Internet? A small answer
Physically, the Internet is simply a collection of thousands of computers that communicate through networks. It started about 25 years ago with only a handful of computers, as an experiment to see what happens when a computer network is not given much central control. Originally funded by the US military, this computer network was a way to ensure that information could continue to be transmitted even if one line of communication was bombed away. Sometimes called an "information superhighway", the Internet can be thought of as many roads, and many destinations. All sorts of information can now be sent along these "roads": words, pictures, sounds, even movies. Different ways of sending this information have been evolving since the Internet became available to the general public, and the most powerful and widely used method is called the World Wide Web (also referred to as the Web, or the WWW).
Whats so Special About the World Wide Web (WWW)?
What makes the web the web is its own particular way of accessing the tons of information thats out there. It uses a method called HyperText Transfer Protocol (HTTP) to transmit and receive information (a protocol is simply a agreed set of standards for communicating information across a network). Whats special about HTTP is that it can send not only words, but also pictures, sounds, and movies, and does so relatively quickly. The letters HTTP form part of every address for a web page (sometimes called a home page), as in
You may have seen these mysterious letters form part of advertisements and TV commercials, and they are simply these addresses for web pages, sometimes called URLs, or Uniform Resource Locators. The other parts of the URL actually mean something as well. The "www" refers to the World Wide Web, "brown" means Brown University, and "edu" means education. Theres also "com" for commercial, "gov" for government, "org" for non-profit organization, and occasionally two letter abbreviations that tell you what country the web page is stored in, such as "jp" for Japan. It can be useful to take a closer look at the URL of a web page, especially if youre not sure whether its a business or non-profit organization putting out information.
The computer programs that make it possible for you to access the web are called "web browsers". The two most commonly used web browsers are Netscape, made by Netscape Communications, and Internet Explorer, made by Microsoft. People tend to have many opinions about which of these two programs is "better", in the same way that some will argue that Macintoshes are preferable to PCs, but in general they do exactly the same thing. Much of what has made the web so successful is that it is a way to make information available in a consistent fashion, so this means that no two web browsers can be too different.
Ok, Ok... How do I get connected?
There are a number of ways to get connected to the internet and the web, but most of them involve a considerable investment in personal computer hardware and research on different companies that perform internet services. By far the easiest way to begin your internet adventures is a visit to the local library. Many libraries have kept up with technology and often provide public computers with connections to the Internet (and thus, the web). Often there are introductory computer classes offered at the library as well, to get you acquainted with the software (programs that use the computer) and hardware (the computer itself, including monitor, keyboard and CPU) youll need.
A full discussion of the programs needed to use the web is beyond the scope of this introduction, but often it is enough to know the name of the program youll need (a "web browser") and a librarian can point you in the right direction. Netscape Navigator and Internet Explorer are the two most common web browsers, and both pride themselves on being easy to use. Once you get familiar using a web browser, and have some way of connecting to the Internet, you can move on to finding things on the web.
How do I find things on the web?
The tools you use to look for what you want on the web are called search engines. There are many different companies who run search engines, and all are organized around similar ideas, each with a few quirks of their own. Search engines in general have huge databases full of information about web sites: their addresses, and what they are about. Since the web changes at such a rapid rate, it is crucial that the different search engines keep up. The ways they keep up with so many new sites being added is what distinguishes them from one another. Two examples of the most widely used methods of obtaining and storing information about the web are listed below. A more comprehensive discussion and comparison of web search engines can be found at
www.yahoo.com (subject based)
Yahoo is one of the most popular search engines around, due to its easy to use look and manageable collection of websites. Unlike most other search engines, your search will not return billions of results, simply because Yahoo doesn't have that many web sites in its database. This may seem like a disadvantage, until you consider the reasons why this is case. Yahoo takes pride in claiming that every web site in its database has been reviewed by a human being and placed in an intuitive category. So what, you say? Well, as it turns out, human beings are not the only creatures searching the web (more on this below), and it can be pretty valuable for Yahoo's staff to sift through web sites and put them into categories.
Their search engine is called a subject based catalog, since web sites are not added to their database until placed in the appropriate category. You'll notice this when first visiting their site, which has the top-level categories right on the first page. You can go right into subjects like Education, Government, Entertainment, and find all sorts of websites related to your interests.
You can bypass all this organized structure and go right into a search, but each website that comes back as a match will have its particular category listed. So again, you can browse around the category after you've looked at that one website you were looking for. This is much like thumbing through the books on the same shelf as the one you're interested in.
http://altavista.digital.com (text based)
Altavista is the search engine to use when you're wondering whether there's a website for that very unique subject you are interested in (there is). Your search should be specific enough to stump the many millions of sites Altavista has stored in its database. Altavista is a good example to mention with Yahoo since the two companies work in partnership on the web. When Yahoo can't find a match for your search, it will pass along your information to Altavista.
Part of the reason Altavista has so many more sites to search through than Yahoo is that it relies much less on human beings to fill it's database with websites. As has been demonstrated many times before, computers can often do things much faster than humans, particularly repetitive tasks. Programmers have written tools called robots or spiders, to coast through the web as quickly as possible, and bring back all the web addresses it can. It brings some text from the web site back with it, to summarize what's available on the site. The text can come from the title, from the first few words on the website, or (in the smarter robots) the words that are repeated most often in the site. Some search engines boast they index (store) every word on a site. Either way, the little bit of text that appears with your results can be very uninformative, mostly because it's been gathered arbitrarily by non-humans.
The main drawback to searching with a text based search engine is the sheer number of results it can bring back for each of your searches. Part of this is due to the large amount of websites stored in the text-based databases, but many of these results are repeats of the same website, or pages within the same web site.
Finding what you want: some tricks...
Most of the search engines will have a space for you to enter text when you first arrive at their web site. A few tricks can narrow down your search results to get at what you're after, rather than the few million sites that are vaguely related to what you are looking for.
the quotation marks trick
Putting a phrase in quotation marks means a search engine will search for exactly that entire phrase. If it's something a little more specific, like "shaken not stirred", you'll probably get a good number of James Bond sites. The text based search engines do better with these than the subject based search engines.
the plus trick
Putting a plus sign directly before a word you are searching for allows you to search for more than one word at a time. So
would be a much less frustrating search than
The second search would bring back all the sites that had both words, but also the ones that only have one of the words. So you'd get many many sites that had the name "james" in them, and probably just as many sites about stocks and financial bonds.
the minus trick
The reverse of the above holds true for placing a minus sign before the word you are searching for. The search engine will exclude that word from your search.
All of these tricks can be used together to really narrow down your search, such as
+"shaken not stirred" +james +bond -"her majesty"
WEB GLOSSARY
HTML (HyperText Markup Language)
One of many possible markup languages, and a subset of SGML (Standard General Markup Language), HTML is the markup language that web browsers read when displaying a text file as a web page. HTML "tags" sit alongside the text in this text file and describe the formatting, where images will go, and what documents links point to.
HTTP (HyperText Transfer Protocol)
This is the protocol used to transfer information over the World Wide Web. A protocol refers to an agreed upon set of standards. This protocol can describe a wide variety of media (movies, audio, images, hence the term multimedia) and hyperlink them.
link
A part of your web page that points to either another part of the same page (relative, or local link) or to another page altogether (absolute link). It is usually underlined and blue, and is the great magic of the web.
URL (Uniform Resource Locator)
The address of a web page. Usually looks something like this:
http://www.brown.edu
http stands for HyperText Transfer Protocol (see below), www for World Wide Web, and edu refers to the type of institution hosting the web site, in this case, educational. (org is non-profit organization, com is commercial, gov is government)
web server
a computer which stores information that can be accessed using a
web browser
web browser
a software program which can take text files written in HTML and display them as web pages (with images, sounds, etc.). The most popular are Netscape Navigator and Microsoft's Internet Explorer.
|