ADD more FIB.
UPDATE SAQs.
Must include:  SGML, ActiveX.
New concepts not incorporated into PA: WAP/WML.
{Add excerpts/links to Part I of Niederst}
{Expand compression in MIME with pictures; add to preassessments}
{Incorporate emerging standards from W3C}
alert_red.gifupdated.gifLAST UPDATE: 2/11/03alert_red.gif
Constantly being Updated!

COSC 330

LEARNING MODULE I
REVIEW OF WEB FUNDAMENTALS

    This learning module is a review of the concepts associated with Internet in general and the World Wide Web in particular.  It is a concise summary of the online course COSC 120, Introduction to Cyberspace.  This is not a replacement for COSC 120, who's content is a prerequisite for COSC 330, but will serve as a concise summary of COSC 120 for those who did not take the course but have Web development experience and enrolled in COSC 330 via permission of the instructorNot all of the information contained in this learning module is directly relevant to COSC330, but it is still essential in order to understand the content of COSC 330 because in presentations and discussions I assume that students understand this background material.  Advice for studying this LM is given in the Study Guide for LM I; this is an online substitute for comments typically made in an on-campus course.  If you haven't already done so, read the Basic Study Guide, general advice for study of my online courses.

The Objectives of this learning module are:

  1. To survey the fundamentals of cyberspace, the Internet, and the Web that are necessary for efficient Web development; these are covered in the course COSC120.
  2. To survey the basic features of Web pages.
  3. To preview the Web Development facilities to be covered later in the course.
  4. To illustrate the techniques for studying this online, independent learning course.
TPQ 1: Rewrite the preceding objectives in terms of personal accomplishments to be attained after finishing the study of this learning module.(Note that this will be a standard exercise at the beginning of each learning module that is very important in order to "get you focused".  {The following will be developed later}For a hint, and link to Tony's answer, click on the link "Hints, TPQs" in the "Navigation Panel" along the left boarder of this Web page; this will be a standard facility throughout the course)

The sequence of presentations in this learning module is as follows.  You can click on any link to jump directly to that section.

  1. CONCEPTS (summary of the COSC 100  INTRODUCTION TO COMPUTER SCIENCE content relevant to COSC 330.)
  2. THE INTERNET
  3. THE WORLD WIDE WEB
  4. OVERVIEW OF WEB DEVELOPMENT
  5. SUMMARY
   INTRODUCTION

        In his landmark, high-tech noir novel, Neuromancer (1984; reviews at Amazon.com), William Gibson coined the word "cyberspace" which has come to represent the abstract computer workspace where all knowledge and information sources are linked via ubiquitous digital networks. Gibson christened this cyberspace "the matrix", the conduit for interactive, virtual multimedia. Since then, terms like "Information superhighway", NII (National Information Infrastructure, the future "super-network" of the U.S.A.), the "infobahn", etc. have appeared to hype the vision of the future where every individual has access to all the world's information via computer. All of these words lack concise, universally accepted definitions so in this class we will use "the matrix" to represent the totality of present-day computer networks (See *FIGURE LM1-1;) and (2) the "information space" to represent all electronically accessible knowledge which includes the matrix plus television, radio, the telephone network, etc. (Note that this latter definition is not limited to computer networks as it often is!) Several spectacular views of Cyberspace are illustrated by the *Atlas of Cyberspace, and fascinating animations (Java Applets) of Internet traffic for the world and the U.S.A. are provided by Matrix Information and Directory Services, Inc. (MIDS).

FIGURE LM1-1
The Relationships Between Various Networks of Cyberspace
(For a larger version of this illustration click here. You might want to open another browser window to view this; if so, right click (on a PC) or hold the mouse button down (on the Mac) and select Open Frame in a New Window from the pop-up menu.)
 




The Internet (often simply called "The Net".) is, by far, the dominant network of cyberspace. It began as a way to communicate text-based data (e-mail, text documents, etc.) and programs (binary files sometimes called executable files), but has dramatically evolved especially with the development, within the Internet, of the World Wide Web (also called WWW, W3, or simply "The Web"), during the 90's. Today one can communicate via multimedia in video conferences or even enter mutual "virtual worlds" where the multiple users interact in an environment that exists only in a computer's memory. These virtual worlds can be anything the creator can imagine! Such facilities are provided by the Web, a subnet of the Internet, that is the prototype of the cyberspace of the future.

The following presentation is a preview of the material to be covered in this course. It consists of (1) a review/preview of the basic computer concepts used to describe the Internet (section 1), a summary of the Internet components (section 2), and overviews of the World Wide Web (section 3) and Web development facilities (section 4).   The following content is concisely presented here as a review of prerequisite material  as well as a preview of Web development techniques to be covered in more detail in subsequent learning modules. NOTE: You should refer back to this Overview when studying later details to see how they fit into the overall context of cyberspace.

1. CONCEPTS (Summary of COSC100 Content Relevant to COSC330):

     The following basic computer concepts are essential to the discussion of cyberspace. They are covered in detail in courses like COSC 100, Introduction to Computer Science (You can access my online version of this course by clicking here.  You should do this in a separate window (Right click on the frame and select "Open Frame in a New Window" from the popup box.); otherwise you will get two navigation panels on the page!). They can also be learned by outside reading or looking them up on the World Wide Web (e.g. click on the links to  Webopaedia, Computer Desktop Encyclopedia, Whatis, or FOLDOC in the Navigation Panel to the left.  Click here navigati.gif#4 and read comment #4.).

SAQ 1: To see what they are like, look up the definition of "Cyberspace" in each of the four on-line references? For a hint, and link to Tony's answer, click on the link "Hints, SAQs" in the "Navigation Panel" along the left boarder of this Web page; this will be a standard facility throughout the course)

1.1 Computer Concepts:

  1. Computer = __________(1) (For a hint, and link to Tony's answer, click on the link "Hints, FIBs" in the "Navigation Panel" along the left boarder of this Web page; this will be a standard facility throughout the course) electronic machine that (a) processes digital data into information (numeric, text, or multimedia) (b) controls electrical devices.
  2. Microcomputer = computer based on a __________(2) a "processor on a chip".
  3. Computer System = people, hardware, software, data, and procedures.
  4. Hardware = physical equipment of a computer system.
  5. Software = __________(3) that "run" the computer.
  6. Program = set of step-by-step instructions, in a _________ __________(4), that causes a computer to execute a specific task in finite time.
TPQ 2: What is the difference between a calculator and a computer?  (For a hint, and link to Tony's answer, click on the link "Hints, TPQs" in the "Navigation Panel" along the left boarder of this Web page; this will be a standard facility throughout the course)
SAQ 2: What is the difference between hardware and software?

STUDY GUIDE NOTES:

  1. SAQs (Self Assessment Questions) and TPQs (Thought Provoking Questions) are learning aids that will be used throughout my learning material.   Both types of questions are designed to help you focus on the essential characteristics of fundamental concepts. SAQs act as "traffic lights"; if you can't answer one, it is a symptom of a misunderstanding, and you should review the notes to correct it. TPQs may have more than one correct answer; they may not even have any correct answer; they are simply there to make you think! You are strongly urged to think up your own SAQs and TPQs, using these as guides.  (The "Cyber Jeopardy" exercise in the PREASSESSMENTS  formalize this exercise by asking you to think up questions for each of the multiple choice answers.)  Searching your mind for such questions helps you to identify important concepts and think about them; thought is essential to obtaining understanding!
  2. You should work continuously on the PREASSESSMENT associated with each learning module as you study.  PREASSESSMENT 120-1 is associated with learning modules I and II; you should read questions 1-20 because the answers to those questions are in this learning module I.  For now, answer the questions by circling the answers, then, when you have to submit the PREASSESSMENT you can easily transfer your answers to the scantron form that will be provided the day before the preassessment is due.
  3. The blanksin the text, like the SAQs TPQs are learning aids. As such, the answers for them should NOT be written in the blanks; that simply turns the learning aids back into normal text (you are a spectator). Instead, if you feel you must write the answer down, place it in the margin or at the end of the chapter; then when reviewing the FIBs (Fill in the Blanks), SAQs and TPQs will make you think. (You become a PARTICIPANT instead of simply a spectator.)
1.2 Data Processing Concepts:

The following flowchart representation of the Input-Process-Output (I-P-O) process, FIGURE LM1-2, can be used to illustrate virtually any computing concept or process!  In this section this representation is used to visualize the conceptual operations involved in data processing.  In FIGURE LM1-3 this same schematic format is used to relate different parts of computer hardware.

FIGURE LM1-2: The "I-P-O" SCHEMATIC

  1. The schematic shows that information is processed __________(5), (facts, values, etc. organized for computer consumption); information is presented for __________(6) consumption.
    1. Direct input includes data as well as the programs that process the data (in word processing the data would be text and the program would be the word proessor) which are typically input from a keyboard, mouse, or some other direct input device.  In order to be processed the input must be encoded, i.e. translated from human language into machine (computer) language; this is done transparently (unseen by the user) as the input is read by the computer.
    2. Local output goes directly to the user, typically via the computer monitor, speakers, printer, etc. and involves decoding from machine language back into a form understandable by humans.
    3. Before being output to the user, processing may have intermediate output and return input involving disk storage or  communications.
      1. Store operations save output to a data file, e.g. a text file from a word processor or an HTML file from a Web browser.
      2. Communicate operations involve interactions with other computers; this is called "remote" input/output to distinguish it from "local" input/output.  Communications usually involves network transmissions, most often via the Internet. Unfortunately, many introductory texts still ignore the communicate activity (and miss the nice symmetry of the I-P-O schematic), so if you memorized a PC-centric version of this schematic you missed out on the fact that "the computer is the network" (Sun Microsystem's moto); be sure to remember the COMMUNICATE component and the nice balance of this schematic!
  2. Virtually all computers are digital, i.e. they can only process digital data (discrete electronic signals). Digital data is stored in memory as collections of electronic switches (transistors) either being on or off; these primitive data elements are called bits (binary digits) and are represented by humans as 1 or 0; a collection of eight bits is called one byte which are used to represent single alphanumeric characters.
  3. Computer data can have various forms including numeric (integer or "mixed"), text, and multimedia(audio, visual, etc.), but they are all digital and thus represented by precise collections of bits.
  4. Most "real world" data is analog (continuous rather than discrete); therefore, it must be converted to digital (A/D conversion) when encoded and visa versa (D/A conversion) when being decoded. (For the distinction between analog and "digital" data see section 1.C in Learning Module of COSC 120, REVIEW/OVERVIEW OF COMMUNICATIONS AND NETWORKING; however, this distinction is not critical to the following discussion.)
  5. Data and programs are stored (i.e. "saved") in files located in secondary storage. (See section 1.3.C, below.)
    1. Data files digital data that is the "raw material" for the computer programs.  Examples include numeric data stored as binary numbers, text stored as binary codes, etc.
    2. Program files contain the instructions that manipulate the data in data files. Program files contain machine languages instructions (in a binary format) that can be executed, without translation, by the computer are usually called "executable files".
  6. In order to complete a processing task, a computer might need to use data or run programs on other computers. This can be accomplished by communication via networks to which the client or server may not even be physically connected. (See section 1.5, below.)
TPQ 3: How can computers be networked without being physically connected?
SAQ 3:

1.3 Hardware Concepts:

     The following is a greatly oversimplified survey of the concepts associated with the interactions of the CPU with its peripheral devices.  It is intended only to familiarize the beginner with basic hardware terms needed to talk about computers used in telecommunications.  It is equivalent to the OVERVIEW OF COMPUTERS, part of my on-line course COSC 100, INTRODUCTION TO COMPUTERS; for a more detailed treatment see CENTRAL PROCESSING UNIT & PRIMARY MEMORY and INPUT/OUTPUT HARDWARE learning modules of that same course.

  1. Computer Classifications:
    1. An simplistic classification of computers can be made according to whether they are utilized by individuals or multiple users.
      1. Personal computers (PCs) are designed for the single user, and are the most common means of Internet access; in such cases they are called "clients" (See below.) which access the services available on "servers" on the Internet  PC's are microcomputers (computers based on microprocessors) which have subclassifications like desktops, portables, notebooks, etc.
      2. Multi-user computers can be loosely categorized, according to decreasing power and price, under the following types: supercomputers, mainframes, and minicomputers.  Mainframes and minicomputers are used as Internet nodes where they route communications traffic.  They are also used as Internet servers in which case they provide a "service" (See below.) like a Web site; however current, powerful microcomputers can also act as servers.
    2. In this course it is unnecessary to fully understand the distinctions between computer types, so further discussion of this topic is omitted.  As far as this course is concerned, it is only necessary to realize that users typically access cyberspace via microcomputers and that mainframes and and minicomputers are typically used as Internet nodes.
  2. Generic Organization of the CPU and Peripheral Devices:
    FIGURE LM1-3


    1. The arrows within the CPU schematic above simply dramatize the complex interaction of the two conceptual components of the CPU (Control Unit (CU), and Arithmetic/Logic Unit (ALU)) and primary memory; this schematic really reflects the organization of a microcomputer, but is less true of large, multi-user computers like minicomputers and mainframes.WARNING: There is a discrepancy in the way different people define the CPU; some texts include primary memory as part of the CPU (I believe this is the most accurate description, but few introductory courses, which focus on microcomputers, use this terminology.)  (For more details read Section 3 of LM IIIB, of COSC 100.)
    2. Input, output, communications, and secondary storage equipment are called peripheral devices.  These may be on-line (directly connected to the CPU) or off-line (often called auxiliary devices).
      1. Direct I/O hardware allows the user to interact directly with the computer; this distinguishes it from Indirect I/O described in the next section. Direct input hardware includes keyboards, pointing devices, etc., and direct output hardware includes monitors, printers, speakers, etc.
      2. Indirect I/O involves multiple outputs and inputs from devices connected to a computer before the final output goes to the user.  This has two basic subcategories, secondary storage and communications which ar briefly explained in the following sections.
      (For more details read LM V, of COSC 100.).
  1. *Secondary Storage is currently dominated by magnetic media (hard disks, removable hard disks, and floppies), but magneto-optical and read/write optical media (DVD, DVD-RAM, and DVD+RW) promise to revolutionize storage technologies.  (For more details read LM IV, of COSC 100.)  An excellent article on the near future of removable storage is published in the 5/21/98 issue of PC Magazine; this article also illustrates the ever presence of vaporware with the hyped 200MB floppy from Sony and the 20GB rewritable magneto-optical disk from TeraStore Corp which, both of which have yet to appear!   A really neat Web site for comparison shopping for hardware is PRICE WATCH, whose URL is www.pricewatch.com/
  2. Data communications is the background theme of this course, so knowledge of basic communications hardware, especially that associated with Internet access, is a prerequisite for COSC 330. (For a review, check out  LM II, of COSC 120.)  The overall picture includes the following.
    1. Data communications is a general term that has two subcategories:
      1. Networks involve groups of computers.  (See section 1.5, below.)
      2. Telecommunications is the technology that facilitates long distance communications between computers.  This overlaps with networking when more than two computers are involved.
    2. Advances in data communications have reoriented computing from a centralized system based on mainframes to distributed systems in which data and computing power is made to available to numerous, non-local users and all resources may be shared.  This trend will continue towards a goal of optimal distribution that is dynamic, i.e. systems will reconfigure themselves so that they offer the maximum facilities to the users currently on-line.
SAQ 4:
SAQ 5: What is the opposite of a distributed computer system?

1.4 Software Concepts:

      Software is a generic term for instructions that a computer can execute. Self-contained software is essentially synonymous with computer programs. Most textbooks classify software into two categories.  (I prefer three; see the concluding paragraph of this section.)

  1. Application software includes programs that turn the computer (a general purpose tool) into a special purpose tool.  Those relevant to his course include:
    1. productivity software includes:
      1. general productivity like word processors, electronic spreadsheets, database management systems, graphics packages, etc.
      2. Web development software including
        1. WYSIWYG Web authoring software like FrontPage, Dreamweaver, etc.
        2. Multimedia development software like Macromedia Flash or Fireworks.
      3. Software development tools (if these are part of a multiuser computer system this is more properly categorized as system development software; see section 1.4.B.c, below) including
        1. Scripting languages like JavaScript which we will learn in this course; Javascript is a "special purpose" languages designed to embed code directly within HTML documents.
        2. Java, a general purpose, object oriented languages optimized for distributed environments
    2. education/entertainment software like tutorials, training programs, games, etc.  I plan to make extensive use of online examples of this genre in this course. To find and evaluate the best of these online learning resources will be an overwhelming undertaking, so I would GREATLY APPRECIATE your keeping an eye on candidates and recommending them to me -- even after you finish this course!
    3. professional software for use in business, science, medicine, etc.,
  2. System software includes programs that allow users and their application software to utilize the computer resources (the computer itself, all its peripheral devices, and networks to which it is connected).  In general, system software has three subcategories:
    1. system management software, e.g. the operating system (OS), networking, telecommunications, etc.,
    2. system support software, e.g. utilities, device drivers, system monitors, maintenance, etc., and
    3. system development software, e.g. programming languages, Integrated Development Environments (IDEs), software engineering tools, etc.
SAQ 6:

1.5 Communications and Networking Concepts:

(For more detail, see LM II of COSC 120,  REVIEW/OVERVIEWS OF COMM. & NETWORKING.)

  1. "Communications" is a general word for the transmission of signals between two or more points via a communications channel.
    1. "Data communications" refers to computer data.
    2. "Telecommunications" pertains to transmissions over a distance in one of two forms:
      1. electronic transmission (via electrons) occurs through physical media such as wires and
      2. electromagnetic wave transmission (via laser, radio, TV, microwave, etc.) requires no media (the term "wireless" is used), except in the case of fiber optics in which light carries data through cables.
    3. Networking links computers so they can communicate, as well as share hardware and software.  The consequent unification of processing power leads to the goal of distributed computing (see section 1.5.N, below), which is the optimum, dynamic spreading of computing resources among users.
  2. Data Communications, in general:
    1. Types of transmission signals (See Figure C&N-3.):
      1. An analog signal is a continuous wave pattern that varied in frequency or amplitude to convey data. Most "real-world" data has an analog format.
      2. A digital signal is a pattern of discrete high or low amplitude pulses that represents binary data and are therefore used to transmit computer data.
    2. A carrier signal is a base signal for transporting data, superimposed on the carrier signal by modulation (altering) the carrier signal.  The most basic forms include Amplitude modulation (AM), Frequency modulation (FM), and Phase modulation (PM).  (Figure C&N-3 illustrates the AM and FM concepts)
SAQ 7: What is the difference between analog data and digital data?
  1. Transmission Characteristics:
    1. Transmission parameters:
      1. The transmission speed is the amount of data transmitted per unit time, e.g. bits per second, bps or bytes per second, Bps.
      2. Digital Signal Classifications (for North America) and Speeds:
        1. DS (digital signal) is a data transmission classification system based on multiples of 64 Kbps, the theoretical bandwidth of a single "voice channel" on the "plane ol’ telephone service" ("POTS") .
        2. OC (optical carrier) speed is a fiber optics classification system that is based on multiples of 51.84 Mbps.
           
          COMMON CARRIER CLASSIFICATIONS
          Service
          Voice
          Channels
          Speed
          (Mbps)
          DSO
            1
          .064
          DS1 (T1)
          24
          1.544 
          DS3 (T3)
          672 
          44.736 
          DS4
          4032 
          274.1xx 
          OC-12
          9150 
          622.xxx 
      1. The  network is only as fast as its slowest component (often called a "bottleneck").  The relative speeds depend on both the type of media and type of equipment used.
    1. Transmission channels include simplex (one way), half-duplex (two way, not simultaneously), and full-duplex (simultaneously two way).  There are three basicly different types of channels.
      1. Analog lines, e.g. POTS which carry analog signals via electrons.  To transmit data, the digital data must be superimposed, by a modem, on the telephone's analog carrier signal.
      2. Digital Lines carry digital signals and thus avoid the analog/digital conversions necessary for digital transmission over POTS.  There are currently two types of digital lines:
        1. ISDN (Integrated Services Digital Network) is a circuit-switched, dial-up service for transmitting digital data via a single wire or fiber optics cable.   Basic Rate service (BRI) can provide 128 Kbps bandwidth; Primary Rate Service (PRI) can provide 1.5 Mbps, equivalent to T1 transmissions.
        2. Digital Subscriber Lines (DSL) also transmits completely digital data over POTS.  It is a dedicated point-to-point technology that provides a practical maximum of over 6 Mbps using current technologies and up to 52 Mbps in the future.
      3. Wireless communication typically uses microwaves (electromagnetic waves with frequencies between Radio/TV and light; see Figure C&N - 4A.) or radio waves to provide high-capacity transmission (over 3 million bps) over line-of-sight channels.
SAQ 8: (a) Is all wireless data transmission electromagnetic?  (b) Is the reverse true, i.e. is all electromagnetic data transmission wireless?
    1. Transmission Techniques: *See FIGURE C&N-5 for a comparison of Baseband and Broadband
      1. Baseband transmission provides digital transmission without change in modulation; simultaneous transmission of multiple sets of data is accomplished by interleaving pulses using TDM (time division multiplexing).
      2. Broadband transmission is used to send multimedia over long distances. It modulates data, voice, and video onto a different frequencies using FDM (frequency division multiplexing).
      3. Multiplicity governs the number of people involved in a network communication session.  There are five categories: Unicast (1 to 1), .Anycast (to the nearest of several receivers),  Multicast (to a selected group of receivers), Broadcast  (to multiple receivers), and Datacast (allows computer data to transmitted simultaneously with a TV broadcast).
SAQ 9: What is the most important difference between baseband and broadband transmission?
  1. Communications Hardware:
    1. A modem is a device that transmits digital data over an analog channel by modulating the analog carrier signal.
    2. A codec transmits analog data over a digital channel.  (Note that "codec" which, in this case, stands for coder/decoder, has several other definitions when used in other contexts, e.g. compressor/decompressor in multimedia transmissions.)
    3. Multiplexers interleaves multiple communications so that can share a single communications channel. The two common multiplexing techniques are FDM and TDM.
    4. Controllers supervise data transfer between the CPU and terminals on a multiuser system.
    5. Concentrators perform the functions of both controllers and multiplexers among the things.
    6. Fax (facsimile machine) transmits images (text, pictures, etc.) over telephone wires.
    7. Network hardware; see section 1.5.H.a below.
SAQ 10: What is the difference between a modem and a codec?
  1. Communications Media:
    1. Electronic Cables transmit data, via electrons, through copper wires. These include Twisted pair wiring, Coaxial cable, and Cable television (CATV) cables which can be used with cable modems to rival  DSL technology for the future of high bandwidth data transmission for the general public.
    2. Fiber Optics Cables transmit data, via light, through glass wire bundles; they outperform electronic cables in transmission speed, bandwidth, interference avoidance, and inhibition of wire tapping.
  2. Communication software controls a computer’s access to system resources and stored data.
    1. A communications program manages the transmission of data, between a computer and another computer or network

    2. A communications application performs a specific communications service or, in the case of Browsers, several communications services.
    3. Other types of communication software include Terminal Emulation and Data-encryption.
  3. Communications Protocols:
    1. Communications protocols are standards  that govern the communications between computing devices.
    2. There are, currently, three basic categories of protocols:
      1. Basic protocols are either synchronous or asynchronous and govern error detection and correction ("parity"), etc.
      2. Modem protocols govern transfer of files via modem.
      3. Network protocols include WAN protocols (communications within complex distributed systems) and LAN protocols.
    3. TCP/IP is a suite of protocols that govern the Internet; see section 2.3, below.
    4. The OSI model is a standard, theoretical, seven layer, network model of protocols.
  4. Generic network architecture is a collection of linked "nodes" that form channels, clients, servers and supporting hardware/software. They provide the infrastructure for a distributed computing environment with its client/server processing model. This is the essence of the provocative statement,  "The network IS the computer".
    1. Network Components ("Nodes") :
      1. A terminal is any end point of the network.
      2. A server is a computer that provides network services.
      3. A host computer coordinates terminals connected to it.
      4. A hub connects several network nodes together, sharing the total bandwidth.
      5. A switch allows a non-shared connection between two network devices.
      6. A repeater facilitates data transfer between distant devices by regenerating an attenuated or distorted signal.
      7. A bridge is an interface linking two similar networks.
      8. A router is a computer manages the efficient routing of a transmission by selecting the "fastest" link to the destination.
      9. A gateway is a network computer that links two different types of networks.
      10. A firewall is a computer that controls access to a private network in order to maintain security.
    2. Basic network topologies includethe star (uses polling access), bus (uses contention access), ring (uses token passing access), and hierarchical.
SAQ 11: What kinds of network nodes are "invisible" to the network user?
  1. Computer Networks are the result of the reorientation of computing design from early isolated, centralized systems based on huge, expensive mainframe computers with numerous user terminals to distributed systems in which data and computing power is spread over all networked users thus allowing all networked resources to be shared. Distributed computing is based on the idea that "the network IS the computer (Sun Microsystem's motto)!  This profound phrase means that, when you are connected to the internet, your "computer" is not just your PC, but all the computers of the internet, a mind-boggling concept!!
  2. Networks consist of interconnected "nodes" that interact via a client-server model.
    1. Servers are network computers which provide resources to the user of the network. Server software are applications that are stored on servers but which can be accessed by users without downloading them to their local hard disk.
    2. Clients are computers at which users access servers on a network. Client software, running on a networked computer, is specifically designed to access server software, pass requests to it, and communicate results to the user.  In FIGURE LM1-4 the particular client software is a database management system; when a query is made, instead of downloading the whole database and searching on the client, the query is processed on the server and only the results are passed back to the client, a much more efficient use of resources.
FIGURE LM1-4
Simplified Client/Server Schematic
NOTE: The terms "client" and "server" are confusingly used to refer to the software as well as the computers on which they run.
SAQ 12: Modify FIGURE LM1-5 so that it illustrates the client-server interaction on the Web.
  1. Types of Computer Networks:
    1. A Local Area Network (LAN) is the smallest kind of network designed to serve users within a confined geographical space, like a room or building.
    2. A Wide Area Network (WAN) , e.g. the __________(7), covers a wide geographic area such as a state, a country, a dispersed corporation, or the world. They usually consist of subnetworks and incorporate common carriers that are licensed and regulated by government agencies providing telecommunication services for the public.
    3. A Metropolitan Networks (MAN ) is a less frequently used term that refers to networks larger than LANs but smaller than WANs, e.g. large corporate networks at a single location.
    4. Value-added networks (VAN) (e.g. GTE's Telnet and Tymshare's Tymnet) are public data networks, accessible via modem, for organizations that find private networks unfeasible. They make long distance connection to computing services less expensive than normal telephone service.
    5. In a switched network a temporary connection is established between two network terminals for each individual communication. Data is transmitted from sender to receiver by three types of switching:
      1. circuit switching (transmission only if receiver is ready) requires that a constant sender to receiver circuit be maintained for the duration of a transmission.
      2. message switching is permanent, like circuit switching, but the connection is automatic, and
      3. packet switching (message components , called "packets", may follow different routes). Unlike ____________(8) switching, which requires a constant point-to-point connection to be maintained, each packet contains the destination address and a number specifying its position in the message sequence. This allows each packet to be "dynamically routed" over any network link as they become available or less congested. The destination computer reassembles the packets back into their proper sequence. The dynamic routing capability of the Internet makes it virtually indestructible, because when any link "goes down" the network itself will automatically reroute the message packages, unknown to the sender or receiver.
    6. Dedicated (nonswitched) lines may be leased as network channels for the exclusive use of organizations transmitting large amounts of data.
SAQ 13: Give an analogy to circuit switching and message switching in today's telephone use.
SAQ 14: The combined networks at FSU would be called a _______; each computer lab at FSU would be called a _______; the combined networks of the University of Maryland System would be called a ______.
TPQ 15: Why would one say that the Internet is a more "efficient" communications network that the telephone network?
  1. Network Technologies:
    1. LAN Technologies:
      1. Ethernet is a bus technology that comes in several varieties:  twisted-pair, switched, and fiber optic.
      2. Token Ring networks implement ring technologies that are avaiable in two types: Type 1  connects up to 255 stations via shielded twisted pair wiring;  Type 3 connects up to 72 devices via unshielded twisted pair.
      3. FDDI is a ring technology for fiber optics LANs that has a range of 124 miles and can support thousands of users.
      4. ATM (Asychoronous Transfer Mode) is a dedicated-connection switchingtechnology available for LANs as well as WANs that provides realtime multimedia transmission.
    2. WAN Technologies:
      1. Unswitched technologies: The T-carrier system is entirely digital and provides full-duplex capability via coaxial cable, optical fiber, digital microwave, and other media. The most common are the T-1 line that provides 1.5 Mbps  and the T-3 line, that provides almost 45 Mbps.
      2. Switched services:
        1. Modem dial-up is the least sophisticated but most common service.
        2. ISDN
        3. Frame relay is a new technology optimized for cost-efficient packet switching for intermittent telecommunications throughout WANs atbandwidths between .065-45 Mbps.
        4. SMDS is a newpublic, connectionless, packet-switched service offered by telephone companies for interconnecting LANs in different locations, providing large bandwidth exchanges between enterprises over a WAN.
        5. ATM  for WANs is the same technology as that for LANs.
  2. Internet Connections:
    1. There are three basic Internet access methods:
      1. modem connections and Online Services (like AOL) provide temporary IP addresses that are reassigned after you disconnect.
      2. LAN connections are permanent because they have permanent IP addresses and can be left on indefinately
    2. Private networks can restrict access to their networks.
      1. Intranets are private networks that are restricted to users inside an enterprise.
      2. Extranets are private networks that are restricted to outside organization that are associated with an enterprise, e.g. people and corporations that doe with the enterprise like customers, suppliers, etc.
  3. Distributed computer systems, the ultimate goal of networking, offer a robust alternative to multiuser computers.   In a multiuser system, if the central computer "goes down" every user is out of luck; in a distributed computing environment when a computer malfunctions only the user of that computer is effected.  (See FIGURE LM1-5 for a comparison of distributed computer systems versus the PC.)  Three versions of distributed PC systems are:
    1. The new "Network Computers" (NCs as opposed to PCs) are computers which have no secondary storage of their own but access all applications from and store all projects on network servers.
    2. Networked workstations, e.g. Windows NT workstations, are PCs that are interconnected as well as connected to printers, servers (e.g. file servers which are computers whose hard disk is accessible to everyone in the network), net modems, etc.
    3. NetPCs and WebPCs are stripped down PCs (but containing local secondary storage) designed specifically to be part of a network via which they access data, application software, etc.  Their locally stored software are installed, maintained, and updated, via the network, under centralized control.


    FIGURE LM1-5

SAQ 16:

2. THE INTERNET:

(See the nice Internet description at How Stuff Works.)
(How Stuff Works is a COOL site; I suggest you explore it!)

2.1 The Internet is a Wide Area Network (WAN):

  1. The Internet (with a capital "I") is a network of networks within which all devices communicate via the TCP/IP protocol suite.  (The terms "intranet" and "extranet"  refer to private networks and extensions of private networks based on TCP/IP.)  It is a "meganetwork" linking (as of 1999) over 100,000 networks, at least 44 million hosts and approximately 150 million people in virtually every country in the world. (These numbers, from the Internet Society, are "guesstimates" because it is virtually impossible to measure them, and they increase daily; it is estimated that the Internet population increases 15% per month! See the MIDS graph of Internet growth.) The latest density of computers on the Internet is shown if Figure OOC-5. The Internet links government agencies, educational institutions, businesses, libraries, science foundations, non-profit organizations, etc.  (Also check out the various fascinating maps from An Atlas of Cyberspace; however, be aware that some of these pages take a long time to access because of their complex graphics.)
    1. No one runs the Internet; it is like a cooperative, i.e. a federation of independent networks. The Internet Society, a non-profit group in Reston, Va., promotes the use of the Internet
    2. It has an open architecture, meaning anyone can connect up and use it.
    3. It is a chaotic source of undisciplined information, an often bewildering maze to navigate.


    FIGURE LM1-6
    The Density of Computers in the Internet
    (For a larger version of this illustration click here.)

  2. The Internet can be thought from three viewpoints, a huge, dynamic network of computers, a collection of protocols, or collection of dynamic services.  Each of these view is briefly described below.
    1.  A physical network: it is a World Wide Network (i.e. a      (9) that is a maze of telecommunication lines which interconnect smaller networks.  For example our Compton laboratory networks are part of the FSU network which is part of the University of Maryland System network which is part of the Internet, but technically every FSU network computer is part of the Internet.
      1. Internet access is provided by ISPs (Internet Service Providers), companies that maintain Internet connections and rent their services to other ISPs or individuals.  In general, there are three categories of ISPs, local, regional, and national. (See Figure LM1-7.)
      2. The national ISPs, like MCI, Sprint, AT&T, etc. maintain "backbones" that act as "trunklines" that carry huge composite transmissions over long distances. In the U.S., access points to these backbones and the places where data moves from one backbone to another are one of two types:
        1. NAPs (network Access Points), also called Internet Exchanges (IXs), are junction points where national ISPs interconnect with each other.
        2. MAEs (metropolitan area exchanges) are NAPs that are strategically located to facilitate efficient transfers between different backbones.
        More information about ISPs and backbones can be found at Boardwatch's informative Web site,
      http://boardwatch.internet.com/
      1. In the idealized illustration below, a user would access their local ISP in Doylestown via a modem.  The local ISP links to the regional ISP which, in turn, links to the backbone of a national ISP.  Every computer in this schematic is part of the Internet (The individual using a modem is only temporary.); this graphically illustrates that the Internet is a network of networks.  For a thorough comparison of commercial ISP see CNET's analysis.
      FIGURE LM1-7
      Subnetworks of the Internet and Their ISPs
      1. For a better idea of the backbones in operation in the U.S. click here.
      2. Every device connected to the Internet has an Internet address that has two forms:
        1. The numeric IP address is used by the computer system and network.  It is a four byte number expressed, for humans, as four decimal numbers separated by periods, such as "131.118.80.1", the IP address of the DNS (Domain Name System; see section 3.5.C) server at FSU. Valid addresses thus range from 0.0.0.0 to 255.255.255.255, a total of about 4.3 billion addresses!
        2. The URL (Uniform Resource Locator) is a more understandable text address, used by humans, that contains the "name" of the computer that corresponds to its IP address.  For example the URL of this Web page that you are reading contains "www.frostburg.edu" which is the domain name of the server on which the Web site of this course is stored.  This name must be translated to its IP addresses before they can be used by networked computers; this translation is the job of the DNS server (mentioned above). (Note: the rest of the text in the URL specifies the protocol (http) used and the specific location of this page in the computer's files.  This will be covered in section 3.6, below.)
        NOTE: Internet addresses should not to be confused with and e-mail address.
    1. A collection of protocols which are conventions (rules) that govern the translation of digital data into and out of "packets" of binary data which can be transmitted over a network, e.g. the Internet. Protocols govern format, timing, sequencing, and error control. Without these rules, a computer cannot "understand" a stream of bits coming to its network connection. The protocols particular to the Internet are part of TCP/IP (Transmission Control Protocol / Internet Protocol) which is actually a collection, or "suite", of protocols which form the basis of communications over the Internet. They are routable (i.e.                  (10)  Switching) protocols which means transmissions are broken into packets which may be sent over different routes before arriving at a single destination where the packets are reassembled into the original message.

    2. Note that other network protocols, e.g. NetBIOS (IBM networks), NetBEUI (Microsoft), IPX (Novell networks), DECNet (DEC), etc., will be ignored in this course because they are not associated with the Internet.
    3. An ever increasing, conceptual network of Internet resources accessed by Internet services. (See section 2.2.) The resources are typical client-server environments.
SAQ 17: What is the similarity and (b) difference between an IP address and a URL?

2.2 The Internet provides a wide variety of "Services":

        Internet services are provided by application programs that implement protocols that are components of the TCP/IP suite. (NOTE: Most of these services are not unique to the Internet, e.g.. e-mail, chat, etc. but others are specific to the Internet, e.g. the World Wide Web.) They fall into three categories:

  1. Communication Services.  (For more details see Learning Module III, section 3.)
    1.  E-mail enables Individuals to exchange electronic messages; it is a network facility that provides users with a "mailbox " file, where messages are stored. Correspondence can be directed to specific users (with security) as well as to specified groups. Local mail is sent via the "mailer" program in system software. Non-local e-mail is routed over a               (11) such as the Internet.
      1. E-mail includes "Talk" or "Phone"services which, like "chat" (See 2.2.A.d, below.), facilitate real-time, interactive text transfers (not voice) between two Internet users.
      2. SMTP (Simple Mail Transfer Protocol), POP (Post Office Protocol), and IMAP (Internet Message Access Protocol) are e-mail protocols of the TCP/IP suite.  Both POP and IMAP use SMTP for communication between the e-mail client and server, but they make e-mail more user friendly.  POP allows users to download e-mail from a mail server to a PC where it can be read, answered, and stored on a hard disk.  IMAP is even better because it allows you to manipulate your e-mail account on the server.
      3. Note that Web based e-mail accounts, like Yahoo Mail and FSU's Sun Interface, use the Web procol, HTTP, as an interface to their e-mail servers.
    2. Newsgroup Services (e.g. Usenet or Internet News) exchange messages called articles arranged according to specific categories called newsgroups. Here the messages are passed from one system to another, not between individuals using e-mail. Unlike mailing lists these transmissions are not automatic, they must be requested by the user via local client software.
    3. Mailing lists allow computers to subscribe to the mass communications on a specified subject. Any e-mail received by a mailing list server is automatically forwarded to all subscribers.
    4. Chat/IM applicationsfacilitate real-time group communication by enabling users to join rooms or "channels" where all members receive a copy of a message sent to the channel they are visiting. (Private conversations can be arranged.) IRC (Internet Relay Chat) was the first such application but is limited to text messages.
      1. Instant Messenging (IM or IMing) is a modern extention of chat technology that adds features like "buddy lists", automatic notification when a buddy comes online, multiperson conferences, user profiles, filters, message histories, etc.. Popular IM applications include AIM (AOL IM), ICQ (for "I seek you"), Yahoo messeger, and Microsoft Network Messenger Service (MSNMS).   A public domain IM is Jabber.
      2. Some  chat application utilize multimedia to create virtual reality (VR) environments where users can assume an identity, called an "avatar", which moves through the chat environment interacting with the avatars of other users.
    5. Teleconferencing refers to real-time computer-based, audio/video interaction of two or more remote stations. Current chat applications apparently will evolve into full featured teleconferencing software.
      1.  Audio communication became possible using microphones and computer speakers.
      2. Graphics communications allow both users to type or draw on a common "whiteboard" or even modify an image loaded from a graphics file. The Netscape Conference is Communicators teleconferencing facility that allows audio and whiteboard communication.
      3. Video communication is possible using images from digital cameras. The freeware applications Microsoft NetMeeting (which we will use during this course) and iVisitprovides this between microcomputers. Multimedia transmissions require huge bandwidth so at present teleconferencing applications and "Video Phones" are rather primitive, especially if they involve color video transmissions between microcomputers.
      4. A good resource on all types of Internet conferencing (including chat, IM, etc.) is About Internet Conferencing.
SAQ 18: What are the similarities and difference between e-mail and voice mail?
SAQ 19: Distinguish between (a) e-mail, (b) mailing lists, and (c) newsgroups?
SAQ 20: (a) What is the difference between between chat, on one hand, and e-mail, Usenet, and mailing lists on the other?
SAQ 21: What is the difference between chat and teleconferencing?
  1. Resource access services. (For more details see Learning Module III, section 2.)
    1. File Transfer allows a network user to copy a file from one computer to another. It is typically used to "download" public domain (free) software or shareware (minimal cost paid, on an honor system, after a trial period) which has been "uploaded" (copied from a users computer to the file server). FTP (File Transfer Protocol) is part of the TCP/IP suite. Archie is FTP's associated search engine; it indexes FTP sites so that the user can determine what is available. An Archie search scans FTP sites and then offers a searchable database of the files it finds. These can then be downloaded via FTP. Archie has lost significance with the growth of the Web, but FTP is still the vehicle used to move files on the Internet.
    2. Remote Logon allows a computer user to access another (multiuser) computer, i.e. to log on to and use that computer as if his/her computer were directly connected to that computer. The user's CPU and operating system are "bypassed" and the user's computer simply becomes a terminal connected to the remote computer. The Telnet protocol provides this in TCP/IP.
  2. Information retrieval services unique to the Internet.  (For more details see Learning Module III, section 1.):
    1. The World Wide Web, the focus of this course, is called "THE Internet Killer Application" because its popularity is literally exploding!  Since 1994 it has not only dominated all other WANs (See the next section.) but all other services of the Internet, itself. "The Web" enables users to "browse" documents on remote servers using the HTTP (hypertext transfer protocol, a member of the TCP/IP suite). Everything (documents, menus, pictures, etc.) is represented to the user as a hypertext object (where clicking on the object activates a link to another object which can be within the document, in another file, or on another Internet resource).
      1. Typically, Web "pages",  are accessed by a "browser" (e.g. Netscape Navigator) running an HTML (Hypertext Markup Language) program. "Search engines", like Google, and "Search Directories", like Yahoo, are programs that allow browsers to search for Web pages with specified key words. Browsers actually provide many of the other TCP/IP services such as e-mail and FTP, which are usually built in, and remote logon which is added by "plug-in applications".
      2. VRML (Virtual Reality Modeling Language) is a developing standard that is designed to allow users to view the Web as a 3D virtual environment. The WWW has been
    2. Gopher/Veronica allows the user to access files on remote servers; the file names are presented as hierarchical menus. Veronica is a "search engine" which allows one to look for specific information on gopher servers, but, like Archie, is insignificant compared to the Web.
    3. WAIS (Wide Area Information System) is an automated Internet search service that allows users to locate documents containing key words or phrases, but, like Archie and Gopher/Veronica, has been almost completely superseded by the Web.

    4.  
TPQ 5: Think up a comprehensive collection of WITS/DB questions (See examples at the end of section 2.2.A.) that will help you distinguish Internet services of sections B and C, above.

2.3 The Internet is Governed by the suite of TCP/IP protocols:

(For more detail, see LM IV of COSC 120,  an overview of TCP/IP.)

    TCP/IP makes it possible for two computers which are part of different networks, that are connected by routers or gateways, to exchange data. This complex process involves the collective, cooperative interactions of several protocols of the TCP/IP suite, depending on the particular service being used.  (An outstanding, detailed illustration of the TCP/IP protocols and network services in their associated OSI level (from http://www.whatis.com/osifig.htm).
 In the following presentation, we begin at the highest level with a client sending a message to a server.

  1. Application protocols occupy the highest protocol layers and  provide specific services.  Unfortunately the application protocols of the TCP/IP suite do not fit nicely into one of the OSI layers.  The WhatIs diagram (referenced above) places them in the sixth (presentation) layer, but adds the caveat that they overlap the adjacent layers.  I prefer to simply place them in the top three layers of the OSI model, i.e. ignore the distinction in these layers as done in COSC120 LMIV, Figure TCP/IP-1.
    1. FTP (File Transfer Protocol) permits files to be transferred from one computer to another using a TCP connection. Transferring files from a server to a client is called ___________(a) and from client to server is called __________(b).  A related but less common file transfer protocol, Trivial File Transfer Protocol (TFTP), uses UDP rather than TCP to transfer file data.
    2. HTTP (hypertext transfer protocol) facilitates the viewing of multimedia files (text, graphic images, sound, video, etc.) from the World Wide Web. The essential  feature of HTTP is that it manages files that can contain hyperlinks to other files whose selection will produce additional transfer requests. To accomplish this, all Web servers contain an HTTP daemon, a program that is designed to wait for HTTP requests and handle them when they arrive.
    3. SMTP (Simple Mail Transfer Protocol) specifies the format of messages that an e-mail client on one computer can use to send (or receive) electronic mail to (from) an SMTP server on another computer.  Now SMTP is usually used to send e-mail while  POP (Post Office Protocol) and IMAP (Internet Message Access Protocol), two other e-mail protocols, are used to read it.  Both POP and IMAP use SMTP for communication between the e-mail client and server, but they make e-mail more user friendly.  POP allows users to download e-mail from a mail server to a PC where it can be read, answered, and stored on a hard disk.  IMAP is even better because it allows you to manipulate your e-mail account on the server.
    4. SNMP (Simple Network Management Protocol) is the protocol governing network management and the monitoring of network devices and their operation. It is not necessarily limited to TCP/IP networks.
    5. NNTP (Network News Transfer Protocol) allows client software, called "newsreaders", to access, read, reply to, or post messages on Usenet newsgroup servers, the electronic equivalent of a bulletin board.  NNTP servers, typically provided by ISPs, store the Usenet messages and provide the software to manage them.  NNTP client software may is typically integrated into your browser, but it can be implemented in a separate newsreader, which you may prefer to your browser implementation. NNTP replaced the original Usenet protocol,UUCP (UNIX-to-UNIX Copy Protocol).  NOTE: this was misleadingly omitted in the WhatIs diagram where they used "UseNet" (which is the service) instead of this protocol.
    6. Telnet is the TCP/IP protocol for remote logon.  Using Telnet, one can log on to a remote network computer as a regular user with whatever privileges that have been granted on the host computer.  Before the advent of the Web, Telnet was more frequently used, but now, with Web page "front ends" to services like e-mail servers, it is not needed.   For example, e-mail users used to have to actually log on to their e-mail server in order to use their account, but with a Web page front end, they can access their account via a browser.  Therefore, Telnet is now only needed by users who want to use specific applications or data stored on a particular host computer.
    NOTE: The WhatIs diagram includes two services (DNS and NSF which are not, themselves, protocols) in the same level as the preceding protocols.  Do not let this confuse you; all protocols, except Telnet, end in "P".
  2. Other emerging Internet protocols include:
    1. WAP (Wireless Application Protocol)  is actually a family of protocols, developed by Ericsson, Motorola, Nokia, and Unwired Planet,  that standardize communications between wireless devices, e.g. cellular telephones, PDAs (personal digital assistants), etc.  WAP facilitates Internet access, including e-mail, the World Wide Web, newsgroups, IRC, etc., on wireless devices.  The family of WAP protocols include:
      1. Wireless Application Environment (WAE)
      2. Wireless Session Layer (WSL)
      3. Wireless Transport Layer Security (WTLS)
      4. Wireless Transport Layer (WTP)
SAQ 22: What are the applications within Netscape Communicator suite that implement a particular protocol?
  1. TCP (Transfer Control Protocol) and UDP (User Datagram Protocol) facilitate the transmission of data streams (e.g. a complete e-mail message) between applications running on different hosts. They are connection-oriented protocols that manage the link between sender and receiver without reference to the network path between them (That is the job of _______(12)).
    1. TCP is a "reliable" protocol because it guarantees reliable delivery of the complete transmission by performing the error checking and handshaking necessary to verify that data makes it to its destination intact.
      1. TCP divides data streams into blocks called TCP segments and transmits them using IP. In most cases, each TCP segment is sent in a single IP datagram. If necessary, however, TCP will split segments into multiple IP datagrams that are compatible with the physical data frames that carry bits and bytes between hosts on a network. Because IP doesn't guarantee that datagrams will be received in the same order in which they were sent, TCP reassembles TCP segments at the other end to form an uninterrupted data stream. FTP and telnet are two examples of popular TCP/IP applications that rely on TCP.
      2. TCP sets up a connection at both ends of a transmission and uses checksums to verify the data integrity and handshaking.  It also manages the division of the message into uniform packets.  These packets are independent and may be sent via different paths through a network; when they are received by the TCP layer of the receiving computer it reassembles the packets into the original message.
      3. With TCP, data is transmitted in packets called TCP segments, which contain TCP headers and data from a higher level application.
    2. UDP is an "unreliable" protocol because it doesn't guarantee that UDP packets will arrive in the order in which they were sent or even that they will arrive at all. If reliability is desired, it's up to the application to provide it.
      1. UDP is a simpler alternative to TCP, which is similar to but more primitive than TCP.   However, UDP does have a place in the TCP/IP suite, and a number of applications use it, e.g. SNMP (Simple Network Management Protocol) applications which are provided with most implementations of TCP/IP.
      2. Unlike TCP,  UDP does not divide its data packets nor does it provide sequencing of packets. This means that the application program that uses UDP must be able to make sure that the entire transmission has arrived and is in the right order.
      3. Network applications, like streaming audio or video, prefer UDP because TCP's error checking an retransmission would interrupt the real-time continuous flow that streaming technologies require. Also applications that need to save processing time because they have very small data units to exchange (and therefore very little message reassembling to do) may prefer UDP to TCP.
  2. IP (Internet Protocol), a lower-level protocol than TCP or UDP, governs the transmission of data packets throughout a computer network.
    1. IP is responsible for packet routing, i.e. selecting the path that data packets (called IP datagrams) will follow to efficiently reach their destination.  This involves utilizing routers to "hop" between different networks, i.e. separate networks are tied together by the routers thus forming the Internet or an intranet.
    2. IP manages the address part of each IP datagram insuring that it is sent to the correct destination. Each gateway or router the packet traverses checks this address an forwards the message along the most efficient route.  Connections in a TCP/IP network are specified by 32-bit IP addresses, which are represented, for humans, as dotted decimal numbers, expressed as four decimal numbers separated by periods.  Valid addresses thus range from 0.0.0.0 to 255.255.255.255, a total of about 4.3 billion addresses.  (For example, Tony's Office Mac is 131.118.83.3 and PC is 131.118.74.21).
    3. IP could be called "the most fundamental of the TCP/IP protocols" because every other protocol depends on it; it is the foundation of the TCP/IP stack (of protocols).
    4. Other network layer protocols, that play less visible but equally important roles in TCP/IP networks, include:
      1. ARP (Address Resolution Protocol): A protocol for converting an IP address to the actual address of the computer that is recognized in the local network. For example, if the computer is on an Ethernet LAN, the 32 bit IP address must be converted, a 48 bit Ethernet address. (The physical machine address is also known as a Media Access Control or MAC address.) A table, usually called the ARP cache, is used to maintain an association between each MAC address and its corresponding IP address. ARP provides the protocol rules for making this connection and providing address conversion in both directions.
      2. RARP (Reverse Address Resolution Protocol): It converts physical network addresses into IP addresses, i.e. it is the reverse of ________(13).
      3. ICMP (Internet Control Message Protocol) is an extension to the Internet Protocol (IP) that allows for the generation of error messages, test packets and informational messages related to IP.  ICMP is a "support protocol" that uses IP to communicate control and error information regarding IP packet transmissions.  It allows IP routers to send error and control messages to other IP routers and hosts. If a router is unable to forward an IP datagram, for example, it uses ICMP to inform the sender that there's a problem. ICMP messages travel in the data fields of IP datagrams and are a required part of all IP implementations.
    5. A rather advanced tutorial on IP addresses and routing is found at http://www.sangoma.com/fguide.htm.  (There is no need to read this unless you really want to know what all the numbers of an IP address mean.)
SAQ 23 : What are the significant (a) similarities and (b) differences between TCP and UDP?
  1. SLIP and PPP are two protocols that allow two computers to communicate via a serial connection (in which bits are transmitted sequentially), thus they correspond to the OSI layer 2. Both transmit packets over serial links (either dedicated or dial up lines). They are most commonly used to allow modem/telephone connections to the Internet via an ISP but they can also be used to provide dial-up access between any two networks. For example, an ISP provide users with a SLIP or PPP access there server gives Internet access as long as the dial-up connection is maintained. However, a modem connection to the server via a serial line is typically slower than the parallel or multiplex lines (such as a T-1 line) of any network that is used to access the Internet directly.
    1. SLIP (the older of the two protocols) was invented to be used for communication between two computers that can be previously configured for communication with each other.  Basically it encapsulates TCP/IP packets with headers and trailers, thus allowing them, for example, to be sent via a modem/POTS to your ISP.
    2. PPP(Point-to-Point Protocol) provides a similar facility to SLIP, but, being more sophisticated, has largely replaced the older protocol.  PPP works with IP, but is designed to manage other protocols as well. Therefore, it is not necessarily part of the TCP/IP suite but is usually considered to be so.
      1. PPP is a full duplex protocol that can be utilized with various kinds of media, including twisted pair, fiber optic lines, or satellite links.
      2. The advantages of PPP over SLIP include the facts that PPP:
        1. can establish and terminate a communication session as well as hang up and redial if a low quality channel occurs.
        2. can manage both synchronous and asynchronous communications,
        3. can share a communications channel with other protocols,
        4. provides address notification, via which a server informs a dial-up client of its IP address for the current session, and
        5. it has built-in error detection.
      Connected: An Internet Encyclopedia, has a more detailed (but still concise) description of PPP at
      http://cth.ccsl.com.np/CIE/Topics/65.htm.


    NOTE: There are no TCP/IP protocols that correspond to the OSI layer 1.  The TCP/IP suite must use separate layer 1 protocols such as ISDN, ADSL, ATM, etc. to provide the actual connection to the physical medium over which the message is to be transmitted.

SAQ 24: What are the most commonly used TCP/IP protocols?

2.4 THE TCP/IP TRANSMISSION SEQUENCE (TCP/IP ARCHITECTURE):

  1. FIGURE TCP/IP-1 illustrates TCP/IP's layered design, showing the relationships among its most important protocols.   FIGURE TCP/IP-3 illustrates how data, in preparation for transmission, is encapsulated at each TCP/IP layer with "headers" and "trailers" and, after reception, how these are stripped off, interpreted, and acted upon in the receiving computer.
    1. FIGURE TCP/IP-3 shows that, as a unit of data "flows downward" (a figure of speech) from a client application to the network interface card, it is encapsulated at each of a succession of TCP/IP layers until it forms a "packet" that can be successfully routed over the internet to its destination.
    2. At each layer, it is encapsulated with layer data required by the equivalent TCP/IP layer of the receiver computer.
    3. If the network being used is Ethernet, the Ethernet card creates a standard Ethernet frame that encapsulates the data unit and its TCP and IP headers.
    4. The operations of the layers of the destination computer on the Ethernet frame are the reverse of those of the sender.  The data link layer strips off the Ethernet headers and trailers and passes the IP datagram to the IP layer; it is passed up with headers removed and interpreted until  the original data is supplied to the receiving application which can then be processed.
  2. Example: To illustrate the process of sending a transmission via TCP/IP consider a Web transmission, i.e. a Web browser (the client) uses HTTP to request the download of a Web page (HTML data) from a Web server attached to the Internet.
    1. The browser first creates a virtual connection (called a "socket") to the server where the Web page is stored.
    2. To download a Web page, the client sends an HTTP GET command (a sequence of bits) to the server by writing the command to the socket. Figure TCP/IP-4 shows that:
      1. the socket software uses TCP to add a header to the GET command thus forming a TCP segment and
      2. the segment is "passed" to the IP module, which in turn adds its header forming an IP datagram
      3. the datagram is then "passed" on to the data link layer of the particular network (e.g. Ethernet) which ultimately encapsulates the datagram with a header and trailer forming a frame
      4. the frame is finally forwarded, over the network,  to the Web server.
    3. If the browser and the Web server are running on computers connected to different physical networks (as is usually the case), the set of frames that make up the whole message go from network to network until they reach the one to which the server is physically connected. The different frames can follow different routes over the network.  Ultimately, the frames are delivered to their destination and reassembled so that the Web server, which reads chunks of data by performing reads on its socket, sees a continuous stream of data.
    4. To the browser and the server, data written to the socket at one end shows up at the other end, as if by magic. However, underneath, all sorts of complex interactions have taken place to create an illusion of seamless data transfer across networks.
SAQ 25: List, in sequence, the TCP/IP headers and trailers that are added to an e-mail message
SAQ 26: In FIGURE TCP/IP-3, an HTTP header correspond to what?

2.5 USING TCP/IP:

  1. The TCP/IP software on a computer provides platform-specific implementations of TCP, IP, and other members of the TCP/IP suite. Modern PC operating systems have TCP/IP applications bundled within the O.S; older O.S.. like Windows 3.1/DOS required that TCP/IP software be installed before Internet connections could be established.
  2. Modern software bundles all the TCP/IP protocols in a "TCP/IP stack"; this term reflects the hierarchy of these integrated protocols, they are  referred to, collectively, as the TCP/IP stack. The application layer protocols include (but are not limited to) the World Wide Web's Hypertext Transfer Protocol (HTTP), the File Transfer Protocol (FTP), Telnet (Telnet), and the Simple Mail Transfer Protocol (SMTP).
  3. When you given access to the Internet (e.g. by your ISP) you will be provided with software that incorporates TCP/IP applications.  Every other computer on the Internet (or corporate intranets or extranets) have similar TCP/IP stacks although they may come from different companies.  The operations of this stack of programs are completely invisible to the user. In other words TCP/IP, as far as the user is concerned, simply turns innumerable small, unknown networks into one big one (the Internet or an intranet) and provides all the services needed for applications to communicate with each other over that network.
3. THE WORLD WIDE WEB:

        In section 2.2, we specified three "information retrieval services",_____________________ (14), _______(15), and _______(16) that are unique to the Internet.  The latter two are no longer important because they sites have by now almost been completely replaced by equivalent Web sites.  Therefore information presentation and retrieval, for the foreseeable future, will be centered on the Web;  COSC 120 is mainly based on search a retrieval aspects whereas COSC 330 focuses on the presentation aspect.

3.1 The Web Concept:

  1. The World Wide Web (Web, WWW, or W3) is a distributed, hypermedia information retrieval system. It is not an application nor protocol like Telnet, FTP and Gopher (HTTP is the protocol of the Web.).  Instead, an invisible network (or web) within the larger network of the Internet. It can be thought of, at least two ways:
    1. as a network of computers, i.e. a subnet of the Internet whose protocol is ______(17) and
    2. as a web of documents, i.e. a distributed "virtual database" of multimedia documents, written in ______(18), whose content is accessed by hyperlinks.
  2. The nonlinear nature of documents accessed by hyperlinks puts the "web" into the Web. (See FIGURE LM1-8.)  A location (text phrase or graphic) in any document can be linked to
    1. another location within the same HTML document, i.e. a "target" in the same HTML file.
    2. another document on the same computer (typically, but not necessarily another HTML document (file)) , or
    3. another document on another computer (________(19) server) on the Internet.
    All these documents are accessed by a client program, called a __________(20).
FIGURE LM1-8
Hypertext vs. Normal text

3.2 History of the World Wide Web:

  1. The concept of the Web is attributed to Tim Berners-Lee of CERN, the European Laboratory for Particle Physics in Geneva, Switzerland, who first proposed it in 1989; CERN developed the first WWW prototype in 1990. (Streaming multimedia interview on ZDTV's "Big Thinkers") In the document About the World Wide Web, he wrote about his vision the Web, "the universe of network-accessible information, an embodiment of human Knowledge." You can access that document at
  2. http://www.w3.org/hypertext/WWW/WWW
    Berners-Lee wanted a single means of access (one client) to the diverse services of the Internet (See FIGURE LM1-7.)
     
    FIGURE LM1-9
    Web Access to Net Services
  1. To overcome problems of incompatibility between different sorts of computers, the WWW introduced the principle of "universal readership," which states that networked information should be accessible from any type of computer in any country, with one easy-to-use program.
  2. The first Web documents were only hypertext, and thus not so inspiring as the multimedia documents that make up the Web of today. The first multimedia browser, Mosaic, was developed by Marc Andreesen, Eric Bina, and others at the National Center for Supercomputer Applications (NCSA) at the University of Illinois. However, it was not until Andreesen left NCSA, co-founded Netscape Communications, and developed the browser, __________ __________(21) that the popularity of the Web really exploded.
3.3 Advantages of the Web:
  1. The Web facilitates multiple protocol support. (See FIGURE LM1-9.) To access any Internet service, all one needs to do is type the URL type (associated protocol or keyword) followed by the domain name (file location), e.g.

  2.  

     
     

    http://www.fsu.umd.edu/<path to some HTML File>

    accesses an unspecified Web page on FSU’s web server; the http designates the URL type. (Sometimes, as in the case of http, this is the same as the protocol.) The www.fsu.umd.edu identifies the server and <path to some HTML File> is a generic symbol for a sequence of directory names followed by a specific file name.

SAQ  27: Give the equivalent of <path to some HTML File> for this page you are reading.
    Other URL types include  ftp, telnet, mailto, news, gopher,  wais, etc.; when they are typed into a browser, it invokes the associated protocol and accesses that Internet service.
  1. The Web is designed to provide access to distributed, dynamic, and platform independent information.
    1. A distributed system is one in which computer resources are distributed throughout a communications network.  Each of the networked computers is designed to handle its local workload but has access to all the resources of the network.   The network itself supports the system as a whole, based on the client-server model.  This is the opposite of a centralized multi-user computer like a mainframe. The amount of information which can be stored on the Internet is limited only by the number of computers and their collective storage space. Thus the Net effectively has an infinite storage capacity!
    2. The content of the Net is constantly changing and evolving. This dynamic nature of the Internet means that users have access to the most up-to-date information possible, like a living, unlimited, multimedia encyclopedia. The disadvantage of dynamic information is that it can disappear if the network connection is blocked or the file is moved (or removed) from its server; resulting "dead links" are the bain of the Web user!
    3. What makes the Web so radically different from other computer facilities is that it is "platform independent", i.e. it can be accessed from any kind of computer and any operating system. All one needs is a browser designed for the operating system you use; the browser GUI is thus the same on all computers. The Web documents are written in HTML, a platform independent language, which means they can be stored on and accessed from any kind of computer system, as long as it implements TCP/IP.
  2. Unlike most Internet services, access to Web information is user friendly in that it is interactive and easily explored.
    1. What makes the Web so interactive is its ability to accept information from users and perform various actions based on these responses. This is accomplished by using various techniques including:
      1. forms, a special Web page that includes text fields, check boxes, radio buttons, menus, and popup lists that give the user the ability to interact with the Web server.  (See the text, Chapter 12.)
      2. JavaScript (see below)
      3. Java (see below)
      4. proprietary technologies such as
        1. Macromedia Flash (see below)  (See the text, Chapter 21.)
        2. Director Shockwave (see below) (See the text, Chapter 21.)
    2. Web access is based on hypertext which allows hyperlinks to be embedded in text; this has been extended, in "hypermedia", to embed hyperlinks in graphic images as well. It is now possible to move between Net documents by pointing and clicking, without needing to know the physical name of the file or even the address of the computer on which it is stored.
  3. The Web facilitates nonlinear access thus providing user control over the sequencing of information retrieval.  HTML makes it possible to embed hyperlinks into the text, thus creating "hypertext", i.e. text that also is linked to other text so that the information sequence depends on choices of the user.  The hyperlinks can use different protocols making it possible to access documents with various Internet protocols.  Thus the browser concept integrated the use of all Internet protocols into one client.
3.4 Basic Web Concepts:
  1. Web information is normally contained in HTML documents. HTML (Hypertext Markup Language; see below) allows one to "program" a document by describing its layout, contents, and hyperlinks with "style tags" embedded in text files. At first, HTML documents were created using a pure ASCII text processors; the style tags were typed in along with the regular text. Now, sophisticated HTML editors (e.g. Macromedia Dreamweaver, Microsoft FrontPage, and Netscape Composer, part of the Netscape Communicator suite) can generate HTML using WYSIWYG GUIs.
    1. An HTML document is any text document written in the prescribed HTML format with imbedded tags.
    2. A Web page is an HTML document that is made available, by a Web server, for access via the Internet.
    3. A home page is the default starting point or organizational center for any collection of Web pages.  It typically has the name index.htm or index.html and is opened automatically by the Web server when a Web site (See next.) is accessed.
    4. A Web site is an integrated collection of Web pages which is normally collected in a single directory (folder) called a Web account.
  2. A hyperlink is text (hypertext) or an image (hypergraphic)  that is distinguishable as a link to another location in the same document or to another HTML document. The browser is designed to detect when the user clicks the mouse on a hyperlink; it then locates the destination and downloads it into the browser. The convention for designating hypertext is the underline, so underlines should not be used in hypermedia documents for other reasons. There are three basic types of hyperlinks:
    1. absolute links are used to access a different Web page and thus must give the absolute URL, i.e. the complete URL, of that document.  These typically lead to the beginning of the Web page unless they contain a target. (See the next item.)
    2. target links point to a named "target" placed within a Web page; when incorporated into an absolute link, this allows a link to go to any point within any Web page.
    3. relative links are pointers to another file relative to the location of the current file, i.e. the document where the link originates.
      1. When a link is created between documents within a Web site it is a relative link because it is specified relative to the document in which the link originates. This is typically done, in an HTML authoring program, by selecting a link button and browsing to the HTML document to which the link is made.
      2. When an HTML document is published, the relative links still work as long as the organization of the files on the server is the same as that on the computer where they were created.   Therefore, all developers need to do is make their links work on their local storage, then, if the file structure is uploaded intact (i.e. it is a perfect reproduction of the original file structure) the relative links between all files on the server will work.
    Illustrations of these different link types can be found in this document.  Investigate the source of this page (You will have to open the page in its own window and then select Page Source from the View menu.) and you will find examples of all the above links.  Look for the tags that begin with <a href=; these are all hyperlinks.  If what follows the equal sign is (1) a complete URL, it is an absolute link, (2) # followed by text, it is a target link, or (3) a path name followed by a file name, it is an absolute link.
  3. Bookmarks (sometimes called "hot links") are links that are saved in a HTML file so they can be retrieved and traversed in the future.
SAQ 28: What is the difference between a relative link and an absolute link in an HTML document?
  1. HyperText Transport Protocol (HTTP) is a member of the TCP/IP protocol suite that defines how to identify, send, and retrieve Web documents.
  2. A browser is ________(22) software for viewing HTML documents and navigating hyperlinks to other documents, not necessarily on the Web.
  3. Plug-ins and Helper applications are programs that can be used by a browser to overcome its inadequacies
    1. Plug-ins typically are software components that are added to the browser itself.  For example, if a browser does not support the format of an image or sound file (See Embedded Files in the next section.) that is embedded in an HTML file, the browser may use a plug-in specifically designed to view that type of image or play that sound. A popular example is Real Player which allows one to access to streaming multimedia. Although browsers typically come bundled with some plug-ins, they usually have to be downloaded and installed in the browser.  Modern browsers will prompt the user when a plug-in is needed and will even automatically access the server where that plug-in can be downloaded.
    2. Helper applications are separate, stand-alone programs that perform a task the browser can not. (These are not as prevalent now that browsers come with more built-in facilities.)  Helper applications are typically used when a browser does not support a particular communications protocol.  In this case an application that provides that service can be executed by the browser. For example, telnet access was not built into Netscape Navigator 3.0 so a separate telnet application, available on the same computer, had to be run by the browser.  Usually the user specifies, in the browser preferences, the particular application to be used in a particular situation.
  4. Embedded Files: In addition to text, HTML documents can contain links to graphic images, video clips, and sounds. These elements are stored in separate files (not necessarily on the same server as the original HTML file) called MIME (Multipurpose Internet Mail Extensions) files; (See section 4.3; for more information click here.)  When the HTML document is displayed by a browser, the browser shows those elements that it can handle and passes off (to plug-ins or helper applications) those that it can not . There are numerous MIME file formats discussed in LM V, but the most common are:
    1. Of the image files GIF (a simple format used for basic pictures) is the most common, but the newer JPEG (a compressed format that stores high quality images in relatively small files) is used for information rich images.
    2. MPEG is a motion image format for displaying images and sound.
    3. AU and WAV are digital audio file formats for playing sounds.
SAQ 29:  "Embedded" files is a misleading term when used to describe HTML documents!  Why?
  1. "Push technology" is a way of automatically delivering Web pages to a browser without the user selecting it.  Instead some program, called an  "agent" selects the page, usually based on preferences pre-specified by the user. Push is the opposite of "Pull", the normal Web access, in which users selects a page by actually clicking a hyperlink.  This technology, pioneered by Pointcast Network, blends the Web with TV (which automatically delivers content to the user).  Push was hyped as a way of providing an intelligent software "adviser" (the agent) that would recommend Web pages to the user thus reducing the need to search through an overwhelming number of Web sites to find pages of interest.  However, some consider it an invasion of privacy.
3.5  The WWW as a Subnet of the Internet:
  1. The WWW is the Network of Web Servers, Accessible by HTTP.
  2. WWW Clients Access Internet Resources via URLs.
    1. URL (Uniform Resource Locators) are the addressing system of the WWW. This system was developed to allow browsers to access any information currently available on the Net (provided by Gopher and WAIS, in addition to _____(23)); in fact, it was designed to incorporate future developments in Internet technology as well.
      1. A URL is the Internet-wide address of any document you can read with a WWW client, i.e. a _________(24). A URL can describe any file on the Internet, even though different files may require different protocols to access them.
      2. The URL (1) instructs the client program how to contact the server, (2) tells the server to transfer the designated document to the computer on which the client resides, where (3) the client displays the document. All of these activities require just one action from the user: typing the URL or clicking on a link.
    2. A URL can have, at most, five distinct parts.
      1. The left-most part of a URL is the URL type or protocol prefix used to access the Internet address. The types recognized by a browser include:
        1. http:// which designates HTTP and accesses Web sites.  (This is the browser "default" so if the prefix is not typed, the browser will assume http and automatically insert it in front of the URL.)  https:// designates a Web document on a secure server.
        2. ftp:// which designates file transfer protocol used to upload and download files via TCP/IP.
        3. telnet:// which designate the telnet protocol used to log on to a remote computer or run applications on a network server.  (rlogin://  and tn3270 are infrequently used alternates to telnet.)
        4. wais:// which designates Wide Area Information Server, an infrequently used information service.
        5. gopher:// which designates a Gopher server, another information service that is virtually obsolete now.
        6. news: which opens the newsreader client associated with the browser and accesses a Usenet newsgroup. snews: opens accesses a newsgroup at a secure news server.
        7. mailto: which opens the e-mail client associated with the browser so that e-mail can be read or sent.
        8. file:/// which opens a file on the local computer system.
        Note that the part after the colon is interpreted according to the access scheme. In general, two slashes after the colon introduce a host name (host:port is also valid, or for FTP user:passwd@host or user@host). The port number is usually omitted and defaults to the standard port for the scheme, e.g. port 80 for HTTP.
      2. The domain name of the server (or ______(25) name) on which the Internet document resides. (See section C below.) This ends with a slash, followed by . . .
      3. the directory path or sequence of directories (or folders) separated by slashes which precede . . .
      4. the file name of the document to be accessed (which is not always required). The file can contain any type of data, but only certain types are interpreted directly by most browsers. These include HTML and images in gif or jpegformat. The file's type is given by a MIME type (See section 4.3, below) in the HTTP headers returned by the server, e.g. "text/html", "image/gif", and is usually also indicated by its filename extension. A file whose type is not recognized directly by the browser may be passed to an external "viewer" application, e.g. a sound player.
      5. The last (optional) part of the URL may be either a
        1. a "target" preceded by "#"; this indicates a particular position within the specified document, or
        2. a query string preceded by "?" which activates a CGI script and allows the user to enter a query.  (You can see an example of a query string, if you access FOLDOC and type in a term to look up (e.g. if you type in "FTP" you will see the query string ?query=FTP&action=Search at the end of the URL displayed in the Location box when the answer appears.)
      Only alphanumeric, reserved characters (:/?#"<>%+) used for their reserved purposes and "$", "-", "_", ".", "&", "+" are safe and may be transmitted unneeded. Other characters are encoded as a "%" followed by two hexadecimal digits.
SAQ 30: Which URL types are not written as protocols, e.g. "http"?
SAQ 31: Identify the parts of the URL, http://www.frostburg.edu/dept/cosc/htracy/cosc120/MODULES120/servicesIR.htm.
SAQ 32: The sequence of directories and file name, when taken together are called what?
SAQ 33: Give analogies between similar parts of a street address and a Web address?
  1. The Domain Name System (DNS) is a way of associating arcane numeric IP addresses with more memorable "domain names" used in URLs. The Internet Protocol (the "IP" in TCP/IP) uses Internet address information to access every node (client, server, printer, etc.) on the Internet. Every IP address is a series of four integers separated by periods (called "dots"), for example, 131.118.95.254, the unique address of the FSU gateway (to the UMS network).
    1. There are two big problems with IP addresses. (1) It is difficult to remember pure numeric addresses and (2) sometimes these IP addresses change. To solve these problems the DNS was designed to handle the addresses of Internet nodes.
    2. The DNS establishes a hierarchy of domains (groups of nodes on the Internet). The domain at the top level of the hierarchy maintains a database of addresses of the subdomains beneath it. Each subdomain has similar responsibilities for their subdomains, and so on. For example, the domain name of one of the administrative computers at FSU is fra00.fsu.umd.edu; the top domain is edu, which stands for _________(26); just below that is umd which stands for _____________(27); below that is the fsu domain; fra00 is the ________________(28).
    3. Top-level domains (TLD) specify the general category of the domain.  Until 1998 TLD names were restricted to:
      1. gov for Government agencies
      2. edu for Educational institutions
      3. org for Organizations (nonprofit)
      4. mil for Military
      5. com for commercial business
      6. net for Network organizations
      7. country abbreviations e.g. uk for Great Britain,  de for Germany, etc.
      The limitations resulting from these restricted categories were removed in 1998 when the Internet Ad Hoc Committee (IAHC) proposed six new top-level domains (However, I have yet to see any of these and haven't heard any discussion for a long time!):
      1. store for merchants
      2. web for parties emphasizing Web activities
      3. arts for arts and cultural-oriented entities
      4. rec for recreation/entertainment sources
      5. info for information services
      6. nom for individuals
    4. The easily recognizable domain names and their associated IP addresses are maintained on DNS name servers which also performs the conversion from domain names to actual IP addresses. The DNS at FSU is maintained on a name server with the IP address 131.118.80.1; it has the domain name freris.fsu.umd.edu.
    5. When the IP address of a node changes, the database of the DNS name server is updated but the domain name remains the same. Thus one never has to worry about the actual address of an Internet resource or whether it has been changed.
    6. The Internet Registry, a part of the Internet Activities Board (IAB), currently maintains the DNS.
4. OVERVIEW OF WEB DEVELOPMENT:

       The essence of Web development is (currently) the "generation" of HTML documents and publishing them on a Web server.  However there are a growing number of techniques that compliment HTML, adding multimedia and interactivity to Web sites.  These are a primary focus of COSC 330 and are previewed in the following sections.

4. 1 Web Development involves several overlapping techniques:

  1. HTML "documents" are actually HTML programs that are a composite of
    1. simple text formatting,
    2. hyperlinks to other documents or multimedia files,
    3. embedded code of other languages, typically authoring languages (See section 4.2.A.), and scripting languages (See section 4.2.B.)
    4. invocations of separate programs written in other languages, typically Java or CGI scripts.
    The HTML, embedded code, and external programs collectively direct the browser to display formatted text and dynamic multimedia in an interactive format.
  2. Web development activities can be categorized under five broad classifications::
    1. Authoring involves creating HTML documents, which are composites of text and HTML tags. Such documents can be created and modified by
      1. typing in a simple ASCII text processor,
      2. converting from other word processed documents,
      3. WYSIWYG HTML authoring systems are widely available that allow developers to generate HTML documents by normal typing while incorporating HTML tags by using in a variety of menus and dialog boxes exactly like word processing.  The HTML tags are invisibly generated but they can be edited and "tweaked" in an HTML editor. (See the comparative analysis of current WYSIWYG Editors by CNET.)
      4. proprietary authoring tools.  These may supersede, but will probably augment HTML authoring.  (See section 4.1.C.)
      Whatever the authoring tool, all generate programs in a an authoring language; see these in section 4.2.A
    2. Scripting is a programming technique for embedding code into a Web document in order to
      1. add dynamic content and
      2. facilitate interactivity between the client and server  The scripts governing the processing of exchanges of data between the client and server.
    3. Programming allows a developer to create applications, completely distinct from the HTML document, using standard programming languages. Such programs extend the functionality of browsers to operations that would otherwise have to be processed on the server, such as handling user input or searching databases.
    4. Publishing is the storing of HTML documents on a Web server.  This can be done three different ways:
      1. One can use the operating system of the server directly to save the HTML documents on the server's secondary storage.
      2. One does not have to have access to the server itself; the server operating system can be accessed remotely, over a TCP/IP network, using telnet.
      3. Another alternative is to use FTP (file transfer protocol) to "upload" files from a PC, where they were authored, to a Web server.
    5. Maintaining means the creation and upkeep of a Web site as a whole thus insuring the integrity of the site.  This essentially involves making sure, as the site evolves, that files, their names, and links, within the Web pages, all work properly. Maintenance activities include editing, testing, and debugging throughout the development process and rechecking when the Web site is updated or expanded.  Also, improvements, suggested by user feedback, are often implemented during maintenance.
    Not all Web development projects involve all of these activities; in fact, effective Web sites can be created, by nonprogrammers, using only HTML.  Also there are usually several different ways of achieving the same effects with a Web page, so there is no preferred, general procedure for Web development.
  3. Proprietary Web technologies:
    1. (From Webopedia; update and rewite) ActiveX is a proprietary Microsoft specification, based on Microsoft's Component Object Model (COM) architecture, that allows Windows-based programs to run within a Web page.It was not designed for Web development but has been applied to it in Microsoft specific applications.
      1. ActiveX enables any Widows-based program to add functionality by calling ready-made components, called ActiveX controls, that blend in and appear as normal parts of the program. They are typically used to add user interface functions, such as 3-D toolbars, a notepad, calculator or even a spreadsheet.
      2. On the Internet or on an intranet, ActiveX controls can be linked to a Web page and, like Java applets, downloaded by an ActiveX-compliant Web browser. ActiveX controls turn Web pages into software pages that can perform just like any program that is launched from a server.
        1. Unlike applets, ActiveX controls can be saved on the client, eleminating the need to download them each time a Web page is opened; however, this can be dangerous; the reason this is not allowed in Java is that applets are not allowed to access the client's storage for security reasons.  ActiveX does have a signature feature that allows you to specify which servers from which  you will allow controls to be downloaded; however, this can not prevent a control, once downloaded from damaging your system.
    2. Cold Fusionby Allaire Corporation, facilitates the creation of Web interfaces to database management systems.  It includes a server and a development toolset designed to integrate databases and Web pages. For example, with Cold Fusion, a user could enter a request on a Web page, and the server would query a database for relevant information and return the response to the Web page. Cold Fusion Web pages include tags written in Cold Fusion Markup Language (CFML) that simplify integration with databases and avoid the use of more complex solutions involving CGI scripting, Java applets, etc.
    3. Shockwave is a technology developed by Macromedia, Inc. that enables Web pages to include multimedia objects, especially animated sequences. To create a shockwave object, you use Macromedia's multimedia authoring tool called Director, and then compress the object with a program called Afterburner. You then insert a reference to the "shocked" file in your Web page. To see a Shockwave object, you need the Shockwave plug-in, a program that integrates seamlessly with your Web browser. It also lets output created by Macromedia's Authorware and Freehand tools be viewed on the Web. The plug-in is freely available from Macromedia's Web site as either a Netscape Navigator plug-in or an ActiveX control.  Shockwave supports audio, animation, video and even processes user actions such as mouse clicks. It runs on all Windows platforms as well as the Macintosh.
    4. Flash (also called "Shockwave Flash"), by Macromedia, is a user-friendly technique for producing vector-graphic based Web sites.Specifically it is a file format for delivering interactive vector graphics and animation on the World-Wide Web. View the Flash animations of great Superbowl plays at: http://superbowl.com/poc.html#) See the excellent demo (7min. video) of Flash technology on ZDTV' Call for Help, 2/3/00. A good reference for Flash information is: http://www.flashkit.com/index.shtml
SAQ 34: What is the (a) similarity and (b) difference between scripting and programming?
SAQ 35: The propoietary Web technologies are most similar to which of the activities in section 4.1.B?

4.2  Programming Languages of the Web :

NOTE: All of these make good candidates for the Project of this course, especially for C.S. majors!
You can develop an Introduction or, better yet, a tutorial on one of these languages.

  1. Authoring languages are powerful, special purpose programming languages that are designed to allow nonprogrammers to create Multimedia (text, graphics, animation, audio, and video) applications. These have great potential for developing individualized learning system software.
    1. Hypertext Markup Language (HTML) was the original language used to create Web pages; it has been updated via several versions.   HTML is really used only to create       (29) which, when inserted into regular text, tell a Web browser how to format text, insert multimedia, link to another location, or link to other programs written in VRML, Java, JavaScript, or other languages (CGI applications).
    2. Dynamic HTML (DHTML) is not a programming language itself, but an augmentation of HTML that  facilitates the enhancement of animation and interactivity of Web pages by providing scripting, cascading style sheets, layering, dynamic fonts, etc.  (From W3C Stylesheet page, http://www.w3.org/Style/: "Dynamic HTML is a term used to describe HTML pages with dynamic content. CSS is one of three components in dynamic HTML; the other two are HTML itself and JavaScript (See section 4.2.B.a, below.).  The three components are glued together with DOM, the Document Object Model. Dynamic HTML is still in its infancy and current implementations are experimental. )  DHTML is covered in more detail in LM IX.
    3. Extensible Markup Language (XML), a platform-independent Web document formatting language, is another improvement on HTML.  It is covered in LM X.
      1. XML is "extensible" because, unlike HTML, the ability to define new markup tags makes it virtually unlimited and self-defining.  XML allows the developer to define new tags  specifically for new data types thus dramatically expanding the variety of customized data that can be handled in a Web page.   In fact, it has been said that "XML is to data what HTML is to text", but since text is a specific form of data, XML is a more general markup language than HTML.
      2. XML is being supported by the United Nations as a premier standard for e-business.
      3. XML is the centerpiece of Microsoft's ".Net" (Dot Net) intitiative as well as virtually all other distributed computing activities.
    4. Synchronized Multimedia Integration Language (SMIL) is an upgrade of HTML that facilitates the synchronization multimedia elements of a streaming Web page enabling its audio, video and graphics elements to be coordinated.
    5. {Add other W3C markup languages based on XML: XHTML, MathML, and SVGA.}
    6. WML (Wireless Markup Language), part of the WAP protocol, is a subset of XML specifically designed to program wireless devices so that they can display the text portions of Web pages.  {EXPAND!}... More information can be found on the WAP Specification from OASIS (Organization for the Advancement of Structured Information Standards ). and an introductory tutorial is presented by The Wireless Developer's Network.
    7. Virtual Reality Modeling Language (VRML) is used to create Web pages with computer generated 3D environments that can be "explored" as if they are the real world. This promotes a highly interactive human-computer interface.
      1. VRML allows the developer to describe, in program code, three-dimensional (3D) image sequences and define user interactions with them. Using VRML, you can build a sequence of visual images into Web settings with which a user can interact by viewing, moving, rotating, and otherwise interacting with an apparently 3D environment. For example, you can view a room and use controls to move through the room as you would experience it if you were walking through it in real space.
      2. VRML programs can be "plugged into" HTML programs.
  2. Scripting tools are primarily used to create programs that produce dynamic, interactive Web pages  This can be accomplished by either client-side or  server-side scripting. In computer programming, a script is a  sequence of instructions that is usually interpreted, i.e. executed by another program rather than by the CPU.   In other words, unlike languages like Java and C++ (whose programs must be compiled into object code order to be executed), "scripts" are typically executed directly, line by line, by an interpreter that is associated with the particular Web browser.  Some languages have been conceived expressly as scripting languages, e.g. JavaScript, and VBscript, while others, like Perl, are more general but may be used in a scripting environments.  In Web applications, scripting languages are often used in two contexts:  to write "scripts" that (1) are "server-side", i.e. reside and run on a Web server or (2) are "clent-side", i.e. reside in applications on the client, e.g. JavaScripts embedded within an HTML document.
    1. JavaScript is the primary client-side scripting language.  It is is a simple, object based (not object oriented), cross-platform, World-Wide Web scripting language.  It is called a "very high level language" (i.e. special purpose) because it is not a general purpose language like Java but is specifically designed to be used with the World Wide Web.   In fact, it currently runs in only three environments.  It is mainly used for client-side scripting, in which the scripts are embedded in the HTML and run by the browser.  However, it can also be used as a server-side scripting language and as an embedded language in server-parsed HTML. (The following is adapted from FOLDOC's definition of JavaScript)
      1. JavaScript is a very high level language (VHLL), and as such has a focused area of application, HTML documents.  While JavaScript programs can function independently, they are designed to be embedded withing HTML documents, giving them functionality not available in HTML itself, e.g.  dynamic, interactive multimedia, forms, simple web databases, etc.
      2. JavaScript was originally created by Netscape (now part of AOL Time-Warner) and was proprietary, but its popularity led it to becoming a defacto standard.  Microsoft adopted it, but (typically for Microsoft) "cloned" its own version called JScript.  The consequent inconsistencies, caused by Microsoft's self-serving refusal to use platform independent standards, made it difficult to write JavaScript that behaves the same in both Netscape Navigator and Microsoft Internet Explorer.  Fortunately an international standard, ECMAScript, has been adopted for the core language so basic JavaScript is now platform independent although the support of advanced features still depend on the browser being used. The endeavor to create an open standard for JavaScript is an effort to prevent Microsoft from monopolizing web software as they have PC operating systems.
      3. JavaScript is an important emphasis of this course and will be covered, in detail, in later learning modules.  See the expanded description in LM VI, section 3 and detailed presentation in LM VII.
    2. REBOL (Relative Expression-Based Object Language) is a new approach to Web scripting that introduces the idea of a messaging language, one that "provides highly-integrated connectivity (networking) along with context sensitivity (called "dialecting", the ability to create variations, or sub-languages, for domain-specific communication.)". It provides a broad range of approaches to the common challenges of Internet computing.  While designed to be simple and productive for novices, the language extends a new dimension of powerful, practical solutions that facilitate advanced Web development.  Specifically, the language offers a significant new approach to the seamless exchange and interpretation of network-based information over a wide variety of computing platforms.  A message can be as simple as a single line or as complex as an entire application. (REBOL's creator on ZDTV, 8 min.)  For more information on REBOL, see the home page of the language at: http://www.rebol.com/
    3. Python is simple, high-level interpreted language that combines ideas from ABC, C, Modula-3, and Icon. It bridges the gap between C and shell programming, making it suitable for rapid prototyping and Web scripting. It is object-oriented and supports packages, modules, classes, user-defined exceptions, a good C interface, dynamic loading of C modules and has no arbitrary restrictions. (This is ZDTV's Leo LaPorte's favorite learning language for beginning programmers.) See the home page of the Python language.
    4. Tcl (Tool Command Language) is a powerful, extensible, interpreted string processing language for issuing commands to interactive programs.   The extensibility of Tcl means that it can be easily extended through the addition of custom Tcl libraries. It is used for prototyping applications as well as for developing CGI scripts.  It has a peculiar but simple syntax. It may be used as an embedded interpreter in application programs. Tcl has an associated GUI toolkit, Tk, so it is sometimes referred to as Tcl/Tk.  For more information see the TLC/ Developer Xchange.
    5. Common Gateway Interface (CGI) scripting is the traditional scripting technique used for server-side scripting.  CGI is not a programming language but a standard protocol for running external programs that are stored on a Web server.  Such programs, called "CGI scripts" can be written in virtually any computer language (the most suitable is probably Perl), but interpreted languages are preferred because compiled languages object code is not as flexible.
      1. Scripts are typically written to manage input via forms, providing feedback, and performing searches.
      2. The CGI approach is being superceded by languages specifically designed for writing programs to be run by Web browsers, e.g Java (applets and servlets) or JavaScript.
      3. A nice set of simple examples are give in the Web site of TechTV's Leo LaPort, www.leoville.com/perltest.shtml.  The online preassessments of my course COSC 120 are written with CGI using Perl, e.g. Preassessment 1.
    6. ASP (Active Server Page) is not a programming language, but is called, by Microsoft, "a server-side scripting environment." To create ASP scripts developers typically use the languages VBScript or JScript, both of which are automatically supported by ASP.  ASP and HTML are tightly integrated.  In HTML, tags are delimited with brackets; similarly in ASP one uses the <% %> delimiters to define the beginning and end of a script. ASP scripts can be inserted anywhere in an HTML page, and visa versa. One of major assets of ASP is that it facilitates access to SQL databases.  For a nice, introductory tutorial on ASP, access the CNET site,
    http://www.builder.com/Programming/ASPIntro/?dd.cn.txt.0701.10
      1. Visual Basic is specifically designed to quickly and easily develop code that provides interaction between different Microsoft applications, e.g. Word and Access or Internet Explorer.   It is actually more of a "programming environment" in which a graphical user interface is utilized to select and modify "library code" previously written in the BASIC programming language.
      2. VBScript is an interpreted scripting language that is a subset of its Visual Basic programming language. To learn about VBScript try the online tutorial at: http://www.intranetjournal.com/corner/wrox/progref/vbt/
  3. General purpose, "high level languages" (HLL) are the staple of software development.  Any HLL can be used to write network applications; however, some make that job much easier than others do.  The two most common ways of writing Web specific programs are:
    1. using a HLL and following CGI protocols.  Such programs, called "CGI applications" can be written in virtually any computer language but the most frequently used is Perl.
    2. using Java, perhaps the "hottest" language today.  Java is a purely object oriented, HLL designed to facilitate reliable, platform-independent, distributed processing.  It is a general purpose language useful in writing any type of application; however, it has special capabilities for writing object-oriented programs that run on networks. The most visible of these capabilities are "applets" (specialized small applications) that can be called directly from within an HTML document. This makes it the "language of choice" for most programmers of network applications.
SAQ 36: List and distinguish the markup languages mentioned in the preceding section.
SAQ 37: What is the difference between client-side scripting and server-side scripting?

4.3  MIME Files:

  1. MIME types are different file formats that may be transmitted via the Internet's SMTP (Simple Mail Transport Protocol). They are identified by file extensions (See Table 330/I-1.) which enable SMTP to transport non-ASCII data over the Internet using ASCII mail protocols.
TABLE 330/I-1: COMMON MIME FILE FORMATS
FILE
FORMAT DESCRIPTION
 EXAMPLE APPLICATION SOFTWARE 
.au most common sound format on the Web NN; Sound Player (Mac); 
.aiff  another " " "  NN; Sound Player(Mac); 
.bin  binary file for Mac computers Use Stuffit Expander to convert to executable
.doc  MS Word document  MS Word (Mac or PC)
.exe DOS/Windows program or Archive Application that created it; self-extracting
.gif  GIF(Graphical Interchange Format), the  NN; JEPGView (Mac); Lview Pro (PC)
.html/.htm  HTML(Hypertext Markup Language) Any Web browser
.hqx  BinHex 4.0; Most Mac files are in .hqx 
Stuffit Expander (Mac); Encodes Mac files in 7b text for transfer
Use BinHex13 (PC) to un-binhex it.
.jpg/.jpeg/.JP2  A 24 bit graphic format
This will be updated to JPEG2000 with the extension .JP2
NN; JEPGView (Mac); Lview Pro (PC)
.mid/midi MIDI files for electronic musical insturments audio/x-midi (Netscape plug-n)
.mpg/.mpeg  MPEG, standard movie format of the Net Sparkle (Mac); VMPEG (PC)
.mov/.moov 
.movie/.qt
QuickTime Movie; native Mac movie format QuickTime (Mac/PC); Sparkle (Mac)
.pdf  Adobe Acrobat Portable Document format Adobe Acrobat Reader
.ps  Postscript file; plain text file; not readable Gostscript (Mac/PC); prints to laser printers
.sit  Stuffit Archive  Use Stuffit Expander (Mac) or UnSit (PC) to convert
.sea  Macintosh Self Extracting Archive  Download as MacBinary, launch; self-extracting
.tiff/.tif  TIFF, very large high quality image format  JEPEGView (Mac); Lview Pro (PC)
.txt  plaint ASCII text  Any wordprocessor
.wav  Windows Wave format sound file  Windows Media Player; SoundApp (Mac); any PC sound
.zip  pkzip, a common DOS/Windows compression format ZipIt/MacUnZip/Stuffit Exapnder (Mac); WinZIP (PC)
  1. Browsers, themselves, can read some MIME types, just as Internet mail programs do. For those types not supported by the browser, "plug-ins" can be configured to handle them. Multimedia data, of course, is originally analog; it must be converted to digital form (binary code) in order to be processed by computers; this is done by "analog to digital converters".
  2. Internet mail programs use MIME to transmit the above file types in the form of text-only mail messages.
    1. When a mail program receives message, it examines the special message header to determine whether a message contains text or MIME data.
    2. If the message contains a MIME file, e.g. a MPEG video clip, it reconstructs the original binary data from the ASCII of the mail message and then plays it back using the plug-in for the application that created it.
  3. File compression is a reversable process that reduces the size of a file (for storage or transmission) and restores the file when it is displayed (graphics file) or executed (program file).   In networked systems, file compression is used to increase the efficiency of distributed computing by reducing the size of files before they are transmitted over a network, thus reducing the required bandwidth. In general, compression is accomplished by a program that implements a compression algorithm that rewrites a file into a smaller storage space.{EXPAND THIS WITH ILLUSTRATIONS.}
    1. In general, there are two types of compression:
      1. Lossless compression is completely reversable, i.e. the original file can be regenerated perfectly from the compressed file.
      2. Lossy compression is NOT reversable, i.e.during lossy compression some "unimportant" data is discarded to maximize the compression ratio.  Obviously, recompression of a lossy compressed file compounds the loss of detail, so it is advisable to retain the original file for later editing rather than trying to edit a compressed file.
    2. Compression and decompression is performed by a codec (compressor/decompressor). A codec can be either hardware or software that uses complex algorithms to compress the file for network transmission and then decompress it to regenerate the original file for processing or execution. (Be aware, however, that the term "codec" is used for several fundamentally different functions, e.g. code/decode in the transmission of digital data over an analog channel.)
    3. MIME types themselves, are usually designed to efficiently compress their data. This is particularly true of graphic, audio, and video files because they tend to be very large.
    4. In network transmissions, compression can be applied toan entire packet or on only the data component.  When files are transmitted over the Internet, either singly or as part of a composit "archive file", they may first be reduced using a zip, gzip, or other compressed format. WinZip is a popular Windows program that compresses files when it packages them in an archive.
SAQ 38:  (a) Which of the files listed in Table are data files?  (b) What are the others?

5. SUMMARY OF THIS LEARNING MODULE:

You might find a good application of "cloning" is to clone the LM summaries, adding your comments and information from or linkes to other sources.  The simplest way to do this is to open a blank HTML page using an HTML editor like Netscape Composer (Open this from the Communicator menu in Navigator.) then "copy and paste" the summary onto your blank page and begin modifying it.   See the Checklist discussion of cloning.   Note: I might be doing you a disservice in providing you a summary.  I found it very thought provoking when I edited this LM in order to create the following summary, so maybe I should require you to write your own summary instead of providing one for you.  However, I think a better solution is to have both so I encourage you to write your own summay and compare it to mine; see the suggested use of clone in the Checklist.
  1. CONCEPTS:
    1. Cyberspace is an abstract computer workspace where all knowledge and information sources are linked via ubiquitous digital networks.  In COSC330 we will use "information space" to represent all electronically accessible knowledge which includes the matrix plus television, the telephone network, etc.  The Internet is, by far, the dominant WAN of cyberspace, but the focus of this course is the World Wide Web, the subnet of the Internet based on the HTTP protocol.
    2. A Computer System can be analyzed on the basis of Input-Process-Output model.
      1. Local I/O involves only the actual user
      2. Indirect I/O involves saving an retrieving files on secondary storage
      3. Remote I/O involves communications, typically via a network.  This is the essence of Web access which is based on the client-server model of computing; therefore, this is the focus of COSC 330.
    3. Computer Networks facilitate remote access to distributed computer systems in which data and computing power is spread over all networked users.
      1. Networks consist of interconnected "nodes" that interact via a client-server modelServers are network computers which provide resources to the user of the network. Clients are computers or computer applications that provide users access to network servers.
      2. Networks are classified as LANs, WANs, or (less frequently used) MANs.
    {UPDATE THIS!!}
  2. THE INTERNET:
    1. The Internet is a public, packet switching, wide area network that is based on the TCP/IP protocol suite.  Internet access is provided by internet service providers called ISPs. Internet services are divided (in this course) into three categories:
      1. Communication services which include e-mail, newsgroups, mailing lists, chat, and teleconferencing.
      2. Resource access services which include file transfer protocol (FTP) and Telnet.
      3. Information retrieval services which included the World Wide Web, Gopher, and WAIS.
    1. The Internet is the WAN based on the TCP/IP suite of protocols.
      1. There is a great deal of TCP/IP details, but it is only really important to remember four points:
        1. TCP/IP is a suite of communication protocols that permit physical networks to be joined together to form a network of networks. TCP/IP combines the individual networks to form a virtual network in which individual network nodes are identified by IP addresses instead by physical network addresses.
        2. TCP/IP has a multilayered architecture that clearly defines each protocol's services and responsibilities.
          1. TCP and UDP provide high-level data transmission services to network application programs; they encapsulate application data in segments and passes them to . . .
          2. IP which adds its data turning the segments into packets (or datagrams).  IP is responsible for routing the packets to their destination.
        3. Data moving between two applications running on Internet hosts travels down and up the hosts' TCP/IP stacks. Layer data added by the TCP/IP modules on the sending end is stripped off by the corresponding TCP/IP modules on the receiving end and used to re-create the original data.
        4. All of this is invisible to the user!!
      2. HTTP is the only protocol that is really relevant to COSC 330.
  3. THE WORLD WIDE WEB:
    1. The World Wide Web (Web, WWW, or W3), the subnet of the Internet governed by HTTP,  is a distributed, hypermedia information retrieval system.  The Web facilitates multiple protocol access via a single client called a browser.
    2. Web information is contained in HTML documents containing hyperlinks and made available on a Web server.  A Web site is an integrated collection of Web pages which is normally collected in a single directory (folder) called a Web account.
    3. The key to the hyperlinked nature of the Web is the URL which can have, at most, five distinct parts.
      1. the URL type or protocol prefix used to access the Internet address.
      2. The domain name
      3. the directory path separated by slashes
      4. the file name of the document to be accessed, and
      5. optionally a "target" or a query string
    4. The Domain Name System (DNS) is a way of associating arcane numeric IP addresses with more memorable "domain names" used in URLs.
  4. OVERVIEW OF WEB DEVELOPMENT:
    1. HTML "documents" are actually HTML programs that are a composite of simple text formatting, hyperlinks to other documents or multimedia files, embedded code of other languages, typically authoring languages, and scripting languages invocations of separate programs written in other languages, typically Java or CGI scripts.
    2. Proprietary Web technologies include Cold Fusion, Shockwave, and Flash.
    3. Web development activities can be categorized under five broad classifications, authoring HTML documents, scripting specialized programs that augment the HTML, programming separate applications that enhance browser presentations,  publishing  HTML documents on a Web server, and maintaining the integrity of the Web site.
    4. Web programming languages include:
      1. Authoring languages that allow nonprogrammers to create Multimedia applications include:
        1. HTML used to tell a Web browser how to format text, insert multimedia, link to another location, or link to other programs written.
        2. DHTML, an extension of HTML that can add animation and interactivity to Web pages.
        3. XML which allows new tags to be defined for new data types thus extending the data types that can be displayed by a browser.  XML is "extensible" because new markup tags can be defined making it virtually unlimited and self-defining.
        4. SMIL which facilitates the synchronization of streaming multimedia.
        5. VRML is used to create virtual 3D environments that can be "explored"
      2. Scripting tools are used to make Web pages dynamic and/or interactiveby either client-side or  server-side scripting.
        1. JavaScript  is the primary client-side scripting language specifically designed to be used with the World Wide Web.
        2. REBOL (Relative Expression-Based Object Language) is a new approach to Web scripting that utilizes messaging  and dialecting to augment HTML.
        3. Python is a simple, high-level interpreted object-oriented language that bridges the gap between application programming and shell programming, making it suitable for rapid prototyping and Web scripting.
        4. Tcl is an interpreted string processing language for issuing commands to interactive programs that is extensible via custom Tcl libraries. It is used for prototyping applications as well as for developing CGI scripts.
        5. Common Gateway Interface (CGI) scripting is the traditional scripting technique used for server-side scripting in virtually any computer language.
        6. ASP (Active Server Page) is a Microsoft specific server-side scripting environment typically used with VBScript or JScript
      3. General purpose HLL, that are typically used to write Web specific programs include:
        1. any HLL (commonly Perl), following CGI protocols.  Such programs, called "CGI applications".
        2. Java, a purely object oriented, HLL designed to facilitate reliable, platform-independent, distributed processing is currently the language of choice for Web programming.  It has special capabilities for writing object-oriented programs that run on networks including the unique concept of "applets".
    5. MIME types are a set of file formats, defined by ISO, that are designed to be transmitted over the Internet using SMTP; each file format has a unique extension that identifies the application that is needed to output it.
NOTES:
  1. You have now covered the material required to answer questions 1-20 on PREASSESSMENT 330-1.  There are two versions of the preassessments, an HTML version (which you can clone and add questions to) and an interactive version (with is self checking so you can make mistakes and correct them without anyone even knowing!) alert_red.gifBe sure to read the preambles and associated descriptions of the preassessments so that you fully understand their purpose.
  2. I have also published the "best Assessment 1 I can write" called the "PROFICIENCY EVALUATION" which you can begin answering now.  (These also have a clonable HTML version.)  It has a format that is identical to that of the gradeable Assessment 1.  If you begin the an assessment having 100% understanding of the associated Proficiency Evaluation, you should have a fundamental understanding of the most import concepts covered and be in good shape to take the assessment. alert_red.gifBe sure to read the preambles and associated descriptions of the proficiency evaluations so that you fully understand their purpose.
  3. Note that within a Learning Module, you can find any word by utilizing the Find utility of your browser. (One of the big advantages of a browser-based learning material!)  In Netscape Communicator you use the Find in Page... or Find in Frame... items in the Edit menu.  This is a very powerful tool that can also be a big help on PreAssessments and Assessment reworks; just copy a question answer and paste it in the Find... dialog box and this will take you to each occurance in that learning module.