ADD more FIB.
UPDATE SAQs.
Must include: SGML, ActiveX.
New concepts not incorporated into PA: WAP/WML.
{Add excerpts/links to Part I of Niederst}
{Expand compression in MIME with pictures; add
to preassessments}
{Incorporate emerging
standards from W3C}

LAST
UPDATE: 2/11/03
Constantly
being Updated!
COSC 330
LEARNING MODULE
I
REVIEW OF WEB FUNDAMENTALS
This
learning module is a review of the concepts associated with Internet
in general and the World Wide Web in particular. It is a concise
summary of the online
course COSC 120, Introduction to Cyberspace. This is not
a replacement for COSC 120, who's content is a prerequisite for COSC 330,
but will serve as a concise summary of COSC 120 for those who did not
take the course but have Web development experience and enrolled in COSC
330 via permission of the instructor.
Not
all of the information contained in this learning module is directly relevant
to COSC330, but it is still essential in order to understand the content
of COSC 330 because in presentations and discussions I assume
that students understand this background material.
Advice for studying this LM is given in the Study
Guide for LM I; this is an online substitute for comments typically
made in an on-campus course. If you haven't already done so, read
the Basic
Study
Guide, general advice for study of my online courses.
The
Objectives of this learning module are:
-
To
survey the fundamentals of cyberspace, the Internet, and the Web that are
necessary for efficient Web development; these are covered in the course
COSC120.
-
To
survey the basic features of Web pages.
-
To
preview the Web Development facilities to be covered later in the course.
-
To
illustrate the techniques for studying this online, independent learning
course.
TPQ
1: Rewrite the preceding objectives in terms of personal accomplishments
to be attained after finishing the study of this learning module.(
Note
that this will be a standard exercise at the beginning of each learning
module that is very important in order to "get you focused". For
a hint, and link to Tony's answer, click on the link "Hints,
TPQs"
in the "Navigation Panel" along the left boarder of this Web page; this
will be a standard facility throughout the course)
The
sequence of presentations in this learning module is as follows.
You can click on any link to jump directly to that section.
-
CONCEPTS (summary of the COSC 100 INTRODUCTION TO
COMPUTER SCIENCE content relevant to COSC 330.)
-
THE INTERNET
-
THE WORLD WIDE WEB
-
OVERVIEW OF WEB DEVELOPMENT
-
SUMMARY
INTRODUCTION
In his landmark, high-tech noir novel, Neuromancer
(1984;
reviews
at Amazon.com), William Gibson coined the word "cyberspace"
which has come to represent the abstract computer workspace where all knowledge
and information sources are linked via ubiquitous digital networks. Gibson
christened this cyberspace "the matrix", the conduit for interactive,
virtual multimedia. Since then, terms like "Information superhighway",
NII (National Information Infrastructure, the future "super-network" of
the U.S.A.), the "infobahn", etc. have appeared to hype the vision of the
future where every individual has access to all the world's information
via computer. All of these words lack concise, universally accepted definitions
so in this class we will use "the matrix" to represent the
totality of present-day computer networks (See *FIGURE
LM1-1;)
and (2) the "information space" to represent all electronically
accessible knowledge which includes the matrix plus television, radio,
the telephone network, etc. (Note that this latter definition is not limited
to computer networks as it often is!) Several
spectacular views of Cyberspace are illustrated by the *Atlas
of Cyberspace, and fascinating animations (Java Applets) of Internet
traffic for the world
and the U.S.A.
are provided by Matrix Information and Directory
Services, Inc. (MIDS).
FIGURE
LM1-1
The Relationships Between
Various Networks of Cyberspace
(For a larger version
of this illustration click here.
You
might want to open another browser window to view this; if so, right click
(on a PC) or hold the mouse button down (on the Mac) and select Open
Frame in a New Window from the pop-up
menu.)

The Internet (often
simply called "The Net".) is, by far, the dominant network of cyberspace.
It began as a way to communicate text-based data (e-mail, text documents,
etc.) and programs (binary files sometimes called executable files), but
has dramatically evolved especially with the development, within the Internet,
of the World Wide Web (also called WWW, W3, or simply
"The Web"), during the 90's. Today one can communicate via
multimedia in video conferences or even enter mutual "virtual worlds"
where the multiple users interact in an environment that exists only in
a computer's memory. These virtual worlds can be anything the creator can
imagine! Such facilities are provided by the Web, a subnet of the Internet,
that is the prototype of the cyberspace of the future.
The
following presentation is a preview of the material to be covered in this
course. It consists of (1) a review/preview of the basic computer concepts
used to describe the Internet (section 1), a summary of the Internet components
(section 2), and overviews of the World Wide Web (section 3) and Web development
facilities (section 4). The following content is concisely
presented here as a review of prerequisite material as well as a
preview of Web development techniques to be covered in more detail in subsequent
learning modules.
NOTE:
You should refer back to this Overview when studying later details to see
how they fit into the overall context of cyberspace.
1.
CONCEPTS (Summary of COSC100 Content Relevant to COSC330):
The
following basic computer concepts are essential to the discussion of cyberspace.
They are covered in detail in courses like COSC 100, Introduction
to Computer Science (You
can access my online version of this course by clicking here.
You should do this in a separate window
(Right click on the frame and select "Open Frame in a New Window"
from the popup box.); otherwise you will get two navigation panels on the
page!).
They
can
also be learned by outside reading or looking them up on the World Wide
Web (e.g. click on the links to Webopaedia, Computer Desktop Encyclopedia,
Whatis,
or
FOLDOC
in the Navigation Panel to the left.
Click here
#4
and read comment #4.).
SAQ
1: To see what they are like, look up the definition of "Cyberspace" in
each of the four on-line references?
For
a hint, and link to Tony's answer, click on the link "Hints,
SAQs"
in the "Navigation Panel" along the left boarder of this Web page; this
will be a standard facility throughout the course)
1.1
Computer Concepts:
-
Computer = __________(1)
(
For
a hint, and link to Tony's answer, click on the link "Hints,
FIBs"
in the "Navigation Panel" along the left boarder of this Web page; this
will be a standard facility throughout the course) electronic
machine that (a) processes digital data into information (numeric,
text, or multimedia) (b) controls electrical devices.
-
Microcomputer = computer
based on a __________(2) a "processor on a chip".
-
Computer System = people,
hardware, software, data, and procedures.
-
Hardware = physical equipment
of a computer system.
-
Software = __________(3)
that "run" the computer.
-
Program = set of step-by-step
instructions, in a _________ __________(4), that causes a computer
to execute a specific task in finite time.
TPQ
2: What is the difference between a calculator and a computer? (
For
a hint, and link to Tony's answer, click on the link "Hints,
TPQs"
in the "Navigation Panel" along the left boarder of this Web page; this
will be a standard facility throughout the course)
SAQ
2: What is the difference between hardware and software?

STUDY
GUIDE NOTES:
-
SAQs
(Self Assessment Questions) and TPQs (Thought Provoking Questions)
are learning aids that will be used throughout my learning material.
Both types of questions are designed to help you focus on the essential
characteristics of fundamental concepts. SAQs act as "traffic lights";
if you can't answer one, it is a symptom of a misunderstanding, and you
should review the notes to correct it. TPQs may have more than one correct
answer; they may not even have any correct answer; they are simply there
to make you think! You are strongly to think
up your own SAQs and TPQs, using these as guides.
(The
"Cyber Jeopardy"
exercise in the PREASSESSMENTS formalize this exercise by
asking you to think up questions for each of the multiple choice answers.)
Searching your mind for such questions helps you to identify important
concepts and think about them; thought is essential to obtaining understanding!
-
You
should
work continuously on the PREASSESSMENT associated with each learning
module as you study. PREASSESSMENT 120-1 is associated with learning
modules I and II; you should read questions 1-20 because the answers to
those questions are in this learning module I. For now, answer the
questions by circling the answers, then, when you have to submit the PREASSESSMENT
you can easily transfer your answers to the scantron form that will be
provided the day before the preassessment is due.
-
The
blanksin
the text, like the SAQs TPQs are learning aids. As such, the answers
for them should NOT be written in the blanks; that simply turns
the learning aids back into normal text (you are a spectator). Instead,
if you feel you must write the answer down, place it in the margin or at
the end of the chapter; then when reviewing the FIBs (Fill in the
Blanks), SAQs and TPQs will make you think. (You become a PARTICIPANT
instead of simply a spectator.)
1.2
Data Processing Concepts:
The following flowchart representation
of the Input-Process-Output (I-P-O) process,
FIGURE LM1-2,
can
be used to illustrate virtually any computing concept or process!
In this section this representation is used to visualize the conceptual
operations involved in data processing. In FIGURE
LM1-3 this same schematic format is used to relate different parts
of computer hardware.
FIGURE
LM1-2: The "I-P-O" SCHEMATIC
-
The schematic shows that information
is processed __________(5), (facts, values, etc. organized for
computer consumption); information is presented for __________(6)
consumption.
-
Direct input includes
data as well as the programs that process the data (in word processing
the data would be text and the program would be the word proessor) which
are typically input from a keyboard, mouse, or some other direct input
device. In order to be processed the input must be encoded, i.e.
translated from human language into machine (computer) language; this is
done transparently (unseen by the user) as the input is read by the computer.
-
Local output goes directly
to the user, typically via the computer monitor, speakers, printer, etc.
and involves decoding from machine language back into a form understandable
by humans.
-
Before being output to the user,
processing may have intermediate output and return input involving disk
storage or communications.
-
Store operations save
output to a data file, e.g. a text file from a word processor or an HTML
file from a Web browser.
-
Communicate operations
involve interactions with other computers; this is called "remote" input/output
to distinguish it from "local" input/output. Communications usually
involves network transmissions, most often via the Internet.
Unfortunately,
many introductory texts still ignore the communicate activity (and miss
the nice symmetry of the I-P-O schematic), so if you memorized a PC-centric
version of this schematic you missed out on the fact that "the computer
is the network" (Sun Microsystem's moto); be sure to remember the COMMUNICATE
component and the nice balance of this schematic!
-
Virtually
all computers are digital, i.e. they can only process digital data
(discrete electronic signals). Digital data is stored in memory as collections
of electronic switches (transistors) either being on or off; these primitive
data elements are called bits (binary digits) and are represented
by humans as 1 or 0; a collection of eight bits is called one byte
which are used to represent single alphanumeric characters.
-
Computer data can have various
forms including
numeric (integer or "mixed"), text, and multimedia(audio,
visual, etc.), but they are all digital and thus represented by precise
collections of bits.
-
Most "real world" data is
analog (continuous rather than discrete); therefore, it must be converted
to digital (A/D conversion) when encoded and visa versa (D/A conversion)
when being
decoded.
(
For
the distinction between analog and "digital" data see section
1.C in Learning Module of COSC 120, REVIEW/OVERVIEW OF COMMUNICATIONS
AND NETWORKING; however, this distinction is not critical to the following
discussion.)
-
Data and programs are stored
(i.e. "saved") in files located in secondary storage. (See
section
1.3.C, below.)
-
Data files digital data
that is the "raw material" for the computer programs. Examples include
numeric data stored as binary numbers, text stored as binary codes, etc.
-
Program files contain
the instructions that manipulate the data in data files. Program files
contain machine languages instructions (in a binary format) that can be
executed, without translation, by the computer are usually called "executable
files".
-
In order to complete a processing
task, a computer might need to use data or run programs on other computers.
This can be accomplished by communication via networks to which
the client or server may not even be physically connected. (See section
1.5, below.)
TPQ
3: How can computers be networked without being physically connected?
SAQ
3:
1.3
Hardware Concepts:
The
following is a greatly oversimplified survey of the concepts associated
with the interactions of the CPU with its peripheral devices. It
is intended only to familiarize the beginner with basic hardware terms
needed to talk about computers used in telecommunications. It is
equivalent to the
OVERVIEW
OF COMPUTERS, part of my on-line course COSC
100, INTRODUCTION TO COMPUTERS; for a more detailed treatment
see CENTRAL
PROCESSING UNIT & PRIMARY MEMORY and INPUT/OUTPUT
HARDWARE learning modules of that same course.
-
Computer Classifications:
-
An simplistic classification
of computers can be made according to whether they are utilized by individuals
or multiple users.
-
Personal computers (PCs)
are designed for the single user, and are the most common means of Internet
access; in such cases they are called "clients" (See
below.) which access the services available on "servers" on
the Internet PC's are microcomputers (computers based on microprocessors)
which have subclassifications like desktops, portables, notebooks, etc.
-
Multi-user computers
can be loosely categorized, according to decreasing power and price, under
the following types: supercomputers, mainframes, and minicomputers.
Mainframes and minicomputers are used as Internet nodes where they route
communications traffic. They are also used as Internet servers
in which case they provide a "service" (See
below.) like a Web site; however current, powerful microcomputers can
also act as servers.
-
In
this course it is unnecessary to fully understand the distinctions between
computer types, so further discussion of this topic is omitted. As
far as this course is concerned, it is only necessary to realize that users
typically access cyberspace via microcomputers and that mainframes
and and minicomputers are typically used as Internet nodes.
-
Generic Organization of
the CPU and Peripheral Devices:
FIGURE
LM1-3
-
The arrows
within the CPU schematic above simply dramatize the complex interaction
of the two
conceptual components of the CPU (Control Unit (CU),
and
Arithmetic/Logic Unit (ALU)) and primary memory; this
schematic really reflects the organization of a microcomputer, but is less
true of large, multi-user computers like minicomputers and mainframes.
WARNING:
There is a discrepancy in the way different people define the CPU; some
texts include primary memory as part of the CPU (I believe this is the
most accurate description, but few introductory courses, which focus on
microcomputers, use this terminology.) (For
more details read Section
3 of LM IIIB, of COSC 100.)
-
Input,
output, communications, and secondary storage equipment are called peripheral
devices. These may be on-line (directly connected to the
CPU) or off-line (often called auxiliary devices).
-
Direct
I/O hardware allows the user to interact directly with the computer;
this distinguishes it from Indirect I/O described in the next section.
Direct
input hardware includes keyboards, pointing devices, etc., and direct
output hardware includes monitors, printers, speakers, etc.
-
Indirect
I/O involves multiple outputs and inputs from devices connected to
a computer before the final output goes to the user. This has two
basic subcategories, secondary storage and communications which ar briefly
explained in the following sections.
(For more
details read LM V, of COSC
100.).
-
*Secondary
Storage is currently dominated by magnetic media (hard disks,
removable hard disks, and floppies), but magneto-optical and read/write
optical media (DVD, DVD-RAM, and DVD+RW) promise to revolutionize storage
technologies. (For
more details read LM IV,
of COSC 100.) An excellent article on the near future of removable
storage is published in the
5/21/98 issue of PC Magazine; this
article also illustrates the ever presence of vaporware with the hyped
200MB floppy from Sony and the 20GB rewritable magneto-optical disk from
TeraStore Corp which, both of which have yet to appear!
A really neat Web site for comparison shopping for hardware is PRICE
WATCH, whose URL is
www.pricewatch.com/
-
Data
communications is the background theme of this course, so knowledge of
basic communications hardware, especially that associated with Internet
access, is a prerequisite for COSC 330. (For
a review, check out LM
II, of COSC 120.) The overall picture includes the following.
-
Data communications
is a general term that has two subcategories:
-
Networks
involve groups of computers. (See section
1.5, below.)
-
Telecommunications
is the technology that facilitates long distance communications between
computers. This overlaps with networking when more than two computers
are involved.
-
Advances
in data communications have reoriented computing from a centralized system
based on mainframes to distributed systems in which data and computing
power is made to available to numerous, non-local users and all resources
may be shared. This trend will continue towards a goal of optimal
distribution that is dynamic, i.e. systems will reconfigure themselves
so that they offer the maximum facilities to the users currently on-line.
SAQ
4:
SAQ
5: What is the opposite of a distributed computer system?
1.4
Software Concepts:
Software
is a generic term for instructions that a computer can execute.
Self-contained software is essentially synonymous with
computer programs.
Most textbooks classify software into two categories. (I prefer three;
see the concluding paragraph of this section.)
-
Application software
includes programs that turn the computer (a general purpose tool) into
a special purpose tool. Those relevant to his course include:
-
productivity software
includes:
-
general productivity like word
processors, electronic spreadsheets, database management systems, graphics
packages, etc.
-
Web development software
including
-
WYSIWYG Web authoring
software like FrontPage, Dreamweaver, etc.
-
Multimedia development
software like Macromedia Flash or Fireworks.
-
Software development tools
(if these are part of a multiuser computer system this is more properly
categorized as system development software; see section 1.4.B.c,
below) including
-
Scripting languages like JavaScript
which we will learn in this course; Javascript is a "special purpose" languages
designed to embed code directly within HTML documents.
-
Java, a general purpose,
object oriented languages optimized for distributed environments
-
education/entertainment software
like tutorials, training programs, games, etc. I plan to make extensive
use of online examples of this genre in this course.

To
find and evaluate the best of these online learning resources will be an
overwhelming undertaking, so I would
your keeping an eye on candidates and recommending them to me -- even after
you finish this course!
-
professional software
for use in business, science, medicine, etc.,
-
System
software includes programs that allow users and their application
software to utilize the computer resources (the computer itself, all its
peripheral devices, and networks to which it is connected). In general,
system software has three subcategories:
-
system management software,
e.g. the operating system (OS), networking, telecommunications, etc.,
-
system support software,
e.g. utilities, device drivers, system monitors, maintenance, etc., and
-
system development software,
e.g. programming languages, Integrated Development Environments (IDEs),
software engineering tools, etc.
SAQ
6:
1.5
Communications and Networking Concepts:
(For more detail, see LM
II of COSC 120, REVIEW/OVERVIEWS OF COMM. & NETWORKING.)
-
"Communications" is a
general word for the transmission of signals between two or more points
via a communications channel.
-
"Data communications"
refers to computer data.
-
"Telecommunications"
pertains to transmissions over a distance in one of two forms:
-
electronic transmission
(via electrons) occurs through physical media such as wires and
-
electromagnetic wave
transmission (via laser, radio, TV, microwave, etc.) requires no media
(the term "wireless" is used), except in the case of fiber optics
in which light carries data through cables.
-
Networking links computers
so they can communicate, as well as share hardware and software.
The consequent unification of processing power leads to the goal of distributed
computing (see section 1.5.N, below), which is the optimum,
dynamic spreading of computing resources among users.
-
Data Communications, in general:
-
Types of transmission signals
(See
Figure
C&N-3.):
-
An analog signal is a
continuous
wave pattern that varied in frequency or amplitude to convey data. Most
"real-world" data has an analog format.
-
A digital signal is a
pattern of discrete high or low amplitude pulses that represents
binary data and are therefore used to transmit computer data.
-
A carrier signal
is a
base signal for transporting data, superimposed on the carrier signal by
modulation
(altering)
the carrier signal. The most basic forms include Amplitude
modulation (AM), Frequency modulation (FM), and Phase modulation (PM).
(Figure
C&N-3 illustrates the AM and FM concepts)
SAQ
7: What is the difference between analog data and digital data?
-
Transmission Characteristics:
-
Transmission parameters:
-
The transmission speed is
the amount of data transmitted per unit time, e.g. bits per second, bps
or bytes per second, Bps.
-
Digital Signal Classifications
(for North America) and Speeds:
-
DS (digital signal) is a data
transmission classification system based on multiples of 64 Kbps, the theoretical
bandwidth of a single "voice channel" on the "plane ol’ telephone service"
("POTS") .
-
OC (optical carrier) speed is
a fiber optics classification system that is based on multiples of 51.84
Mbps.
COMMON CARRIER CLASSIFICATIONS
|
Service
|
Voice
Channels |
Speed
(Mbps) |
|
DSO
|
1
|
.064
|
|
DS1 (T1)
|
24
|
1.544
|
|
DS3 (T3)
|
672
|
44.736
|
|
DS4
|
4032
|
274.1xx
|
|
OC-12
|
9150
|
622.xxx
|
-
The network is only
as fast as its slowest component
(often called a "bottleneck").
The relative speeds depend on both the type of media and type of equipment
used.
-
Transmission channels include
simplex (one way), half-duplex (two way, not simultaneously), and full-duplex
(simultaneously two way). There are three basicly different types
of channels.
-
Analog lines, e.g. POTS
which carry analog signals via electrons. To transmit data, the digital
data must be superimposed, by a modem, on the
telephone's analog carrier signal.
-
Digital Lines carry digital
signals and thus avoid the analog/digital conversions necessary for digital
transmission over POTS. There are currently two types of digital
lines:
-
ISDN
(Integrated
Services Digital Network) is a circuit-switched,
dial-up
service for transmitting digital data via a single wire or fiber optics
cable. Basic Rate service (BRI) can provide 128 Kbps bandwidth;
Primary Rate Service (PRI) can provide 1.5 Mbps, equivalent to T1 transmissions.
-
Digital Subscriber Lines
(DSL) also
transmits completely digital data over POTS. It is a dedicated
point-to-point technology that provides a practical maximum of over
6 Mbps using current technologies and up to 52 Mbps in the future.
-
Wireless communication
typically uses
microwaves (electromagnetic waves with frequencies
between Radio/TV and light; see
Figure
C&N - 4A. or radio waves to
provide high-capacity transmission (over 3 million bps) over line-of-sight
channels.
SAQ
8: (a) Is all wireless data transmission electromagnetic? (b) Is
the reverse true, i.e. is all electromagnetic data transmission wireless?
-
Transmission Techniques:
*See FIGURE
C&N-5 for a comparison of Baseband and Broadband
-
Baseband transmission
provides digital transmission without change in modulation; simultaneous
transmission of multiple sets of data is accomplished by interleaving
pulses using
TDM (time division multiplexing).
-
Broadband transmission
is used to send multimedia over long distances. It modulates
data, voice,
and video onto a different frequencies using FDM
(frequency division multiplexing).
-
Multiplicity
governs the number of people involved in a network communication session.
There are five categories: Unicast (1 to 1), .Anycast (to
the
nearest of several receivers), Multicast (to a
selected
group of receivers), Broadcast (to multiple receivers),
and Datacast (allows computer data to transmitted simultaneously
with a TV broadcast).
SAQ
9: What is the most important difference between baseband and broadband
transmission?
-
Communications Hardware:
-
A modem is a device that
transmits digital data over an analog channel by modulating the analog
carrier signal.
-
A codec transmits analog
data over a digital channel. (
Note
that "codec" which, in this case, stands for coder/decoder, has several
other definitions when used in other contexts, e.g. compressor/decompressor
in multimedia transmissions.)
-
Multiplexers interleaves
multiple communications so that can share a single communications channel.
The two common multiplexing techniques are FDM and TDM.
-
Controllers supervise
data transfer between the CPU and terminals on a multiuser system.
-
Concentrators perform
the functions of both controllers and multiplexers among the things.
-
Fax (facsimile machine)
transmits images (text, pictures, etc.) over telephone wires.
-
Network hardware; see
section
1.5.H.a below.
SAQ
10: What is the difference between a modem and a codec?
-
Communications Media:
-
Electronic Cables transmit
data, via
electrons, through copper wires. These include Twisted
pair wiring, Coaxial cable, and Cable television (CATV) cables
which can be used with cable modems to rival DSL technology
for the future of high bandwidth data transmission for the general public.
-
Fiber Optics Cables transmit
data, via light, through glass wire bundles; they outperform electronic
cables in transmission speed, bandwidth, interference avoidance, and inhibition
of wire tapping.
-
Communication software
controls
a computer’s access to system resources and stored data.
-
A communications program
manages
the transmission of data, between a computer and another computer or network
A communications application
performs a specific communications service or, in the case of Browsers,
several communications services.
-
Other types of communication
software include Terminal Emulation and Data-encryption.
-
Communications Protocols:
-
Communications protocols
are
standards that govern the communications between computing devices.
-
There are, currently, three
basic categories of protocols:
-
Basic protocols are either
synchronous or asynchronous and govern error detection and correction ("parity"),
etc.
-
Modem protocols govern
transfer of files via modem.
-
Network protocols include
WAN
protocols (communications within complex distributed systems) and LAN
protocols.
-
TCP/IP is a suite of
protocols that govern the Internet; see section 2.3,
below.
-
The OSI
model is a standard, theoretical, seven layer, network model of
protocols.
-
Generic network architecture
is
a collection of linked "nodes" that form channels, clients, servers
and
supporting hardware/software. They provide the infrastructure for a distributed
computing environment with its client/server processing model.
This is the essence of the provocative statement, "The network
IS the computer".
-
Network
Components ("Nodes") :
-
A terminal is any end
point of the network.
-
A server is a computer
that provides network services.
-
A host computer coordinates
terminals connected to it.
-
A hub connects several
network nodes together, sharing the total bandwidth.
-
A switch allows a non-shared
connection between two network devices.
-
A repeater facilitates
data transfer between distant devices by regenerating an attenuated or
distorted signal.
-
A bridge is an interface
linking two similar networks.
-
A router is a computer
manages the efficient routing of a transmission by selecting the "fastest"
link to the destination.
-
A gateway is a network
computer that links two different types of networks.
-
A firewall is a computer
that controls access to a private network in order to maintain security.
-
Basic network topologies
includethe
star (uses polling access), bus (uses contention access),
ring
(uses token passing access), and hierarchical.
SAQ
11: What kinds of network nodes are "invisible" to the network user?
-
Computer
Networks
are the result of the reorientation of computing design
from early isolated, centralized systems based on huge, expensive mainframe
computers with numerous user terminals to distributed systems in
which data and computing power is spread over all networked users thus
allowing all networked resources to be shared.
Distributed
computing is based on the idea that "the network
IS the computer (Sun Microsystem's motto)! This profound
phrase means that, when you are connected to the internet, your "computer"
is not just your PC, but all the computers of the internet, a mind-boggling
concept!!
-
Networks consist of interconnected
"nodes" that interact via a client-server model.
-
Servers are network computers
which
provide resources to the user of the network. Server software
are applications that are stored on servers but which can be accessed by
users without downloading them to their local hard disk.
-
Clients are computers
at which users access servers on a network. Client software, running
on a networked computer, is specifically designed to access server software,
pass requests to it, and communicate results to the user. In FIGURE
LM1-4 the particular client software is a database management
system; when a query is made, instead of downloading the whole database
and searching on the client, the query is processed on the server and only
the results are passed back to the client, a much more efficient use of
resources.
FIGURE
LM1-4
Simplified Client/Server
Schematic
NOTE:
The terms "client" and "server" are confusingly used to refer to the software
as well as the computers on which they run.
SAQ
12: Modify FIGURE LM1-5 so that it illustrates the client-server interaction
on the Web.
-
Types
of Computer Networks:
-
A Local Area Network
(LAN) is the smallest kind of network designed to serve users within
a confined geographical space, like a room or building.
-
A Wide Area Network (WAN)
, e.g. the __________(7), covers a wide geographic area such as a state,
a country, a dispersed corporation, or the world. They usually consist
of subnetworks and incorporate common carriers that are licensed
and regulated by government agencies providing telecommunication services
for the public.
-
A Metropolitan Networks (MAN
) is a less frequently used term that refers to networks larger than LANs
but smaller than WANs, e.g. large corporate networks at a single location.
-
Value-added networks (VAN)
(e.g. GTE's Telnet and Tymshare's Tymnet) are public data
networks, accessible via modem, for organizations that find private networks
unfeasible. They make long distance connection to computing services less
expensive than normal telephone service.
-
In a switched network
a temporary connection is established between two network terminals
for each individual communication. Data is transmitted from sender to receiver
by three types of switching:
-
circuit switching (transmission
only if receiver is ready) requires that a constant sender to receiver
circuit be maintained for the duration of a transmission.
-
message switching is
permanent, like circuit switching, but the connection is automatic, and
-
packet switching (message
components , called "packets", may follow different routes). Unlike ____________(8)
switching, which requires a constant point-to-point connection to be maintained,
each packet contains the destination address and a number specifying its
position in the message sequence. This allows each packet to be "dynamically
routed" over any network link as they become available or less congested.
The destination computer reassembles the packets back into their proper
sequence. The dynamic routing capability of the Internet makes it
virtually indestructible, because when any link "goes down" the network
itself will automatically reroute the message packages, unknown to the
sender or receiver.
-
Dedicated (nonswitched)
lines may be leased as network channels for the exclusive use of organizations
transmitting large amounts of data.
SAQ
13: Give an analogy to circuit switching and message switching in today's
telephone use.
SAQ
14: The combined networks at FSU would be called a _______; each computer
lab at FSU would be called a _______; the combined networks of the University
of Maryland System would be called a ______.
TPQ
15: Why would one say that the Internet is a more "efficient" communications
network that the telephone network?
-
Network Technologies:
-
LAN Technologies:
-
Ethernet
is
a bus technology that comes in several varieties: twisted-pair,
switched, and fiber optic.
-
Token Ring networks implement
ring technologies that are avaiable in two types: Type 1 connects
up to 255 stations via shielded twisted pair wiring; Type 3 connects
up to 72 devices via unshielded twisted pair.
-
FDDI is a ring technology
for fiber optics LANs that has a range of 124 miles and can support
thousands of users.
-
ATM
(Asychoronous Transfer Mode) is a dedicated-connection switchingtechnology
available for LANs as well as WANs that provides realtime multimedia transmission.
-
WAN Technologies:
-
Unswitched technologies:
The T-carrier system is entirely digital and provides full-duplex
capability via coaxial cable, optical fiber, digital microwave, and other
media. The most common are the T-1 line that provides 1.5 Mbps
and the T-3 line, that provides almost 45 Mbps.
-
Switched services:
-
Modem dial-up is the least sophisticated
but most common service.
-
ISDN
-
Frame
relay is a new technology optimized for cost-efficient packet
switching for intermittent telecommunications throughout WANs
atbandwidths
between .065-45 Mbps.
-
SMDS is a newpublic,
connectionless, packet-switched service offered by telephone companies
for interconnecting LANs in different locations, providing large bandwidth
exchanges between enterprises over a WAN.
-
ATM for WANs is
the same technology as that for LANs.
-
Internet Connections:
-
There
are three basic Internet access methods:
-
modem
connections and Online Services (like AOL) provide temporary IP
addresses that are reassigned after you disconnect.
-
LAN connections
are permanent because they have permanent IP addresses and can be left
on indefinately
-
Private networks can
restrict access to their networks.
-
Intranets are private networks
that are restricted to users inside an enterprise.
-
Extranets are private networks
that are restricted to outside organization that are associated with an
enterprise, e.g. people and corporations that doe with the enterprise like
customers, suppliers, etc.
-
Distributed
computer systems, the ultimate goal of networking, offer a robust
alternative to multiuser computers. In a multiuser system,
if the central computer "goes down" every user is out of luck; in a distributed
computing environment when a computer malfunctions only the user of that
computer is effected. (See FIGURE LM1-5
for a comparison of distributed computer systems versus the PC.)
Three versions of distributed PC systems are:
-
The new "Network Computers"
(NCs
as opposed to PCs) are computers which have no secondary storage
of their own but access all applications from and store all projects on
network servers.
-
Networked workstations,
e.g. Windows NT workstations, are PCs that are interconnected as well as
connected to printers, servers (e.g. file servers which are computers whose
hard disk is accessible to everyone in the network), net modems, etc.
-
NetPCs and WebPCs are
stripped down PCs (but containing local secondary storage) designed specifically
to be part of a network via which they access data, application software,
etc. Their locally stored software are installed, maintained, and
updated, via the network, under centralized control.
FIGURE
LM1-5
SAQ
16:
2.
THE INTERNET:
(
See
the nice Internet
description at How Stuff Works.)
(How
Stuff Works is a COOL site; I suggest you explore it!)
2.1
The Internet is a Wide Area Network (WAN):
-
The Internet (with a capital
"I") is a network of networks within which all devices communicate
via the TCP/IP protocol suite. (The terms "intranet" and
"extranet" refer to private networks and extensions of private
networks based on TCP/IP.) It is a "meganetwork" linking (as of
1999) over 100,000 networks, at least 44 million hosts and approximately
150 million people in virtually every country in the world. (These
numbers,
from the Internet Society, are "guesstimates" because it is virtually
impossible to measure them, and they increase daily; it is estimated that
the Internet population increases 15% per month! See the MIDS
graph of Internet growth.) The latest density of computers on the
Internet is shown if Figure OOC-5. The Internet links government
agencies, educational institutions, businesses, libraries, science foundations,
non-profit organizations, etc. (Also
check out the various fascinating maps
from An Atlas of Cyberspace; however, be aware that some of these pages
take a long time to access because of their complex graphics.)
-
No one runs the Internet; it
is like a cooperative, i.e. a federation of independent networks.
The
Internet Society, a non-profit group in Reston, Va., promotes the
use of the Internet
-
It has an open architecture,
meaning anyone can connect up and use it.
-
It is a chaotic source of undisciplined
information, an often bewildering maze to navigate.
FIGURE LM1-6
The Density of Computers
in the Internet
(For a larger version
of this illustration click )

-
The Internet can be thought
from three viewpoints,
a huge, dynamic network of computers, a collection of protocols, or collection
of dynamic services. Each of these view is briefly described below.
-
A physical network:
it is a World Wide Network (i.e. a (9)
that is a maze of telecommunication lines which interconnect smaller networks.
For example our Compton laboratory networks are part of the FSU network
which is part of the University of Maryland System network which is part
of the Internet, but technically every FSU network computer is part of
the Internet.
-
Internet access is provided
by ISPs (Internet Service Providers), companies that maintain Internet
connections and rent their services to other ISPs or individuals.
In general, there are three categories of ISPs, local, regional, and national.
(See Figure LM1-7.)
-
The national ISPs, like MCI,
Sprint, AT&T, etc. maintain "backbones" that act as "trunklines"
that carry huge composite transmissions over long distances. In the U.S.,
access
points to these backbones and the places where data moves from one
backbone to another are one of two types:
-
NAPs (network Access
Points), also called Internet Exchanges (IXs), are junction points
where national ISPs interconnect with each other.
-
MAEs (metropolitan area
exchanges) are NAPs that are strategically located to facilitate efficient
transfers between different backbones.
More information about
ISPs and backbones can be found at Boardwatch's informative Web site,
http://boardwatch.internet.com/
-
In the idealized illustration
below, a user would access their local ISP in Doylestown via a modem.
The local ISP links to the regional ISP which, in turn, links to the backbone
of a national ISP. Every computer in this schematic is part of the
Internet (The individual using a modem is only temporary.); this graphically
illustrates that the Internet is a network of networks. For a thorough
comparison of commercial ISP see CNET's
analysis.
FIGURE LM1-7
Subnetworks of the Internet
and Their ISPs
|
 |
-
For a
better idea of
the backbones in operation in the U.S. click here.
-
Every device connected to the
Internet has an Internet address that has two forms:
-
The numeric IP address
is used by the computer system and network. It is a four byte number
expressed, for humans, as four decimal numbers separated by periods, such
as "131.118.80.1", the IP address of the DNS
(Domain Name System; see section 3.5.C) server
at FSU. Valid addresses thus range from 0.0.0.0 to
255.255.255.255, a total of about 4.3 billion addresses!
-
The URL (Uniform Resource
Locator) is a more understandable text address, used by humans, that contains
the "name" of the computer that corresponds to its IP address. For
example the URL of this Web page that you are reading contains "www.frostburg.edu"
which is the domain name of the server on which the Web site of
this course is stored. This name must be translated to its IP addresses
before they can be used by networked computers; this translation is the
job of the DNS server (mentioned above). (Note:
the rest of the text in the URL specifies the protocol (http) used and
the specific location of this page in the computer's files. This
will be covered in section 3.6, below.)
NOTE:
Internet addresses should not to be confused with and e-mail address.
-
A collection of protocols
which
are conventions (rules) that govern the translation of digital data into
and out of "packets" of binary data which can be transmitted over a network,
e.g. the Internet. Protocols govern format, timing, sequencing, and error
control. Without these rules, a computer cannot "understand" a stream of
bits coming to its network connection. The protocols particular to the
Internet are part of TCP/IP
(Transmission
Control Protocol / Internet Protocol) which is actually a collection,
or "suite", of protocols which form the basis of communications over
the Internet. They are routable (i.e.
(10) Switching) protocols which means transmissions are broken
into packets which may be sent over different routes before arriving at
a single destination where the packets are reassembled into the original
message.
Note
that other network protocols, e.g. NetBIOS (IBM networks), NetBEUI (Microsoft),
IPX (Novell networks), DECNet (DEC), etc., will be ignored in this course
because they are not associated with the Internet.
-
An ever increasing, conceptual
network of Internet resources accessed by Internet services.
(See
section
2.2.) The resources are typical client-server environments.
SAQ
17: What is the similarity and (b) difference between an IP address and
a URL?
2.2
The Internet provides a wide variety of "Services":
Internet services are provided by application programs that
implement protocols that are components of the TCP/IP suite. (NOTE:
Most of these services are not unique to the Internet, e.g.. e-mail,
chat, etc. but others are specific to the Internet, e.g. the World Wide
Web.) They fall into three categories:
-
Communication Services.
(For more details see Learning
Module III, section 3.)
-
E-mail enables
Individuals
to exchange electronic messages; it is a network facility that provides
users with a "mailbox " file, where messages are stored. Correspondence
can be directed to specific users (with security) as well as to specified
groups. Local mail is sent via the "mailer" program in system software.
Non-local e-mail is routed over a
(11) such as the Internet.
-
E-mail includes "Talk"
or "Phone"services which, like "chat" (See 2.2.A.d, below.), facilitate
real-time, interactive text transfers (not voice) between two Internet
users.
-
SMTP (Simple Mail Transfer
Protocol),
POP
(Post Office Protocol), and IMAP (Internet
Message Access Protocol) are e-mail protocols of the TCP/IP suite.
Both POP and IMAP use SMTP for communication between the e-mail client
and server, but they make e-mail more user friendly. POP allows users
to download e-mail from a mail server to a PC where it can be read, answered,
and stored on a hard disk. IMAP is even better because it allows
you to manipulate your e-mail account on the server.
-
Note that Web based e-mail accounts, like Yahoo
Mail and FSU's Sun Interface,
use the Web procol, HTTP, as an interface to their e-mail servers.
-
Newsgroup Services (e.g.
Usenet
or
Internet News) exchange messages called articles arranged according
to specific categories called newsgroups. Here the messages are
passed from one system to another, not between individuals using e-mail.
Unlike mailing lists these transmissions are not automatic, they must be
requested by the user via local client software.
-
Mailing lists allow computers
to subscribe to the mass communications on a specified subject. Any e-mail
received by a mailing list server is automatically forwarded to all subscribers.
-
Chat/IM applicationsfacilitate
real-time group communication by enabling users to join rooms or "channels"
where all members receive a copy of a message sent to the channel they
are visiting. (Private conversations can be arranged.)
IRC (Internet
Relay Chat) was the first such application but is limited to text messages.
-
Instant Messenging (IM or
IMing) is a modern extention of chat technology that adds features
like "buddy lists", automatic notification when a buddy comes online, multiperson
conferences, user profiles, filters, message histories, etc.. Popular IM
applications include AIM
(AOL IM),
ICQ (for "I seek you"),
Yahoo
messeger, and Microsoft Network
Messenger Service (MSNMS). A public domain IM is
Jabber.
-
Some chat application
utilize multimedia to create virtual reality (VR) environments where
users can assume an identity, called an "avatar", which moves through the
chat environment interacting with the avatars of other users.
-
Teleconferencing refers
to real-time
computer-based, audio/video interaction of two or more
remote stations.
Current
chat applications apparently will evolve into full featured teleconferencing
software.
-
Audio communication
became possible using microphones and computer speakers.
-
Graphics communications
allow both users to type or draw on a common "whiteboard" or even
modify an image loaded from a graphics file. The Netscape Conference is
Communicators teleconferencing facility that allows audio and whiteboard
communication.
-
Video communication is
possible using images from digital cameras. The freeware applications Microsoft
NetMeeting (which we will use during this course) and iVisitprovides
this between microcomputers. Multimedia transmissions require huge bandwidth
so at present teleconferencing applications and "Video Phones" are rather
primitive, especially if they involve color video transmissions between
microcomputers.
-
A good resource on all types
of Internet conferencing (including chat, IM, etc.) is About
Internet Conferencing.
SAQ
18: What are the similarities and difference between e-mail and voice mail?
SAQ
19: Distinguish between (a) e-mail, (b) mailing lists, and (c) newsgroups?
SAQ
20: (a) What is the difference between between chat, on one hand, and e-mail,
Usenet, and mailing lists on the other?
SAQ
21: What is the difference between chat and teleconferencing?
-
Resource access services.
(For
more details see Learning
Module III, section 2.)
-
File Transfer allows
a network user to copy a file from one computer to another. It is typically
used to "download"
public domain (free) software or shareware
(minimal cost paid, on an honor system, after a trial period) which has
been "uploaded" (copied from a users computer to the file server). FTP
(File Transfer Protocol) is part of the TCP/IP suite. Archie
is FTP's associated search engine; it indexes FTP sites so that the user
can determine what is available. An Archie search scans FTP sites and then
offers a searchable database of the files it finds. These can then be downloaded
via FTP. Archie has lost significance with the growth of the Web, but FTP
is still the vehicle used to move files on the Internet.
-
Remote Logon allows a
computer user to access another (multiuser) computer, i.e. to log on to
and use that computer as if his/her computer were directly connected to
that computer. The user's CPU and operating system are "bypassed" and the
user's computer simply becomes a terminal connected to the remote computer.
The Telnet protocol provides this in TCP/IP.
-
Information retrieval services
unique to the Internet. (For more details see Learning
Module III, section 1.):
-
The World
Wide Web, the focus of this course, is called "THE
Internet Killer Application" because its popularity is literally exploding!
Since 1994 it has not only dominated all other WANs (See the next section.)
but all other services of the Internet, itself. "The Web" enables
users to "browse" documents on remote servers using the HTTP
(hypertext transfer protocol, a member of the TCP/IP suite). Everything
(documents, menus, pictures, etc.) is represented to the user as a hypertext
object (where clicking on the object activates a link to another
object which can be within the document, in another file, or on another
Internet resource).
-
Typically, Web "pages",
are accessed by a "browser" (e.g. Netscape Navigator) running an
HTML
(Hypertext
Markup Language) program. "Search engines", like Google, and "Search
Directories", like Yahoo, are programs that allow browsers to search
for Web pages with specified key words. Browsers actually provide many
of the other TCP/IP services such as e-mail and FTP, which are usually
built in, and remote logon which is added by "plug-in applications".
-
VRML (Virtual Reality
Modeling Language) is a developing standard that is designed to allow users
to view the Web as a 3D virtual environment. The WWW has been
-
Gopher/Veronica
allows the user to access files on remote servers; the file names are presented
as hierarchical menus. Veronica is a "search engine" which allows
one to look for specific information on gopher servers, but, like Archie,
is insignificant compared to the Web.
-
WAIS
(Wide Area Information System) is an automated Internet search service
that allows users to locate documents containing key words or phrases,
but, like Archie and Gopher/Veronica, has been almost completely superseded
by the Web.
TPQ
5: Think up a comprehensive collection of WITS/DB questions (See examples
at the end of section 2.2.A.) that will help you distinguish Internet
services of sections B and C, above.
2.3
The Internet is Governed by the suite of TCP/IP protocols:
(For more detail,
see LM
IV of COSC 120, an overview of TCP/IP.)
TCP/IP makes it possible for two computers which are part of different
networks, that are connected by routers or gateways, to exchange data.
This complex process involves the collective, cooperative interactions
of several protocols of the TCP/IP suite, depending on the particular service
being used. (An outstanding,
detailed
illustration of the TCP/IP protocols and network services in their
associated OSI level (from http://www.whatis.com/osifig.htm).
In
the following presentation, we begin at the highest level with a
client
sending a message to a server.
-
Application
protocols occupy the highest protocol layers and provide specific
services. Unfortunately the application protocols of the TCP/IP suite
do not fit nicely into one of the OSI layers. The WhatIs diagram
(referenced above) places them in the sixth (presentation) layer, but adds
the caveat that they overlap the adjacent layers. I prefer to simply
place them in the top three layers of the OSI model, i.e. ignore the distinction
in these layers as done in COSC120
LMIV, Figure TCP/IP-1.
-
FTP
(File Transfer Protocol) permits files to be transferred from one computer
to another using a TCP connection. Transferring files from a server to
a client is called ___________(a) and from client to server is called __________(b).
A related but less common file transfer protocol, Trivial File Transfer
Protocol (TFTP), uses UDP rather than TCP to transfer file data.
-
HTTP
(hypertext transfer protocol) facilitates the viewing of multimedia
files (text, graphic images, sound, video, etc.) from the World Wide Web.
The essential feature of HTTP is that it manages files that can contain
hyperlinks to other files whose selection will produce additional transfer
requests. To accomplish this, all Web servers contain an HTTP daemon,
a program that is designed to wait for HTTP requests and handle them when
they arrive.
-
SMTP
(Simple Mail Transfer Protocol) specifies the format of messages that
an e-mail client on one computer can use to send (or receive) electronic
mail to (from) an SMTP server on another computer. Now SMTP is usually
used to send e-mail while POP
(Post Office Protocol)
and IMAP (Internet Message Access Protocol), two other e-mail protocols,
are used to read it. Both POP and IMAP use SMTP for communication
between the e-mail client and server, but they make e-mail more user friendly.
POP allows users to download e-mail from a mail server to a PC where it
can be read, answered, and stored on a hard disk. IMAP is even better
because it allows you to manipulate your e-mail account on the server.
-
SNMP
(Simple Network Management Protocol) is the protocol governing network
management and the monitoring of network devices and their operation. It
is not necessarily limited to TCP/IP networks.
-
NNTP
(Network News Transfer Protocol) allows
client software, called "newsreaders", to access, read, reply to,
or post messages on Usenet newsgroup servers, the electronic equivalent
of a bulletin board. NNTP
servers, typically provided by ISPs, store the Usenet messages and provide
the software to manage them. NNTP
client software may is typically integrated into your browser, but it can
be implemented in a separate
newsreader, which you may prefer to your browser implementation.
NNTP
replaced the original Usenet protocol,UUCP (UNIX-to-UNIX Copy Protocol).
NOTE this was misleadingly
omitted in the WhatIs diagram where they used "UseNet" (which is the service)
instead of this protocol.
-
Telnet
is the TCP/IP protocol for remote logon. Using Telnet, one can log
on to a remote network computer as a regular user with whatever privileges
that have been granted on the host computer. Before the advent of
the Web, Telnet was more frequently used, but now, with Web page "front
ends" to services like e-mail servers, it is not needed. For
example, e-mail users used to have to actually log on to their e-mail server
in order to use their account, but with a Web page front end, they can
access their account via a browser. Therefore, Telnet is now only
needed by users
who want to use specific applications or data stored on a particular host
computer.
WhatIs diagram includes two
services (DNS and NSF which are not, themselves, protocols) in the same
level as the preceding protocols. Do not let this confuse you; all
protocols, except Telnet, end in "P".
-
Other
emerging Internet protocols include:
-
WAP
(Wireless Application Protocol) is actually a family
of protocols, developed by Ericsson, Motorola, Nokia, and Unwired Planet,
that standardize communications between wireless devices, e.g. cellular
telephones, PDAs (personal digital assistants), etc. WAP facilitates
Internet access, including e-mail, the World Wide Web, newsgroups, IRC,
etc., on wireless devices. The family of WAP protocols include:
-
Wireless Application Environment (WAE)
-
Wireless Session Layer (WSL)
-
Wireless Transport Layer Security (WTLS)
-
Wireless Transport Layer (WTP)
SAQ
22: What are the applications within Netscape Communicator suite that implement
a particular protocol?
-
TCP
(Transfer Control Protocol) and UDP (User Datagram Protocol)
facilitate the transmission of data streams (e.g. a complete e-mail message)
between applications running on different hosts. They are connection-oriented
protocols that manage the link between sender and receiver without reference
to the network path between them (That is the job of _______(12)).
-
TCP
is a "reliable" protocol because it guarantees reliable delivery of
the complete transmission by performing the error checking and handshaking
necessary to verify that data makes it to its destination intact.
-
TCP divides
data streams into blocks called TCP segments and transmits them
using IP. In most cases, each TCP segment is sent in a single IP datagram.
If necessary, however, TCP will split segments into multiple IP datagrams
that are compatible with the physical data frames that carry bits and bytes
between hosts on a network. Because IP doesn't guarantee that datagrams
will be received in the same order in which they were sent, TCP reassembles
TCP segments at the other end to form an uninterrupted data stream. FTP
and telnet are two examples of popular TCP/IP applications that rely on
TCP.
-
TCP sets
up a connection at both ends of a transmission and uses checksums to verify
the data integrity and handshaking. It also manages the division
of the message into uniform packets. These packets are independent
and may be sent via different paths through a network; when they are received
by the TCP layer of the receiving computer it reassembles the packets into
the original message.
-
With TCP,
data
is transmitted in packets called TCP segments, which contain TCP
headers and data from a higher level application.
-
UDP
is an "unreliable" protocol because it doesn't guarantee that UDP packets
will arrive in the order in which they were sent or even that they will
arrive at all. If reliability is desired, it's up to the application to
provide it.
-
UDP
is a simpler alternative to TCP, which is similar to but more primitive
than TCP. However, UDP does have a place in the TCP/IP suite,
and a number of applications use it, e.g. SNMP (Simple Network Management
Protocol) applications which are provided with most implementations of
TCP/IP.
-
Unlike
TCP, UDP does not divide its data packets nor does it provide
sequencing of packets. This means that the application program that uses
UDP must be able to make sure that the entire transmission has arrived
and is in the right order.
-
Network
applications, like streaming audio or video, prefer UDP because
TCP's error checking an retransmission would interrupt the real-time continuous
flow that streaming technologies require. Also applications that need to
save processing time because they have very small data units to exchange
(and therefore very little message reassembling to do) may prefer UDP to
TCP.
-
IP
(Internet
Protocol), a lower-level protocol than TCP or UDP, governs the transmission
of data packets throughout a computer network.
-
IP is
responsible for packet routing, i.e. selecting the path that data
packets (called IP datagrams) will follow to efficiently
reach their destination. This involves utilizing routers to "hop"
between different networks, i.e. separate networks are tied together by
the routers thus forming the Internet or an intranet.
-
IP manages
the address part of each IP datagram insuring that it is sent to
the correct destination. Each gateway or router the packet traverses checks
this address an forwards the message along the most efficient route.
Connections in a TCP/IP network are specified by 32-bit IP addresses,
which are represented, for humans, as dotted decimal numbers, expressed
as four decimal numbers separated by periods. Valid addresses thus
range from 0.0.0.0 to 255.255.255.255, a total of about 4.3 billion addresses.
(For example, Tony's Office Mac is 131.118.83.3 and PC is 131.118.74.21).
-
IP could
be called "the most fundamental of the TCP/IP protocols" because every
other protocol depends on it; it is the foundation of the TCP/IP stack
(of protocols).
-
Other
network layer protocols, that play less visible but equally important roles
in TCP/IP networks, include:
-
ARP
(Address Resolution Protocol): A protocol for converting an IP address
to the actual address of the computer that is recognized in the local network.
For example, if the computer is on an Ethernet LAN, the 32 bit IP address
must be converted, a 48 bit Ethernet address. (The physical machine address
is also known as a Media Access Control or MAC address.) A table,
usually called the ARP cache, is used to maintain an association between
each MAC address and its corresponding IP address. ARP provides the protocol
rules for making this connection and providing address conversion in both
directions.
-
RARP
(Reverse Address Resolution Protocol): It converts physical network
addresses into IP addresses, i.e. it is the reverse of ________(13).
-
ICMP
(Internet Control Message Protocol) is an extension to the Internet
Protocol (IP) that allows for the generation of error messages, test packets
and informational messages related to IP. ICMP is a "support protocol"
that uses IP to communicate control and error information regarding IP
packet transmissions. It allows IP routers to send error and control
messages to other IP routers and hosts. If a router is unable to forward
an IP datagram, for example, it uses ICMP to inform the sender that there's
a problem. ICMP messages travel in the data fields of IP datagrams and
are a required part of all IP implementations.
-
A rather
advanced tutorial on IP addresses and routing is found at http://www.sangoma.com/fguide.htm.
(There is no need to read this unless you really want to know what all
the numbers of an IP address mean.)
SAQ
23 : What are the significant (a) similarities and (b) differences between
TCP and UDP?
-
SLIP
and PPP are two protocols that allow two computers to communicate
via a serial connection (in which bits are transmitted sequentially),
thus they correspond to the OSI layer 2. Both transmit packets over
serial links (either dedicated or dial up lines). They are most commonly
used to allow modem/telephone connections to the Internet via an ISP but
they can also be used to provide dial-up access between any two networks.
For example, an ISP provide users with a SLIP or PPP access there server
gives Internet access as long as the dial-up connection is maintained.
However, a modem connection to the server via a serial line is typically
slower than the parallel or multiplex lines (such as a T-1 line) of any
network that is used to access the Internet directly.
-
SLIP
(the older of the two protocols) was
invented to be used for communication between
two computers that can be
previously configured for communication with each other. Basically
it encapsulates TCP/IP packets with headers and trailers, thus allowing
them, for example, to be sent via a modem/POTS to your ISP.
-
PPP(Point-to-Point
Protocol) provides a similar facility to SLIP, but, being more sophisticated,
has largely replaced the older protocol. PPP works with IP, but is
designed to manage other protocols as well.
Therefore,
it is not necessarily part of the TCP/IP suite but is usually considered
to be so.
-
PPP is
a full duplex protocol that can be utilized with various kinds of
media, including twisted pair, fiber optic lines, or satellite links.
-
The advantages
of PPP over SLIP include the facts that PPP:
-
can establish and terminate
a communication session as well as hang up and redial if a low quality
channel occurs.
-
can manage
both synchronous and asynchronous communications,
-
can share
a communications channel with other protocols,
-
provides
address notification, via which a server informs a dial-up client of its
IP address for the current session, and
-
it has
built-in error detection.
Connected:
An Internet Encyclopedia, has a more detailed (but still concise) description
of PPP at
http://cth.ccsl.com.np/CIE/Topics/65.htm.
There
are no TCP/IP protocols that correspond to the OSI layer 1.
The TCP/IP suite must use separate layer 1 protocols such as ISDN, ADSL,
ATM, etc. to provide the actual connection to the physical medium over
which the message is to be transmitted.
SAQ
24: What are the most commonly used TCP/IP protocols?
2.4
THE TCP/IP TRANSMISSION SEQUENCE (TCP/IP ARCHITECTURE):
-
FIGURE
TCP/IP-1 illustrates TCP/IP's layered design, showing the
relationships among its most important protocols. FIGURE
TCP/IP-3 illustrates how data, in preparation for transmission,
is encapsulated at each TCP/IP layer with "headers" and "trailers" and,
after reception, how these are stripped off, interpreted, and acted upon
in the receiving computer.
-
FIGURE
TCP/IP-3 shows that, as a unit of data "flows downward" (a figure
of speech) from a client application to the network interface card, it
is
encapsulated at each of a succession of TCP/IP layers until it
forms a "packet" that can be successfully routed over the internet to its
destination.
-
At each
layer, it is encapsulated with layer data required by the equivalent
TCP/IP layer of the receiver computer.
-
If the
network being used is Ethernet, the Ethernet card creates a standard Ethernet
frame that encapsulates the data unit and its TCP and IP headers.
-
The operations
of the layers of the destination computer on the Ethernet frame
are the reverse of those of the sender. The data link layer strips
off the Ethernet headers and trailers and passes the IP datagram to the
IP layer; it is passed up with headers removed and interpreted until
the original data is supplied to the receiving application which can then
be processed.
-
Example:
To
illustrate the process of sending a transmission via TCP/IP consider a
Web
transmission, i.e. a Web browser (the client) uses HTTP to request
the download of a Web page (HTML data) from a Web server attached to the
Internet.
-
The browser
first creates a virtual connection (called a "socket") to the server
where the Web page is stored.
-
To download
a Web page, the client sends an HTTP GET command (a sequence of
bits) to the server by writing the command to the socket. Figure
TCP/IP-4
shows that:
-
the socket
software uses TCP to add a header to the GET command thus forming a TCP
segment and
-
the segment
is "passed" to the IP module, which in turn adds its header forming an
IP
datagram
-
the datagram
is then "passed" on to the data link layer of the particular network (e.g.
Ethernet) which ultimately encapsulates the datagram with a header and
trailer forming a frame
-
the frame
is finally forwarded, over the network, to the Web server.
-
If the
browser and the Web server are running on computers connected to different
physical networks (as is usually the case), the set of frames that
make up the whole message go from network to network until they reach the
one to which the server is physically connected. The different frames
can follow different routes over the network. Ultimately, the
frames are delivered to their destination and reassembled so that the Web
server, which reads chunks of data by performing reads on its socket, sees
a continuous stream of data.
-
To
the browser and the server, data written to the socket at one end shows
up at the other end, as if by magic. However, underneath, all sorts
of complex interactions have taken place to create an illusion of seamless
data transfer across networks.
SAQ
25: List, in sequence, the TCP/IP headers and trailers that are added to
an e-mail message
SAQ
26: In FIGURE
TCP/IP-3,
an HTTP header correspond to what?
2.5
USING TCP/IP:
-
The TCP/IP
software on a computer provides platform-specific implementations of
TCP, IP, and other members of the TCP/IP suite. Modern PC operating
systems have TCP/IP applications bundled within the O.S; older O.S..
like Windows 3.1/DOS required that TCP/IP software be installed before
Internet connections could be established.
-
Modern
software bundles all the TCP/IP protocols in a "TCP/IP stack"; this
term reflects the hierarchy of these integrated protocols, they are
referred to, collectively, as the TCP/IP stack. The application
layer protocols include (but are not limited to) the World Wide Web's Hypertext
Transfer Protocol (HTTP), the File Transfer Protocol (FTP), Telnet (Telnet),
and the Simple Mail Transfer Protocol (SMTP).
-
When you
given access to the Internet (e.g. by your ISP) you will be provided with
software that incorporates TCP/IP applications. Every other computer
on the Internet (or corporate intranets or extranets) have similar TCP/IP
stacks although they may come from different companies. The operations
of this stack of programs are completely invisible to the user. In
other words TCP/IP, as far as the user is concerned, simply turns innumerable
small, unknown networks into one big one (the Internet or an intranet)
and provides all the services needed for applications to communicate with
each other over that network.
3.
THE WORLD WIDE WEB:
In section 2.2, we specified three "information retrieval services",_____________________
(14), _______(15), and _______(16) that are unique to the Internet.
The latter two are no longer important because they sites have by now almost
been completely replaced by equivalent Web sites. Therefore information
presentation and retrieval, for the foreseeable future, will be centered
on the Web; COSC 120 is mainly based on search a retrieval aspects
whereas COSC 330 focuses on the presentation aspect.
3.1
The Web Concept:
-
The World Wide Web (Web, WWW,
or W3) is a distributed, hypermedia information retrieval system.
It is not an application nor protocol like Telnet, FTP and Gopher (HTTP
is the protocol of the Web.). Instead, an invisible network (or web)
within the larger network of the Internet. It can be thought of, at least
two ways:
-
as a network of computers, i.e.
a subnet of the Internet whose protocol is ______(17) and
-
as a web of documents, i.e.
a distributed "virtual database" of multimedia documents, written
in ______(18), whose content is accessed by hyperlinks.
-
The nonlinear nature of documents
accessed by hyperlinks puts the "web" into the Web. (See
FIGURE
LM1-8.) A location (text phrase
or graphic) in any document can be linked to
-
another location within the
same
HTML document, i.e. a "target" in the same HTML file.
-
another document on the same
computer (typically, but not necessarily another HTML document (file))
, or
-
another document on another
computer (________(19) server) on the Internet.
All these documents are accessed
by a client program, called a __________(20).
FIGURE
LM1-8
Hypertext vs. Normal text
|
 |
3.2
History of the World Wide Web:
-
The concept of the Web is attributed
to Tim Berners-Lee of CERN, the European Laboratory for Particle Physics
in Geneva, Switzerland, who first proposed it in 1989; CERN developed the
first WWW prototype in 1990. (

Streaming
multimedia interview on ZDTV's "Big Thinkers") In the document
About
the World Wide Web, he wrote about his vision the Web, "the universe
of network-accessible information, an embodiment of human Knowledge." You
can access that document at
http://www.w3.org/hypertext/WWW/WWW
Berners-Lee wanted a single
means of access (one client) to the diverse services of the Internet
(See FIGURE LM1-7.)
FIGURE
LM1-9
Web Access to Net
Services
|
 |
-
To overcome problems of incompatibility
between different sorts of computers, the WWW introduced the principle
of "universal readership," which states that networked information
should be accessible from any type of computer in any country, with one
easy-to-use program.
-
The first Web documents were
only hypertext, and thus not so inspiring as the multimedia documents
that make up the Web of today. The first multimedia browser, Mosaic,
was developed by Marc Andreesen, Eric Bina, and others at the National
Center for Supercomputer Applications (NCSA) at the University of Illinois.
However, it was not until Andreesen left NCSA, co-founded Netscape Communications,
and developed the browser, __________ __________(21) that the popularity
of the Web really exploded.
3.3
Advantages of the Web:
-
The Web facilitates multiple
protocol support. (See FIGURE LM1-9.) To access any Internet
service, all one needs to do is type the URL type (associated protocol
or keyword) followed by the domain name (file location), e.g.
http://www.fsu.umd.edu/<path
to some HTML File>
accesses an unspecified Web
page on FSU’s web server; the http designates the URL type. (Sometimes,
as in the case of http, this is the same as the protocol.) The
www.fsu.umd.edu identifies the server and <path to some HTML File> is
a generic symbol for a sequence of directory names followed by a specific
file name.
SAQ
27: Give the equivalent of <path to some HTML File> for this page you
are reading.
Other URL
types include ftp, telnet, mailto, news, gopher, wais, etc.;
when they are typed into a browser, it invokes the associated protocol
and accesses that Internet service.
-
The Web is designed to provide
access to distributed, dynamic, and platform independent information.
-
A distributed system
is one in which computer resources are distributed throughout a communications
network. Each of the networked computers is designed to handle its
local workload but has access to all the resources of the network.
The network itself supports the system as a whole, based on the
client-server model. This is the opposite of a centralized multi-user
computer like a mainframe. The amount of information which can be stored
on the Internet is limited only by the number of computers and their collective
storage space. Thus the Net effectively has an infinite storage capacity!
-
The content of the Net is constantly
changing and evolving. This dynamic nature of the Internet means
that users have access to the most up-to-date information possible,
like a living, unlimited, multimedia encyclopedia. The disadvantage of
dynamic information is that it can disappear if the network connection
is blocked or the file is moved (or removed) from its server; resulting
"dead links" are the bain of the Web user!
-
What makes the Web so radically
different from other computer facilities is that it is "platform
independent", i.e. it can be accessed from any kind of computer
and
any
operating system. All one needs is a browser designed for the operating
system you use; the browser GUI is thus the same on all computers. The
Web documents are written in HTML, a platform independent language,
which means they can be stored on and accessed from any kind of computer
system, as long as it implements TCP/IP.
-
Unlike most Internet services,
access to Web information is user friendly in that it is interactive
and easily explored.
-
What makes the Web so interactive
is its ability to accept information from users and perform various
actions based on these responses. This is accomplished by using various
techniques including:
-
forms, a special Web
page that includes text fields, check boxes, radio buttons, menus, and
popup lists that give the user the ability to interact with the Web server.
(
See the text,
Chapter 12.)
-
JavaScript
-
Java
-
proprietary technologies such
as
-
Macromedia Flash See
the text, Chapter 21.)
-
See
the
text, Chapter 21.)
-
Web access is based on hypertext
which allows hyperlinks to be embedded in text; this has been extended,
in "hypermedia", to embed hyperlinks in graphic images as well.
It is now possible to move between Net documents by pointing and clicking,
without needing to know the physical name of the file or even the address
of the computer on which it is stored.
-
The Web facilitates nonlinear
access thus providing user control over the sequencing of information
retrieval. HTML makes it possible to embed hyperlinks into the
text, thus creating "hypertext", i.e. text that also is linked to
other text so that the information sequence depends on choices of the user.
The hyperlinks can use different protocols making it possible to
access documents with various Internet protocols. Thus the browser
concept integrated the use of all Internet protocols into one client.
3.4
Basic Web Concepts:
-
Web information is normally
contained in HTML documents. HTML (Hypertext Markup Language; see
)
allows one to "program" a document by describing its layout, contents,
and hyperlinks with "style tags" embedded in text files. At first, HTML
documents were created using a pure ASCII text processors; the style tags
were typed in along with the regular text. Now, sophisticated
HTML editors
(e.g.
Macromedia Dreamweaver, Microsoft FrontPage, and Netscape Composer, part
of the Netscape Communicator suite) can generate HTML using WYSIWYG GUIs.
-
An HTML document is any
text document written in the prescribed HTML format with imbedded tags.
-
A Web page is an HTML
document that is made available, by a Web server, for access via the Internet.
-
A home page is the default
starting point or organizational center for any collection of Web pages.
It typically has the name index.htm
or index.html and is
opened automatically by the Web server when a Web site (See next.) is accessed.
-
A Web site is an integrated
collection of Web pages which is normally collected in a single directory
(folder) called a Web account.
-
A hyperlink is text (hypertext)
or an image (hypergraphic) that is distinguishable as a link to another
location in the same document or to another HTML document. The browser
is designed to detect when the user clicks the mouse on a hyperlink; it
then locates the destination and downloads it into the browser. The convention
for designating hypertext is the underline, so underlines should not be
used in hypermedia documents for other reasons. There are three basic types
of hyperlinks:
-
absolute links are used
to access a different Web page and thus must give the absolute URL, i.e.
the complete URL, of that document. These typically lead to the beginning
of the Web page unless they contain a target. (See the next item.)
-
target links point to
a named "target" placed within a Web page; when incorporated into
an absolute link, this allows a link to go to any point within any Web
page.
-
relative links are pointers
to another file relative to the location of the current file, i.e. the
document where the link originates.
-
When a link is created between
documents within a Web site it is a relative link because it is
specified relative to the document in which the link originates. This is
typically done, in an HTML authoring program, by selecting a link button
and browsing to the HTML document to which the link is made.
-
When
an HTML document is published, the relative links still work as long as
the organization of the files on the server is the same as that on the
computer where they were created. Therefore, all developers
need to do is make their links work on their local storage, then, if
the file structure is uploaded intact (i.e. it is a perfect reproduction
of the original file structure) the relative links between all files on
the server will work.
Illustrations of these different
link types can be found in this document. Investigate the source
of this page (You will have to open the page in its own window and then
select Page Source
from the View menu.)
and you will find examples of all the above links. Look for the tags
that begin with <a href=; these are all hyperlinks. If what follows
the equal sign is (1) a complete URL, it is an absolute link, (2) # followed
by text, it is a target link, or (3) a path name followed by a file name,
it is an absolute link.
-
Bookmarks (sometimes
called "hot links") are links that are saved in a HTML file so they
can be retrieved and traversed in the future.
SAQ
28: What is the difference between a relative link and an absolute link
in an HTML document?
-
HyperText Transport Protocol
(HTTP) is a member of the TCP/IP protocol suite that defines how to
identify, send, and retrieve Web documents.
-
A
browser
is ________(22) software for viewing HTML documents and navigating hyperlinks
to other documents, not necessarily on the Web.
-
Plug-ins
and Helper applications are programs that can be used by a browser
to overcome its inadequacies
-
Plug-ins typically are
software components that are added to the browser itself.
For example, if a browser does not support the format of an image or sound
file (See Embedded Files in the next section.) that is embedded in an HTML
file, the browser may use a plug-in specifically designed to view
that type of image or play that sound. A popular example is Real
Player which allows one to access to streaming multimedia. Although
browsers typically come bundled with some plug-ins, they usually have to
be downloaded and installed in the browser. Modern browsers will
prompt the user when a plug-in is needed and will even automatically access
the server where that plug-in can be downloaded.
-
Helper applications are
separate, stand-alone programs that perform a task the browser can
not. (These are not as prevalent now that browsers come with more built-in
facilities.) Helper applications are typically used when a browser
does not support a particular communications protocol. In this case
an application that provides that service can be executed by the browser.
For example, telnet access was not built into Netscape Navigator
3.0 so a separate telnet application, available on the same computer, had
to be run by the browser. Usually the user specifies, in the browser
preferences, the particular application to be used in a particular situation.
-
Embedded Files: In addition
to text, HTML documents can contain links to graphic images,
video clips, and sounds. These elements are stored in separate files
(not necessarily on the same server as the original HTML file) called
MIME (Multipurpose Internet Mail Extensions) files; (See section
4.3; for more information click here.)
When the HTML document is displayed by a browser, the browser shows those
elements that it can handle and passes off (to plug-ins or helper applications)
those that it can not . There are numerous MIME file formats discussed
in LM V, but the most common are:
-
Of the image files GIF
(a simple format used for basic pictures) is the most common, but the newer
JPEG
(a compressed format that stores high quality images in relatively small
files) is used for information rich images.
-
MPEG is a motion image
format for displaying images and sound.
-
AU and WAV are
digital audio file formats for playing sounds.
SAQ
29: "Embedded" files is a misleading term when used to describe HTML
documents! Why?
-
"Push technology" is
a way of automatically delivering Web pages to a browser without
the user selecting it. Instead some program, called an "agent"
selects the page, usually based on preferences pre-specified by the user.
Push is the opposite of "Pull", the normal Web access, in which
users selects a page by actually clicking a hyperlink. This technology,
pioneered by Pointcast Network, blends the Web with TV (which automatically
delivers content to the user). Push was hyped as a way of providing
an intelligent software "adviser" (the agent) that would recommend
Web pages to the user thus reducing the need to search through an overwhelming
number of Web sites to find pages of interest. However, some consider
it an invasion of privacy.
3.5
The WWW as a Subnet of the Internet:
-
The WWW is the Network of Web
Servers, Accessible by
-
WWW Clients
Access Internet Resources via URLs.
-
URL (Uniform Resource Locators)
are the addressing system of the WWW. This system was developed to
allow browsers to access any information currently available on the Net
(provided by Gopher and WAIS, in addition to _____(23)); in fact, it was
designed to incorporate future developments in Internet technology as well.
-
A URL is the Internet-wide address
of any document you can read with a WWW client, i.e. a _________(24).
A URL can describe any file on the Internet, even though different files
may require different protocols to access them.
-
The URL (1) instructs the client
program how to contact the server, (2) tells the server to transfer the
designated document to the computer on which the client resides, where
(3) the client displays the document. All of these activities require just
one action from the user: typing the URL or clicking on a link.
-
A URL can have, at most,
five distinct parts.
-
The left-most part of a URL
is the URL type or protocol prefix used to access the Internet
address. The types recognized by a browser include:
-
http:// which designates
HTTP and accesses Web sites. (This is the browser "default"
so if the prefix is not typed, the browser will assume http and automatically
insert it in front of the URL.) https:// designates a Web
document on a secure server.
-
ftp:// which designates
file transfer protocol used to upload and download files via TCP/IP.
-
telnet:// which designate
the telnet protocol used to log on to a remote computer or run applications
on a network server. (rlogin:// and tn3270 are infrequently
used alternates to telnet.)
-
wais:// which designates
Wide Area Information Server, an infrequently used information service.
-
gopher:// which designates
a Gopher server, another information service that is virtually obsolete
now.
-
news: which opens the
newsreader client associated with the browser and accesses a Usenet newsgroup.
snews:
opens accesses a newsgroup at a secure news server.
-
mailto: which opens the
e-mail client associated with the browser so that e-mail can be read or
sent.
-
file:/// which opens
a file on the local computer system.
Note that the part after the
colon is interpreted according to the access scheme. In general, two slashes
after the colon introduce a host name (host:port is also valid, or for
FTP user:passwd@host or user@host). The port number is usually omitted
and defaults to the standard port for the scheme, e.g. port 80 for HTTP.
-
The domain name of the
server (or ______(25) name) on which the Internet document resides. (See
section
C below.) This ends with a slash, followed by . . .
-
the directory path or
sequence
of directories (or folders) separated by slashes which precede . .
.
-
the file name of the
document to be accessed (which is not always required). The file can contain
any type of data, but only certain types are interpreted directly by most
browsers. These include HTML and images in gif or jpegformat.
The file's type is given by a (See
section
4.3, below) in the HTTP headers returned by the server, e.g. "text/html",
"image/gif", and is usually also indicated by its filename extension. A
file whose type is not recognized directly by the browser may be passed
to an external "viewer" application, e.g. a sound player.
-
The last (optional) part of
the URL may be either a
-
a "target" preceded by "#";
this indicates a particular position within the specified document, or
-
a query string preceded
by "?" which activates a CGI script and allows the user to enter a query.
(You can see an example of a query string, if you access FOLDOC
and type in a term to look up (e.g. if you type in "FTP" you will see the
query string ?query=FTP&action=Search at
the end of the URL displayed in the Location box when the answer appears.)
Only alphanumeric, reserved
characters (:/?#"<>%+) used for their reserved purposes and "$", "-",
"_", ".", "&", "+" are safe and may be transmitted
unneeded. Other characters are encoded as a "%" followed by two
hexadecimal digits.
SAQ
30: Which URL types are not written as protocols, e.g. "http"?
SAQ
31: Identify the parts of the URL, http://www.frostburg.edu/dept/cosc/htracy/cosc120/MODULES120/servicesIR.htm.
SAQ
32: The sequence of directories and file name, when taken together are
called what?
SAQ
33: Give analogies between similar parts of a street address and a Web
address?
-
The
Domain Name System (DNS) is a way of associating arcane numeric IP
addresses with more memorable "domain names" used in URLs. The Internet
Protocol (the "IP" in TCP/IP) uses Internet address information to access
every node (client, server, printer, etc.) on the Internet. Every IP address
is a series of four integers separated by periods (called "dots"), for
example, 131.118.95.254, the unique address of the FSU gateway (to the
UMS network).
-
There are two big problems with
IP addresses. (1) It is difficult to remember pure numeric addresses and
(2) sometimes these IP addresses change. To solve these problems the DNS
was designed to handle the addresses of Internet nodes.
-
The DNS establishes a hierarchy
of domains (groups of nodes on the Internet). The domain at the
top level of the hierarchy maintains a database of addresses of the subdomains
beneath it. Each subdomain has similar responsibilities for their subdomains,
and so on. For example, the domain name of one of the administrative computers
at FSU is fra00.fsu.umd.edu;
the top domain is edu, which stands for _________(26);
just below that is umd which stands for _____________(27); below
that is the fsu domain; fra00 is the ________________(28).
-
Top-level domains (TLD)
specify the general category of the domain. Until 1998 TLD names
were restricted to:
-
gov for Government agencies
-
edu for Educational institutions
-
org for Organizations
(nonprofit)
-
mil for Military
-
com for commercial business
-
net for Network organizations
-
country abbreviations e.g. uk
for Great Britain, de for Germany, etc.
The limitations resulting from
these restricted categories were removed in 1998 when the Internet Ad Hoc
Committee (IAHC) proposed six new top-level domains (
However,
I have yet to see any of these and haven't heard any discussion for a long
time!):
-
store for merchants
-
web for parties emphasizing
Web activities
-
arts for arts and cultural-oriented
entities
-
rec for recreation/entertainment
sources
-
info for information
services
-
nom for individuals
-
The easily recognizable domain
names and their associated IP addresses are maintained on DNS name servers
which also performs the conversion from domain names to actual IP addresses.
The DNS at FSU is maintained on a name server with the IP address 131.118.80.1;
it has the domain name freris.fsu.umd.edu.
-
When the IP address of a node
changes, the database of the DNS name server is updated but the domain
name remains the same. Thus one never has to worry about the actual address
of an Internet resource or whether it has been changed.
-
The Internet Registry, a part
of the Internet Activities Board (IAB), currently maintains the
DNS.
4.
OVERVIEW OF WEB DEVELOPMENT:
The
essence of Web development is (currently)
the "generation" of HTML documents and publishing them on
a Web server. However there are a growing number of techniques that
compliment HTML, adding multimedia and interactivity to Web sites.
These are a primary focus of COSC 330 and are previewed in the following
sections.
4.
1 Web Development involves several overlapping techniques:
-
HTML "documents" are actually
HTML programs that are a composite of
-
simple text formatting,
-
hyperlinks to other documents
or multimedia files,
-
embedded code of other
languages, typically authoring languages (See section 4.2.A.), and
scripting languages (See section 4.2.B.)
-
invocations of separate
programs written in other languages, typically Java or CGI scripts.
The HTML, embedded code,
and external programs collectively direct the browser to display formatted
text and dynamic multimedia in an interactive format.
-
Web development
activities
can be categorized under five broad classifications::
-
Authoring
involves
creating HTML documents, which are composites of text and HTML tags. Such
documents can be created and modified by
-
typing
in a simple ASCII text processor,
-
converting
from other word processed documents,
-
WYSIWYG
HTML authoring systems are widely available that allow developers
to generate HTML documents by normal typing while incorporating HTML tags
by using in a variety of menus and dialog boxes exactly like word processing.
The HTML tags are invisibly generated but they can be edited and "tweaked"
in an HTML editor. (See the comparative
analysis of current WYSIWYG Editors by CNET.)
-
proprietary
authoring tools. These may supersede, but will probably augment
HTML authoring. (See section 4.1.C.)
Whatever
the authoring tool, all generate programs in a an authoring language; see
these in section 4.2.A
-
Scripting
is
a programming technique for embedding code into a Web document in order
to
-
add dynamic
content and
-
facilitate
interactivity
between the client and server The scripts governing the processing
of exchanges of data between the client and server.
-
Programming
allows
a developer to create applications, completely distinct from the HTML document,
using standard programming languages. Such
programs extend the functionality of browsers to operations that would
otherwise have to be processed on the server, such as handling user input
or searching databases.
-
Publishing
is the storing of HTML documents on a Web server. This can be done
three different ways:
-
One can
use
the operating system of the server directly to save the HTML documents
on the server's secondary storage.
-
One does
not have to have access to the server itself; the server operating system
can be accessed remotely, over a TCP/IP network, using telnet.
-
Another
alternative is to use FTP (file transfer protocol) to "upload" files
from a PC, where they were authored, to a Web server.
-
Maintaining
means
the creation and upkeep of a Web site as a whole thus insuring the integrity
of the site. This essentially involves making sure, as the site evolves,
that files, their names, and links, within the Web pages, all work properly.
Maintenance activities include editing, testing, and debugging throughout
the development process and rechecking when the Web site is updated or
expanded. Also, improvements, suggested by user feedback, are often
implemented during maintenance.
Not all
Web development projects involve all of these activities; in fact, effective
Web sites can be created, by nonprogrammers, using only HTML. Also
there are usually several different ways of achieving the same effects
with a Web page, so there is no preferred, general procedure for Web development.
-
Proprietary
Web technologies:
-
(From
Webopedia; )
ActiveX
is a proprietary Microsoft
specification,
based on Microsoft's Component Object Model (COM) architecture,
that allows Windows-based programs to run within a Web page.It was not
designed for Web development but has been applied to it in Microsoft specific
applications.
-
ActiveX
enables any Widows-based program
to add functionality by calling ready-made components,
called ActiveX controls, that blend
in and appear as normal parts of the program. They are typically used to
add user interface functions, such as 3-D toolbars, a notepad, calculator
or even a spreadsheet.
-
On the Internet
or on an intranet, ActiveX controls can be linked to a Web page
and, like Java applets, downloaded
by an ActiveX-compliant Web browser. ActiveX controls turn Web pages into
software pages that can perform just like any program that is launched
from a server.
-
Unlike applets,
ActiveX
controls can be saved on the client, eleminating the need to download
them each time a Web page is opened; however, this can be dangerous;
the reason this is not allowed in Java is that applets are not allowed
to access the client's storage for security reasons. ActiveX does
have a signature feature that allows you to specify which servers
from which you will allow controls to be downloaded; however, this
can not prevent a control, once downloaded from damaging your system.
-
Cold
Fusion, by
Allaire Corporation, facilitates
the creation of Web interfaces to database management systems.
It includes a server and a development toolset designed to integrate databases
and Web pages. For example, with Cold Fusion, a user could enter a request
on a Web page, and the server would query a database for relevant information
and return the response to the Web page. Cold Fusion Web pages include
tags written in Cold Fusion Markup Language (CFML) that simplify integration
with databases and avoid the use of more complex solutions involving CGI
scripting, Java applets, etc.
-
Shockwave
is a technology developed by Macromedia, Inc.
that enables Web pages to include multimedia objects,
especially animated sequences. To create a
shockwave object, you use Macromedia's multimedia authoring tool called
Director,
and then compress the object with a program called
Afterburner.
You
then insert a reference to the "shocked" file in your Web page. To see
a Shockwave object, you need the Shockwave plug-in, a program that integrates
seamlessly with your Web browser. It also
lets output created by Macromedia's Authorware and Freehand tools be viewed
on the Web. The plug-in is freely available
from Macromedia's Web site as either a Netscape Navigator plug-in or an
ActiveX control. Shockwave supports audio, animation, video and even
processes user actions such as mouse clicks. It runs on all Windows platforms
as well as the Macintosh.
-
Flash
(also
called "Shockwave Flash"), by Macromedia, is a user-friendly technique
for producing vector-graphic based Web sites.Specifically
it is a file format for delivering interactive vector graphics and animation
on the World-Wide Web.
View
the Flash animations of great Superbowl plays at: http://superbowl.com/poc.html#)
See
the excellent demo (7min. video) of Flash technology on ZDTV' Call for
Help, 2/3/00. A good reference for Flash information is: http://www.flashkit.com/index.shtml
SAQ
34: What is the (a) similarity and (b) difference between scripting and
programming?
SAQ
35: The propoietary Web technologies are most similar to which of the activities
in section 4.1.B?
4.2
Programming Languages of the Web :

NOTE:
All of these make good candidates for the Project of this course, especially
for C.S. majors!
You
can develop an Introduction or, better yet, a tutorial on one of these
languages.
-
Authoring
languages are powerful, special purpose programming languages that
are designed to allow nonprogrammers to create Multimedia (text,
graphics, animation, audio, and video) applications. These have great potential
for developing individualized learning system software.
-
Hypertext
Markup Language (HTML) was the original language used to create
Web pages; it has been updated via several versions. HTML
is really used only to create (29)
which, when inserted into regular text, tell a Web browser how to format
text, insert multimedia, link to another location, or link to other programs
written in VRML, Java, JavaScript, or other languages (CGI applications).
-
Dynamic HTML (DHTML)
is not a programming language itself,
but an augmentation of HTML that facilitates the enhancement
of animation and
interactivity of Web pages by providing
scripting, cascading style sheets, layering, dynamic fonts, etc.
(From W3C Stylesheet page,
http://www.w3.org/Style/:
"Dynamic HTML is a term used to describe
HTML pages with dynamic content. CSS
is one of three components in dynamic HTML; the other two are HTML itself
and JavaScript (See section 4.2.B.a,
below.). The three components are glued together with DOM, the Document
Object Model. Dynamic HTML is still in its infancy and current implementations
are experimental. ) DHTML is covered
in more detail in LM IX.
-
Extensible
Markup Language (XML), a platform-independent Web document formatting
language, is another improvement on HTML. It is covered in LM
X.
-
XML is
"extensible" because, unlike HTML, the ability to define new
markup tags makes it virtually unlimited and self-defining. XML
allows the developer to define new tags specifically for new data
types thus dramatically expanding the variety of customized data
that can be handled in a Web page. In fact, it has
been said that "XML is to data what HTML is to text", but since
text is a specific form of data, XML is a more general markup language
than HTML.
-
XML is
being supported
by the United Nations as a premier standard for e-business.
-
XML is
the centerpiece of Microsoft's ".Net"
(Dot Net) intitiative as well as virtually all other distributed
computing activities.
-
Synchronized
Multimedia Integration Language (SMIL) is an upgrade of HTML that
facilitates the synchronization multimedia elements of a streaming Web
page enabling its audio, video and graphics elements to be coordinated.
-
{Add
other W3C markup languages
based on XML: XHTML, MathML, and SVGA.}
-
WML (Wireless Markup Language), part of the WAP protocol,
is a subset of XML specifically designed to program wireless devices so
that they can display the text portions of Web pages. ...
More information can be found on the WAP
Specification from OASIS (Organization for the Advancement of Structured
Information Standards ). and an introductory
tutorial is presented by The Wireless Developer's Network.
-
Virtual Reality Modeling Language
(VRML) is used to create Web pages with computer generated 3D
environments that can be "explored" as if they are the real world.
This promotes a highly interactive human-computer interface.
-
VRML allows the developer to
describe, in program code, three-dimensional (3D) image sequences and
define user interactions with them. Using VRML, you can build a sequence
of visual images into Web settings with which a user can interact by viewing,
moving, rotating, and otherwise interacting with an apparently 3D environment.
For example, you can view a room and use controls to move through the room
as you would experience it if you were walking through it in real space.
-
VRML programs can be "plugged
into" HTML programs.
-
Scripting
tools are primarily used to create programs that produce dynamic,
interactive Web pages This
can be accomplished by either client-side or server-side scripting.
In
computer programming, a script is a sequence of instructions
that is usually interpreted, i.e. executed by another program rather
than by the CPU. In other words, unlike languages like Java
and C++ (whose programs must be compiled into object code order to be executed),
"scripts" are typically executed directly, line by line, by an interpreter
that is associated with the particular Web browser. Some languages
have been conceived expressly as scripting languages, e.g. JavaScript,
and VBscript, while others, like Perl, are more general but may be used
in a scripting environments. In Web applications, scripting languages
are often used in two contexts: to write "scripts" that (1) are "server-side",
i.e. reside and run on a Web server or (2) are "clent-side", i.e. reside
in applications on the client, e.g. JavaScripts embedded within an HTML
document.
-
JavaScript
is the primary client-side scripting language. It is
is a simple, object based (not object oriented), cross-platform, World-Wide
Web scripting language. It is called a "very high level language"
(i.e. special purpose) because it is not a general purpose language like
Java but is specifically designed to be used with the World Wide Web.
In fact, it currently runs in only three environments. It
is mainly used for client-side scripting, in which the scripts are embedded
in the HTML and run by the browser. However, it can also be used
as a server-side scripting language and as an embedded language in server-parsed
HTML. (The following is adapted from FOLDOC's
definition of JavaScript)
-
JavaScript
is a very high level language (VHLL), and as such has a focused area of
application, HTML documents. While JavaScript programs can function
independently, they are designed to be embedded withing HTML documents,
giving them functionality not available in HTML itself, e.g. dynamic,
interactive multimedia, forms, simple web databases, etc.
-
JavaScript
was originally created by Netscape
(now part of AOL Time-Warner) and was proprietary, but its popularity led
it to becoming a defacto standard. Microsoft adopted it, but
(typically for Microsoft) "cloned" its own version called JScript.
The consequent inconsistencies, caused by Microsoft's self-serving refusal
to use platform independent standards, made it difficult to write JavaScript
that behaves the same in both Netscape Navigator and Microsoft Internet
Explorer. Fortunately an international standard, ECMAScript,
has been adopted for the core language so basic JavaScript is now platform
independent although the support of advanced features still depend on the
browser being used. The endeavor to create an open standard for JavaScript
is an effort to prevent Microsoft from monopolizing web software as they
have PC operating systems.
-
JavaScript
is an important emphasis of this course and will be covered, in detail,
in later learning modules. See the expanded description in LM
VI, section 3 and detailed presentation in LM
VII.
-
REBOL
(Relative Expression-Based Object Language) is a new approach
to Web scripting that introduces the idea of a messaging
language, one that "provides highly-integrated connectivity (networking)
along with context sensitivity (called "dialecting", the ability
to create variations, or sub-languages, for domain-specific communication.)".
It provides a broad range of approaches to the common challenges of Internet
computing. While designed to be simple and productive for novices,
the language extends a new dimension of powerful, practical solutions that
facilitate advanced Web development. Specifically, the language offers
a significant new approach to the seamless exchange and interpretation
of network-based information over a wide variety of computing platforms.
A message can be as simple as a single line or as complex as an entire
application.
(REBOL's
creator on ZDTV, 8 min.) For more information on REBOL, see
the home page of the language at: http://www.rebol.com/
-
Python
is simple, high-level
interpreted
language
that combines ideas from ABC, C, Modula-3, and Icon. It bridges the gap
between C and shell programming, making it suitable for rapid prototyping
and Web scripting. It is object-oriented and supports packages, modules,
classes, user-defined exceptions, a good C interface, dynamic loading of
C modules and has no arbitrary restrictions. (This
is ZDTV's Leo LaPorte's favorite learning language for beginning programmers.)
See
the home page of the Python language.
-
Tcl
(Tool
Command
Language)
is a powerful, extensible, interpreted string
processing language for issuing commands to interactive programs.
The extensibility of Tcl means that it can be easily extended through the
addition of custom Tcl libraries. It is used for prototyping applications
as well as for developing CGI scripts. It has a peculiar but simple
syntax. It may be used as an embedded interpreter in application programs.
Tcl
has an associated GUI toolkit, Tk, so it is sometimes referred to as Tcl/Tk.
For more information see the TLC/ Developer
Xchange.
-
Common
Gateway Interface (CGI) scripting is the traditional scripting
technique used for server-side scripting. CGI
is not a programming language but a standard protocol for running external
programs that are stored on a Web server. Such programs, called
"CGI scripts" can be written in virtually any computer language
(the most suitable is probably Perl), but interpreted languages
are preferred because compiled languages object code is not as flexible.
-
Scripts
are typically written to manage input via forms, providing feedback, and
performing searches.
-
The CGI approach is being superceded
by languages specifically designed for writing programs to be run by Web
browsers, e.g Java (applets and servlets) or JavaScript.
-
A nice set of simple examples
are give in the Web site of TechTV's Leo LaPort, www.leoville.com/perltest.shtml.
The online preassessments of my course COSC 120 are written with CGI
using Perl, e.g. Preassessment
1.
-
ASP
(Active Server Page) is not a programming language, but is called,
by Microsoft, "a server-side scripting environment." To create ASP
scripts developers typically use the languages VBScript or JScript, both
of which are automatically supported by ASP. ASP and HTML are tightly
integrated. In HTML, tags are delimited with brackets; similarly
in ASP one uses the <% %> delimiters to define the beginning and end
of a script. ASP scripts can be inserted anywhere in an HTML page, and
visa versa. One of major assets of ASP is that it facilitates access to
SQL databases. For a nice, introductory tutorial on ASP, access the
CNET site,
http://www.builder.com/Programming/ASPIntro/?dd.cn.txt.0701.10
-
Visual Basic is specifically
designed to quickly and easily develop code that provides interaction between
different Microsoft applications, e.g. Word and Access or Internet Explorer.
It is actually more of a "programming environment" in which a graphical
user interface is utilized to select and modify "library code" previously
written in the BASIC programming language.
-
VBScript is an interpreted
scripting language that is a subset of its Visual Basic programming language.
To
learn about VBScript try the online tutorial at:
http://www.intranetjournal.com/corner/wrox/progref/vbt/
-
General purpose, "high level
languages" (HLL) are the staple of software development. Any
HLL can be used to write
network applications; however, some make
that job much easier than others do. The two most common ways of
writing Web specific programs are:
-
using a HLL and following CGI
protocols. Such programs, called "CGI applications"
can be written in virtually any computer language but the most frequently
used is Perl.
-
using Java, perhaps the
"hottest" language today. Java is a purely object oriented,
HLL designed to facilitate reliable, platform-independent, distributed
processing. It is a general purpose language useful in
writing any type of application; however, it has special capabilities
for writing object-oriented programs that run on networks. The
most visible of these capabilities are "applets" (specialized small
applications) that can be called directly from within an HTML document.
This makes it the "language of choice" for most programmers of network
applications.
SAQ
36: List and distinguish the markup languages mentioned in the preceding
section.
SAQ
37: What is the difference between client-side scripting and server-side
scripting?
4.3
MIME Files:
-
MIME types are different
file formats that may be transmitted via the Internet's SMTP (Simple
Mail Transport Protocol). They are identified by file extensions (See Table
330/I-1.) which enable SMTP to transport non-ASCII data over the
Internet using ASCII mail protocols.
TABLE 330/I-1:
COMMON MIME FILE FORMATS
|
FILE
|
FORMAT DESCRIPTION
|
EXAMPLE
APPLICATION SOFTWARE
|
| .au |
most common sound format
on the Web |
NN; Sound Player (Mac); |
| .aiff |
another " " " |
NN; Sound Player(Mac); |
| .bin |
binary file for Mac computers |
Use Stuffit Expander to
convert to executable |
| .doc |
MS Word document |
MS Word (Mac or PC) |
| .exe |
DOS/Windows program or Archive |
Application that created
it; self-extracting |
| .gif |
GIF(Graphical Interchange
Format), the |
NN; JEPGView (Mac); Lview
Pro (PC) |
| .html/.htm |
HTML(Hypertext Markup
Language) |
Any Web browser |
| .hqx |
BinHex 4.0; Most Mac files
are in .hqx
Stuffit Expander (Mac);
Encodes Mac files in 7b text for transfer |
Use BinHex13 (PC) to un-binhex
it. |
| .jpg/.jpeg/.JP2 |
A 24 bit graphic format
This
will be updated to JPEG2000
with the extension .JP2 |
NN; JEPGView (Mac); Lview
Pro (PC) |
| .mid/midi |
MIDI files for electronic musical insturments |
audio/x-midi (Netscape plug-n) |
| .mpg/.mpeg |
MPEG, standard movie format
of the Net |
Sparkle (Mac); VMPEG (PC) |
.mov/.moov
.movie/.qt |
QuickTime Movie; native
Mac movie format |
QuickTime (Mac/PC); Sparkle
(Mac) |
| .pdf |
Adobe Acrobat Portable Document
format |
Adobe Acrobat Reader |
| .ps |
Postscript file; plain text
file; not readable |
Gostscript (Mac/PC); prints
to laser printers |
| .sit |
Stuffit Archive |
Use Stuffit Expander (Mac)
or UnSit (PC) to convert |
| .sea |
Macintosh Self Extracting
Archive |
Download as MacBinary, launch;
self-extracting |
| .tiff/.tif |
TIFF, very large high quality
image format |
JEPEGView (Mac); Lview Pro
(PC) |
| .txt |
plaint ASCII text |
Any wordprocessor |
| .wav |
Windows Wave format sound
file |
Windows Media Player; SoundApp
(Mac); any PC sound |
| .zip |
pkzip, a common DOS/Windows
compression format |
ZipIt/MacUnZip/Stuffit Exapnder
(Mac); WinZIP (PC) |
-
Browsers, themselves, can
read some MIME types, just as Internet mail programs do. For those
types not supported by the browser, "plug-ins" can be configured
to handle them. Multimedia data, of course, is originally analog; it must
be converted to digital form (binary code) in order to be processed by
computers; this is done by "analog to digital converters".
-
Internet mail programs use MIME
to transmit the above file types in the form of text-only mail messages.
-
When a mail program receives
message, it examines the special message header to determine whether
a message contains text or MIME data.
-
If the message contains a MIME
file, e.g. a MPEG video clip, it reconstructs the original binary data
from the ASCII of the mail message and then plays it back using the
plug-in for the application that created it.
-
File compression
is a reversable process that reduces the size of a file (for storage or
transmission) and restores the file when it is displayed (graphics file)
or executed (program file). In networked systems, file
compression is used to increase the efficiency of distributed computing
by reducing the size of files before they are transmitted over a network,
thus reducing the required bandwidth. In general, compression
is accomplished by a program that implements a compression algorithm
that rewrites a file into a smaller storage space.{
-
In general,
there are two types of compression:
-
Lossless
compression is completely reversable, i.e. the original file can be regenerated
perfectly from the compressed file.
-
Lossy
compression is NOT reversable, i.e.during lossy compression some "unimportant"
data is discarded to maximize the compression ratio. Obviously, recompression
of a lossy compressed file compounds the loss of detail, so it is advisable
to retain the original file for later editing rather than trying to edit
a compressed file.
-
Compression
and decompression is performed by a codec (compressor/decompressor).
A
codec can be either hardware or software that uses complex algorithms to
compress the file for network transmission and then decompress it to
regenerate the original file for processing or execution.
(Be
aware, however, that the term "codec" is used for several fundamentally
different functions, e.g. code/decode in the transmission of digital data
over an analog channel.)
-
MIME
types themselves, are usually designed to efficiently compress their
data. This is particularly true of graphic, audio, and video files because
they tend to be very large.
-
In network
transmissions, compression can be applied toan entire packet or on only
the data component. When files are transmitted over the Internet,
either singly or as part of a composit "archive
file",
they may
first be reduced using a zip,
gzip,
or other compressed format. WinZip is a popular Windows program that compresses
files when it packages them in an archive.
SAQ
38: (a) Which of the files listed in Table are data files?
(b) What are the others?
5.
SUMMARY OF THIS LEARNING MODULE:

You
might find a good application of "cloning" is to clone the LM summaries,
adding your comments and information from or linkes to other sources.
The simplest way to do this is to open a blank HTML page using an HTML
editor like Netscape Composer (Open this from the Communicator menu in
Navigator.) then "copy and paste" the summary onto your blank page and
begin modifying it. See the Checklist
discussion of cloning. Note:
I might be doing you a disservice in providing you a summary. I found
it very thought provoking when I edited this LM in order to create the
following summary, so maybe I should require you to write your own summary
instead of providing one for you. However, I think a better solution
is to have both so I encourage you to write your own summay and compare
it to mine; see the suggested
use of clone in the Checklist.
-
CONCEPTS:
-
Cyberspace is an abstract
computer workspace where all knowledge and information sources are linked
via ubiquitous
digital
networks. In COSC330 we will
use "information space" to represent all electronically accessible
knowledge which includes the matrix plus television, the telephone network,
etc. The Internet is, by far, the dominant WAN of cyberspace, but
the focus of this course is the World Wide Web, the subnet of the Internet
based on the HTTP protocol.
-
A Computer System can
be analyzed on the basis of Input-Process-Output model.
-
Local I/O involves only
the actual user
-
Indirect I/O involves
saving an retrieving files on secondary storage
-
Remote I/O involves communications,
typically via a network. This is the essence of Web access which
is based on the client-server model of computing; therefore, this is the
focus of COSC 330.
-
Computer
Networks facilitate remote access to distributed computer
systems in which data and computing power is spread over all networked
users.
-
Networks consist of interconnected
"nodes" that interact via a client-server model. Servers
are network computers which provide resources to the user of the network.
Clients
are computers or computer applications that provide users access to network
servers.
-
Networks are classified as
LANs,
WANs, or (less frequently used) MANs.
-
THE INTERNET:
-
The Internet
is a public, packet switching, wide area network that is based on the TCP/IP
protocol suite. Internet access is provided by internet service providers
called ISPs. Internet services
are divided (in this course) into three categories:
-
Communication services
which include
e-mail, newsgroups, mailing lists, chat, and teleconferencing.
-
Resource access services
which
include
file transfer protocol (FTP) and Telnet.
-
Information retrieval services
which included the World Wide Web, Gopher, and WAIS.
-
The Internet
is the WAN based on the TCP/IP suite of protocols.
-
There
is a great deal of TCP/IP details, but it is only really important to remember
four points:
-
TCP/IP
is a suite of communication protocols that permit physical networks
to be joined together to form a network of networks. TCP/IP combines the
individual networks to form a virtual network in which individual network
nodes are identified by IP addresses instead by physical network addresses.
-
TCP/IP
has a multilayered architecture that clearly defines each protocol's
services and responsibilities.
-
TCP
and UDP provide high-level data transmission services to network
application programs; they encapsulate application data in segments
and passes them to . . .
-
IP
which adds its data turning the segments into packets (or datagrams).
IP is responsible for routing the packets to their destination.
-
Data moving
between two applications running on Internet hosts travels down and up
the hosts' TCP/IP stacks. Layer data added by the TCP/IP modules
on the sending end is stripped off by the corresponding TCP/IP modules
on the receiving end and used to re-create the original data.
-
All
of this is invisible to the user!!
-
HTTP
is the only protocol that is really relevant to COSC 330.
-
THE WORLD WIDE WEB:
-
The World Wide Web (Web, WWW,
or W3), the subnet of the Internet governed by HTTP, is a
distributed,
hypermedia information retrieval system. The Web facilitates
multiple
protocol access via a single client called a browser.
-
Web information is contained
in HTML documents containing hyperlinks and made available
on a Web server. A Web site is an integrated collection
of Web pages which is normally collected in a single directory (folder)
called a Web account.
-
The key to the hyperlinked nature
of the Web is the URL which can have, at most, five distinct
parts.
-
the URL type or protocol
prefix used to access the Internet address.
-
The domain name
-
the directory path separated
by slashes
-
the file name of the
document to be accessed, and
-
optionally a "target" or a query
string
-
The Domain Name System (DNS)
is a way of associating arcane numeric IP addresses with more memorable
"domain names" used in URLs.
-
-
OVERVIEW OF WEB DEVELOPMENT:
-
HTML "documents" are actually
HTML programs that are a composite of simple text formatting,
hyperlinks
to other documents or multimedia files, embedded code of other languages,
typically authoring languages, and scripting languages invocations
of separate programs written in other languages, typically Java or CGI
scripts.
-
Proprietary
Web technologies include Cold Fusion, Shockwave, and Flash.
-
Web
development activities can be categorized under five broad classifications,
authoring
HTML
documents, scripting specialized programs that augment the HTML,
programming
separate applications that enhance browser presentations,
publishing HTML documents on a Web server, and maintaining
the integrity of the Web site.
-
Web
programming languages include:
-
Authoring languages that
allow nonprogrammers to create Multimedia applications include:
-
HTML used to tell a Web
browser how to format text, insert multimedia, link to another location,
or link to other programs written.
-
DHTML, an extension of
HTML that can add animation and interactivity to Web pages.
-
XML
which allows new tags to be defined for new data types thus extending the
data types that can be displayed by a browser. XML is "extensible"
because new markup tags can be defined making it virtually unlimited and
self-defining.
-
SMIL
which facilitates the synchronization of streaming multimedia.
-
VRML is used to create
virtual 3D environments that can be "explored"
-
Scripting tools are used
to make Web pages dynamic and/or interactiveby
either client-side or server-side scripting.
-
JavaScript is the primary
client-side scripting language specifically designed to
be used with the World Wide Web.
-
REBOL
(Relative Expression-Based Object Language) is a new approach
to Web scripting that utilizes messaging and dialecting
to augment HTML.
-
Python
is a simple, high-level interpreted object-oriented
language that bridges the gap between application programming and shell
programming, making it suitable for rapid prototyping and Web scripting.
-
Tcl
is an interpreted string processing
language for issuing commands to interactive programs that is extensible
via custom Tcl libraries. It is used for prototyping applications as well
as for developing CGI scripts.
-
Common
Gateway Interface (CGI) scripting is the traditional scripting
technique used for server-side scripting
in virtually any computer language.
-
ASP
(Active Server Page) is a Microsoft specific server-side scripting environment
typically used with VBScript or JScript
-
General purpose HLL,
that are typically used to write Web specific programs include:
-
any HLL (commonly Perl), following
CGI protocols. Such programs, called "CGI applications".
-
Java, a purely object
oriented, HLL designed to facilitate reliable, platform-independent,
distributed processing is currently the language of choice for Web
programming. It has special capabilities for writing object-oriented
programs that run on networks including the unique concept of "applets".
-
MIME types are a set
of file formats, defined by ISO, that are designed to be transmitted over
the Internet using SMTP; each file format has a unique extension
that identifies the application that is needed to output it.
NOTES:
-
You
have now covered the material required to answer questions 1-20 on PREASSESSMENT
330-1. There are two versions of the preassessments, an HTML
version (which you can clone and add questions to) and an interactive version
(with is self checking so you can make mistakes and correct them without
anyone even knowing!)
Be
sure to read the preambles and associated descriptions of the preassessments
so that you fully understand their purpose.
-
I
have also published the "best Assessment 1 I can write"
called the "PROFICIENCY
EVALUATION" which you can begin answering now. (These also have
a clonable HTML version.) It has a format that is identical to that
of the gradeable Assessment 1. If you begin the an assessment having
100% understanding of the associated Proficiency Evaluation, you should
have a fundamental understanding of the most import concepts covered and
be in good shape to take the assessment.
Be
sure to read the preambles and associated descriptions of the proficiency
evaluations so that you fully understand their purpose.
-
Note that within
a Learning Module, you can find any word by utilizing the Find
utility of your browser. (One of the big advantages of a browser-based
learning material!) In Netscape Communicator you use the
Find
in Page... or
Find in Frame... items in the
Edit
menu.
This is a very powerful tool that can also be a big help on PreAssessments
and Assessment reworks; just copy a question answer and paste it in
the Find... dialog box and this will take you to each occurance in that
learning module.