{MAKE THIS LAB 9}
alert_red.gifDrafted: 10/12/03updated.gif 12/6/03alert_red.gif
Currently under construction
This is the most current version of Learning Module VIII; however, the FIB/SAQ/TPQ tools have yet to be included.
Also the study guide needs to be written (for the independent learner) and
some of the content sections, links, and assessment tools need to be fine-tuned.  (Nothing is wrong, just imperfect!)
Note that the blinking text designates things that I need to work on; the material is not wrong, but can be improved.
(Don't worry, I don't like blinking text, either, so there will not be any in the finished product!)


COSC 101

LEARNING MODULE IX:
ABSTRACT DATA TYPES AND DATA STRUCTURES

        The following presentation is an overview of language independent, classic ADTs (abstract data types) with minimal coverage of how they are implemented in Java.   This is a different approach than D&L who focus on how ADTs (e.g. linked lists) are implemented in imperative languages, a rather complex subject which isn't covered until the current COSC 201; it may not be covered at all in the new COSC 241.  Anyway, hopefully, the two perspectives (D&Lvs. this LM) will complement each other, giving you a depth of understanding that is not possible from a single viewpoint. .... Study Guide for this learning module {NOT WRITTEN YET}.

The Objectives of this learning module are to help the student:

  1. understand the distinctions between 
    1. abstrract data types (ADTs), types, and data structures.
    2. different kinds of ADTs based on the data access mechanisms.
    3. built-in types of imperative languages and those of declarative languages
    4. array based and linked node implementations of ADTs in imperative languages.
    5. sorted and unsorted ADTs.
    6. three different kinds of sorts (selection, bubble, and Quicksort)
  2. learn how classic operations (Put, Get, and Delete) differ among ADTs.
  3. learn to trace algorithms associated with ADT operations, including
    1. sorts (selection, bubble, and Quicksort)
    2. binary search
    3. other applications of ADTs
  4. learn how to create a binary search tree.

TPQ 1: Rewrite the preceding objectives in terms of personal accomplishments to be attained after finishing the study of this learning module.

The sequence of presentations in this learning module is as follows.  You can click on any link to jump directly to that section.

  1. ABSTRACT DATA TYPES (ADT)
  2. IMPERATIVE LANGUAGE DATA STRUCTURES
  3. LISTS
  4. SORTING
  5. BINARY SEARCH
  6. STACKS AND QUEUES
  7. TREES AND GRAPHS
  8. JAVA'S BUILT IN TYPES AND DATA STRUCTURES
  9. SUMMARY {NOT WRITTEN YET}

1. ABSTRACT DATA TYPES:

  1. A type (also called "data type") is the kind of values that a constant or variable may have or a function may return.  Types are important in computer programs because they classify data so that a translator (compiler or interpreter) can reserve appropriate memory storage to hold all possible values, e.g. integers, real numbers, characters, strings, and Boolean values all have very different representations in memory.  We learned, in LM VIII, that imperative languages typcially have strong typing, i.e. a variable's type must be declared, and if a value of another type is assigned to variable, an error is generated.
  2. Abstract data elements (Tony's term) is anything that represents a single thing, i.e. a number, a word, a boolean value, etc.  Different languages have the following types, but they have different names and implementations, so "abstract data elements" refer to generic element types, i.e. types that are independent of any language or computer platform.
ABSTRACT DATA ELEMENTS
ELEMENT TYPEe
VALUES
     integer
 signed integer values
     real
 Mixed number including fractions
     Boolean
 "true" or "false" (or equivalents)
     character
 ASCII or Unicode character
     string
 sequence of characters
  1. An abstract data type is a collection of data objects characterized by how the objects are accessed; it is an abstract human concept meaningful outside of computer science.  (Note that "object", here, is a general abstract concept as well, i.e. it can be an "element" (like an integer), a data structure (e.g. an list of lists), or an instance of a class.(e.g. a list of circles).  ( There is a inconsistency of termanology in this area so be careful.  The following is not necessarily univerally used.)  I consider a "data structure" to be a programming language implementation of an abstract data type.  There are two fundamental ways to distinguish categories of data structures:
    1. The size of a data structure can be either static or dynamic
      1. Static data structures have a fixed size; if elements are empty, the container still exists.  Most arrays and records, in imperative languages, are static.  (However, it is not necessary that all arrays are static; several languages have dynamic arrays, of various forms.)
      2. Dynamic data structures grow or shrink to accomodate their elements; there are no empty containers.  Linked data stuctures (see below.), e.g. lists, queues, stacks, trees, and graphs are typically dynamic.
    2. The access mechanism can be categorized several ways:
      1. Direct access means elements within a collection of data can be accessed directly, e.g. in an array of grades, the seventh grade would be accessed by writing grade[7].  (In languages based on ANSI C (like Java, C++, C#, etc.), where the first element of an array has the index 0, the seventh element would be Grade[6].)
      2. Linked access means that an element can only be accessed via its nearest neighors, i.e. some means of traversing the elements must be provided in order to allow all elements to be accessed.  Linked access can be subdivided into linear data structures and nonlinear data structures:
        1. linear data structures (lists, queues, stacks, etc.) require sequential access, i.e. via either (or both) the element in front or the element behind the current element.
        2. Nonlinear data structures (trees, graphs, etc.) require procedures/functions that traverse the whole collection of elements in an ordered manner.
    3. Classic abstract data types are summarized in the following table:
    CLASSIC ABSTRACT DATA TYPES
    DATA ORGANIZATION
    ACCESS MECHANISM
    array = collection of data (typically homogeneous* and static), of one or more dimensions. Direct access via indices, e.g. Grade(index) or DistanceTable(row, column) 
    record1 = sequence of data (typically inhomogeneous* and static) Direct access via names (using "dot" notation,e.g. Student.name)
    List, singly linked = sequence of data (dynamic, by definition)
    Sequential access from front, e.g. Front(NameList)
    List, doubly linked = sequence of data (dynamic, by definition) Sequential access from front or back, e.g. Front(NameList)and Last(NameList)
    Queue = sequence of data (dynamic, by definition) Sequential access, First In First Out (FIFO), e.g. Enqueue (X,Line)at end of Line, and Dequeue(Line)from front of Line
    Stack = sequence of data (dynamic, by definition) Sequential access, First In Last Out (LIFO), e.g. Push(X,NameList)and Pop(NameList)
    Tree = hiearchical collection of data (dynamic, by definition) Depth First traversal or Breadth First traversal
    Graph = linked collection of nodes containing data (dynamic, by definition) Nonlinear traversal via adjacent nodes
    NOTES:
    1. "record" is the name used in DBMS and languages based on Pascal.  C/C++ calls them structs; Java doesn't have them (no need with classes).


    1. Remember (from LM VIII) that:
      1. Arrays and records are characteristic of imperative langauges; these are usually built-in data structures which can be used, by the developer, to create other data structures. This is accomplished by writing procedures and or functions which, using records and pointers, implement the characteristics of the particular access mechanism of the particular ADT. See section 2, below.
      2. Lists are characteristic of declaritive (functional and logical) langauges; these are usually built-in data structures which can be used, by the developer, to create other data structures.  This is accomplished by writing  functions (only) which, using lists, implement the characteristics of the particular access mechanism of the particular ADT.
    2. The classic operations required to implement any ADT (and whose details characterize the particular ADT) are:
      1.  put which places an element (which, though singular, could be another ADT) into the ADT, in its proper place (defined by the access mechanism of the particular ADT).
      2. get which retrieves an element and returns it to the calling routine
      3. delete which removes an element without returning it to anything
Other operations, that conceptually the same for all ADTs, include create and destroysearch (neither gets or deletes; only determines if an element is contained) print, empty which is a Boolean function that returns true when the ADT is empty,... Most of the operations associated with an ADT, if not built-in, can be implemented as either procedures or functions; examples of the classics are shown in the following table: (unnecessary in languages which have "garbage collection", like Java),
COMPARISON OF FUNCTIONAL AND PROCEDURAL IMPLEMENTATION OF GENERIC ADT OPERATIONS
(Notation: e
stands for element; DS stands for "data structure"; i stands for index;
in stands for value parameter; out and  in/out stand for var parametehrs)

FUNCTION
PROCEDURE
Put(e, DS) returns DS Put(in: e, in/out: DS)
Get(DS) returns e
Get(out: e; in/out: DS)
Delete(e,DS) retruns DS; e is removed
Delete(i, DS) returns DS; element at location i is removed
Delete(in: e; in/out: DS); e is removed
Delete(in: i; in/out: DS); element at location i is removed
Empty(DS) returns Boolean value
 N/A (Procedures can not be used in control structures.)


COMPARISON OF THE USE OF FUNCTIONAL AND PROCDURAL  ADT OPERATIONS
USE OF FUNCTIONAL OPERATIONS
USE OF PROCEDURAL OPERATIONS
number <- Get(NumberList)
number <- number + 1
NumberList <- Put(number, NumberList)
Get(number; NumberList)
number <- number + 1
Put(number, NumberList)



The generic operations in these tables have various names, depending on the particular ADT implemented (sometimes there are several names within a single ADT).  See the table of classic ADT operations in the summary.
    1. The class is THE OOP "construct" for implementing an abstract data type in a OOPL.
  1. A data structure is the implementation of an ADT in a particular programming language.   Although this definition is standard, the term "data structure" is used, carelessly, as a synonym for ADT, and visa versa; this is understandable in the "gray area" where the abstraction (ADT) overlaps with the implementation (data structure), especially when referring to data types in similar languages, e.g. imperative langauges.
  2. A generic data type is, according to  D&L, a data structure that is implemented as a class in an OOPL, i.e. a generic data type is a template for a data structure where the operations are specified but the type of the data it contains is not, e.g. a generic stack has Push and Pop operations defined but the type of elements is undeclared; the type would be declared when a stack object is instantiated.  Generic data types are not avaialble in all languages; C++ has them (They are called "templates".) and although early versions of Java do not provide them, it generic data types have been incorporated in Java, version 1.5 where they will be called "generics".
SAQ: In the preceding table of classic abstract data types, what is the (a) similarity and (b) difference between (1) array and record,  (2) array and record on the one hand and all the other data structures, (3) the first six and the last two.

TPQ: Why would an array, in an imperative languge, NOT be a good consturct with which to implement a List, or Queue, or Stack?

2. BUILT-IN DATA STRUCTURES OF IMPERATIVE LANGUAGES INCLUDE ARRAYS AND RECORDS:

2.1 Arrays (Arrays were introduced in Chapter  8 of D&L):

  1. Arrays are named collections of data that are directly accesed via one or more indices.  The syntax varies between languages; see D&L, page 260 for the syntax of array declarations and access in several languages.  
  2. Arrays may be multidimensional (Not mentioned in D&L.):
    1. Arrays of two or more dimensions may be considered arrays of arrays, e.g. a two dimensional array can be viewed as a one dimensional array whose elements are one dimensional arrays.  The syntax varies between languages, but the following examples illustrate arrays as denoted in the two most common families of languages:
      1. In Pascal-like languages (Pascal, Modula, Ada, etc.) an array of numbers would be written NumberArray(row, column).
      2. In languages based on ANSI standard C (C, C++, Java, JavaScript, etc.) an array of numbers would be written NumberArray[row][column].
    2. There is no syntax standard for representing language-independent arrays, but I prefer square brackets enclosing a comma delimited indices, e.g. a two-dimensional array of numbers, in pseudocode, a SAC, flowchart, etc. is typically denoted NumberArray[row, column].
  3. Arrays are typically homogeneous in imperative languages. i.e. all its data are of the same type.   (However, this is NOT necessary, just typical.)
  4. Arrays typically have a constant size in imperative languages. i.e. the maximim of the indices can not change, dynamically, during execution.  However, some languages (especially newere languages like Java) have the equivalent of dynamic arrays, e.g. the Vector data type in Java.
  5. Array operations provided in most imperative languages include the following.
    1. Direct access is possible by simply giving a value of the indices, e.g. NumberArray[2][3] would access the value in the second row, third column of the NumberArray[row][column array.  See D&L, page 260 for the syntax of one-dimensional array access in several languages
    2. Arrays may be passed via parameters to functions, procedures, and methods.
  6. The for loop with its built-in index are ideal for processing arrays.
JAVA EXAMPLE:
PROCESSING AN ARRAY WITH A FOR LOOP
for (i = 0; i < n-1; i++)
  NumArray[i] =
NumArray[i] + 1;
}//end for

SAQ: (a) What does the preceding example do?  (b) How big is NumArray[]?  (c) Why does the for index, i, complicate things by starting with 0 and going to n-1? (This is "one" of the things I HATE about C based languages!)
SAQ: Write a Java fragment that would initialize all the elements of
NumArray [row][column] to zero; assume the array has
n
rows and m columns.

{MAKE THIS LAB 9}EXERCISE: Let's consider a modification of the gradebook problem of LM VI; this exercise will dramatically illustrate how using data structures (in this case, arrays) can simplify and streamline an algorithm.   (a) Modify the "student gradebook" so that is uses a single one-dimensional array to hold the grades that are entered; represent this with pseudocode or a SAC.  Indicate how this simplifies your algorithm and makes it more efficient.  (b) Develop, using a top-down algrothm development strategy, the pseudocode or SAC for an electornic "class gradebook" that allows grades, between 0 and 100, to be entered into a list of student records.  Each student record begins with his/her name followed by a single grade.  The gradebook should automatically calculate the average of the set of grades as well as determine the maximum and minimum grades; then the grades should be normalized to 78.  Finally the list of the student's names and normalized grades should be printed, followed by the names and normalized grade of the student with the maximum and minimum grades.  (Note: you only need to develop the algorithm for one set of grades.)     (HINTS: Develop the algorithm using two separate one-dimensional arrays, Name(index) and Grade(index); the fact that both arrays uses the same index will result in "parallel arrays" - THINK about this, until you can justify the statement.  Be sure to trace (deskcheck) your developing algorithm, OFTEN, to check your logic and make sure that each grades is associated with the correct name.)

SAQ: (a) Why couldn't the preceding algorithm be written in terms of a single, two dimensional array?  (b) Could it be designed using an array of records?  If so, how?
SAQ: How could the algorithms of the student gradebook and class gradebook be combined to have a list of grades for each student in a list of students?

2.2 Records (called "structs" in C-based PL; these were introduced in Chapter  8 of D&L):

  1. Records are named collections of data that are directly accesed via an a name.  The syntax varies between languages, but the fields of a record of student might be denoted Student.NameStudent.SSN, etc.
  2. Records may be multidimensional (Not mentioned in D&L.):
    1. The fields of a record may, themselves be records or arrays, resulting in two dimensional data structures.
    2. Arrays may contain records; such arrays of records are two dimensional data structures.
  3. Records are typically inhomogeneous in imperative languages, in fact, they are distinguished from arrays by holding inhomogeneous data.
  4. Record operations provided in most imperative languages include the following. (Java does not have records, but they can be simulated with classes that have no operations; however, it would be better to add specific user defined operations, to the class, for processing the particular type of record.)
    1. Direct access is possible by simply using "dot notation", e.g. <record name>. <field name>.  See D&L, page 259 for simple examples of the syntax of assigning values to fields of a record in several languages.
    2. Records may be passed via parameters to functions, procedures, and methods.

SAQ: What type of  Microsoft Office application would be implemented using (a) a two dimensional array and (b) an array of records?

3. LINEAR DATA STRUCTURES: LINKED LISTS:

3.1 Linked Lists in Declarative PL:

  1. The linked list is THE built-in data structure of delcaritive languages like Lisp (which stands for List processing), Logo, and Prolog, probably because they "fit" so nicely with recursion as a means of repetition (although I've never seen this reason explicitly stated).  Thus, in declarative langauges, lists are used to create all user-defined data structures; because lists are essentially dynamic (expand and contract with the number of elements), other dynamic data structures are easy to create using lists.
  2. Lists, because they are built into a declarative language, are easy to manipulate; furthermore, other linked data structures (stacks, queues, trees, graphs, etc.) are easy to create.  On the other hand, imperative language data structures (like arrays and records (neither really needed)) are clumbsy to create using declarative language lists.
  3. Surprisingly (amazingly??) the fundamental list operations of the declarative "list processing" languages (e.g. First and ButFirst) are not those created in imperative language implementations of lists (e.g. Insert and Delete; see seciton , below).
    1. In Logo, the function First(list) returns the first element of the list being processed and ButFirst(list) returns a list that contains all but the first element of list; note that First(list) is an "element", not a list, but ButFirst(list) is a "list".
    2. In Lisp, the prototypical list processing language and THE language for artificial intelligence applications, uses car(list), which is equivalent to Logo's First(list) and cdr(list) which is equivalent to Logo's ButFirst(list).  The arcane names car and cdr originate in the machine architecture on which Lisp was developed and have never been changed to more meaningful names.  The interesting thing about car and cdr is that they are the only primitive list operations of basic Lisp, i.e. ALL other operations can be built using only the fundamental operations, car and cdr!
  4. Declarative languages are not the focus of D&L, so we will not consider declarative language data structures further.

3.2 Linked Lists in Imperative PL:

  1. Normally, lists are not built in data structures of imperative langauges. Instead, the array and record is THE built-in data structure of imperative languages like Pascal, Ada, C, Java, etc., probably because they "fit" so nicely with indexed iteration (for loops) as a means of repetition (although I've never seen this reason explicitly stated).  Therefore, in imperative langauges, arrays and records are used to create all user-defined data structures, including linked lists. (See D&L, Chapter 9, section 2.) 
    1. Arrays are not good structures to use to implement lists, because arrays are typically static, and lists, by definition, are dynamic.  However, many texts still use arrays (I think unfortunately) to create lists.
    2. The best way to create a list data structure (that is truely dynamic), in traditional imperative languages, is to use records for each node of the list; each node will have at least one field to hold the data and a field for each pointer to an adjacent node.  For example, a singly linked list would have one data field and one pointer field that points to the next node in the list; see D&L, pp. 279-281.   In OOPL, classes are used instead of records; however, conceptually they are equivalent.  ( In fact, ANY dynamic data structure like stacks, quesues, trees, graphs, etc. (discussed below) is most accurately represented with linked nodes created from records or classes.)
  2. Operations defined on imperative language lists, that must be implemented using built in data structures:
    1. Operations that are conceptually equivalent to those for other data structuesCreate, Initialize, Search, Empty (returns true if list is empty, false otherwise), Length, Print, etc.
    2. Unique operations: Insert, Delete, Retrieve.   (As I have warned previously, these are not the same operations as given in declarative languages, e.g. the  First(list) and ButFirst(list) of Logo.)
  3. Modifications of the basic list data structure:  Sorted Lists, Rings
4. SORTING:
I do not understand why D&L give such an
oversimplified presentation of algorithm development (a fundamental software development tool)
and then goe into such detail on an advanced subject like sorting.
(However, sorting algorithms are classics, so we can use them as examples of algorithms and learn to trace them.)
  1. "Sorting" is a term used to describe the ordering of the elements in a data structure, a fundamental operation in software development.  There are two different kinds of sorted elements, ascending (smallest to largest) or descending (largest to smallest).   D&L only discuss the sorting of arrays, probably because this is the easiest to illustrate, but it should be remembered that sorting is a fundamental operation on any data structure that can have elements arranged in order.
    1. Using arrays limits D&L's sorting discussion to imperative languages. Sorting in declarative languages is accomplished using lists and recursion.  They have equivalent sorts to those discussed below, but are beyond the scope of COSC 101.
    2. There are numerous sort algorithms, but D&L limit (??) their coverage to three classics, selection sort, bubble sort, and quick sort, all of which use the swap(A,B) procedure we have been dealing with for a long time.
  2. Selection sort is the simplest (and most inefficient) sorting algorithm.   In a selection sort, one element is put into its proper place, in the original, unsorted array, on each pass through the array.  This involves two, nested, definite loops; the outer loop makes swaps the current element with the smallest element in the rest of the array; the inner loop identifies the smallest element in the rest of the array.  (See the following table for the pseudocode for an selection sort.)
    1. Basic operation: The array is traversed repeatedly until it is sorted. On each pass the smallest element in the unsorted part of the array is swapped with the first element of that unsorted part of the current array.
    2. (Not in D&L.) There are several variations of the selection sort that use the same basic idea.  These include:
      1. building the sorted array from the bottom up instead of the top down.
      2. sorting the array from the largest down to the smallest (descending) instead of from smallest to largest (ascending)
  3. Bubble sort, an improvement on selection sort:  The bubble sort is a selection sort that uses a different mechanism for determining the minimum (or maximum) value in an array. 
    1. For an ascending bubble sort, each repetition through the unsorted elements puts the smallest unsorted value in its correct place, but bubble sort also makes changes in the locations of the other elements in the array.  This happens because swapping occurs when the bottom element of an adjacent pair is smaller than the one above it; this causes the smallest element to be swapped up and thus "bubbles up" (hence the name "Bubble sort") during each repetition of the loop until it reaches its proper, sorted position.  
    2. Bubble sort has variations that are similar to those in selection sort.  (See the following table for the pseudocode for an bubble sort.)
  4. Quicksort, an elegant, recursive, divide and conquer sort is based on the fact that it is more efficient to sort two small arrays than one large one.  
    1. Quicksort first splits an array into two "subarrays", so that all of the elements in the first subarray are smaller than any of the elements in the second subarray; then quicksort calls itself on both arrays thus creating four, eight, ... subarrays each with smaller elements than the array following it.  This is repeated until the subarrays result in the whole array being sorted. 
    2. Quicksort is significantly faster than selection sort and bubble sort on truely random arrays, but if an array is already sorted or almost sorted, Quicksort is not efficient; however, the odds that this occurs is small. 
    3. Quicksort has variations that are similar to those in selection sort.   (See the following table for the pseudocode for an bubble sort where the splitting mechanism is not specified because it is too complex and confusing.)
SAQ: To understand SELECTION SORT (for ascending order), "play like" you are developing it.  Analyze your task by "hand sorting" the integers 2, 3, 1, 2, i.e. go through all the unsorted numbers like a computer would, one number at a time, over and over until the numbers are sorted from lowest to highest.

PSEUDOCODE COMPARISON OF ARRAY BASED SORTING ALGORITHMS
(Note that, to simplify the presentation of these algorithm, I have assumed that the arrays start at 1,  as in Pascal or Ada (instead of 0, as in Java and C++).
SELECTION SORT BUBBLE SORT QUICK SORT
for i <- 1 to n
  min <-
1 //initialize index of minimum value
  for j
<- i+1 to n //find minimum value
    if DA[j] < DA[min]
      min
<- j
    endif
  endfor
  Swap(
DA[min], DA[i])
endfor
swapped <- true //initialize looping condition
i <- 1
while i < n and swapped
  swapped <- false; //initialize looping condition
  for j
<- n to i+1, step-1 //loop backwards
    if DA[j] < DA[j-1]     
       Swap(
DA[j], DA[j-1])
      
swapped <- true
    endif
  endfor
 
i <- i + 1 //shrinks the unsorted part
endwhile
procedure QuickSort(DA[],first,last)
  if first < last
    Split (
DA[],first,last,splitPoint)
   
QuickSort(DA[],first,splitPoint-1)
   
QuickSort(DA[],splitPoint+1,last)
  endif
end procedure

procedure Split
(DA[],frst,lst,sP)
//this chooses a value V where the list
//is to be split into two parts so that:
//DA[frst]..DA[sP-1] <= V and
//DA[sP+1]..DA[lst] > V
//This is done using Swap
to exchange
//elements if right value > left value




SAQ: Draw the SAC for each of the sorts tabulated above.
SAQ: Do a trace for a simple set of data, e.g. 3, 1, 3, 2.
SAQ: Rewrite the sorting algorithms, above, so that the first array element has an index of zero, as C based languages do.


5. BINARY SEARCH (OF ORDERED ARRAYS):
  1. The sequential search algorithm (See SAC Example 8) is simple but inefficient for sorted arrays, especially if the element you are looking for is at the end of the array.
  2. The classic binary search is, like Quicksort, a recursive, divide and conquer search of sorted arrays that is based on the fact you can eleminate half of the elements each time your compare the element you are looking for to an element in the array.
  3. You probably already know the fastest way to guess an unknown number between (say) 0 and 100 - if you are told each time you guess whether your are high or low.  You would be using a mental binary search!
  4. The following table shows the recursive concept and its nonrecursive equivalent (which is more efficient):
BINARY SEARCH DATA ARRAY, DA, FOR A "SEARCHVALUE", SV:
CONCEPT (RECURSIVE)
PSEUDOCODE OF ITERATIVE EQUIVALENT
Compare the middle array element to SV
If the middle element = SV
    then stop searching
    else if the middle element > SV
        then binary search the first half of the array
        else binary search the second half of the array
found <- false //initialize looping condition
first <- 1
last <- NumElements
while first < last and not found
  mid <-(first+last) div 2//absolute value
    if DA[mid] = SV    
      then
found <- true
      else if
DA[mid] > SV
       
then last <- mid - 1
        else
first <- mid + 1
      endif
   
endif
endwhile
if found
 
then location = mid
  else
location = 0
endif



SAQ: Do a trace of the pseudocode of the binary search for a search value, SV = 31 on an ordered array of integers from 0 to 100.

6. MORE DYNAMIC, LINEAR ADTS: STACKS AND QUEUES:
  1. A Stack is a destructive, LIFO (last in, first out) ADT.
    1. Stack elements can only be accessed (inserted or retrieved) for the "top" of the Stack.
    2. Operations defined on imperative language stacks, that must be implemented using built in data structures:
      1. Operations that are conceptually equivalent to those for other data structues:  Create, Empty, Length, Sort, etc.
      2. Unique operations:
        1. Push, inserts an element at the "top" of a stack.
        2. Pop  retrieves an element from the "top" of a stack.
    3. A stack, like a queue (below), is "destructive" in that the stack is its characteristic access operations, e.g. the stack after the operations Push or Pop is not the same as the original stack.  This "destructive nature" of operations is NOT characteristic of data structures other than the stack and queue.
  2. A Queue is a destructive FIFO (first in, first out) ADT.
    1. Queue elements can only be inserted at the "end" of the queue and retrieved from the "front" of the queue, like the line at a check out counter.
    2. Operations defined on imperative language queues, that must be implemented using built in data structures:
      1. Operations that are conceptually equivalent to those for other data structues:  Create, Empty, Length, Sort, etc.
      2. Unique operations:
        1. Enqueue inserts an element at the "end" of a queue.
        2. Dequeue retrieves an element from the "front" of a queue.
    3. Modifications of the basic Queue data structure: Priority Queue
    4. A queue, like a stack (above), is "destructive" in that the queue is changed by its characteristic access operations, e.g. the queue after the operations  or  is not the same as the original queue.  This "destructive nature" of operations is NOT characteristic of data structures other than the stack and queue.
7. NONLINEAR DATA STRUCTURES: TREES AND GRAPHS:
  1. Trees:
    1. The tree is a nonlinear, hierarchical ADT of nodes which contain data elements as well as multiple links to other nodes;  they have the following organization:
      1. The tree is accessed via its root node.
      2. The hierarchy is defined by parent-child relationships where
        1. a parent can have any number of children. (However, binary trees, where a parent may have only one or two children are the most commonly used trees.)  Parents and parents of parents are collectively called ancestors.
        2. a child can have only one parent.  (This is what distinguishes a tree from the more general "graph", i.e. a tree is the special case of a graph (a graph where all nodes have only one parent); see section B, below.)
        3. if a child, itself has children, they and all descendants (children and children of children) form a subtree.
        4. The last node in any branch of a tree is called a leaf node.  A branch is the sequence of nodes is a distinct sequence of nodes usually defining a path to a leaf.
        5. The level of a node is the distance (number of nodes) it is from the root; all nodes with the same distance are said to be on the same level.  The height is defined as the highest level of the tree.
        6. For a given number of nodes are numerous possible trees.  (See two possibilities of an 10 node tree in D&L, Figure 9.17.)  The most efficient trees are balanced, sorted trees, with a minimum height
    2. Operations defined on imperative language queues, that must be implemented using built in data structures:
      1. Operations that are conceptually equivalent to those for other data structues:  Create, Empty, etc.
      2. Characteristic operations:
        1. IsLeaf is a Boolean function that returns true is a leaf, i.e. if the node has no children.
        2. DepthFirstSearch traverses every branch completely before backtracking to traverse adjacent branches, until the search value is located (or the complete tree is traversed without finding the value, in which case the search fails).
        3. BreadthFirstSearch traverses every level completely before moving to the next level until the search value is located (or the complete tree is traversed without finding the value, in which case the search fails).
    3. The binary search tree (BST) is very useful special cases of the basic tree ADT which provides a very efficient search for random, sorted data.  A BST is defined by the fact that the value in any node is greater than any value in its left subtree and less than any value in its right subtree.  (Note that there is normally gaps in the values in BSTs.)  D&L, 303-309 gives a rather detailed introduction to the creations and use of BST, with examples, so I will not got into these.  {EXPAND?}
  2. Graphs are the most general dynamic data structure avaialble; in fact tree, lists, stacks, and queues are all special cases of a graph.
    1. A graph is a nonlinear ADT that consists of a collection of vertices and edges.
      1. vertices are nodes that contain data
      2. edges (also called "archs") connect some of the vertices and define relationships between the vertices; see the examples in D&L, Figure 9.21.
        1. undirected graphs have edges that can be traversed in both directions; see D&L, Figure 9.21a.
        2. directed graphs (also called digraphs) have edges that can be traversed in only one direction; see D&L, Figure 9.21c.; the directionality are indicated by arrows.
    2. Operations defined on imperative language queues, that must be implemented using built in data structures:
      1. Operations that are conceptually equivalent to those for other data structues:  Create, Empty, etc.
      2. Characteristic operations: Traverse, DepthFirstSearch, and BreadthFirstSearch all of which are beyond the scope of COSC 101.
SAQ: Draw a Distance table that is equivalent to the graph in D&L, Figure 9.21b.  (Compare this to the distance table in Lab 6A.)
SAQ: Draw the graph that is equivalent to the table that is equivalent to the graph in Lab 6A, item 6 (in lab manual).


8. PROGRAMMING LIBRARIES:
  1. Programming libraries are collections of software modules with well defined APIs (application programming interfaces) that allow modules to be "plugged into" software cod without knowing the details of those modules.  (The names and kinds of modules that are available for use are characteristic of the particular language., but they are, typically, very similar.)   Unfortunately, typical software development courses do not emphasize the use of these libraries; instead they constantly "reinvent the wheel" supposedly for the reason to help the student understand various algorithms and data structures.  There are basically two types of libriaries:
    1. Libraries for structured programming languages consists of procedures and functions.
    2. Libraries for OOPL consists of packages of classes and interfaces; see the Java Class Library Packages in the next section.
  2. Java Class Library Packages:  
    1. Perhaps the primary key to harnessing the power of Java is accessing the vast and growing collection of "packages of classes and interfaces available" as libraries. ("Package" is a reserved word in Java; a package contains classes, other packages, etc.) These effectively extend the small, efficient set of language primitives allowing the developer to efficiently customize the language for the project planned.
      1. The core Java API is the set of built-in packages of the Java language. These allow the developer to select only those packages (or specific classes or interfaces within a package) that are needed for the project being developed. 
      2. The developer can define his/her user-defined packages and use them in the same way a built-in packages.
      3. This idea is extended and refined in the concept of "components" in Java Beans.
    2. The Java Development Kit (built into Borland's JBuilder and all Java IDEs) comes with a set of packages that are built into the language, called the "core Java API" (where API, as usual, stands for "application programming interface"). Any package that begins with "java." is part of the core Java API and is, therefore, available on any platform that supports Java. 
    3. The growth of java's standard class library is illustrated in the following table, taken from the Byte.com article " Getting Started with Java ".

GROWTH OF JAVA'S STANDARD CLASS LIBRARY
Java Version:
1.0
1.1
1.2
Packages
8
23
62
Classes
172
391
1287
Interfaces
40
113
305
Members
2125
5478
18837
    1. Since the JVM is Internet enabled, if a specific class is not locally available, Java will search the Internet to find it.
      1. See the  java.util package, in the next section, that is provided by Java

9. JAVA'S BUILT IN TYPES AND DATA STRUCTURES:

        The built-in data structures are implemented as classes and interfaces collected together in the java.util package. , which contain "the collections framework, legacy collection classes, event model, date and time facilities, internationalization, and miscellaneous utility classes (a string tokenizer, a random-number generator, and a bit array)". Treatment of data structures is beyond the scope of this course (covered a course on its own, COSC 310!), so only the most useful ones are summarized in the following section.  However, one should look over the java.util package  to see how data structures are provided in Java; it is interesting to note which ones are implemented via class and which ones via interfaces.  Java has extensive class libraries of virtually all basic data structures.  User-defined data structures are efficiently implemented via the class construct which encapsulates the                            () and                            ()  characteristic of the object to be instantiated.  


  1. All Java types, other than the preceding primitive types, are classes, either provided by the Core Java Classes or user-defined classes.  For example, strings and arrays are classes; therefore they differ from those in C++, which are primative data types.
  2. Strings, although they are actually instance of the String class, derived from the object class of the fundamental java.lang package, are functionally equivalent to those of C++ and most imperative languages.  (However, they are NOT arrays of characters as in C.) A few of the most useful string characteristics are given below; for a complete specifications, see the references at the end of this section.
    1. String literals (constants) are delimited by double quotes.
    2. String variables can be declared and initialized in one statement, e.g.
      1. String stringName = "Hello, World"
    3. String concatenation is accomplished with the "+" operator, e.g. String stringName = "Hello " + "World!"
      1. If one of the arguments of the "+" operator is a string, the other is automatically to a string.
    4. For a complete, definitive description of strings see Sun's API Specification of the:
      1. String class and
      2. StringBuffer class.
SAQ: 
  1. Arrays, although they are instances of a built-in Java class, are functionally equivalent to those of C++ and most imperative languages, i.e. they are a sequence of values or objects accessible via an integer index.   Compare the array to the vector, in the next section.
    1. Important points to note are that Java arrays...
      1. instances of the Array class derived directly from the fundamental Object class of the  java.lang package.  
      2. are homogeneous, i.e. every element has the same type, which can be Java primative types or instances of classes.
      3. are manipulated by reference, as are all identifers in Java,
      4. are dynamically instantiated by the keyword new Note that creating an array (instantiating an array object) does NOT instantiate the elements of the array; it only creates an instance of the class array. i.e. storage is allocated for object references, not the objects themselves.  Therefore, the constructors of the array elements are not called by the array declaration.  
      5. can be multidimensional (implemented, as in ANSI C, as arrays of arrays), but only the first dimension must have its number of elements specified, e.g.
      6. int matrixOfIntegers[][][] = new int[10][][];
        is legal.
      7. are automatically garbage collected like all unreferenced instances are.
    2. For a typical, detailed introduction to Java arrays, see Arrays in the Java Tutorial.
    3.  For a complete, definitive description of the array class see Sun's API Specification of the Array class.
  2. The   Vector, an innovative, new integration of array and list facilities, is a dynamic, direct access data structure for storing instances (but not primative data types, which must be stored in an array). This is the construct that will be used to implement most user-defined data structures involving instances (objects).
    1. The class Vector is derived from the fundamental Object class of the java.lang package.  The Vector class is part of the java.util package.  Note that although vectors are conceptually similar to arrays, there is no relationship between the Vector class and the Array class in Java.
    2. Vectors are dynamic, i.e. the number of elements can vary.
    3. Elements can be stored and retrieved at the front and back of a vector (like the _______-____________ () data structure of other languages) as well as inserted at any location designated by an index without replacing an existing element.
    4. To declare a vector variable, v, and to create an instance, the syntax is:
Vector v = new Vector();
Note the use of parantheses instead of the brackets that designate arrays in Java (and C++).  For a complete, definitive description of vectors see Sun's API Specification of the Vector class.
  1. The built in class Stack, implementing the LIFO data structure, is derived directly from the Vector class within the java.util packageStack extends Vector by adding methods for ______, ______(6) , peek, etc.
  2. The versatile List data structure is implemented, in Java, via the LinkedList class.  There is also a List interface which is a realization of the more general Collection interface. All of these are, as expected, in the java.util package.   These are, at present, beyond the scope of this course.
SAQ: What is the (a) similarity and (b) differences between an array and a vector in Java?
  1. JAR (Java Archive) Files: JAR is a unique file format, identified by a ".jar" file extension (e.g. a file called myfile.jar would be a JAR file), designed to allow encapsulation of all the files (.class, graphics, audio, etc.) associated with a Java applet This simplifies file transfers because everything needed by an applet is in a single file.  Also, JAR supports data compression, which further decreases download times.  {Note that JAR is a file format, not a data structure used in an algorithm, so it "belongs" in LM XI, FILE SYSTEMS. See PC Magazine, 2/23/99.}
SAQ : There are no records (or structures of C++) in Java. Why?

10. SUMMARY:
  1. ....
  2. The following table summarizes the class operations of a generalized data structure:
IMPERATIVE IMPLEMENTATIONS OF CLASSIC OPERATIONS FOR CLASIC ADTS
(Note that, with the exception of arrays and records, operations are defined as functions (or, optionally, procedures).)

DIRECT ACCESS,
 STATIC (Typically)

SEQUENTIAL ACCESS,
DYNAMIC

NONLINEAR ACCESS, DYNAMIC
GENERIC
ARRAY, A