{Coin the term "algorithm-oriented programming (AOP)" and rewrite, replacing SP with AOP and/or SAOP.}
: 10/26/04; 2/9/05; 8/5/05; 1/15/06; 11/9/06; 11/30/07
MODULARIZATION OF SOFTWARE
ALGORITHM-ORIENTED PARADIGM vs OBJECT-ORIENTED PARADIGM

(Note that, in the following presentation, "algorithm-oriented programming (AOP)" is not a universally recognized term, but
 it is more charistic of the programming paradigm than the commonly used term "structured programming" is.)

"Divide and conquer" is the foundation of all problem solving activities, from writing (beginning with an outline) to engineering (dividing a task into subtasks).  In computer science, the subdivision of software into highly cohesive, loosely coupled modules has always been the essence of the fundamental divide and conquer approach to software development (... especially by cooperating teams).  In the early days of computing (as well as in current university introductory computer science curricula), programming was essentially an individual undertaking.  However, as software became more sophisticated, software development, in the real world, evolved into a team oriented strategy.   Consequently, "computer programming" has been replaced by "Software Engineering", a strategy based on reusing existing software and a team approach to software development.  This emphasizes the role of modules (self contained subdivisions of software) because engineers "build" things with modules, e.g. electrical engineers use "off the shelf" chips to build integrated circuits.  In an analogous way, software engineers have always used software modules (often "of the shelf" themselves) to "build" software application - after all, "why reinvent the wheel?"  There are two basically different software modularization strategies, (1) algorithm-oriented programming (AOP) and (2) object oriented software development (OOSD).  In the AO paradigm (view or strategy), often called "structured programming" (SP), the modules are distinct subdivisions of the overall algorithm; each subdivision, called "procedures" or "functions", represents distinct tasks or subalgorithms.  Therefore, modularization in AOP involves developing hierarchical tasks, i.e. the identification, within the overall algorithm, of distinct subtasks and encapsulating them as procedures or functions.   (Note the focus on "tasks", i.e. modules which can be represented by algorithms.)   Modularization in OOSD involves modeling the real world with abstract objects, (called "classes"), which encapsulate the object's state (data specifying it) and behavior.  (Note the focus on "modelling" of "objects" (things) in the OO paradigm rather than delineation of "tasks" (analogous to recipes) characteristic of AOP.)  However, both of these approaches have the same goal, the separation of software into distinctive modules that can be developed independently and later integrated into complex software applications; in AOP, procedures and functions are integrated in a program; in object-oriented software development (OOSD), classes are integrated in software architectures (collections of related classes).   In the following presentation we illustrate these two approaches to modularization with a very simple, easily visualized example, the separation of input/output from the processing - something that can be done in any software.   ...

1. TWO MODULARIZATION STRATEGIES, ALGROITHM-ORIENTED PROGRAMMING (AOP) AND OO SOFTWARE DEVELOPMENT (OOSD):

The most fundamental category of software modularization, in any paradigm, is the separation of the input and output (I/O) of data from the actual processing of that data.  Obviously, how one obtains data and displays results is completely independent of how one processes that data, so the separation of I/O and processing into different modules is done in ALL software development.  Usually, both the input/output and processing modules are, themselves, subdivided into distinctive submodules, but we will wait to consider such subdivisions until the next section - in order to clearly illustrate and emphasize the critical point, that an essential feature of software development is modularization - in any paradigm.  Modularization is most effectively illustrated with language independent representations, i.e. representations that are not characteristic of any particular computer language.   In "programming', language independent modularization of algorithms is best presented with flowcharts, illustrated in the left column in the following figure. This illustration shows a very basic input-process-output algorithm.  The object oriented equivalent is a software architecture, illustrated in the right column of the same figure; this consists of an input/output object interacting with a processing object - this "models" the relationship between an input/output object and the processing object.  Thus, the programming and OO (modeling) approaches to modularization of software pictorially compared in Figure MOS-1.

FIGURE MOS-1
Comparison the "Program" and "OO Architecture" Concepts
Flowchart of a Traditional Program
Equivalent Object Architecture
  1. This diagram represents a simplified, top level algorithm of a program.
  2. Each of these elements represent  procedures (subalgorithms) which can contain subalgorithms themselves, e.g. Input could have a subroutine that validates the data entered and requests reentry of invalid data.
  3. This representation of our software  illustrates the algorithm of the program, a __-this  approach. (Hint: it's a two-letter verb.)
  4. A real world analogy for a program would be a ______. (Hint: it's used in the kitchen.)
  5. TPQ: What part of speech are the words "Input", "Process", and "Output" in the context of an algorithm?
  1. This diagram represents a simplified model of a software architecture; the UML is a sophisticated, standard way of doing this.
  2. Each of these rectangless represent abstract objects which encapsulate (1) attributes (Data) that specify the state of the object and (2) operations (e.g. Data input and output) ) that specify the behavior of the object.  (In OOSD, abstract objects are called classes.)
  3. This representation of our software illustrates the static structure of our architecture, a _____-this  approach. (Hint: it's the verb form of a noun used above.)
  4. A real world analogy for an OO architecture would be a _______________.
  5. TPQ: What part of speech are the words Processing and Input/Output in the context of an object diagram?

These schematics emphasize the two fundamentally different approaches to modularization.  The older programming approach modularizes the "algorithm" (analogous to a recipe or a "do this" approach) whereas the OO approach modularizes the "software architecture" (analogous to building blocks or "model this" approach). The OO modeling approach is accomplished by a "software engineer" building a software architecture.  However both approaches have the same goal, the separation of software into highly cohesive (self contained), loosely coupled (relatively independent) modules; this allows any module to be tested, modified, or even replaced without affecting the rest of the software.  See the definition of loose coupling from the "Web Services Glossary" of the W3C.  High cohesion and loose coupling can be exemplified by considering a very simple task, the calculation of the area and circumference of a circle.   This example is modularized two ways in the following two sections.

The hints in item 3 of both of the preceding illustrations provide a segue to a guideline for naming modules, i.e. procedures (which "do" things) should be named with verbs whereas objects (which "are" things) should be named with nouns.  It is a bit premature, at this point, to discuss comprehensive naming guidelines, but we will return to this subject in AN INTUITIVE INTRODUCTION TO OBJECT ORIENTED SOFTWARE DEVELOPMENT USING THE UML.

2. A SIMPLE EXAMPLE OF MODULARIZATION IN STRUCTURED PROGRAMMING:

In this purposefully simplified example, we develop software, named CIRCLED ANALYZER, that, given the value for the radius, outputs information about circles.   The monolithic algorithm (where everything is lumped together in a single algorithm) is illustrated by the following flowchart. 

  

FIGURE MOS-2
FLOWCHART FOR THE PROGRAM "CIRCLE ANALYZER




However, it makes sense to separate the input and output instructions from the processing instructions.  Obviously, the calculations are independent of how the value of the radius is obtained or the results are presented, e.g. that radius value could be input by a human or it could be automatically read from a file of data.  In fact, human I/O can be accomplished via a command line interface, graphical user interface, audio interface, etc.; it makes no difference to the processing as long as the data is supplied to it.  This is illustrated by the following, modularized flowchart where the calculations are done by the module Calculate, called a procedure.

 
FIGURE MOS-3
FLOWCHART OF A MODULARIZED ALGORITHM, "CIRCLE ANALYZER",
WITH IT'S SUBALGORITHM "CALCULATE"





Figure MOS-3 shows the flowcharts of the main program, CIRCLE ANALYZER and it's procedure CALCULATE.  The value of a circle radius is input via the procedure INPUT and passed, via the input parameter r, to the procedure  CALCULATE, in which the area and circumference are calculated and passed back, to the main program, via the output parameters area and  circum.   Finally, the main program outputs the values of area and circum to the user.   The key point is that the algorithm is modularized via subalgorithms, but there is no fundamental association between the data (the radius r) which is specified in the program and operations (calculation of the are and  circum) which are defined in the procedure, i.e. the data and operations performed on the data are essentially independent.  The association between data and behavior is formalized in the object oriented paradigm, illustrated in the next section.

3. EQUIVALENT SIMPLE EXAMPLE OF MODULARIZATION  USING CLASSES:

The equivalent OO model of Figure MOS-3 is given in the following UML class diagram, Figure MOS-4, where the user interacts with the UserInterface class instead of with the program as in Figure MOS-3.  Actually it is the main() operation of UserInterface that connects the user with the Circle class, or, more accurately, with circle objects.  Although it it not obvious from the class diagram, when main() is executed, the user creates a circle object  ("instance" of the class Circle) which has a particular value for its radius.    In this way UserInterface "use-a" Circle; this is the "dependency" relationship between classes (i.e. UserInterface depends on Circle) which is represented by the dashed arrow pointing from UserInterface to Circle.   (NOTE: we could have included values for other data, besides r, such as the location of the circle's center, or the circle's color, etc., but we are keeping this example as simple as possible in order to illustrate and emphasize the important distinctions between modularization in AOP and modularization in OOSD.)

The most obvious difference between the two representations of the circle analyzer software is that the flowcharts represent algorithms (programs, procedures, or functions) whereas the class diagrams represent abstract objects (classes) and their relationships with each other.  It should be noted that the class architecture is a "higher level view" than the algorithm, i.e. it illustrates the relationships between classes and the encapsulation of state and behavior within each class. (The algorithmic view does not, indeed can not, illustrate such characteristics; all algorithms are only "recipes" of what to do, not properties of objects.)  In our example, there is only one relationship between classes, dependency, i.e. the class UserInterface depends on the class Circle. to get the results to be output to the user   The attribute r is encapsulated (with circum() and area()) within Circle, and main() is encapsulated within UserInterfaceObviously, the processing, occurring in the operations within Circle, are completely independent of (loosely related, via dependency, to) UserInterface.  On the other hand, each class must be highly cohesive, i.e. contain everything essential to describe objects of that type.  The fact that the data and operations are encapsulated (fundamentally "bound" together) makes it even easier to modify the modules (or replace them) than in structured programming.  In this example, this means that either class (UserInterface or Circle) can be "unplugged" and replaced by another class, that does the same things different ways - without affecting the other classes in the architecture, e.g. the class UserInterface, could be replaced by classes like  GraphicalUserInterface, CommandLineInterface, AudioInterface, etc. as long as they interact with Circle in the same way.

FIGURE MOS-4
UML CLASS DIAGRAM OF THE
OO SOFTWARE ARCHITECTURE OF A CIRCLE ANALYZER



   

It is very important to recognize that the "operations" circum() and area() have algorithms that are very similar to the instructions of procedure Calculate (therefore the rules of structured algorithm development apply in OOSD as well); however, the essential difference is that circum() and area()are distinct and they  "belong" to the class Circle, i.e. they are the "behaviors" of Circle.  To find the circumference or area of a "particular" circle "object", e.g. testCircle (an instance of Circle that the user creates while running main()), you send a message to testCircle asking it to specify its values, i.e. testCircle.circum() or  testCircle.area(). On the other hand, in Figure MOS-3, CALCULATE(r, circum, area) belongs to the program, i.e. is not associated with any circle, even the radius r is not associated with any particular circle.   It should also be noted that, as defined, the operations circum() and area() of Circle do not have an argument, r, like the procedure  CALCULATE(r, area, circum) does; again, this reflects the fact that the operations of Circle  do not need arguments to supply data because the data (value of r) is automatically associated with the operations because all data and operations are encapsulated together within the class/object.  (It is also worth noting that circum() and area() have a "functional" format, i.e. they return values via their names, unlike the "procedure" CALCULATE(r, area, circum) which returns values via the parameters area and circum.  However, the distinction between functions and procedures is irrelevant to the purpose of this presentation, to distinguish the two types of software modularization.)

Since it is independent of the input mechanism, Circle could be used with any kind of I/0 interface instead of the UserInterface class, defined above.  For example, a Web page can be used to input data and display results; for the languages JavaScript and Java, such I/O facilities are actually built into HTML.  High level programming languages like C++, Java, etc. have "constructs" for creating different kinds of I/O objects.  However, regardless of the differences in I/O mechanisms, the processing class, Circle, is still the same.  In C++ the most basic, built-in module is the "main" function, analogous to the main program in older "programming" languages so, in C++, the UserInterface class, above, would be replaced by the function main.  (Actually main, in C++, is an legacy of the language C, which is NOT object oriented!)  In Java, a pure OO language, main is a method that must be encapsulated within a class; therefore, the UML class diagram, above, would be implemented directly in Java, and there could be different classes for a GUI, a CLI, etc. - all using the same class Circle.

The fact that an object has an identity (its name), data (values of its attributes), a set of behaviors (its operations) hints at the strategy for developing OO software.  It suggests that one has to first identify the abstract objects of the software architecture, then identify the attributes required to specify the data essential to each object, and finally identify the operations that define the required behaviors of each object; during this process one will, of necessity, identify relationships between the abstract objects that specify interactions among the objects. It is worth noting that software developers rarely try to model "everything" in a real world system; they only model the objects that are relevant to the limited world of the software.  It is also important to recognize that once identified,  operations need to be completely specified; this requires algorithm development, so algorithm development doesn't disappear in OOSD; it only takes on a secondary, supportive role in the larger strategy of OOSD.  This strategy will be amplified in subsequent presentations.

4. SUMMARY:

The separation of monolithic software into highly cohesive, loosely coupled modules is the essence of the "divide and conquer" approach to software development (where complex tasks are divided into subtasks for more efficient, team oriented development). In any software development strategy, all modules should be "highly cohesive" (self contained) and "loosely coupled" (relatively independent of each other) so that they can be tested, modified, or even replaced without affecting the rest of software.  The preceeding presentation introduced and illustrated the concept of modularization of software by looking at the simplest and most fundamental modularization technique, the separation of input and output from processing.  Not all software development environments have distinctive I/O mechanisms; however, all software can be "viewed" in terms of an I/O module and one or more processing modules, any of which can be subdivided into "lower level" modules.  There are two fundamental strategies for modularizing software, structured programming and object-oriented software development.

  1. In structured programming, modularized is accomplished by dividing the main algorithm into  subalgorithms called procedures or functions.    These procedures and functions can, themselves, be modularized by creating procedures or functions within them.
    1. In structured programming, there is no fundamental relationship between data and behavior, i.e. there is no programming language facility that binds associated data and operations together.  In fact, before the advent of object technology, data and operations were distinct quantities of equal importance.  Therefore the association of data and its behavior must be controlled by the program itself, i.e. data must be passed to procedures or functions via arguments or parameters.
  2. In object-oriented software development, a  real world system is modeled by a software architecture which is modularized into classes (abstract templates of real objects) that are "highly cohesive and loosely related". When we "model" a system in OO software, we specify the state of an object with its attributes and the behavior of that object with its operations/methods; we only need to specify the attributes and operations/methods that are necessary to our model.  For example if we are modeling a system that manipulates geometric objects, we could omit their colors if they are irrelevant to our task.
    1. This presentation uses a very simplified example that only uses one (albeit the most important) OO  classifier, the class.  There are other classifiers, e.g. interfaces, packages, etc. that are beyond the scope of this presentation; they will be discussed, briefly, in the following presentation.
    2. The OO paradigm is a higher level view of software than the algorithmic view of structured programming.
    3. Unlike structured programming, there is a fundamental connection between data and behavior in OOSD, i.e. the class construct encapsulates related data and operations.  Consequently, there is no need for arguments or parameters to pass data to operations because the data is "bound" to the operations that define its behavior by being encapsulated together within a class.

SAQ: Summarize the essential point of this presentation in two sentences (semicolons allowed).

See the ONLINE ASSESSMENT  for this presentation.  It has hints and answers.  Working through this should help you "UNDERSTAND" the knowledge contained in this presentation.

{NOT PART OF THIS PRESENTATION.  Use as an exercise or place this elsewhere, when talking about generic I/O modules}   In my example, the operations InputData() and OutputInfo() are the interface to the class Process; other classes can use the Process class if (and only if) they access it via these "interface operations".  In fact, there can be several versions of the class UserInterface, e.g. a GUI, CommandLineInterface, AudioInterface, etc.; all they have to do is use InputData() and OutputInfo() to access Process. ...  testProcess is a real object (an "instance" of the class Process); it actually has specific values for the Data. ...The role of the program, in the OO architecture, is carried out by the operation/method  Main()....