Feature-oriented programming

Feature Oriented Programming (FOP) or Feature Oriented Software Development (FOSD) is a general paradigm for program synthesis in software product lines.

FOSD arose out of layer-based designs of network protocols and extensible database systems in the late-1980s ^[1] . A program was defined as a stack of layers. Each layer added functionality to previously composed layers and different compositions of layers produced different programs. Not surprisingly, there was a need for a compact language to express such designs. Elementary algebra fit the bill: each layer was function that added new code to an existing program to produce a new program, and a program's design was modeled by an expression, i.e., a composition of functions (layers).

Over time, the idea of layers was generalized to features, where a feature is an increment in program development or functionality. The paradigm for program design and synthesis was recognized to be a generalization of relational query optimization, where query evaluation programs were defined as relational algebra expressions, and query optimization was expression evaluation ^[2]

A software product line (SPL) is a family of programs where each program is defined by a unique composition of features, and no two programs have the same combination of features. FOSD has since evolved into the study of feature modularity, tools, analyses, and design techniques to support feature-based program synthesis.

Further advances in FOSD arose from recognizing the following facts: Every program has multiple representations (e.g., source, makefiles, documentation, etc.) and adding a feature to a program could elaborate each of its representations so that all representations are consistent. Additionally, some of these representations could be generated (or derived) from other representations. In this article, the mathematics of the three most recent generations of FOSD, namely GenVoca^[1], AHEAD^[3], and FOMDD^[4]^[5], are progressively described, and links to product-lines that have been developed using FOSD tools are provided.

Please note this page is under construction (it currently has both omissions and errors). A first draft should be finished by mid-June 2008.

GenVoca

GenVoca (a meld of the names Genesis and Avoca) is a compositional paradigm for defining programs of a product lines ^[1]. Base programs are 0-ary functions or transformations called values:

f -- base program with feature f

h -- base program with feature h

and features are unary functions that elaborate (modify, extend, refine) a program:

i • x -- adds feature i to program x

j • x -- adds feature j to program x

where • denotes function composition. The design of a program is a named expression, e.g.:

p₁ = j • f -- program p₁ has features j and f

p₂ = j • h -- program p₂ has features j and h

p₃ = i • j • h -- program p₃ has features i, j, and h

A GenVoca model of a domain or software product line is a set of values and functions. The programs (expressions) that can be created defines a product line. Expression optimization is program design optimization, and expression evaluation is program synthesis.

Note: GenVoca is based on the step-wise development of programs: a process that emphasizes design simplicity and understandability, which are essential for program comprehension and the automation of program design and development. Consider program p₃: it begins with base program h, then feature j is added (read: the functionality of feature j is added to the codebase of h), and finally feature i is added (read: the functionality of feature i is added to the codebase of j•h).

Note: A more recent formulation of GenVoca is symmetric: there is only one base program, 0 (the empty program), and all features are unary functions. This gives rise to the interpretation that GenVoca composes program structures by superposition, the idea that complex structures are composed by superimposing simpler structures.^[6]

Note: not all combinations of features are meaningful. Feature diagrams (which can be translated into propositional formulas) are graphical representations that define legal combinations of features. ^[7]

Implementation

To illustrate the basic ideas, consider the following Java base class foo, which we will call value r:

class foo {
   int x = 0;
   void inc() { x++; }
}

A feature (a.k.a refinement, extension) of r is shown below in AHEAD syntax. Let's call this feature w, which adds to class foo an 'int y' field, a 'void set()' method, and a refinement of the 'void inc()' method. Method refinements or deltas are written just like method overrides in Java subclassing hierarchies: the phrase 'super.inc()' effectively means substitute the previously-defined body of the 'void inc()' method.

refines class foo {
   int y;
   void set() { y=x; }
   void inc() { y++; super.inc(); }
}

The composition w•r is shown below: the method and field of w is added to foo, and the inc() method is refined:

class foo {
   int x = 0;
   int y;
   set() { y=x; }
   void inc() { y++; x++;  }
}

The AHEAD ^[3] notion of 'refinement' or 'delta' is quite general, and can apply to non-Java program representations as well. For example, below is a base grammar q and its token definitions:

“+” PLUS

Expr : Val
     | Val Opr Expr ;
Val  : INTEGER ;
Opr  : PLUS ;

A refinement of q, here called t, that adds a token MINUS and a new right-hand side to production Opr is:

“-” MINUS

Opr : super
    | MINUS ;

The “super” construct refers to the prior right-hand sides of a production (in this case, Opr). The composition t•q is the composite grammar:

“+” PLUS
“-” MINUS

Expr : Val
     | Val Opr Expr ;
Val  : INTEGER ;
Opr  : PLUS 
     | MINUS ;

The benefit of using similar refinement concepts for different program representations is pragmatic. If each representation had a completely different way to express refinements, the ability of any individual to understand all of them and use them effectively rapidly diminishes. Our experience is that uniformity contributes to understandability and simplicity.

Note: FOSD does not preclude other and more sophisticated ways of defining refinements. Aspects and rule sets of transformation systems are examples. Both technologies could be (and have been) uniformly applied to all kinds of program representations. So GenVoca captures the essence of these technologies, where the examples above are just one possible implementation.

AHEAD

Algebraic Hierarchical Equations for Application Design (AHEAD) ^[3] generalizes GenVoca in two ways. First it reveals the internal structure of GenVoca values as tuples. Every program has multiple representations, such as source, documentation, bytecode, and makefiles. A GenVoca value is a tuple of program representations. In a product line of parsers, for example, a base parser f is defined by its grammar g_f, Java source s_f, and documentation d_f. Program f is modeled by the tuple f=[g_f, s_f, d_f]. Each program representation may have subrepresentations, and they too may have subrepresentations, recursively. In general, a GenVoca value is a tuple of nested tuples that define a hierarchy of representations for a particular program.

Example. Suppose terminal representations are files. In AHEAD, grammar g_f corresponds to a single BNF file, source s_f corresponds to a tuple of Java files [c₁…c_n], and documentation d_f is a tuple of HTML files [h₁…h_k]. GenVoca values (nested tuples) can be depicted as directed graphs: the graph for program f is shown in the figure below. Arrows denote projections, i.e., mappings from a tuple to one of its components. AHEAD implements tuples as file directories, so f is a directory containing file g_f and subdirectories s_f and d_f. Similarly, directory s_f contains files c₁…c_n, and directory df contains files h₁…h_k.

Hierarchical Relationships among Program Artifacts

Note: Files can be hierarchically decomposed further. Each Java class can be decomposed into a tuple of members and other class declarations (e.g., initialization blocks, etc.).

Second, AHEAD expresses features as nested tuples of unary functions called deltas. Deltas can be program refinements (semantics-preserving transformations), extensions (semantics-extending transformations), or interactions (semantics-altering transformations). We use the neutral term “delta” to represent all of these possibilities, as each occurs in FOSD.

As an example, suppose feature j extends a grammar by $\Delta$ g_j (new rules and tokens are added), extends source code by $\Delta$ s_j (new classes and members are added and existing methods are modified), and extends documentation by $\Delta$ d_j. The tuple of deltas for feature j is modeled by j=[ $\Delta$ g_j, $\Delta$ s_j, $\Delta$ d_j], which we call a delta tuple. Elements of delta tuples can themselves be delta tuples. For example, $\Delta$ s_j represents the changes that are made to each class in s_f by feature j, i.e., $\Delta$ s_j=[ $\Delta$ c₁… $\Delta$ c_n]. The representations of a program are computed recursively by composing tuples element-wise. The representations for parser p (whose GenVoca expression is j•f) are:

  p₂ = j • f -- GenVoca expression

  = [  $\Delta$ g_j,  $\Delta$ s_j,  $\Delta$ d_j] • [g_f, s_f, d_f] -- substitution

  = [ $\Delta$ g_j•g_f,  $\Delta$ s_j•s_f,  $\Delta$ d_j•d_f] -- compose tuples element-wise

That is, the grammar of p is the base grammar composed with its extension ( $\Delta$ g_j•g_f), the source of p is the base source composed with its extension ( $\Delta$ s_j•s_f), and so on. As elements of delta tuples can themselves be delta tuples, composition recurses, e.g., $\Delta$ s_j•s_f= [ $\Delta$ c₁… $\Delta$ c_n]•[c₁…c_n]=[ $\Delta$ c₁•c₁… $\Delta$ c_n•c_n]. Summarizing, GenVoca values are nested tuples of program artifacts, and features are nested delta tuples, where • recursively composes them. This is the essence of AHEAD.

FOMDD

Feature Oriented Model Driven Design (FOMDD) ^[4] ^[5] combines the ideas of AHEAD with Model Driven Design (MDD) (a.k.a. Model-driven architecture (MDA)). AHEAD functions capture the lockstep update of program artifacts. But there are other functional relationships among program artifacts that express derivations. For example, the relationship between a grammar g_f and its parser source s_f is defined by the tool javacc. Similarly, the relationship between Java source s_f and its bytecode b_f is defined by the javac compiler. A commuting diagram expresses these relationships. Objects are program representations, downward arrows are derivations, and horizontal arrows are deltas. The figure below shows the commuting diagram for program p₃ = i•j•h = [g₃,s₃,b₃].

Derivational and Refinement Relationships among Program Artifacts

A fundamental property of a commuting diagram is that all paths between two objects are equivalent. For example, one way to derive the bytecode b₃ of parser p₃ (lower right object in the above figure) from grammar g_f of parser f (upper left object) is to derive the bytecode b_f and refine to b₃, while another way refines g_f to g₃, and then derive b₃:

   $\Delta$ b_i •  $\Delta$ b_j • javac • javacc = javac • javacc •  $\Delta$ g_i •  $\Delta$ g_j

There are ${\tbinom {4}{2}}$ possible paths to derive the bytecode b₃ of parser p₃ from the grammar g_f of parser f. Each path represents a metaprogram whose execution synthesizes the target object (b₃) from the starting object (g_f). There is a potential optimization: traversing each arrow of a commuting diagram has a cost. The cheapest (i.e., shortest) path between two objects in a commuting diagram is a geodesic, which represents the most efficient metaprogram that produces the target object from a given object.

Note: the basic ideas we sketch here align with elementary ideas from Category theory ^[4]^[5]

Applications

To be supplied.

References

^ ^a ^b ^c The Design and Implementation of Hierarchical Software Systems with Reusable Components ftp://ftp.cs.utexas.edu/pub/predator/tosem-92.pdf
^ Access Path Selection In Relational Databases http://portal.acm.org/citation.cfm?id=582099
^ ^a ^b ^c Scaling Step-Wise Refinement ftp://ftp.cs.utexas.edu/pub/predator/TSE-AHEAD.pdf
^ ^a ^b ^c Feature Oriented Model Driven Development: A Case Study for Portlets ftp://ftp.cs.utexas.edu/pub/predator/ICSE07.pdf
^ ^a ^b ^c Generative Metaprogramming http://portal.acm.org/citation.cfm?id=1289971.1289990
^ Superimposition: A Language-Independent Approach to Software Composition http://www.infosun.fim.uni-passau.de/cl/publications/docs/SC2008.pdf
^ Feature Models, Grammars, and Propositional Formulas ftp://ftp.cs.utexas.edu/pub/predator/splc05.pdf

[genvoca-1] The Design and Implementation of Hierarchical Software Systems with Reusable Components ftp://ftp.cs.utexas.edu/pub/predator/tosem-92.pdf

[selinger-2] Access Path Selection In Relational Databases http://portal.acm.org/citation.cfm?id=582099

[ahead-3] Scaling Step-Wise Refinement ftp://ftp.cs.utexas.edu/pub/predator/TSE-AHEAD.pdf

[fomdd-4] Feature Oriented Model Driven Development: A Case Study for Portlets ftp://ftp.cs.utexas.edu/pub/predator/ICSE07.pdf

[genmeta-5] Generative Metaprogramming http://portal.acm.org/citation.cfm?id=1289971.1289990

[symmetric-6] Superimposition: A Language-Independent Approach to Software Composition http://www.infosun.fim.uni-passau.de/cl/publications/docs/SC2008.pdf

[splc-7] Feature Models, Grammars, and Propositional Formulas ftp://ftp.cs.utexas.edu/pub/predator/splc05.pdf

[1]

[2]

[3]

[4]

[5]

[6]

[7]