Dienstag, 31. Juli 2007

OCL Tools proposal in Eclipse

I've submitted (along with A. Jibran Shidqie) a proposal to create an OCL Tools component at the MDT project. The official announcement has been made today on the eclipse.modeling.mdt newsgroup, you may participate in the discussion by going there.

The text of the proposal follows:


OCL Tools


I. Introduction


OCL Tools is a proposed open source component
in the Model Development Tools (MDT) Project to support the editing, refactoring, code generation, execution, and interactive debugging of the OCL constraints given for some underlying (Ecore or UML2) class model, building upon the infrastructure provided by the MDT OCL component.

This proposal is in the Project Proposal Phase (as defined in the Eclipse Development Process document) and is written to declare itsintent and scope. This proposal is written to solicit additionalparticipation and input from the Eclipse community. You are invited tocomment on and/or join the component. Please send all feedback to the

eclipse.modeling.mdt newsgroup.



II. Background


The Eclipse infrastructure for modeling is based on EMF and UML2, with support
for OCL 2.0 (Object Constraint Language) provided by the Model Development Tools (MDT) project. The expressive power of OCL allows capturing a sizable amount of development
requirements in a declarative manner:

  • (a.1) data modeling aspects can be expressed as an (Ecore or UML2) data
    schema further constrained by OCL invariants;


  • (a.2) a number of functional requirements can be specified as
    operation pre- and postconditions, together with


  • (a.3) side-effects-free queries (defining the values of instance
    fields and operations)

As of now, OCL constraints can be evaluated at runtime by:

  • (b.1) its textual representation into its Abstract Syntax Tree (AST)
    representation,


  • (b.2) invoking an EvaluationVisitor (also provided by MDT OCL)
    on the AST representation. Evaluation takes place in the context of an
    object population.


The steps required to realize (b.1) and (b.2) are explained in the references listed below. Carrying out such steps presupposes knowledge well beyond writing OCL expressions.

In order to bring usability of OCL specs on par with that of EMF and UML2,
at the very least an OCL compiler is necessary, building upon (Ecore,UML2) ->
Java code generation and weaving (a.1) through (a.3) into the
resulting Java code. In particular, the generated code can then be
directly used as the Model component in an MVC architecture (for
example, as part of a graphical editor generated with Eclipse GMF).
Additionally, the compiled OCL expressions execute faster (twice on
average) than their interpreted counterparts. In order to compare the performance of alternative compilation algorithms, a benchmark suite of OCL expressions would be desirable.

For OCL specifications to be seamlessly integrated in an MDSD toolchain, besides batch compilation, other tools are necessary: custom text editor, source-level debugging support, i.e. IDE capability.


III. Scope


The objectives of the OCL Tools component are to:
  • (d.1) provide a working implementation of a compilation algorithm
    targeting both EMF and UML2. That algorithm (in its EMF specialization)
    has been described in a position paper [2] submitted to the Modeling Symposium
    at Eclipse Summit Europe 2007 and has been implemented in the code contribution mentioned below. A related Eclipse Technical Article ("How to Process OCL Abstract Syntax Trees") [4] has been contributed by one of the authors of
    this proposal, simplifying the learning curve for future contributors.


  • (d.2) provide sample (Ecore and UML2) + OCL models to speed up the adoption by
    first-time users of OCL Tools. Initially, two non-trivial models are to be made available:


    • (d.2.a) Royal & Loyal, described in the book on OCL by Warmer and Kleppe; and


    • (d.2.b) a metamodel of the JPQL language (Java Persistence Query Language) which encodes as OCL invariants all the well-formedness rules ("static semantics")
      stated as English sentences in JSR-220 (described in this paper)



  • (d.3) solicit contributions to adapt the compilation algorithm
    embedded in our implementation to other domains:



    • (d.3.a) first and foremost, to accept UML2 class models as input,


    • (d.3.b) targeting object models other than Java
      (for example, by translating OCL constraints into the
      query language of Java Persistence API, the Object-Relational Mapping
      technology standardized by JSR-220).




  • (d.4) provide IDE capability:

    • (d.4.a) a text editor supporting usability features such as syntax-directed completion, markers for violations of well-formedness, use-defs navigation, folding, and structural views (e.g., OCLASTView [4])


    • (d.4.b) refactoring support

    • (d.4.c) source-level, interactive debugging




IV. Description

This section focuses on the functionality available in the code contribution.
Pointers to related work are included in the next section ("Relationship with other Eclipse-based projects").

An OCL compiler is available, which follows the classical software architecture of a
batch compiler [1]:

  • (e.1) a front-end component, shared among all
    "target architectures" (EMF and UML2). During the front-end phase,
    textual input is parsed into Concrete Syntax Trees
    (CSTs), which once validated are used to build the ASTs
    that compilation proper takes as input. These two activities involve a
    number of sub-steps, such as resolving mutual forward-references and
    collecting error markers. These sub-steps are realized with
    building blocks provided by MDT OCL, while bookkeeping and glue code
    has been added to hold everything together.


  • (e.2) the in-memory copy of the input model (our compiler does not
    modify the actual input) is decorated with EAnnotations that are
    taken later by EMF CodeGen (and in the future by UML2 CodeGen)
    as the source of method bodies and
    additional declarations: helper methods, helper types (e.g.
    those resulting from OCL tuples), and helper utilities (e.g. those to
    track all instances in a ResourceSet, realizing OCL's allInstances())


Some patterns exhibited by the resulting code are (a more
comprehensive exposition can be found in [2]):



  • (f.1) OCL preconditions and postconditions are translated into Java
    assert statements


  • (f.2) OCL define and derive statements are translated into getters
    (for properties) and into method bodies (for operations)


  • (f.3) OCL invariants are translated into dedicated Java methods. It is
    the responsibility of the user to invoke them at times deemed
    appropriate.



The OCL Compiler is packaged as an Eclipse plugin with a basic UI (an
action for an IFile to start compilation, a preferences page to
choose compilation options). The input (as of now) is assumed
to be a single .ecore file accompanied by an .ocl file with the same name.


V. Relationship with Other Eclipse-based Projects

The SAFARI project aims at generating an IDE out of the description of an object oriented, imperative programming language, in principle covering the whole spectrum from structured editing to interactive debugging.

Other efforts
have focused on the structured editing and AST-building phases,
with an emphasis on using as internal representation of ASTs
an instantiation of an Ecore + OCL language metamodel ("eat your own dog food").

Emfatic has been extended by one of the authors of this proposals, as documented in http://www.sts.tu-harburg.de/~mi.garcia/SoC2007/ImprovementsToTheEmfaticEditor.pdf and in http://www.sts.tu-harburg.de/~mi.garcia/SoC2007/draftreport.pdf


Refactoring of OCL expressions is the goal of the tool http://www.roclet.org

As resources such as the above on IDE-related issues become available,
their application in the context of this component will be considered.



VI. Organization



VI.a) Initial committers



The initial committers will focus on evolving and hardening the OCL
compiler, as input models are contributed by the community. The
initial committers are also willing to provide know-how to the
community on OCL tooling in general (e.g, for those developing
translations from OCL into other languages, such as JPQL). Our agile
development process will follow eclipse.org's standards for openness
and transparency. The initial committers are:

VI.b) Code contributions


The OCL Compiler for EMF described by its authors in [2].

VI.c) Interested parties


The team of initial committers will explore statements of interest
from persons experienced in OCL processing or willing to gain such
experience.

VI.d) Developer community



We expect the initial set of committers to grow, as the intial contribution
establishes a platform which automates the front-end phases of
OCL processing (a common need for most, if not all, OCL processing tasks).

There is an active OCL community, with participants from industry and
academia attending (among others) the OCL workshop series at the
MoDELS conference (http://st.inf.tu-dresden.de/Ocl4All2007/). The
techniques advanced by this community can naturally build upon and extend the
tooling proposed here.

Similarly, members of the growing QVT community usually have a
background in OCL processing, and would be willing to have OCL play a
bigger role in their techniques provided the proposed infrastructure is
available.


VI.e) User community



Feedback from a large user community of commercial developers is
critical to harvesting real-world and complete OCL
specifications. Actually, harvesting such specifications reveals an
egg-and-chicken problem: those specifying a system have a reduced
incentive to invest effort in preparing OCL specs if they are to
remain paper-only and thus not automatically enforceable. On the other
hand, the developers of OCL tooling are reluctant to target a small
audience. The OCL compiler component is good positioned to break this
cycle, providing immediate benefit to the authors of OCL
specifications in the form of a portable and efficient implementation
of the (Ecore,UML2) + OCL spec given as input.

The availability of OCL tooling will also expand the user base
of directly related projects/components (OCL, EMF, UML2) thus
accelerating the synergies of the Eclipse ecosystem.

VII. Tentative Plan


  • 2007-07 v0.1: internal release for study in the EMF, UML2 and MDT OCL teams

  • 2007-08 v0.2: public release for community study

References

[1] A. W. Appel and J. Palsberg.
Modern Compiler Implementation in Java. Cambridge University Press, New York, NY, USA, 2003. http://www.cs.princeton.edu/~appel/modern/java/


[2] OCL Compiler for EMF.
Garcia, M. and Shidqie, A. J. Submission to Modeling Symposium at Eclipse Summit Europe 2007.
[3] Implementing Model Integrity in EMF with MDT OCL
Christian W. Damus, Eclipse Technical Article,

[4] How to process OCL Abstract Syntax TreesMiguel Garcia, Eclipse Technical Article,



Sonntag, 13. Mai 2007

cross-artifact queries

While extremely useful in themselves, the tools discussed in the previous entry are limited to querying source-code artifacts only: configuration files, plugin.xml, etc. are left out (precisely, those for which checks such as referential integrity would be most useful). Examples of referential integrity for Struts and Eclipse Workbench plugins are provided by Michał Antkiewicz here (not in OCL, unfortunately, although OCL has enough expressive power to encode the consistency he considers)

Software artifacts, once available as instances of Ecore-based model, can be queried with OCL. Some useful links for this task:
  1. an Ecore-based metamodel of Java (without OCL well-formedness rules) has been contributed by Mikael Barbero (INRIA) and can be found here (for a tree-based visualization follow this link)

  2. the MM above can be used in conjunction with this plugin, which instantiates such metamodel for a compilation unit of choice (using the JDT infrastructure)

  3. more sample code for querying the “raw” AST offered by the JDT can be found in ASTView. In order to understand in detail the type information of a given Java file, these tutorial slides prove extremely useful.

Querying a Java AST is usually cumbersome given its detailed nature. As in databases, having a stock of parameterized queries (to filter only those items useful for the task at hand) goes a long way towards improving usability.

It looks to me that until the following critical mass is necessary:
  • libraries of metamodels (with WFRs)
  • plugins to parse software artifacts into those metamodels
  • pre-defined queries for repetitive tasks on those metamodels
  • efficient mechanisms to evaluate queries, author them, and to visualize their results
in order to realize the whole potential of software repositories. Wow, that sounds like a "research agenda" ... spooky!

"Watch this space!" :)

querying source code and other artifacts

Now that agreement exists on the parsed representation of Java source code (the JDT ASTs), non-standardization has moved to the languages to query such ASTs. What is querying over ASTs good for? Examples:

  • checking coding styles (such as naming conventions)
  • fault detection (to discover bugs at development time)
  • refactoring (to detect code smells, optimise and improve code design)
  • metrics (for measuring the complexity of the code)
  • aspect weaving (to identify join point shadows of interest)

Let's compare some Eclipse plugins addressing this field. For each alternative below an example is included to convey a flavor of the language (I haven’t been able to find the same query being expressed in all languages, that would ease comparison):

a) SCL
"Constructors must not invoke overridable methods, directly or indirectly, that access a (potentially uninitialized) instance variable"


b) PMD,
"While loops must use braces”

public class WhileLoopsMustUseBracesRule
extends AbstractRule {
public Object visit(ASTWhileStatement node, Object data) {
SimpleNode firstStmt = (SimpleNode)node.jjtGetChild(1);
if (!hasBlockAsFirstChild(firstStmt)) {
addViolation(data, node);
}
return super.visit(node,data);
}
private boolean hasBlockAsFirstChild(SimpleNode node) {
return (node.jjtGetNumChildren() != 0 && (node.jjtGetChild(0) instanceof ASTBlock));
}
}



c) JQuery
"List all public Getters"



d) CodeQuest
"Lookup all implementations of an abstract method"


?query3(M1,M2) :- hasStrModifier(M1,'abstract'), overrides(M2,M1), NOT(hasStrModifier(M2, 'abstract')).

overrides(M1,M2) :- strongLikeThis(M1,M2), hasChild(C1,M1),
hasChild(C2,M2), inheritableMethod(M2),
hasSubtypePlus(C2,C1).


The above list is incomplete, and that’s a sign of duplicate work: queries to detect particular code-smells are being expressed (slightly differently?) in different notations.

Bringing it down to a few words, the pros of the different approaches are:
  • SCL: harmonizing formality (the definition of SCL is formal) and readability
  • PMD: letting the user add new queries (if familiar with the detailed structured of ASTs)
  • JQuery: provide Eclipse-integrated structured viewers to display (and thus navigate) results
  • CodeQuest: efficient query-answering (Datalog queries are quite proficiently optimized by industrial-strength RDBMSs)

So each product has its stronghold. In the next blog entry, I’ll comment on yet another approach to querying software artifacts, comparing how if fares to those mentioned here. We’re dealing with conflicting requirements (for example, expressiveness vs. efficiency) so any choice will involve some trade-offs.

References

[1] SCL, http://people.clarkson.edu/%7Edhou/projects/SCL.html
[2] PMD, http://pmd.sourceforge.net/
[3] JQuery, http://jquery.cs.ubc.ca
[4] CodeQuest, http://progtools.comlab.ox.ac.uk/projects/codequest

Additional projects similar to all of the above are listed at:
http://pmd.sourceforge.net/similar-projects.html

Mittwoch, 9. Mai 2007

an OCL bibliography

Some time ago (two years in fact) I compiled a bibliography of papers, with a focus on showing best-practices around usage of OCL. Surprisingly, most references are still relevant. I hope this list can be useful as a starting point for further exploration. The categories are listed in no particular order, with a selection of papers for each.

Configuration Management

OCL and AspectJ


Refactoring OCL, OCL rewriting

OCL for software artifacts


OCL for modeling of Software Architecture

ORDBMS-based software repositories


OCL codifying business rules

Translation of OCL to XQuery


Generation of proof conditions

verbalizing OCL

Tools to "read aloud" a formal specification belong to the basic repertoire of the (Controlled) Natural Language research community. "Reading aloud" more like paraphrasing, e.g.

∀x.Number(x)∧Prime(x)∧LessThan(x,3)⇒Even(x)

can be rewritten into
Every prime number less than 3 is even

A tool along these lines for OCL is OCLNL. Given this OCL as input:

context Copy

inv: Copy.allInstances()->forAll(c1,c2
not (c1=c2) implies not (c1.barCode=c2.barCode))


The verbalization computed by OCLNL is:

for the class Copy the following invariants hold :

  • for all copies c2 , c1 in the set of all instances of Copy

if it is not the case that c1 is equal to c2 then this implies that it is not the case that the bar code of c1 is equal to the bar code of c2

OCLNL has been applied to JavaCard APIs (security aspects mostly) and similar real-world case studies. Papers and theses can be found in the OCLNL homepage , the technical doc makes us aware to the fact that the core of OCLNL has been implemented in Haskell (but hey, if you've mastered Java's type system then you've accomplished more than you need to learn Haskell). The input models are not Ecore-based.

One quick way to get your hands dirty in this playground (verbalization of Ecore + OCL models) consists in detecting patterns in OCL Abstract Syntax Trees (ASTs) for which there's an (empirically convenient) English-language paraphrasing. For example, there's one such pattern behind the invariant above, that allows its translation to the more compact:


Two different copies have different bar codes


In a way, that's not playing fair, because covering just some such patterns from all those that may show up in valid OCL ASTs does not get the full job done. Still, it's interesting if you want to try your hand at processing OCL ASTs ... which brings me to the real plunge I wanted to make: take a look at my article on that very subject, How to process OCL Abstract Syntax Trees , and its accompanying Eclipse-based plugin. With it, you can do cool stuff such as:



'nuff said! Actually a big advantage of turning a formal spec into controlled natural language is the larger audience that can review it (and thus find errors). This aspect is highlighted in David Burke's master thesis, an excerpt follows:


May be it's just me, but I think it would be great if someone took up the task of performing similar OCL AST processing for Eclipse!

Montag, 7. Mai 2007

OCL in Topcased 1.0.0M3, very first impressions

It all started with exploring the OCL editor:



(including the context menu "Evaluate rule on model", appears after right-clicking an OCL expression)

Then I noticed other use cases:
  • OCL Model Statistics
  • OCL Model Checker
which can be accessed through



Hopefully more details will appear once I'm done with the source code,



My conclusion so far is: Hey, this kind of blogging is not that time consuming!