Application data are an integral part of business processes. Data can be created,
modified, and deleted during the execution of business processes. Since
business processes consist of a set of activities that are related, these activities
operate on an integrated set of application data.
Data in business process models has two aspects, both of which need to
• Data that activity instances manipulate by invoking applications or services.
• Data dependencies between process activities.
The former issue is dealt with in the operations subdomain. In service-oriented
systems architectures, for instance, the parameters of service invocations are
specified, so that data can be communicated correctly with software systems
at run time.
At the process level, data dependencies between process activities is typically
described by data flow. An example of data flow in a business process in
the financial sector is given. A credit approval business process contains activities
to enter a credit request, to assess the risks of granting the credit, and
to inform the customer about the decision made by the financial institution.
The activities of this process model operate on case data, in particular,
the credit request. The credit request can be represented by a record data
type with fields for the name and address of the credit requester, the amount
requested, and other information, such as the risk related to granting the
There are data dependencies between the activities mentioned. The Collect
Credit Info activity is the first activity performed. Only when this data
is available, can the risk be assessed in the Assess Risk activity, the final
decision be made(Decide), and the requestor be notified (Notify). Therefore,
the ordering of the activities in the business process is strongly related to the
data dependencies of the activities.
The process model is illustrated , using a graph-based process
language that explicitly represents input and output parameters of activities and data dependencies. Observe that the actual data transfer can be performed
by passing references to data objects or values of data objects, as
described the context of workflow data patterns.
This diagram shows that data dependencies have implications on the ordering
of activities in the process: the Assess Risk activity can be started only
when the credit information is available. Since this data object is provided as
output parameter CreditInfo of the Collect Credit Info activity, this activity
needs to complete before the risk can be assessed, implying an ordering
between these activities.
This example shows that data dependencies between process activities are
reflected by data flow. A data flow edge between an output parameter of one
activity and an input parameter of another activity represents the fact that
the latter activity requires a data value that the former generates. In the
example, the Collect Credit Request activity generates an output parameter
CreditInfo that the Assess Risk activity requires for its start.
If, as assumed so far, output parameter values are only available when the
respective activity terminates, there is a direct implication of data flow on
control flow. This property is known as control flow follows data flow, and it
is explained as follows.
Control flow needs to follow data flow, since otherwise the process instance
would come to halt. This observation is illustrated in an example
where a data dependency from the Assess Risk activity to the
Decide activity is shown, while the control flow constraint exists, for some
reason, in the opposite direction.
As a result, neither of these activities can be started, because control flow
defines that Assess Risk can only start after Decide has completed, and Decide
can only start after Assess Risk has generated the risk factor data value.
Because the risk factor value is only available when the activity terminates,
both activities are stuck in a permanent waiting condition, and a deadlock
situation has occurred. The process model results from a
modelling mistake, and the control flow follows data flow rule can be used to
detect these kinds of modelling mistakes.
These considerations hold only if it is assumed that an activity instance
requires its input parameters at the start. If this constraint is relaxed and
input parameters can be consumed after an activity instance has started,
then in a process with a data flow A ! B, B can actually start before A
terminates. At some point—when the input values are required—B needs to
wait for A to deliver the required data.
This assumption can also be relaxed at the producer side of data. If we
allow activities to generate data while they are running, then the generated
data can be taken by the follow-up activity, so that activities can execute
concurrently, realizing a data value stream between them.
While most workflow management systems assume that input data is available
up front and that only on completion, does an activity instance write
output data values, some approaches, for instance, the BPMN, relax this assumption.
The use of data dependencies for process enactment control will be discussed
in more detail in the context of case handling , where
data dependencies—and not the proce