\subsection{Multiple-Valued Decision Diagrams}

The representation produced by the previous section for
a function $f:D^{k}\rightarrow R$ with $D$ finite is
called a \emph{multiple-valued decision diagram} or MDD \cite{srinivasan90}
when $D$ is not boolean. If $R$ is also not boolean, the structure
produced is sometimes called a \emph{multi-terminal multiple-valued decision
diagram} (MTMDD) \cite{sasao96}; where this term is used, the term ``MDD''
is used to refer to the case where $R$ is boolean.

Alternatively, the boolean-output \emph{characteristic function} $\chi_f$ of
a multiple-valued-output function $f$ can be represented by a boolean-terminal
decision diagram. This characteristic function is defined as follows:
$$\chi_f(\vec{x}, d)= \left\{
	\begin{array}{ll}
	1 & \mbox{if}\; f(\vec{x})=d \\
	0 & \mbox{otherwise}
	\end{array}
	\right.
$$
	
In order to have the advantages of binary-terminal decision diagrams
for multiple-valued-output functions -- notably constant-time
satisfiability checking and mature implementations -- \emph{shared
decision diagrams} represent 
The key idea behind shared decision diagrams is to represent
the range $R$ in binary, with a bit-vector corresponding to each
element. For instance, a range of 8 values can
be represented in 3 bits, splitting any single function onto
that range into 3 binary-valued functions. The motivation is that this will
increase space efficiency by promoting sharing of
common subexpressions, and this, in practice, turns out
to be justified \cite{minato90}.
 
More formally, a shared decision-diagram representation for
a function $f:D^{k}\rightarrow R$ is based on a \emph{binary decomposition}
$\delta:R\rightarrow \{0, 1\}^t$ for the range $R$. This decomposition splits
$f$ into a vector of $t$ binary-output functions $f_i:D^{k}\rightarrow \{0,1\}$
for $1 \leq i \leq t$, where $f_i(\vec{d})$ is the $i$th element of
$\delta(f(\vec{d}))$:
$$f_i(\vec{d})=\delta(f(\vec{d}))_i$$

The resultant decision diagram has a vector of root nodes, rather than
a single one; all of the reducedness and orderedness properties hold in it.

\subsection{Edge Valuation}

Ordinary BDDs are commonly optimized with \emph{complement edges}.
These edges indicate that the node they terminate in is to
be complemented; if it represents the function $f$ by itself, then
with the complement edge it represents $\neg f$. This allows
for the sharing of common subexpressions and their complements:
so $x\wedge y$ would be represented by the same subgraph as
$\neg x \vee \neg y$. In order to maintain the property of 
canonicity, it is required
that the low edge always be non-complemented. Details can be found
in \cite{somenzi99}.

This idea has been generalized \cite{lai92,tafertshofer97} to
DDs representing functions with a numeric range; in edge-valued
BDDs (EVDDs),
an edge transformer adds a specified quantity to its
terminus, whereas in factored edge-valued BDDs (FEVBDDs),
the edge transformer multiplies by a specified quantity $a$,
and then adds a quantity $m$. We give here a very general
definition for decision diagrams annotated with such edge-valuations:

\begin{definition}
Let $\eta$ be a set of functions from $R\rightarrow R$. For $e, e' \in
\eta$, we denote function composition using $\star$: $(e \star e')(r)=
e(e'(r))$. An
edge-valued (reduced and ordered) $D$-input $R$-output decision diagram is a tuple
$U=(D,R, \eta, E, EV, U_0, N, T,  k, Lab, Val, Child)$ where:
\begin{itemize}
\item $(D,R,U_0, N, T,  k, Lab, Val, Child)$ is a $D$-input $R$-output
(reduced and ordered) decision diagram;
\item $E:N \times (N \cup T)$ is the derived set of edges where $(v, v') \in E$
if and only if $Child(v, i)=v'$ for some $i$;
\item $EV:E \cup \{U_0\} \rightarrow \eta$ maps all edges to some edge-valuation; as well, the root node is assigned an incoming edge.
\end{itemize}

\end{definition}

For arbitrary $\eta$, it is \emph{not} guaranteed that the diagram
is canonical even if it is reduced and ordered. Also, ordinary decision
diagrams can be thought of as the trivial case of edge-valued decision
diagrams, where $\eta$ is the singleton set containing the identity function
on $R$.

A function is described by a node and its incoming edge, since several
functions are represented by the same node. More formally:
$$
\begin{array}{rclll}
\dbrkt{U}= & & & & \\
\dbrkt{EV(U_0), U_0}_e & = &
 \lambda \vec{x}:D^{k}& \mbox{if} & \; (U_0 \in T) \; \mbox{then} \; EV(U_0)(Val(U_0)) \\
& & &  \mbox{else} & \; case((\dbrkt{EV(U_0)\star EV((U_0, Child(U_0, 1), Child(U, 1)}_e, \\
& & & & \ldots, \dbrkt{EV(U_0)\star EV(U_0, Child(U_0,m)), Child(U_0, m)}_e), \\
& & & &  x_{Label(U_0)})(\vec{x})
\end{array}
$$
The mod-$p$ decision diagrams of Sack et al. \cite{sack00}
uses no edge valuations, but another class of nodes
that perform a cyclic shift on the values returned by
their children. This approach sacrifices canonicity; we argue that
by considering these ``shift nodes'' as edge transformers,
and enforcing normality -- the low edge must be the identity --
we can obtain a
representation which may be slightly less compact, but will
be canonical. This variety of decision diagram is described in
the following section.


