A generic DAG walk algorithm

This is an example of a simple walk through a dag data structure, written in the paradigm of generic programming. Since a tree is always a dag, this can be easily applied to a tree walk, with or without further simplification.

You can access this as a downloadable version here. The core generic dag walk header file is here.

Introduction

The contribution of this implementation here is two unrelated functionalities.

Using generic programming, split data structure and algorithm, there by promoting code re-use
Expose the api to preserve user defined stackable information during the walk, through program stack and recursion

We encounter these data structures in multiple occasions in our projects. In projects related to games programming, examples of tree traversal can be, traversing a bounding volume hierarchy tree of a polygon soup, or, traversing the bone hierarchy of a character. An example of a DAG traversal would be, traversing the dependency DAG of resources. When we write code for each of these cases of traversals, we usually succumb to using a separate pieces of code. This is because, the data structures corresponding to the bounding hierarchy traversal is not exactly the same as the data structures for the traversal of animation hierarchy.

One might be tempted to use object oriented programming here, and make base and derived classes, so that generic code can be hoisted to the base class. But sadly OOP, demands unrelated classes to be put into hierarchies, just for the sake of code re-use.

Generic programming doesn’t demand such hierarchies. Generic programming aims to bring re-usabilty by splitting the algorithm and the data structure. The traditional way to implement a generic tree walk or dag walk is through an iterator. But such an implementation, necessitates the iterator class to maintain a stack of where it was, so that it can find the next node to visit, which is kind of unweildy. Our implementation doesnt need an explicit stack of remembering where it is, for the purpose of traversing.

The generic dag_walk algorithm, takes a dag Node class and a Walk class as template arguments.

Eventhough our algorithm doesnt need a stack to keep track of where the current position is, for utility purposes, we expose the abilty to preserve stackable information in the program stack. Many times when you walk a tree or dag, you often feel the necessity to maintain a stack for storing Walk related data. For eg: suppose if you are traversing a scene graph, you may need to store the cumulative transformation matrix. This implementation of the dag_walk, in adition to splitting the algorithm and data structure, allows the walk-implementer to designate an explicit stack_type and a stack variable. The dag walk is done in a recursive way, which means that the program stack can be used to store the stack variable designated by the Walk class.

User's point of view

The user has to supply a dag Node class and a Walk class as template arguments.
Additionally the user has to specialize a node_traits struct and walk_traits struct with the Node and Walk classes that he is supplying.

The folowing requirements need be met for the DAG Node class.

Absolute requirements are requirements necessary for the dag_walk to compile. Functional requirements are requirements necessary for the dag_walk to function.

Absolute requirements of Node class:

None

Functional Requirements of the Node class:

The Node should have the provision of setting and unsetting a visit flag, which signifies whether the Node has already been visited or not.
The requirement can be relaxed, if we anticipate only tree traversals on a dag (ie visiting many times a Node with multiple parents) , or if the dag is guaranteed to be a tree.

Here is an example for a typical user supplied Node

  struct Node {
      Node( int data):_data(data),_visited(false){}
      int _data;
      list< Node *> _children;
      bool _visited;
  };

For the second template argument, the Walk class embodies the work done on the Node. Often the Node class is a rigid, predetermined class supplied by a thrid party, where as the second template argument, the Walk class is a glue class written by the person who uses the Node class. The Walk class must meet the following requirements.

Absolute rquirements of the Walk class:

The Walk class must define a stack_type, which should be copy_constructible and assignable.
The Walk class should have a member variable of the type stack_type.

If the walk doesnt need to use stackable information, then a dummy stack_type, and a dummy stack_type variable should be defined.

Functional requirements of the Walk class:

The Walk class should store the stackable information during the dag traversal in the stack variable.Eg: If the dag was a scene graph, the stack_type of the Walk could contain the current cumulative matrix, at each Node. This trick, enables us to use the program stack to store the stack_type variable while we do recursive traversal of the DAG. Even though the existance of the stack_type and the stack_tpe variable are a compilation necessity, if we dont need them, you can just use dummy values here.
The Walk class should have a pre( Node *) and a post (Node *) function which does the main work on the Node, before and after traversing the children of the Node.

Here is an example of the Walk data structure, which prints the dag depth at each Node.

struct Walk {
  Walk():_level(0){}
  typedef int stack_type;
  stack_type _level;
  stack_type & get_stack() { return _level;}
  const stack_type & get_stack() const { return _level;}
  void print( Node * pNode ) {
      cout << pNode->_data << endl;             }
  bool pre( Node * p){
      inc_level();
      print(p);
      return true;
 }
 bool post( Node *) {
      dec_level();
      return true;
 }
 void inc_level() {
      ++_level;
 }
 void dec_level() {
      --_level;
 }
};

We have defined the following return values for the algorithm to return.

  typedef enum { eOK, ePrune, eStop, eError } walk_result;

In order to facilitate generic access of the Node and Walk classes, one should specialize the node_traits and walk_traits structs with the Node and the Walk classes.

The node_traits struct which takes the Node as a template parameter should define

the set and get functions to access the visit flag of the Node
get the begin and end iterators of the container that hold references to the children of a Node. ( If the child container of the Node deosnt have an iterator class, one should define one )

Here is a specialization of the node_traits struct.

template < >
struct node_traits < Node > {
  typedef Node N;

  //get the visit tag from the node
  static bool get_visit_tag(const N * pNode ) { return pNode->_visited;}
  //set the visit flag
  static void set_visit_tag( N *pNode, bool b ) { pNode->_visited= b; }

  //how the children of the nodes are defined
  typedef list<Node*> node_container;
  typedef list::iterator node_iterator;
  typedef list::const_iterator node_const_iterator;

  //get the begin and end iterators for the children
  static node_iterator get_node_container_begin( N *pNode ) { return pNode->_children.begin();}
  static node_const_iterator get_node_container_begin(const  N *pNode ) { return pNode->_children.begin();}
  static node_iterator get_node_container_end( N *pNode ) { return pNode->_children.end();}
  static node_const_iterator get_node_container_end(const N *pNode ) { return pNode->_children.end();}

};

The walk_traits class, which takes both the Node and the Walk class as template arguments should

define

whether multiple visits to the Node are allowed (ie: it can behave like a tree or is it strictly a dag)
call the appropriate delegataion codes in the Walk class for the pre and post calls
access functions for get-ing and set-ing the current stack

template < >
struct walk_traits<> : public walk_traits_base {
  typedef Walk W;
  typedef Node N;
  typedef W::stack_type stack_type;

  //does this walk allows multiple visits for
  //the dag node. generally during a dag walk, the nodes
  // visisted once are not visited again
  static const bool is_multiple_visits_allowed = false;

  //main operation that need be performed on the node
  //This is called in dag_walk, for each node,
  //before we recurse into the dscendents
  static walk_result pre(  N *pNode, W & w )
      { return (w.pre( pNode )) ? eOK : eStop; }

  //get the curent stack variable of W
  static stack_type & get_stack( W & w){ return w.get_stack(); }
  static const stack_type &get_stack( const W & w) { return w.get_stack();}

  //set the curent stack on W
  static void set_stack( W &w, stack_type &s ){ w._level = s; }

  //main operatrion that will be called after visiting the descendants
  //this function is called after visiting the children
  static walk_result post( N * pNode, W &w, const walk_result &res )
      { return (w.post( pNode) ) ?  eOk : eStop;}
};

Core algorithm

Finally here is the data structure indpendent algorithm code for dag_walk.

/**
*    Generic dag_walk procedure
*
*   - do the pre operation on the node
*   - visit the children and call dag_walk recursively
*   - restore the stack variable after each child visit.
*   - do the post operation on the node
*   @param pNode[in, out] pointer to node.
*          Note that pNode is a not a const pointer and is mutable
*
*   @param w [in, out] the walk operation on the node
*   @return walk_result
**/

template <typename N, typename W>
walk_traits_base::walk_result dag_walk( N *pNode, W &w)
{
  typedef typename walk_traits<W> wtraits;
  typedef typename node_traits<N> ntraits;
  typedef walk_traits_base::walk_result walk_result;
  using walk_traits_base::eOK;
  using walk_traits_base::ePrune;
  using walk_traits_base::eStop;
  using walk_traits_base::eError;

  walk_result res = eOK;
  bool b_visit_children = true;

  //there is no need to visit this node
  //if multiple visits are not allowed and it has already been visited earlier
  bool b_should_do_main_operations = wtraits::is_multiple_visits_allowed || !ntraits::get_visit_tag( pNode);
  if( b_should_do_main_operations)
  {
      res = wtraits::pre( pNode, w );

      //set the visit flag for this node
      ntraits::set_visit_tag( pNode, true );
      //if the result of the main operation
      //is prune, stop or error, the not visit children
      if ( res > eOK)
      {
          b_visit_children = false;
      }
  } else {
      //there is no need to continue
      //this  node has been visited once
      return res;
  }
  //if the children ought to be visited
  if ( b_visit_children )
  {
      //copy the current stack
      //we need to rstore it after visiting chilren
      wtraits::stack_type s_copy( wtraits::get_stack( w) );
      ntraits::node_iterator nit, nbegin, nend;
      nbegin = ntraits::get_node_container_begin( pNode );
      nend = ntraits::get_node_container_end( pNode );
      for( nit = nbegin; nit != nend; ++ nit )
      {
          N * pChild = *nit;
          assert( pChild );

              res = dag_walk( pChild, w );

          //restore the stack on w
          //so that the next child gets the same context
          wtraits::set_stack( w, s_copy);

          //if the rsult of this walk is more serious
          //than ePrune, let us quit
          if ( res >= wtraits::eStop )
          {
              break;
          }
      }
  }
  //finally do the post operation
  //The post operation is the one done after vsisting the children.
  //It is passed the previous value of res.
  //So the first step in post, has to check the result so far,
  //and treat it accordingly
  if( res <= ePrune )     {
      res = wtraits::post( pNode, w,  res );
 }
return res;
}

Here is the version of the above stored at code/google .

blog comments powered by Disqus

Published

22 November 2008

A generic DAG walk algorithm

Introduction

User's point of view

Core algorithm

Published

Tags