Added Multiple Expression Support to LKit

LKit Language

I wanted to be able to add in the support for the user-defined functions, but the first step in that was to allow for multiple expressions, and evaluate them all, in order, but return only the last value. This appears to be pretty easy, and the basics weren't too bad, but I had a nasty little logic problem that was leading to another double-free in the data structures I was using.

What I essential did was to take the single value for an expression and make it an STL vector of pointers. So this:

  value         *_expr;

as an ivar on parser, became:

  expr_list_t   _expr;

and then I needed to really reformulate things a bit, as I had a list of sub-expressions and a mutex for that, but it made more sense to have both lists controlled by the one mutex:

  /**
   * These are all the expressions and sub-expressions that this
   * parser knows about - in two simple lists. These are created
   * in the parsing of the source, but really don't hold much as
   * all the values, constants, and functions are owned by the
   * parser.
   */
  expr_list_t                       _expr;
  expr_list_t                       _subs;
  // ...and a simple spinlock to control access to it
  mutable boost::detail::spinlock   _expr_mutex;

then I had to update all the uses of the old _expr in the code. Not horrible, but there were a few things to keep straight.

Also, then I needed to make methods to add to the expression list, remove from it, and then clear it out, but that last one made more sense to clear both of the lists out:

  /**
   * This method will remove ALL the known expressions from the
   * list for this parser.
   */
  void parser::clearExpr( bool includeSubExpr )
  {
    spinlock::scoped_lock       lock(_expr_mutex);
    /**
     * First, delete all the subexpressions in the list. We need
     * to get rid of them all in order not to leak.
     */
    if (includeSubExpr) {
      for (expr_list_t::iterator it = _subs.begin(); it != _subs.end(); ++it) {
        if (*it != NULL) {
          delete (*it);
        }
      }
      // now we can clear out the list as everything is deleted
      _subs.clear();
    }
 
    /**
     * Next, delete the top-level expressions as they too, need to
     * go as we don't want to leak.
     */
    for (expr_list_t::iterator it = _expr.begin(); it != _expr.end(); ++it) {
      if (*it != NULL) {
        delete (*it);
      }
    }
    // now we can clear out the list as everything is deleted
    _expr.clear();
  }

where the method's definition had the default value set to:

    virtual void clearExpr( bool includeSubExpr = true );

so that by default, we'd clear out everything, but if called by a subclass, we could clear out just the high-level expressions and leave those sub-expressions that might be used in other places as-is for the duration of the parser.

And all seemed to be going pretty well after I made the change to the compile() method to really loop through all the possible expressions in the source. Specifically, we went from:

  bool parser::compile()
  {
    bool      error = false;
    // make sure they don't change the source while we work...
    spinlock::scoped_lock       lock(_src_mutex);
    // see if we need to compile anything at all
    if (_expr == NULL) {
      uint32_t     pos = _src.find('(');
      if (pos < _src.length()) {
        if ((_expr = parseExpr(_src, pos)) == NULL) {
          error = true;
        }
      } else {
        // no start of an expression - bad news
        error = true;
      }
    }
    return !error;
  }

to:

  bool parser::compile()
  {
    bool      error = false;
    // make sure they don't change the source while we work...
    spinlock::scoped_lock       lock(_src_mutex);
    // see if we need to compile anything at all
    if (!isCompiled()) {
      value       *e = NULL;
      char        c = '\0';
      uint32_t    len = _src.length();
      for (uint32_t pos = 0; pos < len; ++pos) {
        // skip all white space - it's unimportant
        if (isspace((c = _src[pos]))) {
          continue;
        }
        // see if we have another expression to process
        if (c == '(') {
          if ((e = parseExpr(_src, pos)) == NULL) {
            // can't parse it, drop everything and bail
            error = true;
            clearExpr();
            break;
          } else {
            /**
             * Looks good, so add it to the list - but ONLY
             * if it's not a variable definition. Then it's
             * already accounted for in the variable list.
             */
            if (e->isExpression()) {
              addExpr((expression *)e);
            }
            // correct for the fact we're on the right spot
            --pos;
          }
        }
      }
    }
    return !error;
  }

One of the real tricks I had to put into this code was the test for the result of parseExpr() to be an expression. After all, if it's a variable, then it's already going to be in the list of variables, and deleting it from the variables list and then the expression list is going to get you a double-free every time.

With this slight change, and a few other things to make it nicer to use, I was able to handle multiple expressions in the code, and therefore have a series of variable definitions (or looking to the future, function definitions) in the code and then have a single evaluation step that would return the one value we were really looking for.

Fun, and painful at times, these pointer gymnastics are nasty, but they have to be done in order to take advantage of the polymorphism necessary for the design.