2.9k
views2
comments

Hi,

I'm trying to write some numerical routines, where the underlying matrix and vector representations are independent of the algorithms (e.g. QR, LP, QP) - the motivation primarily being able to reuse algorithms on strided submatrices/subvectors, and possibly more generally on sparse vectors/matrices. My first attempt, listed below, is to have a parametric type vector<'a> where the type parameter 'a is constrained to be some storage scheme. Then an algorithm would simply take arguments of the type vector<'a>.
Unfortunately when the algorithm explicitly indexes the underlying storage (rather than using vectorized operations) it's about twice as slow as the 'native' version (which isn't so bad). Presumably this is because of the virtual function call overhead when calling storage.item . My question is, is there any way to ensure that the specializations accumulate<basic> and accumulate<strided> inline the relevant method calls and (hopefully) get closer to the 'native' version? I imagine this isn't an easy problem, but any thoughts would be welcome!

Thanks,

David

#light

(* abstract interface to a vector *)
type storage =
  interface
    abstract member item : int -> float
    abstract member size : int
  end

(* a vector based on an array *)
type basic =
  class 
    interface storage with
        member m.item i = m.x.(i)
        member m.size = Array.length m.x
    end
    (* the underlying array *)
    val x : float array
    
    (* initializer *)
    new(y) = { x = y; }
  end

(* a vector strided over an array *)
type strided =
  class
    interface storage with
        member m.item i = m.x.(i*m.stride + m.offset)
        member m.size = m.length
    end
    
    val x : float array
    val offset : int
    val stride : int
    val length : int
    
    new (y,off,stride,length) = { x = y; offset = off; stride = stride; length = length}
  end
  
type vector<'a> when 'a :> storage =
  class
    val u : 'a
    new(x : 'a) = { u = x }
  end
    
let accumulate2 (arg : float array) =
    let N = Array.length arg in
    let x = ref 0. in
    for i = 0 to N - 1 do
        x := !x + (arg.(i))
    done
    !x
let accumulate (arg : vector<'a>) =
    let N = arg.u.size in
    let x = ref 0. in
    let u = arg.u in
    for i = 0 to N - 1 do
        x := !x + (u.item i)
    done
    !x

Hi David,

If you want complete representation independence and complete extensibility then what you have is correct. However, I personally prefer designs that "close-the-world" of possible storage types, and hide the representations away. This is because there are significant interactions when you do binary operations on different representations, e.g. adding sparse + dense. In my mind it's better to control the complexity of these interactions by having a limited set of storage types known to the library designer. It's possible to still permit extensibility through some "generic" storage view like yours, but the major storage types are typically known and fixed and if you internalize them then the complexities of their interactions can be handled and isolated. In other words, it is often better to sacrifice complete extensibility in the name of performance and localization of interactions between extensions.

Here's how that would apply to your situation. The set of possible representations is internalized, usually through a discriminated union whose constructors are hidden by a signature:

 

type FloatVector =
  // these data constructors can be hidden by a signature, since we may want to later add to the
  // representation choices.
    | Basic of BasicFloatVector
    | Strided of StridedFloatVector
    with 
        static member CreateBasic(arr) = Basic(BasicFloatVector.Create(arr))
        static member CreateStrided(offset,stride,arr) = Strided(StridedFloatVector.Create(offset,stride,arr))

        member x.Sum() =
            match x with 
            | Basic(bfv) -> bfv.Sum()
            | Strided(sfv) -> sfv.Sum()
    end

The representations are then implemented just as you would expect:

 

/// A basic vector type using float[] for storage
type BasicFloatVector =
  { data : float[] }
  with 
     static member Create(arr) =
        { data = arr }
        
     member x.Sum()  =
            let mutable res = 0.0 in
            for i = 0 to x.data.Length do
                res <- res + x.data.[ i ]
            done
            res
  end

/// A strided vector type using float[] for storage
type StridedFloatVector =
  { data : float[] 
    offset: int;
    stride: int; }
  with 
     static member Create(offset,stride,arr) = failwith "fill this in"
     member x.Sum() = failwith "fill this in"
  end

You will probably eventually want to parameterize by element type too. You of course need a dictionary of operations for the element type to do this, best represented through a value of type Microsoft.FSharp.Math.INumeric<'elem> and its extensions. You can make users pass in values of this explicitly, or you can look up one through the table of numeric associations at Microsoft.FSharp.Math.GlobalAssociations. The F# Matrix library uses the latter technique.

Of course, invoking dictionaries of element operations come with costs you will absolutely have to avoid when the element type is "float". For this reason you may still want to have (internal) representations "BasicFloatVector" and "StridedFloatVector". Alternatively you can use the trick that the F# Matrix library uses and use a (cheap) runtime type test to detect when the element type is "float", and when it is switch over to an optimized implementation. Here's the outline of how this might look:

 

/// A generic basic vector type using 'elem[] for storage with optimized implementations of operations
/// when 'elem is float
type BasicVector<'elem> =
  { ops: Math.INumeric<'elem>;
    data : 'elem [] }
  with 
     static member Create(arr) =
        // Note: in the next version of F# you will be able to use simply
        // let ops = Math.GlobalAssociations.GetNumericAssociation<'elem>() in
        let ops = Math.GlobalAssociations.getNumericAssociation(typeof() : ReifiedType<'elem>) |> Option.get in
        { ops = ops; data = arr }
        
     /// You now implement the basic functions here.
     /// Making the element type generic has non-trivial costs when the element type
     /// is "float", which is the common case, so we give optimized implementations 
     /// for floats by doing a type test. The cost of this type test is in practice very small.
     member x.Sum() : 'elem =
         match box(x.data) with 
         | :? (float []) as arr -> 
            let mutable res = 0.0 in
            for i = 0 to arr.Length do
                res <- res + arr.[ i ]
            done
            /// return the result - we know it has type "float" but now "prove" it has type "'elem".
            unbox(box(res))
         | _ -> 
            /// Here's the generic version of the algorithm
            let mutable res = x.ops.Zero in
            for i = 0 to x.data.Length do
                res <- x.ops.Add(res, x.data.[ i ])
            done
            res
     
  end

type StridedVector<'elem> =
    { ops: Math.INumeric<'elem>;
      offset: int;
      stride: int;
      data : 'elem array }
    with 
       static member Create(offset,stride,arr) = failwith "fill this in"
       member x.Sum() = failwith "fill this in"

    end
      
type Vector<'elem> =
  // these data constructors should be hidden by a signature
  | Basic of BasicVector<'elem>
  | Strided of StridedVector<'elem>
    with
        static member CreateBasic(arr) = 
            Basic(BasicVector.Create(arr))
        static member CreateStrided(offset,stride,arr) = 
            Strided(StridedVector.Create(offset,stride,arr))

        member x.Sum() =
            match x with 
            | Basic(bfv) -> bfv.Sum()
            | Strided(sfv) -> sfv.Sum()
    end

By dsyme on 2/24/2007 3:28 AM (permalink)

Hi Don,

Thanks for your help. I agree it makes sense to limit the number of representations in this way - there are only a small number of choices after all - and pattern matching will help when adding a new one. My first attempt was actually similar to the one you listed. However it seemed that the performance profile for the 'generic' Sum only using Vector.Item() was slightly worse than the version I gave above. My guess is this is solely due to the overhead of the 'match x with' expression for each index. This is fine in the case of Sum because as you point out it's quite easy to implement it for each type. For more involved numerical codes it would be nice if these came for free, but this may now becoming a DSL rather than a F# library. It may be interesting to know that something similar is done in OOQP, a quadratic programming library, though their linear algebra layer is quite complex to support all the functionality that BLAS offers.

Cheers,

Dave

P.S. Great work on F# ! Being a fan of OCaml, it's great to see these kind of language features become 'mainstream' - and the .NET and VS integration is awesome...

By davidr on 2/25/2007 3:05 AM (permalink)

Topic tags

Built with WebSharper

Home

Answers

Events

Courses

Groups and Conferences

Blogs

Jobs

Developers

Topic tags