Performance with Yeppp! is slower than native implementation

Question

Performance with Yeppp! is slower than native implementation

852 views Asked by SunnyDark At 22 October 2014 at 09:17

Hi I am trying to improve the performance of vector algebra in my code using Yeppp! library however the performance is actually getting worse... Here is a piece of the Vector class code:

#include "Vector3.h"
#include <cmath>
#include "yepCore.h"

Vector3::Vector3()
{
    //ctor
}

Vector3::~Vector3()
{
    //dtor
}

Vector3::Vector3(float X, float Y, float Z)
{
    x = X;
    y = Y;
    z = Z;
}


float& Vector3::operator[](int idx)
{
    return (&x)[idx];
}

Vector3& Vector3::normalize()
{
#if USE_YEPPP
    float inf;
    yepCore_SumSquares_V32f_S32f(&x, &inf, 3);
    yepCore_Multiply_IV32fS32f_IV32f(&x, 1.0f / sqrt(inf), 3);
#else
    float inf = 1.0f / sqrt((x * x) + (y * y) + (z * z));
    x *= inf;
    y *= inf;
    z *= inf;
#endif
    return *this;

}

Vector3 Vector3::cross(Vector3& rh)
{
    return Vector3 (
                (y * rh.z) - (z * rh.y),
                (z * rh.x) - (x * rh.z),
                (x * rh.y) - (y * rh.x)
    );
}

float Vector3::dot(Vector3& rh)
{
#if USE_YEPPP
    float ret = 0;
    yepCore_DotProduct_V32fV32f_S32f(&x, &rh.x, &ret, 3);
    return ret;
#else
    return x*rh.x+y*rh.y+z*rh.z;
#endif
}

Vector3 Vector3::operator*(float scalar)
{
#if USE_YEPPP
    Vector3 ret;
    yepCore_Multiply_V32fS32f_V32f(&x, scalar, &ret.x , 3);
    return ret;
#else
    return Vector3(x*scalar, y*scalar,z*scalar);
#endif
}

Vector3 Vector3::operator+(Vector3 rh)
{
#if USE_YEPPP
    Vector3 ret;
    yepCore_Add_V32fV32f_V32f(&x, &rh.x, &ret.x, 3);
    return ret;
#else
    return Vector3(x+rh.x, y+rh.y, z+rh.z);
#endif
}

Vector3 Vector3::operator-(Vector3 rh)
{
#if USE_YEPPP
    Vector3 ret;
    yepCore_Subtract_V32fV32f_V32f(&x, &rh.x, &ret.x, 3);
    return ret;
#else
    return Vector3(x-rh.x, y-rh.y, z-rh.z);
#endif
}

Vector3 operator*(float s, const Vector3& v)
{
#if USE_YEPPP
    Vector3 ret;
    yepCore_Multiply_V32fS32f_V32f(&v.x, s, &ret.x , 3);
    return ret;
#else
    return Vector3(s*v.x,s*v.y,s*v.z);
#endif
}

I am using g++ compiler. Compiler options: g++ -Wall -fexceptions -fPIC -Wl,--no-as-needed -std=c++11 -pthread -ggdb Linker options: g++ -shared -lpthread -lyeppp -ldl

So any idea what am I doing wrong?

Original Q&A

There are 1 answers

**Marat Dukhan** · Accepted Answer · 2014-10-22T14:07:36+00:00

Marat Dukhan On 22 October 2014 at 14:07 BEST ANSWER

Yeppp! is optimized for processing arrays of 100+ elements.

It is not efficient on small arrays (like length-3 array in your example) due to limited ability to use SIMD and overheads of function call, dynamic dispatching, and parameter checks.

TechQA.

Performance with Yeppp! is slower than native implementation

There are 1 answers

Related Questions in C++

Related Questions in PERFORMANCE

Related Questions in MATH

Related Questions in VECTOR

Related Questions in YEPPP

Popular Questions

Popular Tags

Trending Questions