Afternoon,
I'm writing a custom median function (without looking at existing solutions, i like the challenge), after lots of fiddling I'm most of the way there. I cannot however pass in a column that contains a null value. I'm handling this in the c# Code but it seems to be being stopped by SQL before it gets there.
You get this error...
Msg 6569, Level 16, State 1, Line 11 'Median' failed because parameter 1 is not allowed to be null.
C#:
namespace SQLMedianAggregate
{
[System.Serializable]
[Microsoft.SqlServer.Server.SqlUserDefinedAggregate(
Microsoft.SqlServer.Server.Format.UserDefined,
IsInvariantToDuplicates = false, // duplicates may change results
IsInvariantToNulls = true, // receiving a NULL is handled later in code
IsInvariantToOrder = true, // is sorted later
IsNullIfEmpty = true, // if no values are given the result is null
MaxByteSize = -1,
Name = "Median" // name of the aggregate
)]
public struct Median : IBinarySerialize
{
public double Result { get; private set; }
public bool HasValue { get; private set; }
public DataTable DT_Values { get; private set; } //only exists for merge essentially
public static DataTable DT_Final { get; private set; } //Need a static version so its accesible within terminate
public void Init()
{
Result = double.NaN;
HasValue = false;
DT_Values = new DataTable();
DT_Values.Columns.Add("Values", typeof(double));
DT_Final = new DataTable();
DT_Final.Columns.Add("Values", typeof(double));
}
public void Accumulate(double number)
{
if (double.IsNaN(number))
{
//skip
}
else
{
//add to tables
DataRow NR = DT_Values.NewRow();
NR[0] = number;
DT_Values.Rows.Add(NR);
DataRow NR2 = DT_Final.NewRow();
NR2[0] = number;
DT_Final.Rows.Add(NR2);
HasValue = true;
}
}
public void Merge(Median group)
{
// Count the product only if the other group has values
if (group.HasValue)
{
DT_Final.Merge(group.DT_Values);
//DT_Final = DT_Values;
}
}
public double Terminate()
{
if (DT_Final.Rows.Count == 0) //Just to handle roll up so it doesn't crash (doesnt actually work
{
DataRow DR = DT_Final.NewRow();
DR[0] = 0;
DT_Final.Rows.Add(DR);
}
//Sort Results
DataView DV = DT_Final.DefaultView;
DV.Sort = "Values asc";
DataTable DTF = new DataTable();
DTF = DV.ToTable();
////Calculate median and submit result
double MiddleRow = (DT_Final.Rows.Count -1.0) / 2.0;
if (MiddleRow % 2 != 0)
{
double upper = (double)(DT_Final.Rows[Convert.ToInt32(Math.Ceiling(MiddleRow))]["Values"]);
double lower = (double)(DT_Final.Rows[Convert.ToInt32(Math.Floor(MiddleRow))]["Values"]);
Result = lower + ((upper - lower) / 2);
} else
{
Result = (double)(DT_Final.Rows[Convert.ToInt32(MiddleRow)]["Values"]);
}
return Result;
}
public void Read(BinaryReader SerializationReader)
{
//Needed to get this working for some reason
}
public void Write(BinaryWriter SerializationWriter)
{
//Needed to get this working for some reason
}
}
}
SQL:
DROP AGGREGATE dbo.Median
DROP ASSEMBLY MedianAggregate
CREATE ASSEMBLY MedianAggregate
AUTHORIZATION dbo
FROM 'C:\Users\#######\Documents\Visual Studio 2017\Projects\SQLMedianAggregate\SQLMedianAggregate\bin\Debug\SQLMedianAggregate.dll'
WITH PERMISSION_SET = UNSAFE;
CREATE AGGREGATE dbo.Median (@number FLOAT) RETURNS FLOAT
EXTERNAL NAME [MedianAggregate]."SQLMedianAggregate.Median";
Any ideas of what setting or code i'm missing that will allow this. I pretty much just want it to ignore nulls.
SQL Version is SQL2008 R2 btw
The problem is your datatype. You need to use the
Sql*
types for SQLCLR parameters, return values, and result set columns. In this case, you need to change:into:
Then, you access the
double
value using theValue
property that allSql*
types have (i.e.number.Value
in this case).And then, at the beginning of the
Accumulate
method, you need to check forNULL
using theIsNull
property:Also, for more information on using SQLCLR in general, please see the series I am writing on this topic on SQL Server Central: Stairway to SQLCLR (free registration is required to read content on that site, but it's worth it :-).
And, since we are talking about median calculations here, please see the article I wrote (also on SQL Server Central) on the topic of UDAs and UDTs that uses Median as the example: Getting The Most Out of SQL Server 2005 UDTs and UDAs. Please keep in mind that the article was written for SQL Server 2005 which has a hard limit of 8000 bytes of memory for UDTs and UDAs. That limit was lifted in SQL Server 2008, so rather than using the compression technique shown in that article, you could simply set
MaxByteSize
in theSqlUserDefinedAggregate
to-1
(as you are currently doing) orSqlMetaData.MaxSize
(or something very close to that).Also,
DataTable
is a bit heavy-handed for this type of operation. All you need is a simpleList<Double>
:-).Regarding the following line of code (broken into 2 lines here to prevent the need to scroll):
This is a huge misunderstanding of how UDAs and UDTs work. Please do NOT use static variables here. Static variables are shared across Sessions, hence your current approach is not thread-safe. So you would either get errors about it already being declared or various Sessions would alter the value unbeknownst to other Sessions, as they would all share the single instance of
DT_Final
. And the errors and/or odd behavior (i.e. erroneous results that you can't debug) might happen in a single session if a parallel plan is used.UDTs and UDAs get serialized to a binary value stored in memory, and then are deserialized which keeps their state intact. This is the reason for the
Read
andWrite
methods, and why you needed to get those working.Again, you don't need (or want)
DataTables
here as they are over-complicating the operation and take up more memory than is ideal. Please see the article I linked above on UDAs and UDTs to see how the Median operation (and UDAs in general) should work.