For some strange reason, this generic method won't get inlined inside another method, unless the other method contains a loop. What could explain this odd behavior? For non generic methods, the inlining is happening in both cases, with and without loops.
Code:
using System;
using System.Runtime.CompilerServices;
using SharpLab.Runtime;
[JitGeneric(typeof(int))]
public static class GenericOps<T> where T : unmanaged
{
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static bool Less(T left, T right)
{
if (typeof(T) == typeof(byte)) return (byte)(object)left < (byte)(object)right;
if (typeof(T) == typeof(sbyte)) return (sbyte)(object)left < (sbyte)(object)right;
if (typeof(T) == typeof(ushort)) return (ushort)(object)left < (ushort)(object)right;
if (typeof(T) == typeof(short)) return (short)(object)left < (short)(object)right;
if (typeof(T) == typeof(uint)) return (uint)(object)left < (uint)(object)right;
if (typeof(T) == typeof(int)) return (int)(object)left < (int)(object)right;
if (typeof(T) == typeof(ulong)) return (ulong)(object)left < (ulong)(object)right;
if (typeof(T) == typeof(long)) return (long)(object)left < (long)(object)right;
if (typeof(T) == typeof(float)) return (float)(object)left < (float)(object)right;
if (typeof(T) == typeof(double)) return (double)(object)left < (double)(object)right;
return default;
}
}
[JitGeneric(typeof(int))]
public static class C<T> where T : unmanaged
{
public static bool M1(T a, T b)
{
return GenericOps<T>.Less(a, b);
}
public static bool M2(T a, T b)
{
for(int i = 0; i<0; i++) {}
return GenericOps<T>.Less(a, b);
}
}
JIT: (decompiled using SharpLab)
// All the type checks are omitted since the type is known during compile time
// This generated JIT equals to a direct int < int JIT.
GenericOps`1[[System.Int32, System.Private.CoreLib]].Less(Int32, Int32)
L0000: cmp ecx, edx
L0002: setl al
L0005: movzx eax, al
L0008: ret
// No Inlining
C`1[[System.Int32, System.Private.CoreLib]].M1(Int32, Int32)
L0000: mov rax, GenericOps`1[[System.Int32, System.Private.CoreLib]].Less(Int32, Int32)
L000a: jmp rax
// Direct Inline
C`1[[System.Int32, System.Private.CoreLib]].M2(Int32, Int32) // Direct Inline
L0000: cmp ecx, edx
L0002: setl al
L0005: movzx eax, al
L0008: ret
Takeaways:
- The weird thing is, that it won't inline the method call in
C.M1()
even if it makes the generated JIT size smaller - I've tested with different methods as well. - This odd behavior happens only when the method is generic, it always inlines a direct non-generic implementation.
- It something that has to do with the type switches in the generic method. If the generic method does not contain these type switches, then it will get inlined in both cases (
M1
andM2
), even withoutAggressiveInlining
attribute, as long as the method is short. - The loop kicks in some heuristic which causes the inlining to happen.
The questions that rise from this example are:
- Is this behavior intentional or is it a bug?
- Is there a way to guarantee the inlining of the
Less()
method, without using weird loops in the caller method? - Does this behavior also happening in the
System.Numerics.Vector<T>
class, since it uses the same generic type switches that get optimized away?
Given that this was fixed in
.NET 5
, I would call it a bug. Verified in SharpLab with the following.NET
versions:So, to answer your questions:
<T>
types. The rules/heuristics are probably quite complex.Vector<T>
would suffer from inlining problems.