Why string.Contains performs better than string.IndexOf?

Question

Why string.Contains performs better than string.IndexOf?

289 views Asked by NirMH At 31 August 2023 at 17:09

I am trying to optimize a method.

Here is the original method:

private const string Separator = "::";
private char[] SeparatorChars = Separator.ToCharArray();

public string GetSubName(string name)
{
    if (name.Contains(Separator))
    {
        return name.Split(SeparatorChars)[0];
    }
    
    return "INVALID";
}

if name doesn't contain ::, return INVALID else return the first element in the array by generated by Split with ::.

I have wrote the following optimized method:

public string GetSubNameOpt(string name)
{
    var index = name.IndexOf(Separator, StringComparison.Ordinal);
    if (index >= 0)
    {
        return name.AsSpan().Slice(0, index).ToString();
    }

    return "INVALID";
}

The goal was to omit the second O(N) iteration over the string in order to split it in case it contains the :: substring.

I benchmarked both method using BenchmarkDotNet and here are the results

Now.. while, as expected, the Optimized method is better in time and memory in the "name contains :: case" due usage of AsSpan and removing of 1 O(N) iteration of the string, it was surprise to me that the Non Optimized method was better for the INVALID case.

EDIT Running with a non empty, not containing :: case.

Again, the Optimized method is slower...

Can you explain what is causing this behavior?

Original Q&A

There are 2 answers

radfast On 17 January 2024 at 10:19

Worth noting that by default, IndexOf(string needle) uses StringComparison.CurrentCulture which is almost certainly going to be lower performance than StringComparison.Ordinal. And probably also not what the code author wants, unless the two strings happen to be localized text strings.

**Guru Stron** · Accepted Answer · 2023-09-01T06:02:37+00:00

Because string.Contains(string) (i.e. without explicitly specified StringComparison) is a very special beast in recent implementations of .NET and looks something like the following:

public bool Contains(string value)
{
    if (value == null)
        ThrowHelper.ThrowArgumentNullException(ExceptionArgument.value);
 
    return SpanHelpers.IndexOf(
        ref _firstChar,
        Length,
        ref value._firstChar,
        value.Length) >= 0;
}

While the string.Contains(string, StringComparison) overload just call IndexOf:

public bool Contains(string value, StringComparison comparisonType)
{
    return IndexOf(value, comparisonType) >= 0;
}

i.e. name.Contains(Separator, StringComparison.Ordinal) will result in "expected" performance.

And IndexOf for ordinal comparison currently uses internal Ordinal.IndexOf which contains quite a lot of checks and different optimizations for more general use-case.

One thing you can try is using AsSpan().IndexOf which on my machine results in better performance for the optimized version:

public string name { get; set; } = "qwerty";
private const string Separator = "::";
private char[] SeparatorChars = Separator.ToCharArray();

[Benchmark]
public string GetSubName()
{
    if (name.Contains(Separator))
    {
        return name.Split(SeparatorChars)[0];
    }
    
    return "INVALID";
}

[Benchmark]
public string GetSubNameOpt()
{
    var index = name.AsSpan().IndexOf(SeparatorChars);
    if (index >= 0)
    {
        return name.AsSpan().Slice(0, index).ToString();
    }
    
    return "INVALID";
}

Method	Mean	Error	StdDev
GetSubName	8.850 ns	0.2388 ns	0.3856 ns
GetSubNameOpt	8.430 ns	0.2337 ns	0.2400 ns

TechQA.

Why string.Contains performs better than string.IndexOf?

There are 2 answers

Related Questions in C#

Related Questions in STRING

Related Questions in PERFORMANCE

Related Questions in BENCHMARKING

Related Questions in .NET-4.6.2

Popular Questions

Trending Questions