Text Processing in .NET #3: Using the Buffer Pool

by bob on January 4, 2021

In the previous installment in this series, we described a method, RemoveConsecutiveCharacters(), that demonstrates some techniques for minimizing string allocation and considerably improving performance as a result. This method begins, as should most any string transformation or concatenation method on a hot path, by creating a buffer in the form of a char[] array of sufficient size to hold the expected result — or, alternatively, the largest possible expected result, some of which can be discarded in the returned string.

We now turn to the matter of eliminating the char[] allocation itself.

You might wonder why the method creates the array via the char[] constructor, and then copies the contents of the string to it — rather than simply using string.ToCharArray() to do both at once. There’s a method to this madness. Although ToCharArray() should be about as fast (an examination of the relevant source code shows that it basically does an unsafe memory copy via pointer techniques and a call to an internal wstrcpy() method), the allocation overhead remains. Every call to this method allocates a char[] and then discards it, leaving it at the garbage collector’s doorstep.

Array allocation for small arrays is quite fast, even with GC overhead, but when we need larger arrays, this overhead becomes problematic. The answer to this is to reuse a preallocated array in some fashion. We can create a non thread-safe system for this quite easily, but let’s go with something more generic and thread-safe: the shared pool managed by System.Buffers and available to us in .NET Core / 5, and via a Nuget package, in CLR 4.5+. We can request an array from the pool of any desired type (in this case, char) and any desired size.

There are some minor limitations. First, any array obtained from the pool must be returned to it after use. While failure to do so does not cause a memory leak as such, it is a performance suck and may eventually cause the pool to run out of resources.

Secondly, while you are guaranteed to get an array of at least your requested size, you usually get a larger size than you ask for. The smallest array I’ve ever seen returned is one of 16 elements.

Thirdly, an array you get from the System.Buffers pool is not guaranteed to have any particular contents. You may get an array of zero/”null” characters, but don’t count on it. It may well contain leftover characters from prior usage. This is no problem for RemoveConsecutiveCharacters() because it immediately populates the array with the characters from the input string anyway.

Lastly, from the published benchmarks I’ve seen, obtaining an array from the pool is actually SLOWER than straight-up allocation if you request less than 1K worth of array (in the case of char[], 512 elements, since each UTF16 character occupies two bytes). At around 1K, the performance is comparable. Larger sizes are faster than conventionally allocating a new array. These benchmarks include GC overhead, so there is not an advantage below 1K requests where reduced GC activity overcomes slower allocation times.

From this, we end up with four scenarios:

  • Thread safe arrays smaller than 1K in size should be conventionally allocated.
  • Thread safe arrays >= 1K in size should be obtained from the array pool.
  • Single threaded scenarios, within a single method or its called private methods, can generally reuse static arrays. This is super fast, faster than either conventional allocation or requesting arrays above a certain size from the buffers pool.
  • Single threaded scenarios with complex requirements might be better served with either conventional allocation or with static pre-allocated and reusable arrays specific to a particular method. As always, long-lived allocations or allocations outside of hot paths (especially of small arrays) can simply be done in the conventional manner.

Let’s first explore how we’d straight-up substitute requesting an array from the array pool, rather than a conventional allocation. Then we’ll abstract away the above decisions about the “cheapest” way to obtain an array by substituting a call to a method of our own that will sort that all out.

So … after adding a using System.Buffers; statement at the top of the file, our simple allocation call:

char[] newChars = new char[s.Length];

… becomes a request to the buffer pool:

char[] newChars = ArrayPool<char>.Shared.Rent(s.Length);

At the very end of the method, we return the buffer to the pool like so:

ArrayPool<char>.Shared.Return(newChars);

And … voila! … there goes another allocation.

There’s an optional bool argument to ArrayPool.Shared.Return, which if true, will clear the array when it goes back into the pool. I don’t recommend using it, however, outside of perhaps returning an array that might contain security-sensitive information like an unencrypted password. Requesting that a returned string be cleared is performance overhead because you’re clearing an array that you probably haven’t even used most of. If you have an algorithm that needs to assume unused bytes are full of zeros, or you simply don’t want random clutter making it harder to single step through your logic during development, use Array.Clear(arrayName,int,int) right after you rent the array, to clear just the portion you actually need to use.

In RemoveConsecutiveCharacters(), we need change no other code due to the extra / unused space of our oversized array, as we are already maintaining the newLength variable and using only the first part of the character array to allocate the final string. In general, we only need to make sure we use the string constructor overload that takes a character array, start position and length argument, so that the resulting string is no larger than the data it needs to hold.

If your method returns a string instance rather than adjusting a by-reference passed string, you’ll need to store the final string in a variable, return the char[] to the pool, and then return the string variable.

Here’s our RemoveConsecutiveCharacters method from our previous article in this series, modified to use the “rented” char array:

using System.Buffers;

public static void RemoveConsecutiveCharacters(ref string s, char c) {
    int state = 0;
    int pos = -1;
    int runLength = 0;
    int newLength = s.Length;
    char[] newChars = ArrayPool<char>.Shared.Rent(s.Length);
    s.AsSpan().CopyTo(newChars.AsSpan());
    int i = 0;
             
    while (i < newLength) {

        if (newChars[i] == c) {
 
            if (state == 0) {
                state = 1; // encountered start of a run of characters
                pos = i; // pointer to first character in run
                runLength = 1;
            } else {
                runLength++;
            }

        } else {

            if (state == 1) {

                if (runLength > 1) {
                    // exited run of characters; copy rest of string over top of all but one
                    int copyLength = newLength - pos - runLength + 1;
                    newLength -= runLength;
                    newLength++;
                    newChars.AsSpan(pos + runLength - 1, copyLength).CopyTo(newChars.AsSpan(pos, copyLength));
                    i -= (runLength - 1); // adjust pointer for next character to examine, as we've shortened the evolving new string value
                    newChars[newLength] = '\0'; // This is just for debugging / visualization ease, it can be commented out after testing
                }
 
                state = 0;
            }
 
         }
 
         i++;
     }
 
     if (newLength < s.Length) {

        if (state == 1 && runLength > 1) {
            // We're inside a run at the end of the string.
            newLength -= runLength;
            newLength++;
        }
 
        s = new string(newChars, 0, newLength);
     }
 
    ArrayPool<char>.Shared.Return(newChars);
}

As you can see, there’s not much difference. Now that you understand the basic concept, we can replace the direct calls into ArrayPool<char>.Shared with methods of our own, which will decide based on the requested size and whether we want thread safety, the best way to return a usable array. We’ll encapsulate these in a static CharArrayPool class, and call these methods Rent(int size) and Return(array).

First. we need an IsThreadSafe property, which you can default to true or false according to your typical need (True is the safer general purpose default). You can set this as appropriate before making any requests for arrays.

When thread safety is requested, we simply do a conventional allocation for requests of arrays smaller than 1K in memory size. Otherwise we return an array from the pool.

When thread safety isn’t needed, we reuse an array of characters of generous size (4K elements or 8K of memory, in our case) which nevertheless will be resized dynamically if a larger request is ever made. Initially our list of reusable arrays is empty. The first array is allocated on the first Rent() call. More arrays are added to the list as needed. Return() calls are assumed to be in reverse order of the original requests, or “last out / first in”.

Here’s how one might implement this within a static library class:

public static class CharArrayPool {
    private static bool _isThreadSafe = false;
    public static bool IsThreadSafe { get => _isThreadSafe; set => _isThreadSafe = value; }

    // _chars is a shared list of char arrays used to obtain temporary buffers.
    // This is used when _isThreadSafe = false for super fast performance.
    private static List<char[]> _chars = new List<char[]>(6); // initially empty, but capacity of 6 so the backing array likely won't have to be re-grown
    private static int _charsPointer = 0; // always points to next _chars index to obtain string from
    private static int MIN_POOL_REQUEST_SIZE = 512;
    private static int SINGLE_THREAD_BASE_SIZE = 4096;

    public static char[] Rent(int minsize) {

        if (_isThreadSafe) {

            if (minsize >= MIN_POOL_REQUEST_SIZE) {
                return ArrayPool<char>.Shared.Rent(minsize);
            } else {
                return new char[minsize];
            }

        } else {

            if (_charsPointer == _chars.Count) {
                // requesting more arrays than currently in the _chars list
                _chars.Add(new char[SINGLE_THREAD_BASE_SIZE]);
            }

            if (minsize > _chars[_charsPointer].Length) {
                // request is larger than current size; grow it.
                int bufferLength = _chars[_charsPointer].Length;

                while (minsize > bufferLength) {
                    bufferLength *= 2;
                }

                _chars[_charsPointer] = new char[bufferLength];
            }

            return _chars[_charsPointer++];
        }

    }

    public static void Return(char[] buffer) {

        if (_isThreadSafe) {

            if (buffer.Length >= MIN_POOL_REQUEST_SIZE) {
                ArrayPool<char>.Shared.Return(buffer);
            }

        } else {

            if (_charsPointer == 0) {
                throw new ApplicationException("ReturnBufferToPool(): no buffers to return.");
            } else {
                _charsPointer--;
            }

        }

    }

}

Now in our RemoveConsecutiveCharacters() method, the ArrayPool<char>.Shared.Rent() call would be replaced with CharArrayPool.Rent() and the ArrayPool<char>.Shared.Return() call would be replaced with CharArrayPool.Return(). In conjunction with the properly set CharArrayPool.IsThreadSafe property, you’ll always get the most efficiently allocated arrays.

We’ve now covered basic string manipulation concepts, as well as some fairly advanced techniques for minimizing allocations that are especially useful on hot paths. In my next post in this series, I’ll discuss some of the other tricks available to us in the latest .NET runtime, and review a useful open source library of routines that implement the techniques I’ve discussed.

Leave a Comment

Previous post: