This article is a mirror article of machine translation, please click here to jump to the original article.

View: 6317|Reply: 3

[Source] [Turn]. NET performance optimizations - quickly traverse List collections

[Copy link]
Posted on 2022-8-28 20:51:16 | | | |
Brief introduction

System.Collections.Generic.List <T>is a generic collection class in .NET, which can store any type of data because of its convenience and rich API, which is widely used in our daily lives, and can be said to be the most used collection class.

In code writing, we often need to iterate through a List <T>collection to obtain the elements in it for some business processing. Normally, there aren't many elements within a set and it's very fast to traverse. But for some big data processing, statistics, real-time computing, etc.<T>, how to quickly traverse the list collection of tens of thousands or hundreds of thousands of data? That's what I need to share with you today.

Traversal mode

Let's take a look at the performance of different traversal methods, and build the following performance benchmark, using different order of magnitude collection traversal to see the performance of different methods. The code snippet looks like this:

Use the foreach statement

foreach is the most common way we traverse collections, it is a syntax sugar implementation of the iterator pattern, and it is also used as a benchmark for this time.

Because the foreach statement is a syntax sugar, the compiler eventually calls GetEnumerator() and MoveNext() with a while loop to implement the functionality. The compiled code looks like this:



The MoveNext() method implementation will ensure that there will be no other threads modifying the collection in the iteration, and if the modification occurs, it will throw an InvalidOperationException exception, and it will have an overflow check to check whether the current index is legitimate, and it also needs to assign the corresponding element to the enumerator. Current attribute,So in fact, its performance is not the best, the code snippet looks like this:



Let's take a look at how it performs across different set sizes, and the results look like this:



It can be seen that in the case of different sizes, the linear growth relationship of the time-consuming process is required, even if it is traversing 100w of data without any processing logic, it takes at least 1s.

Use the ForEach method of List

Another common way is to use List<T>. ForEach() method, which allows you to pass in an Action <T>delegate, which will call the Action delegate as it iterates through the element<T>.

It is an <T>internal implementation method of List, so it can directly access private arrays and avoid overflow checks. In theory, it should be fast; But in our scenario there is only one empty method, which may not behave well with a fully inline call to the foreach method. Below is the source code of the ForEach method, which shows that it does not have overflow checking, but it still retains concurrent version number checking.



In addition, since it is necessary to pass a delegate to the ForEach method, in the call code, it will check whether the delegate object in the closure generation class is empty every time, and if not, new Action<T>(), as shown below:



Let's take a look at how it compares to the foreach keyword in terms of performance. The following image shows the results of the benchmark:



Judging from the test results, it is 40% slower than directly using the foreach keyword, it seems that if it is not necessary, it is a better choice to use foreach directly, so is there any faster way?

for loop traversal

Going back to our oldest way, which is to use the for keyword to traverse the collection. It should be the best performing traversal method at the moment, because it doesn't require some redundant code like the previous ones (although the indexer is also checked to prevent overflows), and obviously it doesn't check the version number, so in a multithreaded environment the collection is changed, and there will be no exception thrown when using for. The test code looks like this:

Let's see how it turns out.



This seems to be the way we expect it.Using the for loop directly is 60% faster than foreach, a set that used to take 1 second to traverse, now only takes 400 milliseconds. So is there a faster way?

Use CollectionsMarshal

After .NET5, the dotnet community implemented the CollectionsMarshal class in order to improve the performance of collection operations. This class implements how to access native arrays of collection types (if you've seen my [. .NET Performance Optimization - You Should Set the Initial Size for Collection Types] article, you know that the underlying implementation of many data structures is arrays). So it can skip all kinds of detections and directly access the original array, which should be the fastest. The code looks like this:

You can see that the code generated by the compiler is very efficient.



Direct access to the underlying array is very dangerous, you must know what you are doing with each line of code, and have enough testing. The benchmark results are as follows:



Wow,Using CollectionsMarshal is 79% faster than using foreach, but it should be the reason for JIT optimization, there is no big difference between using foreach and for keyword loop Span.

summary

Today I talked to you about how to quickly traverse the List collection, and in most cases it is recommended to use the foreach keyword, which has both overflow checking and multi-threaded version number control, which can make it easier for us to write the correct code.

If you need high performance and large data volumes, it is recommended to use for and CollectionsMarshal.AsSpan directly to traverse the collection.

Source code link of this article:

The hyperlink login is visible.

Original link:The hyperlink login is visible.





Previous:Detailed explanation of RabbitMQ AMQP message architecture
Next:Network cable crystal head T568A and T568B standard and difference
Posted on 2022-9-4 22:15:52 |
Learn to learn
Posted on 2022-9-8 10:33:05 |
Learn to learn
Posted on 2023-6-27 22:39:13 |
Hello 12306 Can you send me a private message with data
Disclaimer:
All software, programming materials or articles published by Code Farmer Network are only for learning and research purposes; The above content shall not be used for commercial or illegal purposes, otherwise, users shall bear all consequences. The information on this site comes from the Internet, and copyright disputes have nothing to do with this site. You must completely delete the above content from your computer within 24 hours of downloading. If you like the program, please support genuine software, purchase registration, and get better genuine services. If there is any infringement, please contact us by email.

Mail To:help@itsvse.com