We all know that CPU and memory are the two most important metrics for a program, so how many people have really thought about the question: How many bytes does an instance of a type (value type or reference type) occupy in memory? Many of us can't answer. C# provides some operators and APIs for calculating sizes, but none of them completely solve the problem I just asked. This article provides a method for calculating the number of memory bytes occupied by instances of value types and reference types. The source code is downloaded from here.
1. sizeof operator 2. Marshal.SizeOf method 3. Unsafe.SizeOf method > 4. Can it be calculated based on the type of field member? 5. Layout of value types and application types 6. LDFLDA Directive 7. Calculate the number of bytes of the value type 8. Count the number of bytes of the citation type 9. Complete calculation
1. sizeof operator
The sizeof operation is used to determine the number of bytes occupied by an instance of a type, but it can only be applied to Unmanaged types. The so-called Unmanaged type is limited to:
Primitive Types: Boolean, Byte, SByte, Int16, UInt16, Int32, UInt32, Int64, UInt64, IntPtr, UIntPtr, Char, Double, and Single) Decimal type Enumeration type Pointer type Structs that contain only data members of type Unmanaged As the name suggests, an Unmanaged type is a value type, and the corresponding instance cannot contain any reference to the managed object. If we define a generic method like this to call the sizeof operator, the generic parameter T must add an unmananged constraint and an unsafe tag to the method.
Only native and enum types can use the sizeof operator directly, which must be added if applied to other types (pointers and custom structs)./unsafecompilation tags, and also need to be placed in unsafein context.
Since the following struct Foobar is not an Unmanaged type, the program will have a compilation error.
2. Marshal.SizeOf method
Static types Marshal defines a series of APIs that help us allocate and copy unmanaged memory, convert between managed and unmanaged types, and perform a series of other operations on unmanaged memory (Marshal in computational science refers to the operation of converting memory objects into the corresponding format for data storage or transfer). Static, which includes the following 4 SizeOf method overloads to determine the number of bytes of a given type or object.
The Marshal.SizeOf method does not have a restriction on the specified type for the Unmanaged type, but it still requires one to be specifiedValue type。 If the incoming object is an object, it must also be a box for a value type.
Since the following Foobar is defined as:kind, so calls to both SizeOf methods will throw an ArgumentException exception and prompt: Type 'Foobar' cannot be marshaled as an unmanaged structure; no meaningful size or offset can be computed.
Marshal.SizeOf methodGenerics are not supported, but also has requirements for the layout of the structure, which supports supportSequentialandExplicitLayout mode. Since the Foobar struct shown below adopts the Auto layout mode (Auto, which does not support "dynamic planning" of memory layout based on field members due to the stricter memory layout requirements in unmanaged environments), calls to the SizeOf method will still throw the same ArgumentException exception as above.
3. Unsafe.SizeOf method
Static Unsafe provides more low-level operations for unmanaged memory, and similar SizeIOf methods are also defined in this type. The method doesn't have any restrictions on the type specified, but if you specify a reference type, it returns theNumber of pointer bytes”(IntPtr.Size)。
4. Can it be calculated based on the type of field member?
We know that both value and reference types are mapped as a continuous fragment (or stored directly in a register). The purpose of a type is to specify the memory layout of an object, and instances of the same type have the same layout and the number of bytes is naturally the same (for fields of reference type, it stores only the referenced address in this byte sequence). Since byte length is determined by type, if we can determine the type of each field member, wouldn't we be able to calculate the number of bytes corresponding to that type? In fact, it is not possible.
For example, we know that the bytes of byte, short, int, and long are 1, 2, 4, and 8, so the number of bytes for a byte binary is 2, but for a type combination of byte + short, byte + int, and byte + long, the corresponding bytes are not 3, 5, and 9, but 3, 8, and 16. Because this involves the issue of memory alignment.
5. Layout of value types and reference types
The number of bytes occupied by instances of the reference type and subtype is also different for the exact same data member. As shown in the following image, the byte sequence of the value type instanceAll are field members used to store it。 For instances of reference types, the address of the type corresponding method table is also stored in front of the field byte sequence. The method table provides almost all the metadata describing the type, and we use this reference to determine which type the instance belongs to. At the very front, there are also extra bytes, which we will callObject HeaderIt is not only used to store the locked state of the object, but the hash value can also be cached here. When we create a reference type variable, this variableIt does not point to the first byte of memory occupied by the instance, but to the place where the method table address is stored。
6. LDFLDA Directive
As we have introduced above, the sizeof operator and the SizeOf method provided by the static type Marshal/Unsafe cannot really solve the calculation of byte length occupied by instances. As far as I know, this problem cannot be solved in the C# field alone, but it is provided at the IL levelLdfldaInstructions can help us solve this problem. As the name suggests, Ldflda stands for Load Field Address, which helps us get the address of a field in the instance. Since this IL instruction has no corresponding API in C#, we can only use it in the following form using IL Emit.
As shown in the code snippet above, we have a GenerateFieldAddressAccessor method in the SizeCalculator type, which generates a delegate of type Func<object?, long[]> based on the list of fields of the specified type, which helps us return the memory address of the specified object and all its fields. With the address of the object itself and the address of each field, we can naturally get the offset of each field, and then easily calculate the number of bytes of memory occupied by the entire instance.
7. Calculate the number of bytes of the value type
Since value types and reference types have different layouts in memory, we also need to use different calculations. Since the byte of the struct is the content of all the fields in memory, we use a clever way to calculate it. Suppose we need to settle the number of bytes of a struct of type T, then we create a ValueTuple<T,T> tuple, and the offset of its second field Item2 is the number of bytes of struct T. The specific calculation method is reflected in the following CalculateValueTypeInstance method.
As shown in the code snippet above, assuming that the struct type we need to compute is T, we call the GetDefaultAsObject method to get the default(T) object in the form of a reflection, and then create a ValueTuple<T,T>tuple. After calling the GenerateFieldAddressAccessor method to get the Func<object?, long[]> delegate for calculating the instance and its field addresses, we call this delegate as an argument. For the three memory addresses we get, the code tuple and the addresses of fields 1 and 2 are the same, we use the third address representing Item2 minus the first address, and we get the result we want.
8. Count the number of bytes of the citation type
Byte calculation for reference types is more complicated, using this idea: after we get the address of the instance itself and each field, we sort the addresses to get the offset of the last field. Let's add this offset to the number of bytes of the last field itself, and then add the necessary "first and last bytes" to the result we want, which is reflected in the following CalculateReferneceTypeInstance method.
As shown in the code snippet above, if the specified type does not have any fields defined, CalculateReferneceTypeInstance returns the minimum number of bytes of the reference type instance: 3 times the number of address pointer bytes. For x86 architectures, an application type object takes up at least 12 bytes, including ObjectHeader (4 bytes), method table pointers (bytes), and at least 4 bytes of field content (this 4 bytes is required even if no type is defined without any fields). In the case of x64 architecture, this minimum number of bytes will be 24, because the method table pointer and minimum field content will become 8 bytes, although the valid content of the ObjectHeader only occupies 4 bytes, but 4 bytes of padding will be added in front.
The settlement of bytes occupied by the last field is also very simple: if the type is a value type, then the CalculateValueTypeInstance method defined earlier is called to compute, if it is a reference type, the content stored in the field is only the memory address of the target object, so the length is IntPtr.Size. Since reference type instances are aligned with IntPtr.Size in memory by default, this is also done here. Finally, don't forget that the reference of the reference type instance does not point to the first byte of memory, but to the byte that stores the method table pointer, so you have to add the number of bytes of ObjecthHeader (IntPtr.Size).
9. Complete calculation
The two methods used to calculate the number of bytes of value type and reference type instances are used in the following SizeOf method. Since the call of the Ldflda instruction needs to provide a corresponding instance, this method provides a delegate to obtain the corresponding instance in addition to providing the target type. The parameters corresponding to this delegate can be defaulted, and we will use the default value for the value type. For reference types, we'll also try to create the target object using the default constructor. If this delegate object is not provided and the target instance cannot be created, the SizeOf method throws an exception. Although we need to provide the target instance, the calculated result is only related to the type, so we cache the computed result. For ease of calling, we also provide another generic SizeOf <T>method.
In the code snippet below, we use it to output the number of bytes of two structs and types with the same field definition. In the next article, we'll further get the full binary content of the instance in memory based on the calculated number of bytes, so stay tuned.
Original link:The hyperlink login is visible. |