NET Foundations – Memory model (Part 1)
Previous two parts of NET foundations series were covering structure of managed assembly and .NET execution model. Originally I have planned this to be a long and complete post about the stack and heap, but it get too long so I decide to split in two parts:
- First part of this post would set up the stage by giving some information on a conceptual terms which need to be understand in order to be able to understand stack and heap related examples. Questions covered there would be:
- what are stack and heap (with a little more details then usually found on most of the posts I found)
- what are process, application domains and threads and what is the relationship between them
- what are reference and value types and what is the conceptual difference between then
- In second part of the post , I would use then those explanations and try to give some answers on common stack and heap based questions:
- how memory allocation looks like in cases of reference and value types,
- what is boxing/unboxing and why it is something we have to be careful
- what are the differences between static and instance members from memory allocation perspective and which one should we prefer in our code
- what is the difference between value types and reference types etc. As you can tell from this list, this would be an long post but with lots of useful information (I hope 🙂 )
Setting the stage up
Process, Application domains and threads
As we saw in .NET execution model post, execution of NET assembly starts with Win32 process primary thread calling the shim – mscoree and mscorwks.dll. As a part of the CLR bootstrapping process, shim then creates a default application domain and two additional application domains: system and shared application domain.
Application domain is a concept .NET uses to isolate different .NET applications running in the same process by providing them unique virtual address space and scoped resources. Main advantages of CLR application domain model is that:
- enables existing of multiple application domains in a single process which has lower system costs comparing to multiple process creation,
- “what happens in domain stays in domain” which means that domains
- are not dependable on possible fault or exception occurring in different domains
- the communication between domains goes through well defined decoupled mechanisms without direct access to objects existing outside of current domain
- separate domain can have separate configuration settings and security models
As I’ve mentioned in the CLR bootstrapping process, during that process beside default application domain, shared and system domain are also created.
The purpose of SharedDomain is to contain all the assemblies which would are used from multiple default domains. The advantage shared domain hosting provides is better performance due to single JIT compilation performed regardless of number of application domains using the shared domain. Without the shared domain, we have “once per method per domain” JIT compilation method. Also, memory consumption is higher in cases of redundant assembly loads in each one of the application domains. During the CLR bootstrapping Mscorlib.dll (system library) and fundamental types from System namespace like object, ValueType,Array,Enum, String, Delegate are being preloaded in shared domain and shared across all the other domains.Console programs can load code into SharedDomain by annotating the app’s Main method with a System.LoaderOptimizationAttribute
The purpose of SystemDomain is to to create and initialize SharedDomain and the default AppDomain, handle string interning etc.
(In case you are interested in more details on application domains, check out this excellent post)
CLR bootstrap process creates only one default application domain, but if needed inside of the process multiple application domains can co-exist not impacting each other, with optional different security models. During the original design of NET framework, the tests showed that up to 1000 separate application domains containing small application can coexist effectively in one process (which doesn’t mean that is something we should strive to 🙂
At the end, once CLR is initialized and loaded, shim calls the entry entry point defined in CLR header and managed application from that point is running and processing IL code in default application domain.
The whole bootstrap process is driven by process primary thread, but if needed multiple threads can be executed inside of process on one application domain. The only constraint enforced by .NET application model is that a thread can execute only in one application domain at the time, but it can switch application domains if needed.
Stack and heap
As we saw just said, every win32 process hosting the CLR can have multiple threads and each one of those threads during its creation gets allocated 1 Mb of stack memory. As the name implies, memory reads and writes in that stack are done in Last-In-First-Out manner without any additional overhead activities. If we add to the stack operation nature the fact that stack is 1Mb size “only”, it becomes clear that the stack operations are very fast and efficient.
Even stack being very efficient, .NET can not rely only on stack memory(due to its size and sequential nature), so every application domain has an allocated memory space called heap, where types and instances are created and read in direct manner using stack pointers, without any constrains in which order R/W operations should be performed and with automatic memory de-allocation (GC collection) of non used heap allocations.
Heap is been created by GC (garbage collector) using the VirtualAlloc() WinApi function as large, contiguous memory block where all the NET managed memory allocations are placed in the heap one after another. That improves very much allocation time in NET because there is no need for searching available and appropriate size memory blocks.
To increase performance of heap memory operations, CLR segments heap to “regular” heap and Large Object Heap which are both often referred as “GC heaps”.
Regular object heap is a heap containing the type instances which are subject to garbage collection and defragmentation. (I would have a separate post .NET fundamentals post about GC so no more details here therefore). Large object heap contains instances which size is greater then 85 Kb, which are treated as generation 2, never defragmented and collected only during the full GC collecting phase. All this as GC performance enhancement in cases of large object.
Beside GC heaps holding the object instances, there are also loader heaps. While GC heap contain object instances, loader heaps are containing object types, handle method JIT compilation, enable CLR to make inheritance based decisions etc.There are three types of loader heap: High frequency loader heap, low frequency loader heap, stub heap.
High frequency loader heap contains other things method table with MehodDesc table which we briefly mentioned in JITCompilation part of .NET execution model post as the table containi
ng the list of methods with column pointing to JITCompiler function or address where already JIT compiled native CPU instructions are located. Method table also contains MethodSlot table created based on the linearized list of implementation methods laid out in the following order: Inherited virtuals, Introduced virtuals, Instance Methods, and Static Methods. Those values are used by CLR when deciding how member should be executed
In this example we see that type HelloHelper is loaded in High Frequency load heap which contains one static field and 3 methods where:
- method GetDate was already once called and JIT compiled so it’s address points to native instructions while other two methods are pointing to address of JIT Compiler internal function
- GetHelloText method slot is defined as introduced virtuals, which would mean that this method is marked with virtual keyword so the CLR has to check if any of the inherited types override this method
- GetDate method is marked as static, which signalize to CLR that no instance is needed for method invocation
- GetVirtualText is marked as instance method, which signalize to CLR that the method require instance context for its execution.
In my posts, I would use just term “heap” without specifying explicitly type of heap and I’ll refer by that GC (“small object, instance”) heap. Once again, in case you are interested in more details on heaps and stack , check out this excellent post
Everything in .NET is object, but not everything behaves as an object
System.Object type is base type of all types in .NET framework class library, so in that sense it is correct to say that everything in .NET is an object. But, although everything being object provides numerous benefits, designers of .NET framework were also aware that it has some performance disadvantage in a sense that some “objects” are so simple and frequently used that there’s no sense in using them on the same way as “real, complex objects” which are allocated on heap and GC collected with performance hit caused by that activities.
Therefore, .NET architects decide to make to divert from Java and introduced in FCL value types as types inheriting from System.ValueType which are either allocated in stack memory or allocated inline on stack (stack) or heap (value type member of class). Value types are most of the primitive types (bool, int, byte etc) and structures.
Very important to be mentioned here regarding value type is that in cases when we use value type by casting it to reference type there is the the same performance hit as the type was value type, because an object wrapping the value from the stack would have to be created and garbage collected from heap. That traversal of a type from stack to heap is called boxing and traversal from heap to stack is called unboxing
I forgot to mention in previous two posts that the best book I read so far for C#/CLR stuff is CLR via C# by Jeffrey Richter, where most of the things I am presenting in my blog posts are explained in great detail. In case you haven’t read it already do yourself a favor and buy it. That book is pure gold!
This blog post give basic explanations about some very common NET terms, which proper understanding is (IMHO) very important for every .NET developer. Next blog post would use the terms explained here and it would dive in into ,NET memory model, through couple of examples and interesting use cases
Quote of the day:
I no longer prepare food or drink with more than one ingredient. – Cyra McFadden