Categories
Technology Writing

Creating C# Applications: Chapter 4

One of the most innovative features of C# is the concept of managed and unmanaged code. It is this that truly differentiates this language from its more commonly used cousins, C++ and Java.

What is unmanaged code? Well, put simply, unmanaged code is any C# code that uses pointers. Conversely, managed code could be considered to be any use of C# that doesn’t include the use of pointers. However, the concept goes beyond these simple statements.

In C++ if you want to work with a block of data, such as a large text string (anything that isn’t a simple scalar value), you must allocate the resources for the string, initialize and use the resource, and then free or deallocate it manually when you’re finished. To pass a reference to the string in a function you wouldn’t need to pass the entire string, but only a pointer to the string’s location in memory, its address. Using pointers is efficient, but one of the problems with manual allocation of resources and the use of pointers is that the developer may forget to free the resource, causing a loss of the resource commonly referred to as a memory leak. After many iterations of the code, the application’s performance can degrade, and may even stop performing.

Java eliminated this problem by eliminating the need to manually allocate and deallocate resources. The Java Virtual Machine (VM) provides the resources necessary to support the application and also provides garbage collection (defined in more detail in the next section) to clean up and free the resources after the application is finished. This approach prevents problems of lost resources but also prevents the use of pointers as the support for garbage collection and pointers are mutually exclusive.

C# eliminates the problem of having to choose between the use of pointers and the associated problems of resource loss, and automatic garbage collection but without support for pointers. The language does this by providing support for both garbage collection and pointers — if you follow rules carefully defined within the language that allows both of these seemingly incompatible capabilities.

In addition to the support for managed and unmanaged code, C# also has other unique concepts available to it based on the runtime language support provided by the Common Language Runtime (CLR), such as the WeakReference wrapper class, useful with larger objects and collections.

First, though, let’s take a closer look in the next section at why the use of pointers and garbage collection are so contradictory. If you already know this, you can skip to the section titled “Managed and Unmanaged code,” following.

Garbage Collection and Pointers

As stated earlier, in a programming language that doesn’t provide automatic management of resources, you must allocate and initialize the resource and then deallocate it when finished. However, when garbage collection is supported, you don’t have to manage the resources as the garbage collection mechanism does this for you.

Within the CLR environment, storage space for a new object is pulled from a managed heap. As a new object is created, it’s allocated space within the heap and an address is assigned to the object that references the starting address of its location in the heap. The object’s given the next available space within the heap that can contain it completely.

The amount of space in the heap assigned to the object depends on the structure of the object. A pointer is used to traverse the heap until a contiguous block of space large enough to fit the object is found.

Garbage collection is used to find objects that are no longer being used and to free up the heap space the object is occupying. In the Common Language Infrastructure (CLI) architecture, the garbage collection mechanism is triggered when the pointer used to search for an available heap location travels past the boundaries of the heap.

During the garbage collection process, the collection mechanism visits each address on the heap, checks to see if the object is still being referenced (can be reached and is therefore accessible by existing process) and then freeing up the space if the object is no longer within the scope of the existing process or application scope. The mechanism also optimizes the heap by shifting the addresses (the pointers) of existing objects to provide contiguous blocks of free space for new objects as they are allocated.

This last statement actually explains why manual creation of pointers and garbage collection are not compatible within the same development environment. You can’t create an object and reference it via a pointer to an address that could then be modified by an automatic garbage collection mechanism, thereby making your reference — the pointer — meaningless.

In addition, in this process the garbage collection mechanism must know how much space an object takes up — its structure — in order to manage it efficiently. In C++, it isn’t unusual to cast a pointer from one object/structure type to another, making garbage collection virtually impossible as the mechanism wouldn’t be able to determine the exact structure of the object, and therefore how much space is needed or occupied.

The problem of using pointers and garbage collection in one language is solved through the use of managed and unmanaged code, discussed next.