Tuesday, July 29, 2008

C# - Notes - Part 1

Assemblies
The assembly is an important element of .NET programming. On the .NET platform, an assembly is a unit of reuse, versioning, security and deployment.It means that the entire .NET code on compilation gets converted into an Intermediate Language (IL) code and gets stored as an assembly. In addition to the IL code, an assembly also contains Assembly metadata (Manifest), Type metadata and Resources. Assemblies are hence self-describing. Let us peep into the structure of an assembly.
Private and shared assemblies
We can create two types of assemblies—private assemblies and shared assemblies. A private assembly is used by only one application while a shared assembly is shared amongst different applications.
By default, when a C# program is compiled, the assembly produced will be a private assembly.

Boxing And Unboxing C#

Boxing and unboxing is a essential concept in C#’s type system. With Boxing and unboxing one can link between value-types and reference-types by allowing any value of a value-type to be converted to and from type object. Boxing and unboxing enables a unified view of the type system wherein a value of any type can ultimately be treated as an object.
Converting a value type to reference type is called Boxing. ( implicit operation)
Unboxing is an explicit operation.
C# provides a “unified type system”. All types—including value types—derive from the type object. It is possible to call object methods on any value, even values of “primitive” types such as int.
int i = 1;
object o = i; // boxing
int j = (int) o; // unboxing

Note : we can unbox only the boxed object




Understanding Access Modifiers

Access modifiers decide accessibility of your class or class member. There are five accessibility levels in VB.NET. They are:
§ Public
§ Private
§ Protected
§ Friend (internal in C#)
§ Protected friend (protected internal in C#)

1. public: This class is accessible to all other classes. If a class is declared without explicitly specifying an access modifier for it, then it's public by default.
2. private: Accessible only by the class in which it is declared.
3. protected: Accessible only by the class in which it is declared, as well as any derived classes.
4. internal: Accessible only from within the same assembly (in C#, an assembly is a package of inter-related data that contains both code and meta data).
5. protected internal: Accessible only by the class in which it is declared, as well as any derived classes in the same source code file.
6. sealed: Prevents a class from every being derived. If another class tries to use this class as its base class either directly or indirectly then the C# compiler will raise an error.
7. abstract: Similar to the concept of a pure virtual function in C++, an abstract class can't actually be instantiated. It contains a signature, but can only be used when it is the base class of a derived class.
8. new: Using the new keyword as an access modifier for a nested class allows us to hide an inherited method of a parent class by providing the compiler with a new version of that class

Reflection in C#
Reflection is the ability of a managed code to read its own metadata for the purpose of finding assemblies, modules and types information at runtime. In other words, reflection provides objects that encapsulate assemblies, modules and types. A program reflects on itself by extracting metadata from its assembly and using that metadata either to inform the user or to modify its own behavior

Strings in C#
A string is basically a sequence of characters. Each character is a Unicode character in the range U+0000 to U+FFFF (more on that later). The string type (I'll use the C# shorthand rather than putting System.String each time) has the following characteristics:
It is a reference type
It's a common misconception that string is a value type. That's because its immutability (see next point) makes it act sort of like a value type. It actually acts like a normal reference type. See my articles on parameter passing and memory for more details of the differences between value types and reference types.
It's immutable
You can never actually change the contents of a string, at least with safe code which doesn't use reflection. Because of this, you often end up changing the value of a string variable. For instance, the code s = s.Replace ("foo", "bar"); doesn't change the contents of the string that s originally referred to - it just sets the value of s to a new string, which is a copy of the old string but with "foo" replaced by "bar".
It can contain nulls
C programmers are used to strings being sequences of characters ending in '\0', the nul or null character. (I'll use "null" because that's what the Unicode code chart calls it in the detail; don't get it confused with the null keyword in C# - char is a value type, so can't be a null reference!) In .NET, strings can contain null characters with no problems at all as far as the string methods themselves are concerned. However, other classes (for instance many of the Windows Forms ones) may well think that the string finishes at the first null character - if your string ever appears to be truncated oddly, that could be the problem.
It overloads the == operator
When the == operator is used to compare two strings, the Equals method is called, which checks for the equality of the contents of the strings rather than the references themselves. For instance, "hello".Substring(0, 4)=="hell" is true, even though the references on the two sides of the operator are different (they refer to two different string objects, which both contain the same character sequence). Note that operator overloading only works here if both sides of the operator are string expressions at compile time - operators aren't applied polymorphically. If either side of the operator is of type object as far as the compiler is concerned, the normal == operator will be applied, and simple reference equality will be tested.
Literals
Literals are how you hard-code strings into C# programs. There are two types of string literals in C# - regular string literals and verbatim string literals. Regular string literals are similar to those in many other languages such as Java and C - they start and end with ", and various characters (in particular, " itself, \, and carriage return (CR) and line feed (LF)) need to be "escaped" to be represented in the string. Verbatim string literals allow pretty much anything within them, and end at the first " which isn't doubled. Even carriage returns and line feeds can appear in the literal! To obtain a " within the string itself, you need to write "". Verbatim string literals are distinguished by having an @ before the opening quote. Here are some examples of the two types of literal, and what they amount to:
Regular literal Verbatim literal Resulting string
"Hello" @"Hello" Hello
"Backslash: \\" @"Backslash: \" Backslash: \
"Quote: \"" @"Quote: """ Quote: "
"CRLF:\r\nPost CRLF" @"CRLF:Post CRLF" CRLF:Post CRLF


Understanding Properties in C#
In C#, properties are nothing but natural extension of data fields. They are usually known as 'smart fields' in C# community. We know that data encapsulation and hiding are the two fundamental characteristics of any object oriented programming language.In C#, data encapsulation is possible through either classes or structures. By using various access modifiers like private, public, protected, internal etc it is possible to control the accessibility of the class members.
Usually inside a class, we declare a data field as private and will provide a set of public SET and GET methods to access the data fields. This is a good programming practice, since the data fields are not directly accessible out side the class. We must use the set/get methods to access the data fields.
using System;
class MyClass
{
private int x;
public int X
{
get
{
return x;
}
set
{
x = value;
}
}
}
class MyClient
{
public static void Main()
{
MyClass mc = new MyClass();
mc.X = 10;
int xVal = mc.X;
Console.WriteLine(xVal);//Displays 10
}
}

Remember that a property should have at least one accessor, either set or get. The set accessor has a free variable available in it called value, which gets created automatically by the compiler. We can't declare any variable with the name value inside the set accessor.
Static Properties
C# also supports static properties, which belongs to the class rather than to the objects of the class.
The properties of a Base class can be inherited to a Derived class.
Abstract Properties
A property inside a class can be declared as abstract by using the keyword abstract.


const vs. readonly

const and readonly perform a similar function on data members, but they have a few important differences.
const
A constant member is defined at compile time and cannot be changed at runtime. Constants are declared as a field, using the const keyword and must be initialized as they are declared. For example;
public class MyClass
{
public const double PI = 3.14159;
}
PI cannot be changed in the application anywhere else in the code as this will cause a compiler error.
Constants must be of an integral type (sbyte, byte, short, ushort, int, uint, long, ulong, char, float, double, decimal, bool, or string), an enumeration, or a reference to null.
Since classes or structures are initialized at run time with the new keyword, and not at compile time, you can't set a constant to a class or structure.
Constants can be marked as public, private, protected, internal, or protected internal.
Constants are accessed as if they were static fields, although they cannot use the static keyword.
To use a constant outside of the class that it is declared in, you must fully qualify it using the class name.

readonly
A read only member is like a constant in that it represents an unchanging value. The difference is that a readonly member can be initialized at runtime, in a constructor as well being able to be initialized as they are declared. For example:
public class MyClass
{
public readonly double PI = 3.14159;
}
or
public class MyClass
{
public readonly double PI;

public MyClass()
{
PI = 3.14159;
}
}
Because a readonly field can be initialized either at the declaration or in a constructor, readonly fields can have different values depending on the constructor used. A readonly field can also be used for runtime constants as in the following example:
public static readonly uint l1 = (uint)DateTime.Now.Ticks;
Notes
readonly members are not implicitly static, and therefore the static keyword can be applied to a readonly field explicitly if required.
A readonly member can hold a complex object by using the new keyword at initialization.
readonly members cannot hold enumerations.

static
Use of the static modifier to declare a static member, means that the member is no longer tied to a specific object. This means that the member can be accessed without creating an instance of the class. Only one copy of static fields and events exists, and static methods and properties can only access static fields and static events. For example:
public class Car
{
public static int NumberOfWheels = 4;
}
The static modifier can be used with classes, fields, methods, properties, operators, events and constructors, but cannot be used with indexers, destructors, or types other than classes.
static members are initialized before the static member is accessed for the first time, and before the static constructor, if any is called. To access a static class member, use the name of the class instead of a variable name to specify the location of the member. For example:
int i = Car.NumberOfWheels;


Types in C#

Types are primarily divided in to Value type and Reference Type



C# Data Types


Value type Pointer Reference Type




Predefined type user defined type Predefined type user defined type

Integer Enumeration Objects Classes
Real Number Structures Strings Arrays
Boolean Delegates
Characters Interfaces




The table below lists the predefined types, and shows how to write literal values for each of them.
Type Description Example
object The ultimate base type of all other types object o = null;
string String type; a string is a sequence of Unicode code units string s = "hello";
sbyte 8-bit signed integral type sbyte val = 12;
short 16-bit signed integral type short val = 12;
int 32-bit signed integral type int val = 12;
long 64-bit signed integral type long val1 = 12;long val2 = 34L;
byte 8-bit unsigned integral type byte val1 = 12;
ushort 16-bit unsigned integral type ushort val1 = 12;
uint 32-bit unsigned integral type uint val1 = 12;uint val2 = 34U;
ulong 64-bit unsigned integral type ulong val1 = 12;ulong val2 = 34U;ulong val3 = 56L;ulong val4 = 78UL;
float Single-precision floating point type float val = 1.23F;
double Double-precision floating point type double val1 = 1.23;double val2 = 4.56D;
bool Boolean type; a bool value is either true or false bool val1 = true;bool val2 = false;
char Character type; a char value is a Unicode code unit char val = 'h';
decimal Precise decimal type with at least 28 significant digits decimal val = 1.23M;


Pointers
Pointer Notation
A pointer is a variable that holds the memory address of another type. In C#, pointers can only be declared to hold the memory addresses of value types (except in the case of arrays ).
Pointers are declared implicitly, using the 'dereferencer' symbol *, as in the following example:
int *p;
Ex: int age = 32 ;
// Declare age pointer
Int*age_ptr ;
age_ptr = &age // age_ptr will have memory address of age
// Pointer can be set to structure also
cords x = new cords();
Coords *y = &x;
One can then use the declared pointer y to access a public field of x (say z). This would be done using either the expression
(*y).z
or the equivalent expression, which uses the -> string:
y -> z
A pointer can be declared in relation to an array, as in the following:
int[] a = {4, 5};
int *b = a;


Unsafe Code
A major problem with using pointers in C# is that C# operates a background garbage collection process. In freeing up memory, this garbage collection is liable to change the memory location of a current object without warning. So any pointer which previously pointed to that object will no longer do so. Such a scenario leads to two potential problems. Firstly, it could compromise the running of the C# program itself. Secondly, it could affect the integrity of other programs.
Because of these problems, the use of pointers is restricted to code which is explicitly marked by the programmer as 'unsafe'. Because of the potential for malicious use of unsafe code, programs which contain unsafe code will only run if they have been given full trust.


Memory Management
Memory Contents Item Order Item Lifetime Item Removal Timing
Stack value types, stack frames sequential (LIFO) scope pop deterministic
Heap objects random reference count Garbage Collection nondeterministic


Garbage Collection

1) first make the assumption that everything not being used is trash
2) garbage collector performs a collection, it checks for objects in the managed heap that are no longer being used by the application and performs the necessary operations to reclaim their memory.
· It searches for managed objects that are referenced in managed code.
· It then attempts to finalize those objects that are not referenced in the code.
· Lastly, it frees the unreferenced objects and reclaims the memory occupied by them.

Garbage Collection Algoritham

Building Application Roots

Roots identify storage locations, which refer to objects on the managed heap or to objects that are set to null.
Roots consist of:
· Global/Static pointers. One way to make sure our objects are not garbage collected by keeping a reference to them in a static variable.
· Pointers on the stack. We don't want to throw away what our application's threads still need in order to execute.
· CPU register pointers. Anything in the managed heap that is pointed to by a memory address in the CPU should be preserved (don't throw it out).

Phase I: Mark
1. The GC identifies live object references or application roots.
2. It starts walking the roots and building a graph of all objects reachable from the roots.
3. If the GC attempts to add an object already present in the graph, then it stops walking down that path.

Phase II: Compact
Move all the live objects to the bottom of the heap, leaving free space at the top
Weak References
Weak references are of two types:
The object which has a short weak reference to itself is collected immediately without running its finalization method.
long weak reference tracks resurrection.
The garbage collector collects object pointed to by the long weak reference table only after determining that the object's storage is reclaimable. If the object has a Finalize method, the Finalize method has been called and the object was not resurrected.
Generations
One feature of the garbage collector that exists purely to improve performance is called generations. A generational garbage collector takes into account two facts that have been empirically observed in most programs in a variety of languages:
1. Newly created objects tend to have short lives.
2. The older an object is, the longer it will survive.
Generational collectors group objects by age and collect younger objects more often than older objects.
When initialized, the managed heap contains no objects. All new objects added to the heap can be said to be in generation 0,
until the heap gets filled up which invokes garbage collection.
As most objects are short-lived, only a small percentage of young objects are likely to survive their first collection.
Once an object survives the first garbage collection, it gets promoted to generation1.
Newer objects after GC can then be said to be in generation 0.
The garbage collector gets invoked next only when the sub-heap of generation 0 gets filled up.
All objects in generation 1 that survive get compacted and promoted to generation 2.
All survivors in generation 0 also get compacted and promoted to generation 1.
Generation 0 then contains no objects, but all newer objects after GC go into generation 0.
Thus, as objects "mature" (survive multiple garbage collections) in their current generation, they are moved to the next older generation.
Generation 2 is the maximum generation supported by the runtime's garbage collector. When future collections occur, any surviving objects currently in generation 2 simply stay in generation 2.

Note :
Finalization
.Net Framework's garbage collection implicitly keeps track of the lifetime of the objects that an application creates, but fails when it comes to the unmanaged resources (i.e. a file, a window or a network connection) that objects encapsulate.
The unmanaged resources must be explicitly released once the application has finished using them. .Net Framework provides the Object. Finalize method: a method that the garbage collector must run on the object to clean up its unmanaged resources, prior to reclaiming the memory used up by the object. Since Finalize method does nothing, by default, this method must be overridden if explicit cleanup is required.
The garbage collector compacts the reclaimable memory and the special runtime thread empties the freachable queue, executing each object's Finalize method.
Destructor
The destructor should only release unmanaged resources that your object holds on to, and it should not reference other objects. If you have only managed references you do not need to (and should not) implement a destructor. You never call an object’s destructor directly. The garbage collector will call it for you.
How destructors work
The garbage collector maintains a list of objects that have a destructor. This list is updated every time such an object is created or destroyed.
When an object on this list is first collected, it is placed on a queue with other objects waiting to be destroyed. After the destructor executes, the garbage collector then collects the object and updates the queue, as well as its list of destructible objects.

You declare a C# destructor with a tilde as follows:

~MyClass(){}
In C#, however, this syntax is simply a shortcut for declaring a Finalize() method that chains up to its base class. Thus, when you write:

~MyClass()
{
// do work here
}
the C# compiler translates it to:

protected override void Finalize()
{
try
{
// do work here
}
finally
{
base.Finalize();
}
}



Note :
It is not legal to call a destructor explicitly. Your destructor will be called by the garbage collector. If you do handle precious unmanaged resources (such as file handles) that you want to close and dispose of as quickly as possible, you ought to implement the IDisposable interface.
The IDisposable interface requires its implementers to define one method, named Dispose(), to perform whatever cleanup you consider to be crucial. The availability of Dispose() is a way for your clients to say, "Don’t wait for the destructor to be called; do it right now."
If you provide a Dispose() method, you should stop the garbage collector from calling your object’s destructor. To stop the garbage collector, you call the static method, GC.SuppressFinalize(), passing in this reference for your object. Your destructor can then call your Dispose() method
Ex:
using System;
class Testing : IDisposable
{
bool is_disposed = false;
protected virtual void Dispose(bool disposing)
{
if (!is_disposed) // only dispose once!
{
if (disposing)
{
Console.WriteLine("Not in destructor, OK to reference
other objects");
}
// perform cleanup for this object
Console.WriteLine("Disposing...");
}
this.is_disposed = true;
}
public void Dispose()
{
Dispose(true);
// tell the GC not to finalize
GC.SuppressFinalize(this);
}
~Testing()
{
Dispose(false);
Console.WriteLine("In destructor.");
}
}

Difference b/w Destructor and Finalize, Dispose, Collect
Destructors are used to de-allocate resources i.e. to clean up after an object is no longer available
Dispose() is called by user code, that is, the code that is using your class.
Finalize/Destructor cannot be called by User code, it's called by Garbage Collector

System.GC.Collect() forces garbage collector to run.This is not recommended but can be used if
situations arises.

1 comment:

Anonymous said...

very good details