1 2 3 4 5 6
Designing a small API: Bit manipulation in C#

This article originally appeared on earthli News and has been cross-posted here.


A usable API doesn't usually spring forth in its entirety on the first try. A good, usable API generally arises iteratively, improving over time. Naturally, when using words like good and usable, I'm obliged to define what exactly I mean by that. Here are the guidelines I use when designing an API, in decreasing order of importance:

Static typing & Compile-time Errors

Wherever possible, make the compiler stop the user from doing something incorrectly instead of letting the runtime handle it.

Integrates into standard practices

That is, do not invent whole new ways of doing things; instead, reuse or build on the paradigms already present in the language.

Elegance

Ideally, using the API should be intuitive, read like natural language and not involve a bunch of syntactic tricks or hard-to-remember formulations or parameter lists.

Clean Implementation

The internals should be as generalized and understandable as possible and involve as little repetition as possible.

CLS-Compliance

Cross-language compliance is also interesting and easily achieved for all but the most low-level of APIs

Using those guidelines, I designed an API to manage bits and sets of bits in C#. Having spent a lot of time using Delphi Pascal, I'd become accustomed to set and bit operations with static typing. In C#, the .Net framework provides the Set generic type, but that seems like overkill when the whole idea behind using bits is to use less space. That means using enumerated types and the FlagsAttribute; however, there are some drawbacks to using the native bit-operations directly in code:

  1. Bit-manipulation is more low-level than most of the rest of the coding a C#-developer typically does. That, combined with doing it only rarely, makes direct testing of bits error-prone.
  2. The syntax for testing, setting and removing bits is heavy with special symbols and duplicated identifiers.

To demonstrate, here is a sample:

[Flags]
enum TestValues
{
  None = 0,
  One = 1,
  Two = 2,
  Three = 4,
  Four = 8,
  All = 15,
}

// Set bits one and two:
var bitsOneAndTwo = TestValues.One | TestValues.Two;

// Remove bit two :
var bitOneOnly = bitsOneAndTwo & ~TestValues.Two;

// Testing for bit two:
if ((bitsOneAndTwo & TestValues.Two) == TestValues.Two)
{
  ...
}

As you can see in the example above, setting a bit is reasonably intuitive (though it's understandable to get confused about using | instead of & to combine bits). Removing a bit is more esoteric, as the combination of & with the ~ (inverse) operator is easily forgotten if not often used. Testing for a bit is quite verbose and extending to testing for one of several flags even more so.

Version One

Therefore, to make things easier, I decided to make some extension methods for these various functions and ended up with something like the following:

public static void Include<T>(this T flags, T value) { ... }
public static void Exclude<T>(this T flags, T value) { ... }
public static bool In<T>(this T flags, T value) { ... }
public static void ForEachFlag<T>(this T flags, Action<T> action) { ... }

These definitions compiled and worked as expected, but had the following major drawbacks:

  • At the time, we were only using them with enum values, but code completion was offering the methods for all objects because there was no generic constraint on T.
  • Not only that, but much of the bit-manipulation code needed to know the base type of the arguments in order to be able to cast it to and from the correct types. There were a lot of checks, but it all happened at runtime.
  • The ForEachFlag() function was implemented as a lambda when it is clearly an iteration. Using a lambda instead makes it impossible to use break or continue with this method.

This version, although it worked, broke several of the rules outline above; namely: while it did offer compile-time checking, the implementation had a lot of repetition in it and the iteration did not make use of the common library enumeration support (IEnumerable and foreach). That the operations were available for all objects and polluted code-completion only added insult to injury.

Version Two

A natural solution to the namespace-pollution problem is to add a generic constraint to the methods, restricting the operations to objects of type Enum, as follows:

public static void Include<T>(this T flags, T value)
  where T : Enum
{ ... }

public static void Exclude<T>(this T flags, T value)
  where T : Enum
{ ... }

public static bool In<T>(this T flags, T value)
  where T : Enum
{ ... }

public static void ForEachFlag<T>(this T flags, Action<T> action)
  where T : Enum
{ ... }

.NET enum-declarations, however, do not inherit from Enum; instead, they inherit from Int32, by default, but can also inherit from a handful of other base types (e.g. byte, Int16). This makes sense so that enum-values can be freely converted to and from these base values. Not only will a generic constraint as defined above not have the intended effect, it's explicitly disallowed by the compiler. So, that's a dead-end.

The other, more obvious way of restricting the target type of an extension method is to change the type of the first parameter from T to something else. However, since enum types don't inherit from Enum, what type do we use? Well, it turns out that Enum is a strange type, indeed. It can't be used in a generic constraint and does not serve as the base class for enumerated types but, when used as the target of an extension method, it magically applies to all enumerated types!

I took advantage of this loophole to build the next version of the API, as follows:

public static void Include<T>(this Enum flags, T value) { ... }
public static void Exclude<T>(this Enum flags, T value) { ... }
public static bool In<T>(this T flags, Enum value) { ... }
public static void ForEachFlag<T>(this Enum flags, Action<T> action) { ... }

This version had two advantages over the first version:

  1. The methods are only available for enumerated types instead of for all types, which cleans up the code-completion pollution.
  2. The implementation could take advantage of the Enum.GetTypeCode() method instead of the is and as-operators to figure out the type and cast the input accordingly.

Version Three

After using this version for a little while, it became obvious that there were still problems with the implementation:

  1. Though using Enum as the target type of the extension method was a clever solution, it turns out to be a huge violation of the first design-principle outlined above: The type T for the other parameters is not guaranteed to conform to Enum. That is, the compiler cannot statically verify that the bit being checked (value) is of the same type as the bit-set (flags).
  2. The solution only works with Enum objects, where it would also be appropriate for Int32, Int64 objects and so on.
  3. The ForEach method still has the same problems it had in the first version; namely, that it doesn't allow the use of break and continue and therefore violates the second design-principle above.

A little more investigation showed that the Enum.GetTypeCode() method is not unique to Enum but implements a method initially defined in the IConvertible interface. And, as luck would have it, this interface is implemented not only by the Enum class, but also by Int32, Int64 and all of the other types to which we would like to apply bit- and set-operations.

Knowing that, we can hope that the third time's a charm and redesign the API once again, as follows:

public static void Include<T>(this T flags, T value)
  where T : IConvertible
{ ... }

public static void Exclude<T>(this T flags, T value)
  where T : IConvertible
{ ... }

public static bool In<T>(this T flags, T value)
  where T : IConvertible
{ ... }

public static void ForEachFlag<T>(this T flags, Action<T> action)
  where T : IConvertible
{ ... }

Now we have methods that apply only to those types that support set- and bit-operations (more or less1). Not only that, but the value and action arguments are once again guaranteed to be statically compliant with the flags arguments.

With two of the drawbacks eliminated with one change, we converted the ForEachFlag method to return an IEnumerable<T> instead, as follows:

public static IEnumerable<T> GetEnabledFlags<T>(this T flags)
  where T : IConvertible
{ ... }

The result of this method can now be used with foreach and works with break and continue, as expected. Since the method also now applies to non-enumerated types, we had to re-implement it to return the set of possible bits for the type instead of simply iterating the possible enumerated values returned by Enum.GetValues().2

This version satisfies the first design principles (statically-typed, standard practice, elegant) relatively well, but is still forced to make concessions in implementation and CLS-compliance. It turns out that the IConvertible interface is somehow not CLS-compliant, so I was forced to mark the whole class as non-compliant. On the implementation side, I was avoiding the rather clumsy is-operator by using the IConvertible.GetByteCode() method, but still had a lot of repeated code, as shown below in a sample from the implementation of Is:

switch (flags.GetTypeCode())
{
  case TypeCode.Byte:
    return (byte)(object)flags == (byte)(object)value;
  case TypeCode.Int32:
    return (int)(object)flags == (int)(object)value;
  ...
}

Unfortunately, bit-testing is so low-level that there is no (obvious) way to refine this implementation further. In order to compare the two convertible values, the compiler must be told the exact base type to use, which requires an explicit cast for each supported type, as shown above. Luckily, this limitation is in the implementation, which affects the maintainer and not the user of the API.

Since implementing the third version of these "BitTools", I've added support for Is (shown partially above), Has, HasOneOf and it looks like the third time might indeed be the charm, as the saying goes.


[Flags]
enum TestValues
{
  None = 0,
  One = 1,
  Two = 2,
  OneOrTwo = 3,
  All = 3,
}

That is, foreach (Two.GetEnabledFlags()) { ... } should return only Two and foreach (All.GetEnabledFlags()) { ... } should return One and Two.


  1. The IConvertible interface is actually implemented by other types, to which our bit-operations don't apply at all, like double, bool and so on. The .NET library doesn't provide a more specific interface -- like "INumeric" or "IIntegralType" -- so we're stuck constraining to IConvertible instead.

  2. Which, coincidentally, fixed a bug in the first and second versions that had returned all detected enumerated values -- including combinations -- instead of individual bits. For example, given the type shown below, we only ever expect values One and Two, and never None, OneAndTwo or All.

Waiting for C# 4.0: A casting problem in C# 3.5

This article originally appeared on earthli News and has been cross-posted here.


C# 3.5 has a limitation where generic classes don't necessarily conform to each other in the way that one would expect. This problem manifests itself classically in the following way:

class A { }
class B : A { }
class C : A { }

class Program
{
  void ProcessListOfA(IList<A> list) { }
  void ProcessListOfB(IList<B> list) { }
  void ProcessSequenceOfA(IEnumerable<A> sequence) { }
  void ProcessSequenceOfB(IEnumerable<B> sequence) { }

  void Main()
  {
    var bList = new List<B>();
    var aList = new List<A>();

    ProcessListOfA(aList); // OK
    ProcessListOfB(aList); // Compiler error, as expected
    ProcessSequenceOfA(aList); // OK
    ProcessSequenceOfB(aList); // Compiler error, as expected

    ProcessListOfA(bList); // Compiler error, unexpected!
    ProcessListOfB(bList); // OK
    ProcessSequenceOfA(bList); // Compiler error, unexpected!
    ProcessSequenceOfB(bList); // OK
  }
}

Why are those two compiler errors unexpected? Why shouldn't a program be able to provide an IList<B> where an IList<A> is expected? Well, that's where things get a little bit complicated. Whereas at first, it seems that there's no down side to allowing the assignment -- B can do everything expected of A, after all -- further investigation reveals a potential source of runtime errors.

Expanding on the example above, suppose ProcessListOfA() were to have the following implementation:

void ProcessListOfA(IList<A> list)
{
  if (SomeCondition(list))
  {
    list.Add(new C());
  }
}

With such an implementation, the call to ProcessListOfA(bList), which passes an IList<B> would cause a runtime error if SomeCondition() were to return true. So, the dilemma is that allowing co- and contravariance may result in runtime errors.

A language design includes a balance of features that permit good expressiveness while restricting bad expressiveness. C# has implicit conversions, but requires potentially dangerous conversions to be made explicit with casts. Similarly, the obvious type-compatibility outlined in the first example is forbidden and requires a call to the System.Linq.Enumerable.Cast<T>(this IEnumerable) method instead. Other languages -- most notably Eiffel -- have always allowed the logical conformance between generic types, at the risk of runtime errors.1

Some of these limitations will be addressed in C# 4.0 with the introduction of covariance. See Covariance and Contravariance (C# and Visual Basic) and LINQ Farm: Covariance and Contravariance in C# 4.0 for more information.

A (Partial) Solution for C# 3.5

Until then, there's the aforementioned System.Linq.Enumerable.Cast<T>(this IEnumerable) method in the system library. However, that method, while very convenient, makes no effort to statically verify that the input and output types are compatible with one another. That is, a call such as the following is perfectly legal:

var numbers = new [] { 1, 2, 3, 4, 5 };
var objects = numbers.Cast<object>(); // OK
var strings = numbers.Cast<string>(); // runtime error!

Instead of an unchecked cast, a method with a generic constraint on the input and output types would be much more appropriate in those situations where the program is simply avoiding the generic-typing limitation described in detail in the first section. The method below does the trick:

public static IEnumerable<TOutput> Convert<TInput, TOutput>(this IEnumerable<TInput> input)
  where TInput : TOutput
{
  if (input == null) { throw new ArgumentNullException("input"); }

  if (input is IList<TOutput>) { return (IList<TOutput>)input; }

  return input.Select(obj => (TOutput)(object)obj);
}

While it's entirely possible that the Cast() function from the Linq library is more highly optimized, it's not as safe as the method above. A check with Redgate's Reflector would probably reveal just how that method actually works. Correctness come before performance, but YMMV.2

The initial examples can now be rewritten to compile without casting:

ProcessListOfA(bList.Convert<B, A>()); // OK
ProcessListOfB(bList); // OK
ProcessSequenceOfA(bList.Convert<B, A>()); // OK
ProcessSequenceOfB(bList); // OK

One More Little Snag

Unlike the Enumerable.Cast<TOutput>() method, which has no restrictions and can be used on any IEnumerable, there will be places where the compiler will not allow an application to use Convert<TOutput>(). This is because the generic constraint to which TOutput must conform (TInput) is, in some cases, not statically provable (i.e. at compile-time). A concrete example is shown below:

abstract class A
{
  abstract IList<TResult> GetObject<TResult>();
}

class B<T> : A
{
  public override IList<TResult> GetObject<TResult>() 
  {
    return _objects.Convert<T, TResult>(); // Compile error!
  }

  private IList<T> _objects;
}

The example above does not compile because TResult does not provably conform to T. A generic constraint on TResult cannot be applied because it would have to be applied to the original, abstract function, which knows nothing of T. In these cases, the application will be forced to use the System.Linq.Enumerable.Cast<T>(this IEnumerable) instead.



  1. I've addressed this issue before in Static-typing for languages with covariant parameters, which reviewed the paper, Type-safe covariance: Competent compilers can catch all catcalls, a proposal for statically identifying potential runtime errors and requiring them to be addressed with a recast definition. Similarly, another runtime plague -- null-references -- is also addressed in Eiffel, a feature extensively documented in the paper, Attached types and their application to three open problems of object-oriented programming.

  2. YMMV = "Your Mileage May Vary", but remember, Donald Knuth famously said: "We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil."

Creating fluent interfaces with inheritance in C#

This article originally appeared on earthli News and has been cross-posted here.


Fluent interfaces -- or "method chaining" as it's also called -- provide an elegant API for configuring objects. For example, the Quino query API provides methods to restrict (Where or WhereEquals), order (OrderBy), join (Join) and project (Select) data. The first version of this API was very traditional and applications typically contained code like the following:

var query = new Query(Person.Metadata);
query.WhereEquals(Person.Fields.Name, "Müller");
query.WhereEquals(Person.Fields.FirstName, "Hans");
query.OrderBy(Person.Fields.LastName, SortDirection.Ascending);
query.OrderBy(Person.Fields.FirstName, SortDirection.Ascending);
var contactsTable = query.Join(Person.Relations.ContactInfo);
contactsTable.Where(ContactInfo.Fields.Street, ExpressionOperator.EndsWithCI, "Strasse");

(This example gets all people named "Hans Müller" that live on a street with a name that ends in "Strasse" (case-insensitive) sorted by last name, then first name. Fields and Relations refer to constants generated from the Quino metadata model.)

Fluent Examples

The syntax above is very declarative and relatively easy-to-follow, but is a bit wordy. It would be nice to be able to chain together all of these calls and remove the repeated references to query. The local variable contactsTable also seems kind of superfluous here (it is only used once).

A fluent version of the query definition looks like this:

var query = new Query(Person.Metadata);
query.WhereEquals(Person.Fields.Name, "Müller")
  .WhereEquals(Person.Fields.FirstName, "Hans")
  .OrderBy(Person.Fields.LastName, SortDirection.Ascending)
  .OrderBy(Person.Fields.FirstName, SortDirection.Ascending)
  .Join(Person.Relations.ContactInfo)
    .Where(ContactInfo.Fields.Street, ExpressionOperator.EndsWithCI, "Strasse");

The example uses indenting to indicate that restriction after the join on the "ContactInfo" table applies to the "ContactInfo" table instead of to the "Person" table. The call to Join logically returns a reference to the joined table instead of the query itself. However, each such table also has a Query property that refers to the original query. Applications can use this to "jump" back up and apply more joins, as shown in the example below where the query only returns a person if he or she also works in the London office:

var query = new Query(Person.Metadata);
query.WhereEquals(Person.Fields.Name, "Müller")
  .WhereEquals(Person.Fields.FirstName, "Hans")
  .OrderBy(Person.Fields.LastName, SortDirection.Ascending)
  .OrderBy(Person.Fields.FirstName, SortDirection.Ascending)
  .Join(Person.Relations.ContactInfo)
    .Where(ContactInfo.Fields.Street, ExpressionOperator.EndsWithCI, "Strasse").Query
  .Join(Person.Relations.Office)
    .WhereEquals(Office.Fields.Name, "London");

A final example shows how even complex queries over multiple table levels can be chained together into one single call. The following example joins on the "ContactInfo" table to dig even deeper into the data by restricting to people whose web sites are owned by people with at least 10 years of experience:

var query = new Query(Person.Metadata);
query.WhereEquals(Person.Fields.Name, "Müller")
  .WhereEquals(Person.Fields.FirstName, "Hans")
  .OrderBy(Person.Fields.LastName, SortDirection.Ascending)
  .OrderBy(Person.Fields.FirstName, SortDirection.Ascending)
  .Join(Person.Relations.ContactInfo)
    .Where(ContactInfo.Fields.Street, ExpressionOperator.EndsWithCI, "Strasse")
    .Join(ContactInfo.Relations.WebSite)
      .Join(WebSite.Relations.Owner)
        .Where(Owner.Fields.YearsExperience, ExpressionOperator.GreaterThan, 10).Query
  .Join(Person.Relations.Office)
    .WhereEquals(Office.Fields.Name, "London");

This API might still be a bit too wordy for some (.NET 3.5 Linq would be less wordy), but it's refactoring-friendly and it's crystal-clear what's going on.

Implementation

When there's only one class involved, it's not that hard to conceive of how this API is implemented: each method just returns a reference to this when it has finished modifying the query. For example, the WhereEquals method would look like this:

IQuery WhereEquals(IMetaProperty prop, object value);
{
  Where(CreateExpression(prop, value);

  return this;
}

This isn't rocket science and the job is quickly done.

However, what if things in the inheritance hierarchy aren't that simple? What if, for reasons known to the Quino framework architects, IQuery actually inherits from IQueryCondition, which defines all of the restriction and ordering operations. The IQuery provides projection and joining operations, which can easily just return this, but what type should the operations in IQueryCondition return?

The problem area is indicated with question marks in the example below:

public interface IQueryCondition
{
  ??? WhereEquals(IMetaProperty prop, object value);
}

public interface IQueryTable : IQueryCondition
{
  IQueryTable Join(IMetaRelation relation);
}

public interface IQuery : IQueryTable
{
  IQueryTable SelectDefaultForAllTables();
}

The IQueryCondition can't simply return IQueryTable because it might be used elsewhere1, but it can't return IQueryCondition because then the table couldn't perform a join after a restriction because applying the restriction would have restricted the fluent interface to an IQueryCondition instead of an IQueryTable.

The solution is to make IQueryCondition generic and pass it the type that it should return instead of hard-coding it.

public interface IQueryCondition<TSelf>
{
  TSelf WhereEquals(IMetaProperty prop, object value);
}

public interface IQueryTable : IQueryCondition<IQueryTable>
{
  IQueryTable Join(IMetaRelation relation);
}

public interface IQuery : IQueryTable
{
  IQueryTable SelectDefaultForAllTables();
}

That takes care of the interfaces, on to the implementation. The standard implementation runs into a small problem when returning the generic type:

public class QueryCondition<TSelf> : IQueryCondition<TSelf>
{
  TSelf WhereEquals(IMetaProperty prop, object value)
  {
    // Apply restriction

    return (TSelf)this; // causes a compile error
  }
}

public class QueryTable : QueryCondition<IQueryTable>, IQueryTable
{
  IQueryTable Join(IMetaRelation relation) 
  {
    // Perform the join

    return result;
  }
}

public class Query : IQuery
{
  IQueryTable SelectDefaultForAllTables()
  {
    // Perform the select

    return this;
  }
}

One simple solution to the problem is to cast down to object and back up to TSelf, but this is pretty bad practice as it short-circuits the static checker in the compiler and defers the problem to a potential runtime one.

public class QueryCondition<TSelf> : IQueryCondition<TSelf>
{
  TSelf WhereEquals(IMetaProperty prop, object value)
  {
    // Apply restriction

    return (TSelf)(object)this;
  }
}

In this case, it's guaranteed by the implementation that this is compliant with TSelf, but it would be even better to solve the problem without resorting to the double-cast above. As it turns out, there is a simple and quite elegant solution, using an abstract method called ThisAsTSelf, as illustrated below:

public abstract class QueryCondition<TSelf> : IQueryCondition<TSelf>
{
  TSelf WhereEquals(IMetaProperty prop, object value)
  {
    // Apply restriction

    return ThisAsTSelf();
  }

  protected abstract TSelf ThisAsTSelf();
}

public class Query : IQuery
{
  protected override TSelf ThisAsTSelf()
  {
    return this;
  }
}

The compiler is now happy without a single cast at all because Query returns this, which the compiler knows conforms to TSelf. The power of a fluent API is now at your disposal without restricting inheritance hierarchies or making end-runs around the compiler. Naturally, the concept extends to multiple levels of inheritance (e.g. if all calls had to return IQuery instead of IQueryTable), but it gets much uglier, as it requires nested generic types in the return types, which makes it much more difficult to understand. With a single level, as in the example above, the complexity is still relatively low and the resulting API is very powerful.



  1. And, in Quino, it is used elsewhere, for the IQueryJoinCondition.

Pre-generating Entity Framework (EF) Views

These instructions apply to the 1.x release of EF and its designer integration into Visual Studio 2008.

Overview

The Entity Framework requires what it calls "views" in order to access a database. EF generates these views automatically if they are not available. In order to avoid generating these views at application startup, they can be pre-generated and stored as C# code.

A post-build step in the compilation process would be the ideal place for this, but there's a snag: the view generator needs the entity model data stored as files in the output directory whereas the deployment would rather have them stored as resources. An EF model has an option that indicates whether it is stored as files or as a resource; knowing that, we could set up the build like this:

  1. Toggle all model files to generate model data as files
  2. Build
  3. Generate the views
  4. Toggle all model files to generate model data as resources
  5. Build

Generating the Views

However, if the model has not changed (as is usually the case when a project is mature), there is no need to waste time regenerating the views and building twice. Therefore, we've settled on the following manual method for updating the views:

  1. Open each model (.edm) file and change the "Metadata Artifact Processing" property to "Copy to Output Directory".
  2. Build the application to force the model files to be generated.
  3. Run a batch file to generate the views (shown below).
  4. Change the "Metadata Artifact Processing" property back to "Embed in Output Assembly".
  5. Build again to embed the model as a resource and bind the newly generated views.

The Batch File

Here is a sample command you can execute in order to generate your views:

"%windir%\Microsoft.NET\Framework\v3.5\EdmGen.exe" ^
  /mode:ViewGeneration ^
  /language:CSharp ^
  /nologo ^
  "/inssdl:D:\Encodo\projects\customer\project\bin\Debug\Models\EntityModel.ssdl" ^
  "/incsdl:D:\Encodo\projects\customer\project\bin\Debug\Models\EntityModel.csdl" ^
  "/inmsl:D:\Encodo\projects\customer\project\bin\Debug\Models\EntityModel.msl" ^
  "/outviews:D:\Encodo\projects\customer\project\Models\EntityModel.Views.cs"
  • Change "EntityModel" to the name of your own model file.
  • Make sure to include the "EntityModel.Views.cs" in your project to actually use the generated views.
Microsoft Code Contracts: Not with a Ten-foot Pole

This article originally appeared on earthli News and has been cross-posted here. In the meantime, a lot has changed and the major complaint -- a lack of explicit contracts in C# -- will finally be addressed in the next version of C#, 4.0.


After what seems like an eternity, a mainstream programming language will finally dip its toe in the Design-by-contract (DBC) pool. DBC is a domain amply covered in one less well-known language called Eiffel (see ISE Eiffel Goes Open-Source for a good overview), where preconditions, postconditions and invariants of various stripes have been available for over twenty years.

Why Contracts?

Object-oriented languages already include contracts; "classic" signature-checking involves verification of parameter counts and type-conformance. DBC generally means extending this mechanism to include assertions on a higher semantic level. A method's signature describes the obligations calling code must fulfill in order to execute the method. The degree of enforcement varies from language to language. Statically-typed languages verify types according to conformance at compile-time, whereas dynamically-typed languages do so at run-time. Even the level of conformance-checking differs from language to language, with statically-typed languages requiring hierarchical conformance via ancestors and dynamically-typed languages verifying signatures via duck-typing.

And that's only for individual methods; methods are typically collected into classes that also have a semantic meaning. DBC is about being able to specify the semantics of a class (e.g. can property A ever be false when property B is true?) as well as those of method parameters (can parameter a ever be null?) using the same programming language.

Poor-man's DBC

DBC is relatively tedious to employ without framework or language support. Generally, this takes the form of using Debug.Assert1 at the start of a method call to verify arguments, throwing ArgumentExceptions when the caller did not satisfy the contract. Post-conditions can also be added in a similar fashion, at the end of the funtion. Naturally, without library support, post-conditions must be added before any return-statements or enclosed in an artificial finally-clause around the rest of the method body. Class invariants are even more tedious, as they must be checked both at the beginning and end of every single "entering" method call, where the "entering" method call is the first on the given object. A proper implementation may not check the invariant for method calls that an object calls on itself because its perfectly all right for an object to be in an invalid state until the "entering" method returns.

One assertion that arises quite often is that of requiring that a parameter be non-null in a precondition. An analysis of most code bases that used poor-man's DBC will probably reveal that the majority of its assertions are of this form. Therefore, it would be nice to handle this class of assertion separately using a language feature that indicates that a particular type can statically never be null. Eiffel has added this support with a separate notation for denoting "attached" types (types that are guaranteed to be attached to a non-null reference). Inclusion of such a feature not only improves the so-called "provability" of programs written in that language, it also transforms null-checking contracts to another notation (e.g. in Eiffel, objects are no longer nullable by default and the ?-operator is used to denote nullability) and removes much of the clutter from the precondition block.

Without explicit language support, a DBC solution couched in terms of assertions and/or exceptions quickly leads to clutter that obscures the actual program logic. Contracts should be easily recognizable as such by both tools and humans. Ideally, the contract can be extracted and included in documentation and code completion tooltips. Eiffel provides such support with separate areas for pre- and post-conditions as well as class invariants. All assertions can be labeled to give them a human-readable name, like "param1_not_null" or "list_contains_at_most_one_element". The Eiffel tools provide various views on the source code, including what they call the "short" view, showing method signatures and contracts without implementation, as well as the "short flat" view, which is the "short" view, but includes all inherited methods to present the full interface of a type.

Looking at "Code Contracts"

Other than Eiffel, no close-to-mainstream programming language2 has attempted to make the implicit semantics of a class explicit with DBC. Until now. Code Contracts will be included in C# 4.0, which will be released with Visual Studio 2010. It is available today as a separate assembly and compatible with C# 3.5 and Visual Studio 2008, so no upgrade is required to start using it. Given the lack of an upgrade requirement, we can draw the conclusion that this contracting solution is library-only without any special language support.

That does not bode well; as mentioned above, such implementations will be limited in their support of proper DBC. The user documentation provides an extensive overview of the design and proper use of Code Contracts.

There are, as expected, no new keywords or language support for contracts in C# 4.0. That means that tools and programmers will have to rely on convention in order to extract semantic meaning from the contracts. Pre- and postconditions are mixed together at the top of the method call. Post-conditions have support for accessing the method result and original values of arguments. Contracts can refer to fields not visible to other classes and there is an attribute-based hack to make these fields visible via a proxy property.

Abstract classes and Interfaces

Contracts for abstract classes and interfaces are, simply put, a catastrophe. Since these constructs don't have method implementations, they can't contain contracts. Therefore, in order to attach contracts to these constructs -- and, to be clear, the mechanism would be no improvement over the current poor-man's DBC if there was no way to do this -- there is a ContractClass attribute. Attaching contracts to an interface involves making a fake implementation of that interface, adding contracts there, hacking expected results so that it compiles, presumably adding a private constructor so it can't be instantiated by accident, then referencing it from the interface via the attribute mentioned above. It works, but it's far from pretty and it move the contracts far from the place where it would be intuitive to look for them.

No Support for Precondition Weakening

Just as the specification side is not so pretty, the execution side also suffers. Contracts are, at least, inherited, but preconditions cannot be weakened. That is, a sub-type -- and implementations of interfaces with contracts are sub-types -- cannot add preconditions; end of story. As soon as a type contains at least one contract on one method, all methods in that type without contracts are interpreted as specifying the "empty" contract.

Instead of simply acknowledging that precondition weakening could be a useful feature, the authors state:

While we could allow a weaker precondition, we have found that the complications of doing so outweigh the benefits. We just haven't seen any compelling examples where weakening the precondition is useful.

Let's have an example, where we want to extend an existing class with support for a fallback mechanism. In the following case we have a transmitter class that sends data over a server; the contracts require that the server be reachable before sending data. The descendant adds support for a second server over which to send, should the first be unreachable. All examples below have trimmed initialization code that guarantees non-null properties for clarity's sake. All contracts are included.

**class** Transmitter
{
  **public** Server Server { **get**; }

  **public virtual void** SendData(Data data)
  {
     Contracts.Requires(data != null);
     Contracts.Requires(Server.IsReachable);
     Contracts.Ensures(data.State == DataState.Sent);

     Server.Send(data);
  }

  [ContractInvariantMethod]
  **protected void** ObjectInvariant
  {
    Contract.Invariant(Server != null);
  }
}

**class** TransmitterWithFallback : Transmitter
{
  **public** Server FallbackServer { **get**; }

  **public override void** SendData(Data data)
  {
     // **contract violation**

     // If "Server" is not reachable, we will never be given
     // the opportunity to send using the fallback server
  }

  [ContractInvariantMethod]
  **protected void** ObjectInvariant
  {
    Contract.Invariant(FallbackServer != null);
  }
}

We can't actually implement the fallback without adjusting the original contracts. With access to the code for the base class, we could address this shortcoming by moving the check for server availability to a separate method, as follows:

**class** Transmitter
{
  **public** Server Server { **get**; }

  [Pure]
  **public virtual bool** ServerIsReachable 
  { 
    **get** { return Server.IsReachable; }
  }

  **public virtual void** SendData(Data data)
  {
     Contracts.Requires(data != null);
     Contracts.Requires(ServerIsReachable);
     Contracts.Ensures(data.State == DataState.Sent);

     Server.Send(data);
  }

  [ContractInvariantMethod]
  **protected void** ObjectInvariant
  {
    Contract.Invariant(Server != null);
  }
}

**class** TransmitterWithFallback : Transmitter
{
  **public** Server FallbackServer { **get**; }

  [Pure]
  **public override bool** ServerIsReachable 
  { 
    **get** { return Server.IsReachable || FallbackServer.IsReachable; }
  }

  **public override void** SendData(Data data)
  {
    if (Server.IsReachable)
    {
      base.SendData(data);
    }
    else
    {
      FallbackServer.Send(data);
    }
  }

  [ContractInvariantMethod]
  **protected void** ObjectInvariant
  {
    Contract.Invariant(FallbackServer != null);
  }
}

With careful planning in the class that introduces the first contract -- where precondition contracts are required to go -- we can get around the lack of extensibility of preconditions. Let's take a look at how Eiffel would address this. In Eiffel, the example above would look something like the following3:

**class** TRANSMITTER
  **feature**
    server: SERVER

    send_data(data: DATA) **is**
    **require**
      server.reachable
    **do**
      server.send(data)
    **ensure**
      data.state = DATA_STATE.sent;
    **end
end**

**class** TRANSMITTER_WITH_FALLBACK
  **inherits**
    TRANSMITTER
      **redefine**
        send_data
      **end
  feature**
    fallback_server: SERVER

    send_data (data: DATA) **is**
      **require else**
        fallback_server.reachable
      **do**
        **if** server.reachable **then**
          Precursor;
        **else**
          fallback_server.send(data)
        **end
      end
end**

The Eiffel version has clearly separated boundaries between contract code and implementation code. It also did not require a change to the base implementation in order to implement a useful feature. The author of the library has that luxury, whereas users of the library would not and would be forced to use less elegant solutions.

To sum up, it seems that, once again, the feature designers have taken the way out that makes it easier on the compiler, framework and library authors rather than providing a full-featured design-by-contract implementation. It was the same with the initial generics implementation in C#, without co- or contra-variance. The justification at the time was also that "no one really needed it". C# 4.0 will finally include this essential functionality, belying the original assertion.

Thumbs Up or Thumbs Down?

The implementation is so easy-to-use that even the documentation leads off by warning that:

a word of caution: Static code checking or verification is a difficult endeavor. It requires a relatively large effort in terms of writing contracts, determining why a particular property cannot be proven, and finding a way to help the checker see the light. [...] If you are still determined to go ahead with contracts [...] To not get lost in a sea of warnings [...] (emphasis added)

Not only is that not ringing, that's not even an endorsement.

Other notes on implementation include:

  • Testing frameworks require scaffolding to redirect contract exceptions to the framework instead of an assertion dialog.
  • There is no support for edit-and-continue in contracted assemblies. Period. Contracting injects code into assemblies during the compile process, which makes them unusable for the edit-and-continue debugger.4
  • Because of this instrumentation, expect medium to massive slowdowns during compilation; the authors recommend enabling contracts in a special build instead of in all DEBUG builds. This is a ridiculous restriction as null-checks and other preconditions are useful throughout the development process, not just for pre-release testing. Poor-man's DBC is currently enabled in all builds; a move to MS Contracts with the recommended separate build would remove this support, weakening the development process.
  • Some generated code (e.g. Windows Forms code) currently causes spurious errors that must be suppressed by manually editing that generated code. Such changes will be wiped out as soon as a change is made in the Winforms designer.

Because the feature is not a proper language extension, the implementation is forced within the bounds of the existing language features. A more promising implementation was Spec# -- which extended the C# compiler itself -- but there hasn't been any activity on that project from Microsoft Research in quite some time. There are, however, a lot of interesting papers available there which offer a more developer-friendly insight into the world of design-by-contract than the highly compiler-oriented point-of-view espoused by the Contracts team.

This author will be taking a pass on the initial version of DBC as embodied by Microsoft Contracts.



  1. With which this author is acquainted.

  2. Examples use C# 3.5 unless otherwise noted.

  3. Please excuse any and all compile errors, as I haven't got access to a current Eiffel installation and am piecing this example together from documentation and what I remember about writing Eiffel code.

  4. This admission goes a long way toward explaining why code with generics and lambdas cannot be changed in an edit-and-continue debugging session. These language features presumably also rely on rewriting, instrumentation and code-injection.

An analysis of C# language design

This article originally appeared on earthli News in 2004 and has been cross-posted here. In the meantime, a lot has changed and the major complaint -- a lack of explicit contracts in C# -- will finally be addressed in the next version of C#, 4.0.


A Conversation with Anders Hejlsberg (Part I: The C# Design Process --- the process used by the team that designed C#, and the relative merits of usability studies and good taste in language design.) is a four-part series on the ideas that drove the design of C#. (The link is to the first page of the first section; browse to Artima.com Interviews to see a list of all the sections.)

Virtual vs. Static

I found some points of interest in Part IV, Versioning, Virtual, and Override (Part IV: Versioning, Virtual, and Override --- why C# instance methods are non-virtual by default and why programmers must explicitly indicate an override.), which Anders Hjelsberg (designer of both Delphi Pascal and C#) chats about the reasoning behind making methods non-virtual by default.

One answer is performance; he cites method usage in Java: "We can observe that as people write code in Java, they forget to mark their methods final. ... Because they're virtual, they don't perform as well." Another cited reason is 'versioning', which seems to be another term for formal contracts. Lack of versioning accounts for API instability in most software systems and C#'s approach, or lack thereof, is discussed in more detail later. First, let's examine the arguments supporting performance as a reason to make methods static by default.

In Java's case, methods are virtual by necessity; since classes can always be loaded into the namespace and their bytecode interpreted, methods must be virtual in case a descendant is loaded that overrides the method. In C#'s case, assemblies are built with a known 'universe' of classes (to borrow a term from the Eiffel world) -- there is no need to leave methods virtual in case other classes are loaded.

Leaving methods as statically linked by default puts the burden on the developer. That is, the developer must explicitly decide whether a method should be virtual or not. This prevents you from designing, then optimizing; you are immediately faced with the question: can a descendent legitimately redefine this method?

Private data

There are those who claim one can always answer this question. They are the same ones who squirrel variables away in 'private' areas, right when you would need it in your descendant most. Private features (data or methods visible only to the current class) limit the number of uses to which a class can be put: if a class has the correct interface, but an unacceptable implementation, a programmer is forced to define an entirely new, non-conforming class or, at the very least, to duplicate code in order to get the desired effect. Inheritance provides 'is a' semantics; if a class is another class, why is it valid that it can't see parts of itself?

Marking methods as 'final' (Java) or leaving them non-virtual (C#) and using private fields is akin to saying "I have created an infallible design and the private implementation is beyond reproach".

This is an especially dangerous attitude to take in library code. Library code is incorporated into other products; clients of the library will often define classes based on library classes. What if some part of a class doesn't function correctly, or works differently than expected, or desired? Can a client class alter the behavior of the library class enough? Or does the client need to alter library source code, or, worse yet, do without functionality, because the library class doesn't allow that kind of modification? Is a client forced to simply rewrite the entire class in a non-conforming class to get functionality that the library almost provides?

Implicit contracts

To this you may say "there are certain things you should not be able to do with a class". Fair enough, a good design imbues every class with a purpose and provides it with an API that fulfills it. However, what does it mean to say "you should not be able to do" something with a class? Does your class explicitly define an intent? The intent of a class is, at best, stored explicitly in external documentation. At worst, it is defined implicitly in the API; the secret of a class's purpose is stored in the visibility of features (private/protected/public) and in the redefinablity of methods. Even if the documentation is defined in the class itself, it is stored in comments that are beyond the reach of the compiler. The purpose of a class can't be verified or enforced at compile-time or run-time.

We come once again to the notion of contracts. Contracts to help the compiler, to help the developer, and to help the end-user or client. Contracts to make documentation simpler and clearer and explicit rather than implicit. All software enforces contracts; almost no programming language provides mechanisms for making these contracts explicit -- C# is no exception.

Easy Way Out

Language designers today have no imagination, creating clone after clone after clone. There's a reason C# looks so much like Java: given the choice between making writing software in the language easy and writing a compiler for the language easy, they go for an easy compiler every time. Neither of these languages lets a programmer express a design without immediately worrying about implementation. Anders Hjelsberg explains why C# took the easy way out:

The academic school of thought says, "Everything should be virtual, because I might want to override it someday." The pragmatic school of thought, which comes from building real applications that run in the real world, says, "We've got to be real careful about what we make virtual."

Now it's clear: whiners who are sick of working for their compilers are "academic, ... [not] pragmatic" and don't know about the "real world". That's a pretty specious argument, since he hasn't backed up his assertion with any evidence (other than the performance argument, which, while perfectly valid, is still addressable on the compiler side, as explained below).

Consider the question of whether to make methods static or dynamic by default. A compiler-friendly language makes everything static, forcing the programmer to explicitly mark redefinable methods with a keyword. A nicer language would make all methods dynamic. If a descendant redefines a method, it is compiled as dynamic. All methods in program (the universe of classes available at compile time) that are not redefined can safely be statically compiled.

Helpful features

A corollary to this is how a language handles function inlining. Inlining replaces a function call with the body of the function itself to increase performance for smaller functions. C and C++ still have an explicit 'inline' keyword. C# thankfully does not and has rightly chosen to put the burden of choosing which functions to inline on a compiler that examines the heuristics of the entire program. Since it's a newer compiler, there are still a few kinks to work out (Poor inline optimization of C#/JIT), but C# is headed in the right direction.

Another issue affecting a language's usability is its redefinition policy. When is a method considered a redefinition of another method? C++ has the absolute worst policy in this respect, assuming a redefinition as soon as a method with the same signature in an ancestor is marked as 'virtual'. If the signature of the 'virtual' method or the redefinition changes, it is simply assumed to no longer be a redefinition. What fun!

C# has thankfully adopted the policy of explicit redefinition, forcing a method with the same signature to be marked as an 'override'. The method being redefined must, of course, be marked as 'virtual' when defined (as explained above).

These are the language features that lighten the load for a programmer. Garbage collection is another such feature that C# got right. Given garbage collection, a developer can freely design structures without immediately considering which object is responsible for what. The accompanying complications of 'has' and 'uses' falls away in almost all cases and a design can be much more easily mapped to the language without accommodating memory management so early in the process. It is still possible to have memory problems with garbage collection (a dangling pointer no longer causes a crash, but instead causes inconsistent logic and excessive memory usage). Nevertheless, languages that provide garbage collection allow elegant designs that require a lot of scaffolding code in non-memory-managed languages.

Back to versioning

Anders goes on at length about the problem of 'versioning':

When we make something virtual ... we're making an awful lot of promises about how it evolves in the future. ... When we publish a virtual method in an API, we not only promise that when you call this method, x and y will happen. We also promise that when you override this method, we will call it in this particular sequence with regard to these other ones and the state will be in this and that invariant.

What promises? C# has no contracting mechanism, so discussion of promises is limited to non-functional documentation and perhaps the method name, which implies what it does. Though he mentions an "invariant", which is presumably the class invariant, there is no mechanism for specifying one: how can you prove that code broke an implicit contract?

He continues talking about contracts, noting that "[v]irtual has two sides to it: the incoming and the outgoing". He talks all around the notion of contracts and documentation and the pitfalls associated with trusting developers to write documentation that shows "what you're supposed to do when you override a virtual method". Documentation should include information about "[w]hat are the invariants before you're called? What should be true after?". At this point, you're screaming with frustration that a man so seemingly knowledgeable of Design-by-Contract decided to leave everything implicit in his language. He acknowledges the problem of contracting, then, in the same breath, leaves the entirety of the solution up to the developer. Not only does C# have no way of specifying these obviously important and troublesome contracts, its designer has invented whole new terms (ingoing/outgoing instead of precondition/postcondition) in a seemingly willful ignorance of existing Design-by-Contract theory.

As justification for this somewhat fuzzy 'versioning' concept he's espousing, he mentions that "[w]henever [Java or C++ languages] introduce a new method in a base class, if someone in a derived class had a method of that same name, that method is now an override" Honestly, that has nothing to with contracts or making sure redefinitions enforce the same contracts; that's simply about explicit redefinition rather than implicit signature-matching. It's a trivial language feature that C# got exactly right, but, lacking contracts of any sort, how is C# any better at managing valid redefinitions than Java or C++? If a method is marked as 'virtual' in C#, a redefinition can do whatever it likes, including nothing.

The 'versioning' problem is not solved; it is simply no longer applicable to all methods because many more methods are static. That's not an advancement; that's removing functionality in order to prevent programmers from breaking things. Putting the burden on the developer limits the expressiveness of the language and constrains the power of solutions you can build in that language. Just because you might break a window with a hammer doesn't mean it's better to build a house without one.

Given a proper contracting mechanism in the language, "ingoing and outgoing" contracts could be explicitly specified in the language. A redefinition of such a method would inherit the ancestor method's contracts and be forced to support them. A method redefinition is free to weaken the precondition, but must support or strengthen the inherited postcondition. In addition, contracts at the class scope should be defined in a class invariant, which is checked before and after each method call, to ensure that the class is in a stable state before executing code on it.

There is a Design by Contract Framework for C# available, but it's only a library extension, and like all non-language implementations of Design-by-Contract, is only a pale imitation of the power afforded by a language-level solution. It's a real shame to see a language designer who knows so much about the pitfalls of programming and does so little to help the users of his language avoid them.

It's not the first time this has happened and it won't be the last. So many programmers are sticking with C++ because it has at least some form of generics (C++ templates are not truly generic, but are nonetheless extremely useful). Java, a language whose programs are littered with typecasts because of a lack of generics, plans to finally introduce generics after ten years. C# also skipped generics in the first version, and introduces them in the next revision, 2.0. One can only wonder when and if either will ever support contracting or whether we have to sit back and wait another ten years for a new language.

If you can't wait that long and want a language that has real generics, allows no private data, compiles non-redefined methods as static, has automatic inlining, explicit redefinition, garbage collection and incorporates a rich contract mechanism, try Eiffel.

Remote Debugging with [ASP].NET

When a .NET application exhibits behavior on a remote server that cannot be reproduced locally, you'll need to debug application directly on the server. The following article includes specific instructions for debugging ASP.NET applications, but applies just as well to standalone executables.

Prerequisites

There are several prerequisites for remote debugging; don't even bother trying until you have all of the items on the following list squared away or the Remote Debugger will just chortle at your naiveté.

  • The SERVER must have the Visual Studio Remote Debugging Monitor installed.
  • The firewall must be opened for Visual Studio on the client (which means that ReSharper sees other instances); remote debugging involves two-way communication.
  • A local user, BOB, with administrator rights on the client machine.
  • A server user, BOB, with administrator rights on the SERVER machine.
  • The names must match.
  • The monitor must be started on the SERVER using BOB (using "Run as...")
  • If you're not debugging in the same domain, then you have to change the options to use the Server name in the options to "BOB@SERVER"

Before you think you can get all fancy and simply debug remotely without authentication, know this: unauthenticated, native debugging does not support breakpoints, so forget it. You'll technically be able to connect to a running application but, without breakpoints, you'll only be able to watch any pre-existing debug output appear on the console, if that.

Firewall ports

The following ports must be open in order for Remote Debugging to function correctly in all situations:

**Protocol**    **Port**     **Service Name**
TCP         139      File and Printer Sharing
TCP         445      File and Printer Sharing
UDP         137      File and Printer Sharing
UDP         138      File and Printer Sharing
UDP         4500     IPsec (IKE NAT-T)
UDP         500      IPsec (IKE)
TCP         135      RPC Endpoint Mapper and DCOM infrastructure

Additionally, the application "Microsoft Visual Studio 2008" must be in the exceptions list on the client and "Visual Studio Remote Debugging Monitor" must be in the exceptions list on the server.

Recommendations

Once you've satisfied the requirements above, you should probably also heed the following tips: it's best to read about them now rather than learn about them the hard way later:

  • Make sure to turn off recycling and auto-shutdown for the AppPool while debugging, so you don't run the risk of your PID suddenly being gone.
  • Make sure that you're using debug versions of all assemblies where you want to debug or you'll be staring at IL assembly code more often than you'd like.
  • Make sure your local source code is in-sync with the source code on the SERVER or you'll be debugging on the wrong lines at best or be staring at IL assembly at worst.
  • It's best if the path to your local symbols is also valid and writable on the server so that symbols cached during remote debugging can be stored on the server. Check the "Options..Debugging..Symbols" to change that path if you need to. (there's more about this below if this doesn't make sense)

Test Run

Here are steps you can follow to debug an application remotely. These steps worked for me, but the remote debugging situation seems to be extremely hit-or-miss, so your mileage may vary.

  1. Open your web project in Visual Studio and compile it in debug mode.
  2. Outside of Visual Studio, build a deployment version of the web site and copy it to the SERVER.
  3. If this is the first time setting it up, move the application to its own ApplicationPool, so you can detect it more easily later.
  4. If you haven't already, install the Visual Studio Remote Debugging Monitor to the SERVER.
  5. Make sure you have a local user on that machine with your own user name, USER.
  6. Start the Visual Studio Remote Debugging Monitor using "Run as..." and entering USER on that server.
  7. When it has started, select "File..Options" from the menu and change the server name to USER@SERVERNAME.
  8. From within Visual Studio, select "Debug..Attach to Process" from the menu.
  9. In the dialog, specify the USER@SERVERNAME you used in the debugging monitor above and hit Refresh.
  10. Scroll down until you see the "w3wp.exe" processes

You've set up the server and attached to it so far. If anything has gone wrong, check the troubleshooting section below to see if your problem is addressed there. Now, the next steps are optional if you think you can identify your process without knowing the PID (Process ID). This is generally the case only when yours is the only .NET application deployed to that server. In that case, your process is the "w3wp.exe" process which includes "managed code". If you don't know your PID, follow the optional instructions below to figure out which one is yours.

  1. From the client machine, download the attached script "ASP.NET PID Detector" and open it in a text editor.
  2. Change the machineName, appPoolName and url to match the settings for your application on the SERVER. (this is the reason we put our application into its own application pool at the beginning.)
  3. Save the file as a different name (probably with the machine name and server in the title).
  4. Execute the file and follow instructions; it will probably launch your web site in IE. It will probably also claim to have failed. Run it again and it will give you the PID of your application on the server.

If that didn't work, then you probably aren't configured to query WMI remotely; your only options are to try to run it remotely using the instructions and tips below or to run it from the server.

  • If you have remote desktop access to the server, then copy the script to the server and configure the batch file to query the local script and server (recommended).
  • Turn off the Windows Firewall on the server completely (not recommended if the server is open to the internet).
  • Follow instructions at Enable WMI (Windows Management Instrumentation) to enable remote administration through the firewall. Not only must you execute a special command to configure the Firewall (only available from the command line) but your user must also have the correct permissions (also not recommended, as enabling WMI can open up the server in unexpected ways if you don't know what you're doing). I did not attempt this, as I could simply run the PID-detector from the server.

Once you have the PID in hand, continue:

  1. Select the "w3wp.exe" process with your PID and double-click it to attach to that process.
  2. It will ask whether remote symbols can be stored on the server in the given location. You should say yes but it will try to save those symbols on the server using the same path you use for storing symbols locally on your development machine.1
  3. Set a breakpoint where desired; the breakpoint should be solid red. If it is, you're done.
  4. Browse the application in IE to trigger the breakpoint and debug away.

Troubleshooting

As you can probably tell from the massive list of prerequisites and recommendations as well as the 20-step guide to triggering a breakpoint, there's a lot that can go wrong with Remote Debugging. It's not insurmountable, but it's not something you're going to want to attempt unless your job pretty much depends on it. These are some of the errors I encountered along the way and how I addressed them.

Unable to connect to the Microsoft Visual Studio Remote Debugging Monitor named 'USER@SERVER'. The Visual Studio Remote Debugger on the target computer cannot connect back to this computer. Authentication failed. Please see Help for assistance.

You need to create a local administrator with the same password as the one you're using on the server to run the debugging monitor.

Unable to connect to the Microsoft Visual Studio Remote Debugging Monitor named 'USER@SERVER'. The Visual Studio Remote Debugger on the target computer cannot connect back to this computer. A firewall may be preventing communication via DCOM to the local computer. Please see Help for assistance.

You opened the firewall, but only for computers on the same subnet. The computer to which you are connecting is probably not on the same subnet, so you'll need to go to the firewall settings and open them up all the way (Visual Studio will not ask again). To edit the firewall settings, do the following:

  1. Open the "Windows Firewall" control panel.
  2. Select the "Exceptions" tag.
  3. Scroll to the "Microsoft Visual Studio 2008" entry and double-click it.
  4. From the dialog, press the "Change Scope" button.
  5. Change it to "Any computer (including those on the Internet)".
  6. Press "Ok" three times to save changes.

It's also possible that the Remote Debugger is being blocked on the server side. To address this, run the "Visual Studio 2008 Remote Debugger Configuration Wizard" again; if the wizard wants to adjust firewall settings, let it do so (for internal or external networks, as appropriate to your situation -- if you're not sure, use external). To make sure that the settings were applied, run the wizard again; it should ask you about running the service, but should no longer complain about the firewall.

If it still complains about the firewall, then you've got another problem, which is that the setup is having trouble adjusting the settings for the firewall but isn't telling you that it's utterly failing when it attempts to do so. Verify that you're running the wizard as a user that has permission to adjust the firewall settings.

Unable to connect to the Microsoft Visual Studio Remote Debugging Monitor named 'USER@SERVER'. Logon failure: unknown user name or bad password. See help for more information.

The user with which you are executing Visual Studio on the client does not exist on the server or has a different password. In order to avoid adding useless user accounts to the server's domain, you should restart your IDE using "Run as..." to set the security context to the same user as you have on the server.

You can impersonate other users, but you have set a registry key; see Remote Debugging Under Another User Account for more information. This doesn't help though, if the user you are trying to use doesn't even have an account on the remote machine.

Conclusion

Remote debugging sounds way cool and is the major difference between the Standard and Professional versions of Visual Studio, but it's not for the faint of heart or the inexperienced. If you Google around a bit, you'll notice that most people get a big heap of epic fail when they try it and I've tried to make as comprehensive guide to remote debugging as my own experience and time constraints allowed.

Here's hoping you never have to do remote debugging (write a test instead! smile) but, if you do, I wish you the best of luck.


This article originally appeared on earthli News and has been cross-posted here.


  1. I'm honestly not sure whether this is required or not, but I allowed it and it worked. It may also work without caching the symbols if the path can't be written.

The Dark Side of Entity Framework: Mapping Enumerated Associations

At Encodo, we're using the Microsoft Entity Framework (EF) to map objects to the database. EF treats everything -- and I mean everything -- as an object; the foreign key fields by which objects are related aren't even exposed in the generated code. But I'm getting ahead of myself a bit. We wanted to figure out the most elegant way of mapping what we are going to call enumerated associations in EF. These are associations from a source table to a target table where the target table is a lookup value of type int. That is, the enumerated association could be mapped to a C# enum instead of an object. We already knew what we wanted the solution to look like, as we'd implemented something similar in Quino, our metadata framework (see below for a description of how that works).

The goals are as follows:

  1. Properties of the enumerated type are stored in the database, including its identifier, its value and a mapping to translations.
  2. Relations to the enumerated value are defined in the database as constraints.
  3. The database is therefore internally consistent
  4. C# code can work with an enumerated type rather than a sub-object; this avoids joining the enumerated type tables when retrieving the main object or restricting to the enumerated type's value.

EF encourages -- nay, requires -- that one develop the application model in the database. A database model consists of tables, fields and relationships between those tables. EF will map those tables, fields and relationships to classes, properties and sub-objects in your C# code. The properties used to map an association -- the foreign keys -- are not exposed by the Entity Framework and are simply unavailable in the generated code. You can, however, add custom code to your partial classes to expose those values1:

return Child.ParentReference.ID;

However, you can't use those properties with LINQ queries because those extra properties cannot be mapped to the database by EF. Without restrictions or orderings on those properties, they're as good as useless, so we'll have to work within EF itself.

Even though EF has already mapped the constraint from the database as a navigational property, let's add the property to the model as a scalar property anyway. You'll immediately be reprimanded for mapping the property twice, with something like the following error message:

Error 3007: Problem in Mapping Fragments starting at lines 1383, 1617: Non-Primary-Key column(s) [ColumnName] are being mapped in both fragments to different conceptual side properties - data inconsistency is possible because the corresponding conceptual side properties can be independently modified.

Since we're feeling adventurous, we open the XML file directly (instead of inside the designer) and remove the navigational property and association, then add the property to the conceptual model by hand. Now, we're reprimanded for not having mapped the association EF found in the database, with something like the following error message:

Error 11008: Association 'FOREIGN_KEY_NAME' is not mapped.

Not giving up yet, we open the model in the designer again and delete the offending foreign key from the diagram. Now, we get something like the following error message:

Error 3015: Problem in Mapping Fragments starting at lines 6680, 6699, 6716, 6724, 6801, 6807, 6815: Foreign key constraint 'FOREIGN_KEY_NAME' from table Source (SourceId) to table TargetType (Id):: Insufficient mapping: Foreign key must be mapped to some AssociationSet on the conceptual side.

The list of line numbers indicate where the foreign key we've deleted is still being referenced. Despite having used the designer to delete the key, EF has neglected to maintain consistency in the model, so it's time to re-open the model as XML and delete the remaining references to 'FOREIGN_KEY_NAME' manually.

We're finally in the clear as far as the designer and compiler are concerned, with the constraint defined as we want it in the database and EF exposing the foreign key as an integer -- to which we can assign a typecast enum -- instead of an object. This was the goal, so let's run the application and see what happens.

Everything works as expected and there are no nasty surprises waiting for us at runtime. We've got a much more comfortable way of working with the special case of enumerated types working in EF. This special case, arguably, comes up quite a lot; in the model for our application, about half of the tables contain enumerated data, which are used as lookups for reports.

It wasn't easy and the solution involved switching from designer to XML-file and back a few times2, but at least it works. However, before we jump for joy that we at least have a solution, let's pretend we've changed our database again and update the model from the database.

Oops.

The EF-Designer has detected the foreign key we so painstakingly deleted and re-established it without asking for so much as a by-your-leave, giving us the error of type 3007 shown above. We're basically back where we started ... and will be whenever anyone changes the database and updates the model automatically. At this point, it seems that the only way to actually expose the foreign key in the EF model is to remove the association from the database! Removing the constraint in the database, however, is unacceptable as that would destroy the relational integrity just to satisfy a crippled object mapper.

In a last-ditch effort, we can fool EF into thinking that the constraint has been dropped not by removing the constraint but by removing the related table from the EF model. That is, once EF no longer maps the destination table -- the one containing the enumerated data -- it will no longer try to map the constraint, mapping the foreign key as just another integer field.

This solution finally works and the model can be updated from the designer without breaking it -- as long as no one re-adds the table with the enumerated data. This is the solution we've chosen for all of our lookup data, establishing a second EF-model to hold those tables.

  • The main model contains non-enumerated data; relations to enumerated data end in integer fields instead of objects.
  • The lookup model contains a list of enumerated data tables; these are queried for the contents of drop-down lists and so on.
  • We defined an enumerated type in C# for each table in the lookup model, with values corresponding to the values that go in the lookup table.
  • We wrote a synchronizer to keep the data in the lookup tables synchronized with the enum-values in C#.
  • Business logic uses these enumerated types to assign the values to the foreign-key integer fields (albeit with a cast).

Using Quino to Solve the Problem

It's not a beautiful solution, but it works better than the alternative (using objects for everything). Quino, Encodo's metadata framework includes an ORM that addresses this problem much more elegantly. In Quino, if you have the situation outlined above -- a data table with a relation to a lookup table -- you define two classes in the metadata, pretty much as you do with EF. However, in Quino, you can specify that one class corresponds to an enumerated type and both the code generator and schema migrator will treat that meta-class accordingly.

  • The code generator maps relations with the enumerated class as the target to the C# enum instead of an object, automatically converting the underlying integer foreign key to the enumerated type and back.
  • The schema migrator detects differences between the C# enumerated type and the values available in the lookup table in the database and keeps them synchronized.
  • As simple integer enums, the values can be easily restricted and ordered without joining extra tables.
  • Generated code used the C# enumerated type, which ensures type-safety and code-completion, including documentation, in business code.

EF has a graphical designer, whereas Quino does not, but the designer only gets in the way for the situation outlined above. Quino offers an elegant solution for lookup values with only two lines of code: one to create the lookup class and indicate which C# enum it represents and one to create a property of that type on the target class. The Quino Demo (not yet publicly available) contains an example.



  1. You can also try to modify the T4 templates used to generate code, but that would be futile for reasons that follow.

  2. Which is, frankly, appalling, but hardly unexpected for a 1.0 product from Microsoft, which usually needs a few tries to get things working smoothly.

Elegant Code vs.(?) Clean Code

A developer on the Microsoft C# compiler team recently made a post asking readers to post their solutions to a programming exercise in Comma Quibbling by Eric Lippert. The requirements are as follows:

  1. If the sequence is empty then the resulting string is "".
  2. If the sequence is a single item "ABC" then the resulting string is "".
  3. If the sequence is the two item sequence "ABC", "DEF" then the resulting string is "".
  4. If the sequence has more than two items, say, "ABC", "DEF", "G", "H" then the resulting string is "{ABC, DEF, G and H}". (Note: no Oxford comma!)

On top of that, he stipulated "I am particularly interested in solutions which make the semantics of the code very clear to the code maintainer."

Before doing anything else, let's nail down the specification above with some tests, using the NUnit testing framework:

[TestFixture]
public class SentenceComposerTests
{
  [Test]
  public void TestZero()
  {
    var parts = new string[0];
    var result = parts.ConcatenateWithAnd();

    Assert.AreEqual("{}", result);
  }

  [Test]
  public void TestOne()
  {
    var parts = new[] { "one" };
    var result = parts.ConcatenateWithAnd();

    Assert.AreEqual("{one}", result);
  }

  [Test]
  public void TestTwo()
  {
    var parts = new[] { "one", "two" };
    var result = parts.ConcatenateWithAnd();

    Assert.AreEqual("{one and two}", result);
  }

  [Test]
  public void TestThree()
  {
    var parts = new[] { "one", "two", "three" };
    var result = parts.ConcatenateWithAnd();

    Assert.AreEqual("{one, two and three}", result);
  }

  [Test]
  public void TestTen()
  {
    var parts = new[] { "one", "two", "three", "four", "five", "six", "seven", "eight", "nine", "ten" };
    var result = parts.ConcatenateWithAnd();

    Assert.AreEqual("{one, two, three, four, five, six, seven, eight, nine and ten}", result);
  }
}

The tests assume that the method ConcatenateWithAnd() is declared as an extension method. With the tests written, I figured I'd take a crack at the solution, keeping the last condition foremost in my mind instead of compactness, elegance or cleverness (as often predominate). Instead, I wanted to make the special cases given in the specification as clear as possible in the code. On top of that, I added the following conditions to the implementation:

  1. Do not create a list or array out of the enumerator. That is, do not invoke any operation that would involve reading the entire contents of the enumerator at once (e.g. the extension methods Count() or Last() are verboten).
  2. Avoid comments; instead, make the code comment itself.
  3. Make the code as clearly efficient as possible without invoking any potentially costly library routines whose asymptotic order is unknown.

That said, here's my version:

public static string ConcatenateWithAnd(this IEnumerable<string> words)
{
  var enumerator = words.GetEnumerator();

  if (!enumerator.MoveNext())
  {
    return "{}";
  }

  var firstItem = enumerator.Current;

  if (!enumerator.MoveNext())
  {
    return "{" + firstItem + "}";
  }

  var secondItem = enumerator.Current;

  if (!enumerator.MoveNext())
  {
    return "{" + firstItem + " and " + secondItem + "}";
  }

  var builder = new StringBuilder("{");
  builder.Append(firstItem);
  builder.Append(", ");
  builder.Append(secondItem);

  var item = enumerator.Current;

  while (enumerator.MoveNext())
  {
    builder.Append(", ");
    builder.Append(item);
    item = enumerator.Current;
  }

  builder.Append(" and ");
  builder.Append(item);
  builder.Append("}");

  return builder.ToString();
}

Looking at this from a maintenance or understanding point-of-view, I have the following notes:

  • More novice users will probably not immediately grasp the use of the enumerator. Though it's part of the .NET library, its use is usually hidden by the syntactic sugar of the foreach-statement.
  • The formatting instructions for curly brackets and separators are included several times, which decreases maintainability should the output specification change.
  • The multiple calls to the string-concatenation operator and to StringBuilder.Append() are intentional. I wanted to avoid having to use escaped {} in the format string (e.g. String.Format("{{{0} and {1}}}", firstItem, secondItem) is confusing if you're not aware how curly brackets are escaped in a format string).

Other than those things, it seems relatively compact and efficient. With my own version written, I looked through the comments on the post to see if any other interesting solutions were available. I came up with two that caught my eye, one by Jon Skeet and another by Hresto Deshev, who submitted his in F#.

Hristo's example in F# is as follows:

#light
let format (words:list<string>) =
   let rec makeList (words: list<string>) =
       match words with
           | [] -> ""
           | first :: [] -> first
           | first :: second :: [] -> first + " and " + second
           | first :: second :: rest -> first + ", " + second + ", " + (makeList rest)
   "{" + (makeList words) + "}"

That's so cool: the formulation in F# is almost plain English! That's pretty damned maintainable, I'd say. I have no way of judging the performance of this just-in-time parsing, but it does make use of recursion: lists with thousands of items will incur thousands of nested calls.

Next up is Jon Skeet's version in C#:

public static string JonSkeetVersion(this IEnumerable<string> words)
{
  var builder = new StringBuilder("{");
  string last = null;
  string penultimate = null;
  foreach (string word in words)
  {
    // Shuffle existing words down
    if (penultimate != null)
    {
      builder.Append(penultimate);
      builder.Append(", ");
    }
    penultimate = last;
    last = word;
  }
  if (penultimate != null)
  {
    builder.Append(penultimate);
    builder.Append(" and ");
  }
  if (last != null)
  {
    builder.Append(last);
  }
  builder.Append("}");
  return builder.ToString();
}

This one is very clever and handles all cases in a single loop rather than addressing special cases outside of a loop (as mine did). Also, all of the formatting elements -- the curly brackets and item separators -- are mentioned only once, improving maintainability. I immediately liked it better than my own solution from a technical standpoint. While I'm drawn to the cleverness and elegance of the solution, I'm not the target audience. Skeet's version forces you to reason out the special cases; it's not immediately obvious how the special cases for zero, one and two elements are handled. Also, while I am tickled pink by the aptness of the variable name penultimate, I wonder how many non-native English speakers would understand its intent without a visit to an online dictionary. The name secondToLast would have been a better, though far less sexy, choice.

It's very easy to underestimate how little people are willing to actually read code that they didn't write. If the code requires a certain amount of study to understand, then they may just leave it well enough alone and seek the original developer. If, however, it looks quite easy and the special cases are made clear -- as in my version -- they are far more likely to dig further and work with it. Since the problem is defined as a three special cases and a general case, it is probably best to offer a solution where these cases are immediately obvious to ease maintainability -- and as long as you don't sacrifice performance unnecessarily. Cleverness is wonderful, but you may end up severely limiting the number of people willing -- or able -- to work on that code.

Encodo's Development Environment

For the software developers in our audience, we've put together a list of the most essential .Net-tools that we use daily and without which we wouldn't want to have to work.

Visual Studio 2008

Many months ago, we moved our entire .Net development to Visual Studio 2008. VS2008 supports .Net 2.0, 3.0 and 3.5 proejcts, which made the transition both quick and easy. Given the choice, we use .Net 3.5.1 as we've grown quite attached to the new language features like Linq, lambda-expressions and so on and wouldn't want to do without them anymore.

Resharper (R#) and Agent Smith

The ReSharper addon developed by JetBrains quickly revealed itself as an extremely useful Visual Studio addon. Its main strength is recommending ways to clean up and improve source code but it also includes an excellent unit-testing client (NUnit-compatible), enhanced code-navigation, analysis tools and much more.

ReSharper also supports plugins of its own and we've installed Agent Smith, which performs spell-checking and enforces other naming and coding conventions.

GhostDoc

GhostDoc is a freeware Visual Studio addon that enhances the source-code documentation if Visual Studio. It not only generates a complete documentation scaffolding based on method and property signatures, but also fills in the text with actual documentation that can often be used as-is.

DPack

DPack is also a freeware Visual Studio addon that includes several useful functions. We use it primarily for the two functions "File Browser" and "Code Browser", both of which improve code navigation speed. You use the file browser to find a file in the solution by typing a part of its name, then you use the code browser to search for an identifier within that file (matching classes, methods, properties, fields, interfaces and so on) and jump to it by hitting enter.

Encodo Perforce Plugin

Perforce has been Encodo's version control system (VCS) of choice for a while now, but the Visual Studio addon provided by Perforce themselves isn't very good, so we decided to make our own addon (implemented as a so-called Source Control Package). After a minimum of development time, we've got an addon that we've been using for months both in the office and remotely and that fully integrates with Visual Studio.

Developer Express Components

The UI components provided with the .Net-framework are good, but we prefer the much more powerful components available from Developer Express for both our Windows and web applications. Components range from simpler components like toolbars, ribbons, grids and treelists to more advanced components like full schedulers as well as reporting & printing. All components support full skinning.

Other component libraries

Though we use Developer Express almost exclusively, we've also worked with components from Telerik and ComponentArt. Naturally, though all three providers have extremely powerful components, they each have their strengths and weaknesses.

Jing

And lastly we have one of the newer tools that we find ourselves using more and more lately: Jing. Jing is an application that both takes screenshots and records screencasts. It can store recordings either locally or publish them directly to screencast.com under your user account. This tools is extremely helpful for both writing documentation and providing support (e.g. showing a user how to do something), easy to use and free.

That should give you a good overview of the most important tools that we currently use at Encodo. Of course, every workstation has numerous other small tools that help us get our jobs done; if you want to know more, feel free to ask!