1 2 3 4 5 6
Improving performance in GenericObject

Quino is Encodo's metadata framework, written in C#/.NET 4.0. Since its inception four years ago, we've used it in several products and the code base has been updated continuously.

However, it was only in a recent product that one of the central features of the framework came under scrutiny for performance issues. It turned out that reading and writing to Quino data objects was a bit slower than we needed it to be.

How Data Objects are Implemented

A typical ORM (like Hibernate or Microsoft's Entity Framework) uses a C# class as the base entity in the model, decorating those classes with attributes to add to the model. The ORM then uses this information to communicate with the database, reading and writing values through reflection. Creating objects and getting and setting values -- including default values -- is all done through direct calls to property getters and setters.

Quino took a different approach, putting the model at the center of the framework and defining an in-memory structure for the model that is accessible through a regular API rather than reflection. The actual C# classes used by business logic are then generated from this model -- instead of the other way around.

This decoupling of metadata from the classes has a lot of advantages, not the least of which is that Quino provides generalized access to any of these business objects. Components that work with Quino data do not need to be aware of the actual classes: instead, those components use the metadata and an API to read and write values. Since the interface is generalized, these values are get and set using Quino code rather than direct getters and setters.

As you would expect, there is a base class from which all Quino data objects inherit that provides the support for this interface, called GenericObject. It was in this central class that we had to go to work with a profiler to squeeze out some more speed.

Improving Performance

The actual use case for our data objects didn't even use our ORM, as such. Instead, we were generating the objects from a data stream with 0 to n columns defined (a perfect situation to use an object that supports a flexible interface).

Once those objects were created, they were handed off to the user interface, which applied them to a grid, replacing rows or updating values as required.

So, we needed to improve things on several fronts:

  • We needed to improve speed when creating objects because data was arriving at a serious clip.
  • We needed to improve speed when applying values because there were often several grids open at once, and they all needed to be updated as quickly as possible.1
  • We also needed to decrease the memory footprint because when the data flow was heavy, there were a lot of objects in memory and the application was reaching the limit of its address space.2

As mentioned above, the data object we had worked fine. It was fast enough and slim enough that we never noticed any performance or memory issues in more classical client applications. It was only when using the data object in a very high-demand, high-performance product that the issue arose. That's actually the way we prefer working: get the code running correctly first, then make it faster if needed.

And how do you make it faster and slimmer without breaking everything else you've already written? You run each subsequent version against your unit, regression and integration tests to verify it, that's how. Quino has several thousand automated tests that we ran each step of the way to make sure that our performance improvements didn't break behavior.

Charts and Methodology

The charts below indicate a relative improvement in speed and memory usage. The numbers are not meant to be compared in absolute terms to any other numbers. In fact, the application being tested was a simple console application we wrote that created a bunch of objects with a bunch of random data. Naturally we built the test to adequately approximate the behavior of the real-world application that was experiencing problems. This test application emitted the numbers you see below.

We used the YourKit Profiler for .NET to find code points that still needed improvement and iterated until we were happy with the result. We are very happy with YourKit as a profiler. It's fast and works well for sampling and tracing as well as detecting memory leaks and tracking memory usage. To test performance, we would execute part of the tests below with tracing enabled (no recompilation necessary), show "Hot Spots" and fix those.

The tests focused on creating a certain number of objects with a certain number of columns (with total data fields = #objects * #columns), corresponding to the first two columns in the table. The other columns are v0 (the baseline) and v1--v3, which are various versions we made as we tried to hone performance. The final three columns show the speed of v1--v3 vs. v0.

image

image

Finally, not only did we make creating objects over 3 times faster and changing values more than twice as fast, but we also decreased the memory footprint of each object to just over 1/3 of the original size.

image

These improvements didn't come by magic: the major change we made was to move from using a dictionary as an internal representation to using arrays and direct indexing. The dictionary is the more natural choice as the generalized API maps property and relation names to values, but it uses more space and is slower than an array. It is, however, much easier to use if you don't have to worry about extreme performance situations. Using an array gives us the speed we need, but it also requires that we be much more careful about index-out-of-bounds situations. That's where our rich suite of tests came to the rescue and let us have our cake and eat it too.

These improvements are available to any application using Quino 1.6.2.0 and higher.



  1. In a subsequent version of this product, we would move each grid/window into its own UI thread in order to parallelize the work and use all 8 cores on the target machine to make updates even faster.

  2. Because of the parallelization mentioned in the footnote above, the subsequent version was still reaching the limit of the 32-bit address space, even with the decreased memory footprint per object. So we compiled as 64-bit to remove that limitation as well.

Encodo C# Handbook 7.17 -- Using System.Linq

I'm currently revising the Encodo C# Handbook to update it for the last year's worth of programming experience at Encodo, which includes a lot more experience with C# 4.0 features like optional parameters, dynamic types and more. The following is an expanded section on working with Linq. A final draft should be available by the middle of April or so.

7.17 -- Using System.Linq

When using Linq expressions, be careful not to sacrifice legibility or performance simply in order to use Linq instead of more common constructs. For example, the following loop sets a property for those elements in a list where a condition holds.

foreach (var pair in Data)
{
  if (pair.Value.Property is IMetaRelation)
  {
    pair.Value.Value = null;
  }
}

This seems like a perfect place to use Linq; assuming an extension method ForEach(this IEnumerable<T>), we can write the loop above using the following Linq expression:

Data.Where(pair => pair.Value.Property is IMetaRelation).ForEach(pair => pair.Value.Value = null);

This formulation, however, is more difficult to read because the condition and the loop are now buried in a single line of code, but a more subtle performance problem has been introduced as well. We have made sure to evaluate the restriction (Where) first so that we iterate the list (with ForEach) with as few elements as possible, but we still end up iterating twice instead of once. This could cause performance problems in border cases where the list is large and a large number of elements satisfy the condition.

7.17.1 -- Lazy Evaluation

Linq is mostly a blessing, but you always have to keep in mind that Linq expressions are evaluated lazily. Therefore, be very careful when using the Count() method because it will iterate over the entire collection (if the backing collection is of base type IEnumerable<T>). Linq is optimized to check the actual backing collection, so if the IEnumerable<T> you have is a list and the count is requested, Linq will use the Count property instead of counting elements naively.

A few concrete examples of other issues that arise due to lazy evaluation are illustrated below.

7.17.2 -- Capturing Unstable Variables/Access to Modified Closure

You can accidentally change the value of a captured variable before the sequence is evaluated. Since ReSharper will complain about this behavior even when it does not cause unwanted side-effects, it is important to understand which cases are actually problematic.

var data = new[] { "foo", "bar", "bla" };
var otherData = new[] { "bla", "blu" };
var overlapData = new List<string>();

foreach (var d in data)
{
  if (otherData.Where(od => od == d).Any())
  {
    overlapData.Add(d);
  }
}

// We expect one element in the overlap, bla
Assert.AreEqual(1, overlapData.Count);

The reference to the variable d will be flagged by ReSharper and marked as an access to a modified closure. This is a reminder that a variable referencedor capturedby the lambda expressionclosurewill have the last value assigned to it rather than the value that was assigned to it when the lambda was created. In the example above, the lambda is created with the first value in the sequence, but since we only use the lambda once, and then always before the variable has been changed, we dont have to worry about side-effects. ReSharper can only detect that a variable referenced in a closure is being changed within the scope that it checks and letting you know so you can verify that there are no unwanted side-effects.

Even though there isnt a problem, you can rewrite the foreach-statement above as the following code, eliminating the Access to modified closure warning.

var overlapData = data.Where(d => otherData.Where(od => od == d).Any()).ToList();

The example above was tame in that the program ran as expected despite capturing a variable that was later changed. The following code, however, will not run as expected:

var data = new[] { "foo", "bar", "bla" };
var otherData = new[] { "bla", "blu" };
var overlapData = new List<string>();

var threshold = 2;
var results = data.Where(d => d.Length == threshold);
var overlapData = data.Where(d => otherData.Where(od => od == d).Any());
if (overlapData.Any())
{
  threshold += 1;
}

// All elements are three characters long, so we expect no matches
Assert.AreEqual(0, results.Count());

Here we have a problem because the closure is evaluated after a local variable that it captured has been modified, resulting in unexpected behavior. Whereas its possible that this is exactly what you intended, its not a recommended coding style. Instead, you should move the calculation that uses the lambda after any code that changes variables that it capture:

var threshold = 2;
var overlapData = data.Where(d => otherData.Where(od => od == d).Any());
if (overlapData.Any())
{
  threshold += 1;
}
var results = data.Where(d => d.Length == threshold);

This is probably the easiest way to get rid of the warning and make the code clearer to read.

Encodo C# Handbook 7.30 -- Loose vs. Tight Coupling

I'm currently revising the Encodo C# Handbook to update it for the last year's worth of programming experience at Encodo, which includes a lot more experience with C# 4.0 features like optional parameters, dynamic types and more. The following is an expanded section on working with Linq. A final draft should be available by the middle of April or so.

7.30 -- Loose vs. Tight Coupling

Whether to use loose or tight coupling for components depends on several factors. If a component on a lower-level must access functionality on a higher level, this can only be achieved with loose coupling: e.g. connecting the two by using one or more delegates or callbacks.

If the component on the higher level needs to be coupled to a component on a lower level, then its possible to have them be more tightly coupled by using an interface. The advantage of using an interface over a set or one or more callbacks is that changes to the semantics of how the coupling should occur can be enforced. The example below should make this much clearer.

Imagine a class that provides a single event to indicate that it has received data from somewhere.

public class DataTransmitter
{
  public event EventHandler<DataBundleEventArgs> DataReceived;
}

This is the class way of loosely coupling components; any component that is interested in receiving data can simply attach to this event, like this:

public class DataListener
{
  public DataListener(DataTransmitter transmitter)
  {
    transmitter.DataReceived += TransmitterDataReceived;
  }

  private TransmitterDataReceived(object sender, DataBundleEventArgs args)
  {
    // Do something when data is received
  }
}

Another class could combine these two classes in the following, classic way:

var transmitter = new DataTransmitter();
var listener = new DataListener(transmitter);

The transmitter and listener can be defined in completely different assemblies and need no dependency on any common code (other than the .NET runtime) in order to compile and run. If this is an absolute must for your component, then this is the pattern to use for all events. Just be aware that the loose coupling may introduce semantic errorserrors in usage that the compiler will not notice.

For example, suppose the transmitter is extended to include a new event, NoDataAvailableReceived.

public class DataTransmitter
{
  public event EventHandler<DataBundleEventArgs> DataReceived;
  public event EventHandler NoDataAvailableReceived;
}

Lets assume that the previous version of the interface threw a timeout exception when it had not received data within a certain time window. Now, instead of throwing an exception, the transmitter triggers the new event instead. The code above will no longer indicate a timeout error (because no exception is thrown) nor will it indicate that no data was transmitted.

One way to fix this problem (once detected) is to hook the new event in the DataListener constructor. If the code is to remain highly decoupledor if the interface cannot be easily changedthis is the only real solution.

Imagine now that the transmitter becomes more sophisticated and defines more events, as shown below.

public class DataTransmitter
{
  public event EventHandler<DataBundleEventArgs> DataReceived;
  public event EventHandler NoDataAvailableReceived;
  public event EventHandler ConnectionOpened;
  public event EventHandler ConnectionClosed;
  public event EventHandler<DataErrorEventArgs> ErrorOccured;
}

Clearly, a listener that attaches and responds appropriately to all of these events will provide a much better user experience than one that does not. The loose coupling of the interface thus far requires all clients of this interface to be proactively aware that something has changed and, once again, the compiler is no help at all.

If we can change the interfaceand if the components can include references to common codethen we can introduce tight coupling by defining an interface with methods instead of individual events.

public interface IDataListener
{
  void DataReceived(IDataBundle bundle);
  void NoDataAvailableReceived();
  void ConnectionOpened();
  void ConnectionClosed();
  void ErrorOccurred(Exception exception, string message);
}

With a few more changes, we have a more tightly coupled system, but one that will enforce changes on clients:

  • Add a list of listeners on the DataTransmitter
  • Add code to copy and iterate the listener list instead of triggering events from the DataTransmitter.
  • Make DataListener implement IDataListener
  • Add the listener to the transmitters list of listeners.

Now when the transmitter requires changes to the IDataListener interface, the compiler will enforce that all listeners are also updated.

Troubleshooting a misbehaving designer in Visual Studio 2010

This article originally appeared on earthli News and has been cross-posted here.


Anyone who's used Visual Studio 20101 for a non-trivial Windows Forms project has run into situations wherein the designer can no longer be opened. Usually, it's because the class encounters null-reference exceptions when referencing data that is unavailable until runtime. Those are easy to fix: just avoid referencing that data in the constructor or load-routine while in design-mode.

However, sometimes Visual Studio has problems loading assemblies that it seems it should have available. Sometimes Visual Studio seems to have a devil of a time loading assemblies whose location it has quite explicitly been told.

If you like, there is a walkthrough -- with screenshots! -- at the end of this article, which shows how to solve even the most intractable designer problems.

A Tale of Two Platforms

One of the troubles is that many developers have moved to 64-bit Windows in order to take advantage of the higher RAM limits. The move to 64-bit causes some issues with many .NET assemblies in that the developer (i.e. probably YOU) didn't remember to take into account that an assembly might be loaded by x86 code or x64 code or some combination thereof. The designer will sometimes be unable to load an assembly because it has been compiled in a way that cannot be loaded by the runtime currently being used by the designer as explicitly requested in the project settings. That said, the request is explicit as far as Visual Studio is concerned, but implicit as far as the developer is concerned.

The only long-lasting solution is to learn how assemblies are loaded and what the best compile settings are for different assemblies so that you will run into as few problems as possible.

There are several considerations:

  1. It would be nice to have class libraries that can be loaded by any executable instead of having separate versions for x64 and x86.
  2. It would also be nice to be able to benefit from as many debugging features of the environment as possible (e.g. the Edit & Continue feature does not work with x64 builds).
  3. It would be nice to have the most optimal executable for the target platform. (This is usually taken to mean an executable compiled to run natively on the target, but turns out not necessarily to be so, as shown below.)

In order to help decide what to do, it's best to go to the source: Microsoft. To that end, the article AnyCPU Exes are usually more trouble than they're worth by Rick Byers provides a lot of guidance.

  • "Running in two very different modes increases product complexity and the cost of testing": two different platforms equals two times as much testing. Build servers have to compile and run tests for all configurations because there can be subtle differences.2
  • "32-bit tends to be faster anyway": the current version of the WOW (Windows-on-Windows) runtime on 64-bit systems actually runs code faster than the native 64-bit runtime. That still holds true as of this writing.
  • "Some features aren't avai[l]able in 64-bit": the aforementioned Edit & Continue counts among these, as does historical debugging if you're lucky enough to have a high-end version of Visual Studio.

Given all of the points made above and assuming that your application does not actually need to be 64-bit (i.e. it needs to address more RAM than is available in the 32-bit address space), your best bet is to use the following rules as your guide when setting up default build and release settings.

  • Pure class libraries should always be compiled for "Any CPU" (i.e. able to be loaded by both x86 and x64 assemblies).
  • Executables should always be compiled as x86.
  • Unit-test assemblies should also be compiled as x86 in order to be able to use Edit & Continue.

Where Did You Find That?!

Once you've set up your build configuration appropriately and rebuilt everything, you will avoid many design-time errors.

Though not all of them.

Visual Studio has a nasty habit of loading assemblies wherever it can find one that matches your requirements, regardless of the location from which you linked in the assembly. If you look in the project file for a C# Visual Studio project (the .csproj-file), you'll actually see an XML element called <HintPath> after each assembly reference. The name is appropriately chosen: Visual Studio will look for an assembly in this location first, but will continue looking elsewhere if it's not there. It will look in the GAC and it will look in the bin/Debug or bin/x86/Debug folder to see if can scrounge up something against which to link. Only if the assembly is not to be found anywhere will Visual Studio give up and actually emit an error message.

At Encodo, we stopped using the GAC entirely, relying instead on local folders containing all required third-party libraries. In this way, we try to control the build configuration and assemblies used when code is downloaded to a new environment (such as a build server). However, when working locally, it is often the case that a developer's environment is a good deal dirtier than that of a build server and must be cleaned.

Though Visual Studio offers an option to clean a project or solution, it doesn't do what you'd expect: assemblies remain in the bin/Debug or bin/x86/Debug folders. We've added a batch command that we use to explicitly delete all of these folders so that Visual Studio once again must rely on the HintPath to find its assemblies.

If you find yourself switching between x86 and x64 assemblies with any amount of frequency, you will run into designer loading errors when the designer manages to find an assembly compiled for the wrong platform. When this happens, you must shut down Visual Studio, clean all output folders as outlined above and re-open the solution.

Including References with ReSharper

A final note on references: if you adopt the same policy as Encodo of very carefully specifying the location of all external references, you have to watch out for ReSharper. If ReSharper offers to "reference assembly X" and "include the namespace Y", you should politely decline and reference the assembly yourself. ReSharper will reference the assembly as expected but will not include a HintPath so the reference will be somewhere in the bin/Debug or bin/x86/Debug folder and will break as soon as you clean all of those directories (as will be the case on a build server).

Designer Assemblies

This almost always works, but Visual Studio can still find ways of loading assemblies over which you have little to no control: the designer assemblies.

In all likelihood, you won't be including the designer assemblies in your third-party binaries folder for several reasons:

  1. They are not strictly required for compilation
  2. The are usually a good deal larger than the assembly that they support and are only used during design-time
  3. Design-time assemblies are usually associated with visual component packages that must be installed anyway in order for a compiled executable to be considered licensed.3

For all of the reasons above, it's best not to even try to get Visual Studio to load designer assemblies out of a specific folder and just let it use the GAC instead.

Walkthrough: Solving a Problem in the Designer

Despite all of the precautions mentioned above, it is still possible to have a misbehaving designer. The designer can be so mischievous that it simply refuses to load, showing neither a stack not an error message, keeping its reasons to itself. How do we solve such a problem?

You know you have a problem when the designer presents the following view instead of your form or user control.

image

In the worst case, you will be given neither a useful error message nor a stack from which to figure out what happened.

image

There's a little link at the top -- right in the middle -- that you can try that may provide you with more information.

image

The designer will try to scare you off one last time before giving up its precious secrets; ignore it.

image

At this point, the designer will finally show the warnings and errors that describe the reason it cannot load.4

image

The text is a bit dense, but one thing pops out immediately:

image

It looks like Visual Studio is checking some cached location within your application settings to find referenced assemblies and their designer assemblies.5 This is a bit strange as Visual Studio has been explicitly instructed to load those assemblies from the third-party folder that we carefully prepared above. Perhaps this cache represents yet another location that must be cleared manually every once in a while in order to keep the designer running smoothly.

[A]DevExpress.XtraLayout.LayoutControl cannot be cast to [B]DevExpress.XtraLayout.LayoutControl. 
Type A originates from 'DevExpress.XtraLayout.v10.2, Version=10.2.5.0, Culture=neutral, PublicKeyToken=b88d1754d700e49a' 
in the context 'LoadNeither' at location 
'C:\Documents and Settings\Marco\Local Settings\Application Data\Microsoft\VisualStudio\10.0\ProjectAssemblies\kn8q9qdt01\DevExpress.XtraLayout.v10.2.dll'. 
Type B originates from 'DevExpress.XtraLayout.v10.2, Version=10.2.4.0, Culture=neutral, PublicKeyToken=b88d1754d700e49a'
in the context 'Default' at location
'C:\WINDOWS\assembly\GAC_MSIL\DevExpress.XtraLayout.v10.2\10.2.4.0__b88d1754d700e49a\DevExpress.XtraLayout.v10.2.dll'.

This will turn out to be a goose chase, however.6 The problem does not lie in the location of the assemblies, but rather in the version. We can see that the designer was attempting to load version 10.2.4.0 of the third-party component library for DevExpress. However, the solution and all projects were referencing the 10.2.5.0 version, which had not been officially installed on that workstation. It was unofficially available because the assemblies were included in the solution-local third-party folder, but the designer files were not.

Instead of simply showing an error message that the desired version of a required assembly could not be loaded, Visual Studio chose instead to first hide the warnings quite well, then to fail to mention the real reason the assembly could not be loaded (i.e. that it conflicted with a newer version already in memory). Instead, the designer left it up to the developer to puzzle out that the error message only mentioned versions that were older than the current one.7

From there, a quick check of the installed programs and the GAC confirmed that the required version was not installed, but the solution was eminently non-obvious.

That's about all for Visual Studio Designer troubleshooting tips. Hopefully, they'll be useful enough to prevent at least some hair from being torn out and some keyboards from being thrown through displays.



  1. All tests were performed with the SP1 Beta version available as of Mid-February 2010.

  2. One such difference is how hash-codes are generated by the default implementation of GetHashCode(): the .NET implementation is optimized for speed, not portability so the codes generated by the 32-bit and 64-bit runtimes are different.

  3. In the case of SyncFusion, this means the application won't even compile; in the case of DevExpress, the application will both compile and run, but will display a nag screen every once in a while.

  4. If you're lucky, of course. If you're unlucky, Visual Studio will already have crashed and helpfully offered to restart itself.

  5. Then it encountered a null-reference exception, which we can only hope will actually get fixed in some service pack or other.

  6. I tried deleting this folder, but it was locked by Visual Studio. I shut down Visual Studio and could delete the folder. When I restarted and reloaded the project as well as the designer, I found to my surprise that Visual Studio had exactly recreated the folder structure that I had just deleted. It appears that this is a sort of copy of the required assemblies, but the purpose of copying assemblies out of the GAC to a user-local temporary folder is unclear. It stinks of legacy workarounds.

  7. In the case of DevExpress, this didn't take too long because it's a large component package and the version number was well-known to the developers in the project. However, for third-party components that are not so frequently updated or which have a less recognizable version number, this puzzle could have remained insoluble for quite some time.

Overriding Equality Operators: A Cautionary Tale

This article originally appeared on earthli News and has been cross-posted here.


tl;dr: This is a long-winded way of advising you to always be sure what you're comparing when you build low-level algorithms that will be used with arbitrary generic arguments. The culprit in this case was the default comparator in a HashSet<T>, but it could be anything. It ends with cogitation about software processes in the real world.

Imagine that you have a framework (The Quino Metadata framework from Encodo Systems AG) with support for walking arbitrary object graphs in the form of a GraphWalker. Implementations of this interface complement a generalized algorithm.

This algorithm generates nodes corresponding to various events generated by the graph traversal, like beginning or ending a node or edge or encountering a previously processed node (in the case of graphs with cycles). Such an algorithm is eminently useful for formatting graphs into a human-readable format, cloning said graphs or other forms of processing.

A crucial feature of such a GraphWalker is to keep track of the nodes it has seen before in order to avoid traversing the same node multiple times and going into an infinite loop in graphs with cycles. For subsequent encounters with a node, the walker handles it differently -- generating a reference event rather than a begin node event.

A common object graph is the AST for a programming language. The graph walker can be used to quickly analyze such ASTs for nodes that match particular conditions.

Processing a Little Language

Let's take a look at a concrete example, with a little language that defines simple boolean expressions:

OR(
  (A < 2)
  (B > A)
)

It's just an example and we don't really have to care about what it does, where A and B came from or the syntax. What matters is the AST that we generate from it:

1 Operator (OR)
2  Operator (<)
3    Variable (A)
4    Constant (2)
5  Operator (>)
6    Constant (B)
7    Variable (A)

When the walker iterates over this tree, it generates the following events (note the numbers at the front of the line correspond to the object in the diagram above:

1 begin node
1  begin edge
2    begin node
2      begin edge
3        begin node
3        end node
4        begin node
4        end node
2      end edge
2    end node
5    begin node
5      begin edge
6        begin node
6        end node
7        begin node
7        end node
5      end edge
5    end node
1  end edge

Now that's the event tree we expect. This is also the event tree that we get for the objects that we've chosen to represent our nodes (Operator, Variable and Constant in this case). If, for example, we process the AST and pass it through a formatter for this little language, we expect to get back exactly what we put in (namely the code in Listing 1). Given the event tree, it's quite easy to write such a formatter -- namely, by handling the begin node (output the node text), begin edge (output a "(") and end edge (output a ")") events.

So far, so good?

Running Into Trouble

However, now imagine that we discover a bug in other code that uses these objects and we discover that when two different objects refer to the same variable, we need them to be considered equal. That is, we update the equality methods -- in the case of .NET, Equals() and GetHashCode() -- for Variable.

As soon as we do, however, the sample from Listing 1 now formats as:

OR(
  (A < 2)
  (B > )
)

Now we have to figure out what happened. A good first step is to see what the corresponding event tree looks like now. We discover the following:

1 begin node
1  begin edge
2    begin node
2      begin edge
3        begin node
3        end node
4        begin node
4        end node
2      end edge
2    end node
5    begin node
5      begin edge
6        reference
7        begin node
7        end node
5      end edge
5    end node
1  end edge

The change is highlighted and affects the sixth node, which has now become a reference because we changed how equality is handled for Variables. The algorithm now considers any two Variables with the same name to be equivalent even if they are two different object references.

Fix #1 -- Hack the Application Code)

If we look back at how we wrote the simple formatter above, we only handled the begin node, begin edge and end edge events. If we throw in a handler for the reference event and output the text of the node, we're back in business and have "fixed" the formatter.

Fix #2 -- Fix the Algorithm

But we ignore the more subtle problem at our own peril: namely, that the graph walking code is fragile in that its behavior changes due to seemingly unrelated changes in the arguments that it is passed. Though we have a quick fix above, we need to think about providing more stability in the algorithm -- especially if we're providers of low-level framework functionality.1

The walker algorithm uses a HashSet<T> to track the nodes that it has previously encountered. However, the default comparator -- again, in .NET -- leans on the equality functions of the objects stored in the map to determine membership.

The first solution -- or rather, the second one, as we already "fixed" the problem with what amounts to a hack above by outputting references as well -- is to change the equality comparator for the HashSet<T> to explicitly compare references. We make that change and we can once again remove the hack because the algorithm no longer generates references for subsequent variable encounters.

Fix #3 -- Giving the Caller More Control

However, we're still not done. We've now not only gotten our code running but we've fixed the code for the algorithm itself so the same problem won't crop up again in other instances. That's not bad for a day's work, but there's still a nagging problem.

What happens if the behavior that was considered unexpected in this case is exactly the behavior that another use of the algorithm expects? That is, it may well be that other types of graph walker will actually want to be able to control what is and is not a reference by changing the equivalence functions for the nodes.2

Luckily, callers of the algorithm already pass in the graph walker itself, the methods of which the algorithm already calls to process nodes and edges. A simple solution is to add a method to the graph walker interface to ask it to create the kind of HashSet<T> that it would like to use to track references.

Tough Decisions: Which Fix to Use?

So how much time does this all take to do? Well, the first solution -- the hack in application code -- is the quickest, with time spent only on writing the unit test for the AST and verifying that it once again outputs as expected.

If we make a change to the framework, as in the second solution where we change the equality operator, we have to create unit tests to test the behavior of the AST in application code, but using test objects in the framework unit tests. That's a bit more work and we may not have time for it.

The last suggestion -- to extend the graph walker interface -- involves even more work because we then have to create two sets of test objects: one set that tests a graph walker that uses reference equality (as the AST in the application code) and one that uses object equality (to make sure that works as well).

It is at this point that we might get swamped and end up working on framework code and unit tests that verify functionality that isn't even being used -- and certainly isn't being used by the application with the looming deadline. However, we're right there, in the code, and will never be better equipped to get this all right than we are right now. But what if we just don't have time? What if there's a release looming and we should just thank our lucky stars that we found the bug? What if there's no time to follow the process?

Well, sometimes the process has to take a back seat, but that doesn't mean we do nothing. Here are a few possibilities:

  1. Do nothing in the framework; add an issue to the issue tracker explaining the problem and the work that needs to be done so that it can be fixed at a more opportune time (or by a developer with time). This costs a few minutes of time and is the least you should do.
  2. Make the fix in the framework to prevent others from getting bitten by this relatively subtle bug and add an issue to the issue tracker describing the enhanced fix (adding a method to the graph walker) and the tests that need to be written.
  3. Add the method to the graph walker interface so that not only do others not get bitten by the bug but, should they need to control equivalence, they can do so. Add an issue describing the tests that need to be written to verify the new functionality.

What about those who quite rightly frown at the third possibility because it would provide a solution for what amounts to a potential -- as opposed to actual -- problem? It's really up to the developer here and experience really helps. How much time does it take to write the code? How much does it change the interface? How many other applications are affected? How likely is it that other implementations will need this fix? Are there potential users who won't be able to make the fix themselves? Who won't be able to recompile and just have to live with the reference-only equivalence? How likely is it that other code will break subtly if the fix is not made? It's not an easy decision either way, actually.

Though purists might be appalled at the fast and loose approach to correctness outlined above, pragmatism and deadlines play a huge role in software development. The only way to avoid missing deadlines is to have fallback plans to ensure that the code is clean as soon as possible rather than immediately as a more stringent process would demand.

And thus ends the cautionary tale of making assumptions about how objects are compared and how frameworks are made.



  1. Which we are (The Quino Metadata framework from Encodo Systems AG).

  2. This possibility actually didn't occur to me until I started writing this blog post, which just goes to show how important it is to document and continually think about the code your write/have written.

Getting Started with .NET / C# 4.0

Every once in a while, I get asked if I have any suggestions as to how to get started with .NET/C# development. Generally, the person asking has a good background in programming and design. In order to really use a language and library, though, you want to know which concepts and patterns are used in a language as well as what those patterns are called, how to use them and so on. If you don't know what's already available, you end up reinventing the wheel or doing things in an inefficient way.

Generally, you can find information about many of these topics at Microsoft's MSDN documentation or you can search for articles on working with them. Just remember to always look for existing functionality in the .NET framework or the Quino framework -- only if you have the pleasure of working with it, naturally -- or by searching for solutions online. A good place to look for such solutions is Stack Overflow.

Without further ado, here's a list of topics and links that I think would be helpful in getting started with .NET / C# 4.0.1



  1. This list of links is intended for new Encodo employees, to get them up to speed on what kind of technology we use and what kind of problems they will be expected to be able to solve. Some of the links will not necessarily be relevant to your own programming requirements (e.g. the DevExpress documentation).

  2. If you see the name Jon Skeet on a post or article, then you will be well-served. The guy is pretty consistently well-informed.

Sealed classes and methods

This article originally appeared on earthli News and has been cross-posted here.


According to the official documentation, the sealed keyword in C# serves the following dual purpose:

When applied to a class, the sealed modifier prevents other classes from inheriting from it. [...] You can also use the sealed modifier on a method or property that overrides a virtual method or property in a base class. This enables you to allow classes to derive from your class and prevent them from overriding specific virtual methods or properties.

Each inheritable class and overridable method in an API is part of the surface of that API. Functionality on the surface of the API costs money and time because it implies a promise to support that API through subsequent versions. The provider of the API more-or-less guarantees that potential modifications -- through inheritance or overriding -- will not be irrevocably broken by upgrades. At the very least, it implies that so-called breaking changes are well-documented in a release and that an upgrade path is made available.

In C#, the default setting for classes and methods is that classes are not sealed and methods are sealed (non-virtual, which amounts to the same thing). Additionally, the default visibility in C# is internal, which means that the class or method is only visible to other classes in the assembly. Thus, the default external API for an assembly is empty. The default internal API allows inheritance everywhere.

Some designers recommend the somewhat radical approach of declaring all classes sealed and leaving methods as non-virtual by default. That is, they recommend reducing the surface area of the API to only that which is made available by the implementation itself. The designer should then carefully decide which classes should be extensible -- even within the assembly, because designers have to support any API that they expose, even if it's only internal to the assembly -- and unseal them, while deciding which methods should be virtual.

From the calling side of the equation, sealed classes are a pain in the ass. The framework designer, in his ineffable wisdom, usually fails to provide an implementation that does just what the caller needs. With inheritance and virtual methods, the caller may be able to get the desired functionality without rewriting everything from scratch. If the class is sealed, the caller has no recourse but to pull out Reflector(tm) and make a copy of the code, adjusting the copy until it works as desired.

Until the next upgrade, when the original version gets a few bug fixes or changes the copied version begins to diverge from it. It's not so clear-cut whether to seal classes or not, but the answer is -- as with so many other things -- likely a well-thought out balance of both approaches.

Sealing methods, on the other hand, is simply a way of reverting that method back to the default state of being non-virtual. It can be quite useful, as I discovered in a recent case, shown below.

I started with a class for which I wanted to customize the textual representation -- a common task.

**class** Expression
{
  **public override** string ToString()
  {
    // Output the expression in human-readable form
  }
}

**class** FancyExpression : Expression
{
  **public override** string ToString()
  {
    // Output the expression in human-readable form
  }
}

So far, so good; extremely straightforward. Imagine dozens of other expression types, each overriding ToString() and producing custom output.

Time passes and it turns out that the formatting for expressions should be customizable based on the situation. The most obvious solution it to declare an overloaded version of ToString() and then call the new overload from the overload inherited from the library, like this:

**class** Expression
{
  **public override** string ToString()
  {
    return string ToString(ExpressionFormatOptions.Compact);
  }

  **public virtual** string ToString(ExpressionFormatOptions options)
  {
    // Output the expression in human-readable form
  }
}

Since the new overload is a more powerful version of the basic ToString(), we just redefine the latter in terms of the former, choosing appropriate default options. That seems simple enough, but now the API has changed and in a seemingly unenforcable way. Enforcable, in this context, means that the API can use the semantics of the language to force callers to use it in a certain way. Using the API in non-approved ways should result in a compilation error.

This new version of the API now has two virtual methods, but the overload of ToString() without a parameter is actually completely defined in terms of the second overload. Not only is there no longer any reason to override it, but it would be wrong to do so -- because the API calls for descendants to override the more powerful overload and to be aware of and handle the new formatting options.

But, this is the second version of the API and there are already dozens of descendants that override the basic ToString() method. There might even be descendants in other application code that isn't even being compiled at this time. The simplest solution is to make the basic ToString() method non-virtual and be done with it. Descendents that overrode that method would no longer compile; maintainers could look at the new class declaration -- or the example-rich release notes! -- to figure out what changed since the last version and how best to return to a compilable state.

But ToString() comes from the object class and is part of the .NET system. This is where the sealed keyword comes in handy. Just seal the basic method to prevent overrides and the compiler will take care of the rest.

**class** Expression
{
  **public override** sealed string ToString()
  {
    return ToString(ExpressionFormatOptions.Compact);
  }

  **public virtual** string ToString(ExpressionFormatOptions options)
  {
    // Output the expression in human-readable form
  }
}

Even without release notes, a competent programmer should be able to figure out what to do. A final tip, though, is to add documentation so that everything's crystal clear.

**class** Expression
{
  /// <summary>
  /// Returns a text representation of this expression.
  /// </summary>
  /// <returns>
  /// A text representation of this expression.
  /// </returns>
  /// <remarks>
  /// This method can no longer be overridden; instead, override <see cref="ToString(ExpressionFormatOptions)"/>.
  /// </remarks>
  /// <seealso cref="ToString(ExpressionFormatOptions)"/>
  **public override** sealed string ToString()
  {
    return ToString(ExpressionFormatOptions.Compact);
  }

  /// <summary>
  /// Gets a text representation of this expression using the given <paramref name="options"/>.
  /// </summary>
  /// <param name="options">The options to apply.</param>
  /// <returns>
  /// A text representation of this expression using the given <paramref name="options"/>
  /// </returns>
  **public virtual** string ToString(ExpressionFormatOptions options)
  {
    // Output the expression in human-readable form
  }
}
Cross MonoTouch off the list

This article originally appeared on earthli News and has been cross-posted here.


Apple presented the iPhone OS 4.0 late last week. The new version includes hundreds of new API calls for third-party developers, including long-sought-after support for multi-tasking. The changes extended to the licensing agreement for iPhone developers, with section 3.3.1 getting considerable modification, as documented in the article, Adobe man to Apple: 'Go screw yourself' by Cade Metz. That section now reads:

Applications must be originally written in Objective-C, C, C++, or JavaScript as executed by the iPhone OS WebKit engine, and only code written in C, C++, and Objective-C may compile and directly link against the Documented APIs (e.g., Applications that link to Documented APIs through an intermediary translation or compatibility layer or tool are prohibited).

That doesn't sound too good for Adobe, which had planned to allow direct compilation of iPhone applications from Flash in CS5. And it doesn't sound too good for MonoTouch either, which allows developers to write iPhone applications using the .Net framework and the C# language. The license for iPhone 3.2 prevented applications from using interpreters or virtual machines, but both CS5 and MonoTouch steered clear of those problems by compiling directly to iPhone OS machine code.

The new wording in section 3.3.1 seems to be Apple's attempt to exclude these technologies with about as much subtelety as a five-year--old making up new rules during a game he invented. The official response, MonoTouch and iPhone OS 4, is understandably upbeat: they've already invested way too much time and effort to give up now. Their optimism that "[a]pplications built with MonoTouch are native applications indistinguishable from native applications" (whatever that means) seems suspiciously desperate since MonoTouch applications are written against the .NET framework in the C#-language, which means that they are most certainly not "written in C, C++, and Objective-C".

Maybe the MonoTouch project will continue to be able to build iPhone applications that have a hope of being accepted by the iPhone App Store. But the rewording of section 3.3.1 puts the power to discontinue support wholly in Apple's hands. Developers would be silly to get on board with MonoTouch now without a far more explicit show of support from Apple. MonoTouch is putting on a brave face and promises that "[s]upport for iPhoneOS 4.0 on MonoTouch will be arriving soon."

A typically well--thought-out article, Why Apple Changed Section 3.3.1 by John Gruber details what the new wording means for Apple. And the answer, as usual, is control. It "makes complete sense" from Apple's perspective of "ruthless competitiveness". Apple is using the popularity of its platform to force developers to only spend time developing for Apple's platform instead of for multiple platforms simultaneously.

Flash CS5 and MonoTouch arent so much cross-platform as meta-platforms. Adobes goal isnt to help developers write iPhone apps. Adobes goal is to encourage developers to write Flash apps that run on the iPhone (and elsewhere) instead of writing iPhone-specific apps. Apple isnt just ambivalent about Adobes goals in this regard it is in Apples direct interest to thwart them.

There are aesthetic arguments to be made that cross-platform applications sully an operating system. There are very few of them that are truly well-integrated -- and those that are take a tremendous amount of time, patience and versions to get that far. On the OS X platform especially, it's incredibly easy to spot applications that were made exclusively for OS X and those that were ported from another operating system. It's truly like night and day. Preferring native applications, however, is a good deal different than banning non-native ones. As a C# developer with a large library of code I'd like to use, I can no longer assure clients that an iPhone application is easily achievable -- not without spending a lot of time and money learning Objective-C, the XCode toolset and the Cocoa APIs. Jobs and Co. would argue that I have no business developing applications for a platform without an intimate knowledge of its APIs, but that's philosophical until they see the end-product.

Simply banning a procedure for building applications because the end-product may be unsatisfactory seems arbitrarily iron-fisted. Apple has always reserved the right to determine which Apps show up in the App Store and which do not. (As of this writing, Apple has been "evaluating" Opera Mini for the iPhone for almost 20 days.) That's why Gruber's analysis probably does get the real reason right: Apple's doing it because (A) they can and (B) they retain more control and (C) most of their users don't care one way or the other and (D) there are enough iPhone developers willing to follow Apple's rules and make mountains of money for Apple.

Backing up this impression is an actual, honest-to-God response from El Jobso, as documented in the post Steve Jobs response on Section 3.3.1 by Greg Slepak, where Jobs says that "Grubers post is very insightful" and goes on to say that Apple prefers native applications because:

[...] intermediate layers between the platform and the developer ultimately produces sub-standard apps and hinders the progress of the platform.

As discussed above, though such layers may produce sub-standard apps -- and often do -- one does not necessarily follow from the other. That is, Jobs is merely hand-waving, arguing that a decision made for cut-throat business reasons was made in the interests of quality. There will always be developers writing bad software with Apple's tools and there would have been developers writing insanely great software using CS5 or MonoTouch.

Apple actually already had what could be considered a user-friendly and customer-oriented program in place: They were able to reject bad applications individually. Is Jobs arguing that cross-platform tools were creating so many bad applications that Apple was losing profits just from the time and effort involved in rejecting them? Or does Jobs fear the flood of Flash-to-iPhone applications descending on Cupertino with the advent of CS5?

Maybe Apple will bow to pressure and modify the section again -- it wouldn't be the first time a company tried to get away with something and had to backtrack. In the end, though, Apple can do what it wants with its platform -- and it plans to.

Building pseudo-DSLs with C# 3.5

This article originally appeared on earthli News and has been cross-posted here.


DSL is a buzzword that's been around for a while and it stands for [D]omain-[S]pecific [L]anguage. That is, some tasks or "domains" are better described with their own language rather than using the same language for everything. This gives a name to what is actually already a standard practice: every time a program assumes a particular format for an input string (e.g. CSV or configuration files), it is using a DSL. On the surface, it's extremely logical to use a syntax and semantics most appropriate to the task at hand; it would be hard to argue with that. However, that's assuming that there are no hidden downsides.

DSL Drawbacks

And the downsides are not inconsequential. As an example, let's look at the DSL "Linq", which arrived with C# 3.5. What's the problem with Linq? Well, nothing, actually, but only because a lot of work went into avoiding the drawbacks of DSLs. Linq was written by Microsoft and they shipped it at the same time as they shipped a new IDE -- Visual Studio 2008 -- which basically upgraded Visual Studio 2005 in order to support Linq. All of the tools to which .NET developers have become accustomed worked seamlessly with Linq.

However, it took a little while before JetBrains released a version of ReSharper that understood Linq...and that right there is the nub of the problem. Developer tools need to understand a DSL or you might as well just write it in Notepad. The bar for integration into an IDE is quite high: developers expect a lot these days, including:

  • The DSL must include a useful parser that pinpoints problems exactly.
  • The DSL syntax must be clear and must support everything a developer may possibly want to do with it.1
  • The DSL must support code-completion.
  • ReSharper should also work with the DSL, if possible.
  • And so on...

What sounds, on the surface, like a slam-dunk of an idea, suddenly sounds like a helluva lot more work than just defining a little language2. That's why Encodo decided early on to just use C# for everything in its Quino framework, wherever possible. The main part of a Quino application is its metadata, or the model definition. However, instead of coming up with a language for defining the metadata, Encodo lets the developer define the metadata using a .NET-API, which gives that developer the full power of code-completion, ReSharper and whatever other goodies they may have installed to help them get their jobs done.

Designing a C#-based DSL

Deciding to use C# for APIs doesn't mean, however, that your job is done quickly: you still have to design an API that not only works, but is intuitive enough to let developers use it with as little error and confusion as possible.

I recently extended the API for building metadata to include being able to group other metadata into hierarchies called "layouts". Though the API is implementation-agnostic, its primary use will initially be to determine how the properties of a meta-class are laid out in a form. That is, most applications will want to have more control over the appearance than simply displaying the properties of a meta-class in a form from first-to-last, one to a line.

In the metadata itself, a layout is a group of other elements; an element can be a meta-property or another group. A group can have a caption. Essentially, it should look like this when displayed (groups are surrounded by []; elements with <>):

[MainTab]
-----------------------------------
|  <Company>
|  [MainFieldSet]
|  --------------------------------
|  |  <Contact>
|  |  [ <FirstName> <LastName> ]
|  |  <Picture>
|  |  <Birthdate>
|  --------------------------------
|  [ <IsEmployee> <Active> ]
-----------------------------------

From the example above, we can extract the following requirements:

  1. Groups can be nested.
  2. Groups can have captions, but a caption is not required.
  3. An element can be an anonymous group, a named group or an individual metadata element.

Design Considerations

One way of constructing this in a traditional programming language like C# is to create a new group when needed, using a constructor with a caption or not, as needed. However, I also wanted to make a DSL, which has as little cruft as possible; that is, I wanted to avoid redundant parameters and unnecessary constructors. I also wanted to avoid forcing the developer to provide direct references to meta-property elements where it would be more comfortable to just use the name of the property instead.

To that end, I decided to avoid making the developer create or necessarily provide the actual destination objects (i.e. the groups and elements); instead, I would build a parallel set of throwaway objects that the developer would either implicitly or explicitly create. The back-end could then use those objects to resolve references to elements and create the target object-graph with proper error-checking and so on. This approach also avoids getting the target metadata "dirty" with properties or methods that are only needed during this particular style of construction.

Defining the Goal

I started by writing some code in C# that I thought was both concise enough and offered visual hints to indicate what was being built. That is, I used whitespace to indicate grouping of elements, exactly as in the diagram from the requirements above.

Here's a simple example, with very little grouping:

builder.AddLayout(
  personClass, "Basic", 
  Person.Relations.Contact,
  new LayoutGroup(Person.Fields.FirstName, Person.Fields.LastName),
  Person.Fields.Picture,
  Person.Fields.Birthdate
  new LayoutGroup(Person.Fields.IsEmployee, Person.Fields.Active)
);

The code above creates a new "layout" for the class personClass named "Details". That takes care of the first two parameters; the much larger final parameter is an open array of elements. These are primarily the names of properties to include from personClass (or they could also be the properties themselves). In order to indicate that two properties are on the same line, the developer must group them using a LayoutGroup object.

Here's a more complex sample, with nested groups (this one corresponds to the original requirement from above):

builder.AddLayout(
  personClass, "Details", 
  new LayoutGroup("MainTab",
    Person.Relations.Company,
    new LayoutGroup("MainFieldSet",
      Person.Relations.Contact,
      new LayoutGroup(Person.Fields.FirstName, Person.Fields.LastName),
      Person.Fields.Picture,
      Person.Fields.Birthdate
    ),
    new LayoutGroup(Person.Fields.IsEmployee, Person.Fields.Active)
  )
);

In this example, we see that the developer can also use a LayoutGroup to attach a caption to a group of other items, but that otherwise everything pretty much stays the same as in the simpler example.

Finally, a developer should also be able to refer to other layout definitions in order to avoid repeating code (adhering to the D.R.Y. principle3). Here's the previous example redefined using a reference to another layout (highlighted):

builder.AddLayout(
  personClass, "Basic", 
  Person.Relations.Contact,
  new LayoutGroup(Person.Fields.FirstName, Person.Fields.LastName),
  Person.Fields.Picture,
  Person.Fields.Birthdate
);

builder.AddLayout(
  personClass, "Details", 
  new LayoutGroup("MainTab",
    Person.Relations.Company,
    new LayoutGroup("MainFieldSet",
      new LayoutReference("Basic");
    )),
    new LayoutItems(Person.Fields.IsEmployee, Person.Fields.Active)
  ))
);

Implementation

Now that I had an API I thought was good enough to use, I had to figure out how to get the C# compiler to not only accept it, but also to give me the opportunity to build the actual target metadata I wanted.

The trick ended up being to define a few objects for the different possibilities -- groups, elements, references, etc. -- and make them implicitly convert to a basic LayoutItem. Using implicit operators allowed me to even convert strings to meta-property references, like this:

public static implicit operator LayoutItem(string identifier)
{
  return new LayoutItem(identifier);
}

Each of these items has a reference to each possible type of data and a flag to indicate which of these data are valid and can be extracted from this item. The builder receives a list of such items, each of which may have a sub-list of other items. Processing the list is now as simple as iterating them with foreach, something like this:

private void ProcessItems(IMetaGroup group, IMetaClass metaClass, LayoutItem[] items)
{
  foreach (var item in items)
  {
    if (!String.IsNullOrEmpty(item.Identifier))
    {
      var element = metaClass.Properties[item.Identifier];
      group.Elements.Add(element);
    }
    else if (item.Items != null)
    {
      var subGroup = CreateNextSubGroup(group);
      group.Elements.Add(subGroup);
      ProcessItems(subGroup, metaClass, item.Items.Items);
    }
    else if (item.Group != null)
    {
      ...
    }
    else (...)
  }
}

If the item was created from a string, the builder looks up the property to which it refers in the meta-class and add that to the current group. If the item corresponds to an anonymous group, the builder creates a new group and calls adds the items to it recursively. Here we can see how this solution spares the application developer the work of looking up each and every referenced property in application code. Instead, the developer's code stays clean and short.

Naturally, my solution has many more cases but the sample above should suffice to show how the full solution works.

Cleaning it up

The story didn't just end there, as there are limitations to forcing C# to doing everything we'd like. The primary problem came from distinguishing between the string that is the caption from strings that are references to meta-properties. To avoid this problem, I was forced to introduce a LayoutItems class for anonymous groups and reserve the LayoutGroup for groups with captions.

I was not able to get the implementation to support my requirements exactly as I'd designed them, but it ended up being pretty close. Below is the first example from the requirements, but changed to accommodate the final API; all changes are highlighted.

builder.AddLayout(
  personClass, "Details", 
  new LayoutGroup("MainTab", new LayoutItems(
    Person.Relations.Company,
    new LayoutGroup("MainFieldSet", new LayoutItems(
      Person.Relations.Contact,
      new LayoutItems(Person.Fields.FirstName, Person.Fields.LastName),
      Person.Fields.Picture,
      Person.Fields.Birthdate
    )),
    new LayoutItems(Person.Fields.IsEmployee, Person.Fields.Active)
  ))
);

All in all, I'm pretty happy with how things turned out: the API is clear enough that the developer should be able to both visually debug the layouts and easily adjust them to accommodate changes. For example, it's quite obvious how to add a new property to a group, move a property to another line or put several properties on the same line. Defining this pseudo-DSL in C# lets the developer use code-completion, popup documentation and the full power of ReSharper and frees me from having to either write or maintain a parser or development tools for a DSL.



  1. Even Linq has its limitations, of course, notably when using together with Linq-to-Entities in the Entity Framework. One obvious limitation in the first version is that "Contains" or "In" are not directly supported, requiring the developer to revert to yet another DSL, ESQL (Entity-SQL).

  2. Before getting the moniker "DSL", the literature referred to such languages as "little languages".

  3. On a side note, Encodo recently looked into the Spark View Engine for .NET MVC. Though we decided not to use it because we don't really need it yet, we were also concerned that it has only nascent support for code-completion and ReSharper in its view-definition language.

Designing a small API: Bit manipulation in C#

This article originally appeared on earthli News and has been cross-posted here.


A usable API doesn't usually spring forth in its entirety on the first try. A good, usable API generally arises iteratively, improving over time. Naturally, when using words like good and usable, I'm obliged to define what exactly I mean by that. Here are the guidelines I use when designing an API, in decreasing order of importance:

Static typing & Compile-time Errors

Wherever possible, make the compiler stop the user from doing something incorrectly instead of letting the runtime handle it.

Integrates into standard practices

That is, do not invent whole new ways of doing things; instead, reuse or build on the paradigms already present in the language.

Elegance

Ideally, using the API should be intuitive, read like natural language and not involve a bunch of syntactic tricks or hard-to-remember formulations or parameter lists.

Clean Implementation

The internals should be as generalized and understandable as possible and involve as little repetition as possible.

CLS-Compliance

Cross-language compliance is also interesting and easily achieved for all but the most low-level of APIs

Using those guidelines, I designed an API to manage bits and sets of bits in C#. Having spent a lot of time using Delphi Pascal, I'd become accustomed to set and bit operations with static typing. In C#, the .Net framework provides the Set generic type, but that seems like overkill when the whole idea behind using bits is to use less space. That means using enumerated types and the FlagsAttribute; however, there are some drawbacks to using the native bit-operations directly in code:

  1. Bit-manipulation is more low-level than most of the rest of the coding a C#-developer typically does. That, combined with doing it only rarely, makes direct testing of bits error-prone.
  2. The syntax for testing, setting and removing bits is heavy with special symbols and duplicated identifiers.

To demonstrate, here is a sample:

[Flags]
enum TestValues
{
  None = 0,
  One = 1,
  Two = 2,
  Three = 4,
  Four = 8,
  All = 15,
}

// Set bits one and two:
var bitsOneAndTwo = TestValues.One | TestValues.Two;

// Remove bit two :
var bitOneOnly = bitsOneAndTwo & ~TestValues.Two;

// Testing for bit two:
if ((bitsOneAndTwo & TestValues.Two) == TestValues.Two)
{
  ...
}

As you can see in the example above, setting a bit is reasonably intuitive (though it's understandable to get confused about using | instead of & to combine bits). Removing a bit is more esoteric, as the combination of & with the ~ (inverse) operator is easily forgotten if not often used. Testing for a bit is quite verbose and extending to testing for one of several flags even more so.

Version One

Therefore, to make things easier, I decided to make some extension methods for these various functions and ended up with something like the following:

public static void Include<T>(this T flags, T value) { ... }
public static void Exclude<T>(this T flags, T value) { ... }
public static bool In<T>(this T flags, T value) { ... }
public static void ForEachFlag<T>(this T flags, Action<T> action) { ... }

These definitions compiled and worked as expected, but had the following major drawbacks:

  • At the time, we were only using them with enum values, but code completion was offering the methods for all objects because there was no generic constraint on T.
  • Not only that, but much of the bit-manipulation code needed to know the base type of the arguments in order to be able to cast it to and from the correct types. There were a lot of checks, but it all happened at runtime.
  • The ForEachFlag() function was implemented as a lambda when it is clearly an iteration. Using a lambda instead makes it impossible to use break or continue with this method.

This version, although it worked, broke several of the rules outline above; namely: while it did offer compile-time checking, the implementation had a lot of repetition in it and the iteration did not make use of the common library enumeration support (IEnumerable and foreach). That the operations were available for all objects and polluted code-completion only added insult to injury.

Version Two

A natural solution to the namespace-pollution problem is to add a generic constraint to the methods, restricting the operations to objects of type Enum, as follows:

public static void Include<T>(this T flags, T value)
  where T : Enum
{ ... }

public static void Exclude<T>(this T flags, T value)
  where T : Enum
{ ... }

public static bool In<T>(this T flags, T value)
  where T : Enum
{ ... }

public static void ForEachFlag<T>(this T flags, Action<T> action)
  where T : Enum
{ ... }

.NET enum-declarations, however, do not inherit from Enum; instead, they inherit from Int32, by default, but can also inherit from a handful of other base types (e.g. byte, Int16). This makes sense so that enum-values can be freely converted to and from these base values. Not only will a generic constraint as defined above not have the intended effect, it's explicitly disallowed by the compiler. So, that's a dead-end.

The other, more obvious way of restricting the target type of an extension method is to change the type of the first parameter from T to something else. However, since enum types don't inherit from Enum, what type do we use? Well, it turns out that Enum is a strange type, indeed. It can't be used in a generic constraint and does not serve as the base class for enumerated types but, when used as the target of an extension method, it magically applies to all enumerated types!

I took advantage of this loophole to build the next version of the API, as follows:

public static void Include<T>(this Enum flags, T value) { ... }
public static void Exclude<T>(this Enum flags, T value) { ... }
public static bool In<T>(this T flags, Enum value) { ... }
public static void ForEachFlag<T>(this Enum flags, Action<T> action) { ... }

This version had two advantages over the first version:

  1. The methods are only available for enumerated types instead of for all types, which cleans up the code-completion pollution.
  2. The implementation could take advantage of the Enum.GetTypeCode() method instead of the is and as-operators to figure out the type and cast the input accordingly.

Version Three

After using this version for a little while, it became obvious that there were still problems with the implementation:

  1. Though using Enum as the target type of the extension method was a clever solution, it turns out to be a huge violation of the first design-principle outlined above: The type T for the other parameters is not guaranteed to conform to Enum. That is, the compiler cannot statically verify that the bit being checked (value) is of the same type as the bit-set (flags).
  2. The solution only works with Enum objects, where it would also be appropriate for Int32, Int64 objects and so on.
  3. The ForEach method still has the same problems it had in the first version; namely, that it doesn't allow the use of break and continue and therefore violates the second design-principle above.

A little more investigation showed that the Enum.GetTypeCode() method is not unique to Enum but implements a method initially defined in the IConvertible interface. And, as luck would have it, this interface is implemented not only by the Enum class, but also by Int32, Int64 and all of the other types to which we would like to apply bit- and set-operations.

Knowing that, we can hope that the third time's a charm and redesign the API once again, as follows:

public static void Include<T>(this T flags, T value)
  where T : IConvertible
{ ... }

public static void Exclude<T>(this T flags, T value)
  where T : IConvertible
{ ... }

public static bool In<T>(this T flags, T value)
  where T : IConvertible
{ ... }

public static void ForEachFlag<T>(this T flags, Action<T> action)
  where T : IConvertible
{ ... }

Now we have methods that apply only to those types that support set- and bit-operations (more or less1). Not only that, but the value and action arguments are once again guaranteed to be statically compliant with the flags arguments.

With two of the drawbacks eliminated with one change, we converted the ForEachFlag method to return an IEnumerable<T> instead, as follows:

public static IEnumerable<T> GetEnabledFlags<T>(this T flags)
  where T : IConvertible
{ ... }

The result of this method can now be used with foreach and works with break and continue, as expected. Since the method also now applies to non-enumerated types, we had to re-implement it to return the set of possible bits for the type instead of simply iterating the possible enumerated values returned by Enum.GetValues().2

This version satisfies the first design principles (statically-typed, standard practice, elegant) relatively well, but is still forced to make concessions in implementation and CLS-compliance. It turns out that the IConvertible interface is somehow not CLS-compliant, so I was forced to mark the whole class as non-compliant. On the implementation side, I was avoiding the rather clumsy is-operator by using the IConvertible.GetByteCode() method, but still had a lot of repeated code, as shown below in a sample from the implementation of Is:

switch (flags.GetTypeCode())
{
  case TypeCode.Byte:
    return (byte)(object)flags == (byte)(object)value;
  case TypeCode.Int32:
    return (int)(object)flags == (int)(object)value;
  ...
}

Unfortunately, bit-testing is so low-level that there is no (obvious) way to refine this implementation further. In order to compare the two convertible values, the compiler must be told the exact base type to use, which requires an explicit cast for each supported type, as shown above. Luckily, this limitation is in the implementation, which affects the maintainer and not the user of the API.

Since implementing the third version of these "BitTools", I've added support for Is (shown partially above), Has, HasOneOf and it looks like the third time might indeed be the charm, as the saying goes.


[Flags]
enum TestValues
{
  None = 0,
  One = 1,
  Two = 2,
  OneOrTwo = 3,
  All = 3,
}

That is, foreach (Two.GetEnabledFlags()) { ... } should return only Two and foreach (All.GetEnabledFlags()) { ... } should return One and Two.


  1. The IConvertible interface is actually implemented by other types, to which our bit-operations don't apply at all, like double, bool and so on. The .NET library doesn't provide a more specific interface -- like "INumeric" or "IIntegralType" -- so we're stuck constraining to IConvertible instead.

  2. Which, coincidentally, fixed a bug in the first and second versions that had returned all detected enumerated values -- including combinations -- instead of individual bits. For example, given the type shown below, we only ever expect values One and Two, and never None, OneAndTwo or All.