1 2 3 4 5 6 7 8 9 10 11
Profiling: that critical 3% (Part II)

image In part I of this series, we discussed some core concepts of profiling. In that article, we not only discussed the problem at hand, but also how to think about not only fixing performance problems, but reducing the likelihood that they get out of hand in the first place.

In this second part, we'll go into detail and try to fix the problem.

Reevaluating the Requirements

Since we have new requirements for an existing component, it's time to reconsider the requirements for all stakeholders. In terms of requirements, the IScope can be described as follows:

  1. Hold a list of objects in LIFO order
  2. Hold a list of key/value pairs with a unique name as the key
  3. Return the value/reference for a key
  4. Return the most appropriate reference for a given requested type. The most appropriate object is the one that was added with exactly the requested type. If no such object was added, then the first object that conforms to the requested type is returned
  5. These two piles of objects are entirely separate: if an object is added by name, we do not expect it to be returned when a request for an object of a certain type is made

There is more detail, but that should give you enough information to understand the code examples that follow.

Usage Patterns

There are many ways of implementing the functional requirements listed above. While you can implement the feature with only requirements, it's very helpful to know usage patterns when trying to optimize code.

Therefore, we'd like to know exactly what kind of contract our code has to implement -- and to not implement any more than was promised.

Sometimes a hopeless optimization task gets a lot easier when you realize that you only have to optimize for a very specific situation. In that case, you can leave the majority of the code alone and optimize a single path through the code to speed up 95% of the calls. All other calls, while perhaps a bit slow, will at least still be yield the correct results.

And "optimized" doesn't necessarily mean that you have to throw all of your language's higher-level constructs out the window. Once your profiling tool tells you that a particular bit of code has introduced a bottleneck, it often suffices to just examine that particular bit of code more closely. Just picking the low-hanging fruit will usually be more than enough to fix the bottleneck.1

Create scopes faster2

I saw in the profiler that creating the ExpressionContext had gotten considerably slower. Here's the code in the constructor.

foreach (var value in values.Where(v => v != null))
{
  Add(value);
}

I saw a few potential problems immediately.

  • The call to Add() had gotten more expensive in order to return the most appropriate object from the GetInstances() method
  • The Linq replaced a call to AddRange()

The faster version is below:

var scope = CurrentScope;
for (var i = 0; i < values.Length; i++)
{
  var value = values[i];
  if (value != null)
  {
    scope.AddUnnamed(value);
  }
}

Why is this version faster? The code now uses the fact that we know we're dealing with an indexable list to avoid allocating an enumerator and to use non-allocating means of checking null. While the Linq code is highly optimized, a for loop is always going to be faster because it's guaranteed not to allocate anything. Furthermore, we now call AddUnnamed() to use the faster registration method because the more involved method is never needed for these objects.

The optimized version is less elegant and harder to read, but it's not terrible. Still, you should use these techniques only if you can prove that they're worth it.

Optimizing CurrentScope

Another minor improvement is that the call to retrieve the scope is made only once regardless of how many objects are added. On the one hand, we might expect only a minor improvement since we noted above that most use cases only ever add one object anyway. On the other, however, we know that we call the constructor 20 million times in at least one test, so it's worth examining.

The call to CurrentScope gets the last element of the list of scopes. Even something as innocuous as calling the Linq extension method Last() can get more costly than it needs to be when your application calls it millions of times. Of course, Microsoft has decorated its Linq calls with all sorts of compiler hints for inlining and, of course, if you decompile, you can see that the method itself is implemented to check whether the target of the call is a list and use indexing, but it's still slower. There is still an extra stack frame (unless inlined) and there is still a type-check with as.

Replacing a call to Last() with getting the item at the index of the last position in the list is not recommended in the general case. However, making that change in a provably performance-critical area shaved a percent or two off a test run that takes about 45 minutes. That's not nothing.

protected IScope CurrentScope
{
  get { return _scopes.Last(); }
}
protected IScope CurrentScope
{
  get { return _scopes[_scopes.Count - 1]; }
}

That takes care of the creation & registration side, where I noticed a slowdown when creating the millions of ExpressionContext objects needed by the data driver in our product's test suite.

Get objects faster

Let's now look at the evaluation side, where objects are requested from the context.

The offending, slow code is below:

public IEnumerable<TService> GetInstances<TService>()
{
  var serviceType = typeof(TService);
  var rawNameMatch = this[serviceType.FullName];

  var memberMatches = All.OfType<TService>();
  var namedMemberMatches = NamedMembers.Select(
    item => item.Value
  ).OfType<TService>();

  if (rawNameMatch != null)
  {
    var nameMatch = (TService)rawNameMatch;

    return
      nameMatch
      .ToSequence()
      .Union(namedMemberMatches)
      .Union(memberMatches)
      .Distinct(ReferenceEqualityComparer<TService>.Default);
  }

  return namedMemberMatches.Union(memberMatches);
}

As you can readily see, this code isn't particularly concerned about performance. It is, however, relatively easy to read and to figure out the logic behind returning objects, though. As long as no-one really needs this code to be fast -- if it's not used that often and not used in tight loops -- it doesn't matter. What matters more is legibility and maintainability.

But we now know that we need to make it faster, so let's focus on the most-likely use cases. I know the following things:

  • Almost all Scope instances are created with a single object in them and no other objects are ever added.
  • Almost all object-retrievals are made on such single-object scopes
  • Though the scope should be able to return all matching instances, sorted by the rules laid out in the requirements, all existing calls get the FirstOrDefault() object.

These extra bits of information will allow me to optimize the already-correct implementation to be much, much faster for the calls that we're likely to make.

The optimized version is below:

public IEnumerable<TService> GetInstances<TService>()
{
  var members = _members;

  if (members == null)
  {
    yield break;
  }

  if (members.Count == 1)
  {
    if (members[0] is TService)
    {
      yield return (TService)members[0];
    }

    yield break;
  }

  object exactTypeMatch;
  if (TypedMembers.TryGetValue(typeof(TService), out exactTypeMatch))
  {
    yield return (TService)exactTypeMatch;
  }

  foreach (var member in members.OfType<TService>())
  {
    if (!ReferenceEquals(member, exactTypeMatch))
    {
      yield return member;
    }
  }
}

Given the requirements, the handful of use cases and decent naming, you should be able to follow what's going on above. The code contains many more escape clauses for common and easily handled conditions, handling them in an allocation-free manner wherever possible.

  1. Handle empty case
  2. Handle single-element case
  3. Return exact match
  4. Return all other matches3

You'll notice that returning a value added by-name is not a requirement and has been dropped. Improving performance by removing code for unneeded requirements is a perfectly legitimate solution.

Test Results

And, finally, how did we do? I created tests for the following use cases:

  • Create scope with multiple objects
  • Get all matching objects in an empty scope
  • Get first object in an empty scope
  • Get all matching objects in a scope with a single object
  • Get first object in a scope with a single object
  • Get all matching objects in a scope with multiple objects
  • Get first object in a scope with multiple objects

Here are the numbers from the automated tests.

image

image

  • Create scope with multiple objects -- 12x faster
  • Get all matching objects in an empty scope -- almost 2.5x faster
  • Get first object in an empty scope -- almost 3.5x faster
  • Get all matching objects in a scope with a single object -- over 3x faster
  • Get first object in a scope with a single object -- over 3.25x faster
  • Get all matching objects in a scope with multiple objects -- almost 3x faster
  • Get first object in a scope with multiple objects -- almost 2.25x faster

This looks amazing but remember: while the optimized solution may be faster than the original, all we really know is that we've just managed to claw our way back from the atrocious performance characteristics introduced by a recent change. We expect to see vast improvements versus a really slow version.

Since I know that these calls showed up as hotspots and were made millions of times in the test, the performance improvement shown by these tests is enough for me to deploy a pre-release of Quino via TeamCity, upgrade my product to that version and run the tests again. Wish me luck4



  1. The best approach at this point is to create issues for the other performance investigations you could make. For example, I opened an issue called Optimize allocations in the data handlers (start with IExpressionContexts), documented everything I had analyzed and quickly got back to the issue on which I'd started.

  2. For those with access to the Quino Git repository, the diffs shown below come from commit a825d5030ce6f65a452e1db85a308e1351288b96.

  3. If you're following along very, very carefully, you'll recall at this point that the requirement stated above is that objects are returned in LIFO order. The faster version of the code returns objects in FIFO order. You can't tell that the original, slow version did guarantee LIFO ordering, but only because the call to get All members contained a hidden call to the Linq call Reverse(), which slowed things down even more! I removed the call to reverse all elements because (A) I don't actually have any tests for the LIFO requirement nor (B) do I have any other code that expects it to happen. I wasn't about to make the code even more complicated and possibly slower just to satisfy a purely theoretical requirement. That's the kind of behavior that got me into this predicament in the first place.

  4. Spoiler alert: it worked. ;-) The fixes cut the testing time from about 01:30 to about 01:10 for all tests on the build server, so we won back the lost 25%.

Profiling: that critical 3% (Part I)

An oft-quoted bit of software-development sagacity is

Premature optimization is the root of all evil.

As is so often the case with quotes -- especially those on the Internet1 -- this one has a slightly different meaning in context. The snippet above invites developers to overlook the word "premature" and interpret the received wisdom as "you don't ever need to optimize."

Instead, Knuth's full quote actually tells you how much of your code is likely to be affected by performance issues that matter (highlighted below).

Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.

An Optimization Opportunity in Quino2

In other articles, I'd mentioned that we'd upgraded several solutions to Quino 2 in order to test that the API was solid enough for a more general release. One of these products is both quite large and has a test suite of almost 1500 tests. The product involves a lot of data-import and manipulation and the tests include several scenarios where Quino is used very intensively to load, process and save data.

These tests used to run in a certain amount of time, but started taking about 25% longer after the upgrade to Quino 2.

Measuring Execution Speed

Before doing anything else -- making educated guesses as to what the problem could be, for example -- we measure. At Encodo, we use JetBrains DotTrace to collect performance profiles.

There is no hidden secret: the standard procedure is to take a measurement before and after the change and to compare them. However, so much had changed from Quino 1.13 to Quino 2 -- e.g. namespaces and type names had changed -- that while DotTrace was able to show some matches, the comparisons were not as useful as usual.

A comparison between codebases that hadn't changed so much is much easier, but I didn't have that luxury.

Tracking the Problem

Even excluding the less-than-optimal comparison, it was an odd profile. Ordinarily, one or two issues stick out right away, but the slowness seemed to suffuse the entire test run. Since the direct profiling comparison was difficult, I downloaded test-speed measurements as CSV from TeamCity for the product where we noticed the issue.

How much slower, you might ask? The test that I looked at most closely took almost 4 minutes (236,187ms) in the stable version, but took 5:41 in the latest build.

image

This test was definitely one of the largest and longest tests, so it was particularly impacted. Most other tests that imported and manipulated data ranged anywhere from 10% to 30% slower.

When I looked for hot-spots, the profile unsurprisingly showed me that database access took up the most time. The issue was more subtle: while database-access still used the most time, it was using a smaller percentage of the total time. Hot-spot analysis wasn't going to help this time. Sorting by absolute times and using call counts in the tracing profiles yielded better clues.

The tests were slower when saving and also when loading data. But I knew that the ORM code itself had barely changed at all. And, since the product was using Quino so heavily, the stack traces ran quite deep. After a lot of digging, I noticed that creating the ExpressionContext to hold an object while evaluating expressions locally seemed to be taking longer than before. This was my first, real clue.

Once I was on the trail, I found that when evaluating calls (getting objects) that used local evaluation, it was also always slower.

Don't Get Distracted

imageOnce you start looking for places where performance is not optimal, you're likely to start seeing them everywhere. However, as noted above, 97% of them are harmless.

To be clear, we're not optimizing because we feel that the framework is too slow but because we've determined that the framework is now slower than it used to be and we don't know why.

Even after we've finished restoring the previous performance (or maybe even making it a little better), we might still be able to easily optimize further, based on other information that we gleaned during our investigation.

But we want to make sure that we don't get distracted and start trying to FIX ALL THE THINGS instead of just focusing on one task at a time. While it's somewhat disturbing that we seem to be created 20 million ExpressionContext objects in a 4-minute test, that is also how we've always done it, and no-one has complained about the speed up until now.

Sure, if we could reduce that number to only 2 million, we might be even faster3, but the point is that that we used to be faster on the exact same number of calls -- so fix that first.

A Likely Culprit: Scope

I found a likely candidate in the Scope class, which implements the IScope interface. This type is used throughout Quino, but the two use-cases that affect performance are:

  1. As a base for the ExpressionContext, which holds the named values and objects to be used when evaluating the value of an IExpression. These expressions are used everywhere in the data driver.
  2. As a base for the poor-man's IOC used in Stage 2 of application execution.4

The former usage has existed unchanged for years; its implementation is unlikely to be the cause of the slowdown. The latter usage is new and I recall having made a change to the semantics of which objects are returned by the Scope in order to make it work there as well.

How could this happen?

You may already be thinking: smooth move, moron. You changed the behavior of a class that is used everywhere for a tacked-on use case. That's definitely a valid accusation to make.

In my defense, my instinct is to reuse code wherever possible. If I already have a class that holds a list of objects and gives me back the object that matches a requested type, then I will use that. If I discover that the object that I get back isn't as predictable as I'd like, then I improve the predictability of the API until I've got what I want. If the improvement comes at no extra cost, then it's a win-win situation. However, this time I paid for the extra functionality with degraded performance.

Where I really went wrong was that I'd made two assumptions:

  1. I assumed that all other usages were also interested in improved predictability.
  2. I assumed that all other usages were not performance-critical. When I wrote the code you'll see below, I distinctly remember thinking: it's not fast, but it'll do and I'll make it faster if it becomes a problem. Little did I know how difficult it would be to find the problem.

Preventing future slippage

Avoid changing a type shared by different systems without considering all stakeholder requirements.

I think a few words on process here are important. Can we improve the development process so that this doesn't happen again? One obvious answer would be to avoid changing a type shared by different systems without considering all stakeholder requirements. That's a pretty tall order, though. Including this in the process will most likely lead to less refactoring and improvement out of fear of breaking something.

We discussed above how completely reasonable assumptions and design decisions led to the performance degradation. So we can't be sure it won't happen again. What we would like, though, is to be notified quickly when there is performance degradation, so that it appears as a test failure.

Notify quickly when there is performance degradation

Our requirements are captured by tests. If all of the tests pass, then the requirements are satisfied. Performance is a non-functional requirement. Where we could improve Quino is to include high-level performance tests that would sound the alarm the next time something like this happens.[^5]

Enough theory: in part II, we'll describe the problem in detail and take a crack at improving the speed. See you there.



  1. In fairness, the quote is at least properly attributed. It really was Donald Knuth who wrote it.

  2. By "opportunity", of course, I mean that I messed something up that made Quino slower in the new version.

  3. See the article Quino 2: Starting up an application, in detail for more information on this usage.

  4. I'm working on this right now, in issue Add standard performance tests for release 2.1.

v2.1: API-smoothing and performance

The summary below describes major new features, items of note and breaking changes. The full list of issues is also available for those with access to the Encodo issue tracker.

Highlights

Quino 2 is finally ready and will go out the door with a 2.1 rather than a 2.0 version number. The reason being that we released 2.0 internally and tested the hell out of it. 2.1 is the result of that testing. It includes a lot of bug fixes as well as API tweaks to make things easier for developers.

On top of that, I've gone through the backlog and found many issues that had either been fixed already, were obsolete or had been inadequately specified. The Quino backlog dropped from 682 to 542 issues.

Breaking changes

The following changes are marked with Obsolete attributes, so you'll get a hint as to how to fix the problem. Since these are changes from an unreleased version of Quino, they cause a compile error.

  • UseMetaSchemaWinformDxFeedback() has been renamed to UseMetaschemaWinformDx()
  • UseSchemaMigrationSupport() has been renamed to UseIntegratedSchemaMigration()
  • MetaHttpApplicationBase.MetaApplication has been renamed to BaseApplication
  • The IServer.Run() extension method is no longer supported.
  • GetStandardFilters, GetStandardFiltersForFormsAuthentication() and GetStandardFiltersForUnrestrictedAuthentication are no longer supported. Instead, you should register filters in the IOC and use the IWebFilterAttributeFactory.CreateFilters() to get the list of supported filters
  • The ToolRequirementAttribute is no longer supported or used.
  • AssemblyExtensions.GetLoadableTypesWithInterface() is no longer supported
  • AssemblyTools.GetValidAssembly() has been replaced with AssemblyTools.GetApplicationAssembly(); GetExecutableName() and GetExecutablePath() have removed.
  • All of the constant expressions on the MetaBuilderBase (e.g. EndOfTimeExpression) are obsolete. Instead, use MetaBuilderBase.ExpressionFactory.Constants.EndOfTime instead.
  • All of the global values on MetaObjectDescriptionExtensions are obsolete; instead, use the IMetaObjectFormatterSettings from the IOC to change settings on startup.
  • Similarly, the set of extension methods that included GetShortDescription() has been moved to the IMetaObjectFormatter. Obtain an instance from the IOC, as usual.
v2.0: Logging, Dependencies, New Assemblies & Nuget

The summary below describes major new features, items of note and breaking changes. The full list of issues is also available for those with access to the Encodo issue tracker.

Highlights

In the beta1 and beta2 release notes, we read about changes to configuration, dependency reduction, the data driver architecture, DDL commands, security and access control in web applications and a new code-generation format.

In 2.0 final -- which was actually released internally on November 13th, 2015 (a Friday) -- we made the following additional improvements:

These notes are being published for completeness and documentation. The first publicly available release of Quino 2.x will be 2.1 or higher (release notes coming soon).

Breaking changes

imageAs we've mentioned before, this release is absolutely merciless in regard to backwards compatibility. Old code is not retained as Obsolete. Instead, a project upgrading to 2.0 will encounter compile errors.

The following notes serve as an incomplete guide that will help you upgrade a Quino-based product.

As I wrote in the release notes for beta1 and beta2, if you arm yourself with a bit of time, ReSharper and the release notes (and possibly keep an Encodo employee on speed-dial), the upgrade is not difficult. It consists mainly of letting ReSharper update namespace references for you.

Global Search/Replace

Instead of going through the errors (example shown to the right) one by one, you can take care of a lot of errors with the following search/replace pairs.

  • Encodo.Quino.Data.Persistence => Encodo.Quino.Data
  • IMetaApplication => IApplication
  • ICoreApplication => IApplication
  • GetServiceLocator() => GetServices()
  • MetaMethodTools.GetInstance => DataMetaMethodExtensions.GetInstance
  • application.ServiceLocator.GetInstance => application.GetInstance
  • Application.ServiceLocator.GetInstance => Application.GetInstance
  • application.ServiceLocator => application.GetServices()
  • Application.ServiceLocator => Application.GetServices()
  • application.Recorder => application.GetLogger()
  • Application.Recorder => Application.GetLogger()
  • session.GetRecorder() => session.GetLogger()
  • Session.GetRecorder() => Session.GetLogger()
  • Session.Application.Recorder => Session.GetLogger()
  • FileTools.Canonicalize() => PathTools.Normalize()
  • application.Messages => application.GetMessageList()
  • Application.Messages => Application.GetMessageList()
  • ServiceLocator.GetInstance => Application.GetInstance
  • MetaLayoutTools => LayoutConstants
  • GlobalContext.Instance.Application.Configuration.Model => GlobalContext.Instance.Application.GetModel()
  • IMessageRecorder => ILogger
  • GetUseReleaseSettings() => IsInReleaseMode()
  • ReportToolsDX => ReportDxExtensions

Although you can't just search/replace everything, it gets you a long way.

Model-Building Fixes

These replacement pairs, while not recommended for global search/replace, are a handy guide for how the API has generally changed.

  • *Generator => *Builder
  • SetUpForModule => CreateModule
  • Builder.SetElementVisibility(prop, true) => prop.Show()
  • Builder.SetElementVisibility(prop, false) => prop.Hide()
  • Builder.SetElementControlIdentifier(prop, ControlIdentifiers => prop.SetInputControl(ControlIdentifiers
  • Builder.SetPropertyHeightInPixels(prop, 200); => prop.SetHeightInPixels(200);

Constructing a module has also changed. Instead of using the following syntax,

var module = Builder.SetUpForModule<AuditModule>(Name, "ApexClearing.Alps.Core", Name, true);

Replace it with the following direct replacement,

var module = Builder.CreateModule(Name, "ApexClearing.Alps.Core", Name);

Or use this replacement, with the recommended style for the v2 format (no more class prefix for generated classes and a standard namespace):

var module = Builder.CreateModule(Name, typeof(AuditModuleBuilder).GetParentNamespace());

Standard Modules (e.g. Reporting, Security, etc.)

Because of how the module class-names have changed, the standard module ORM classes all have different names. The formula is that the ORM class-name is no longer prepended its module name.

  • ReportsReportDefinition => ReportDefinition
  • SecurityUser => User
  • And so on...

Furthermore, all modules have been converted to use the v2 code-generation format, which has the metadata separate from the ORM object. Therefore, instead of referencing metadata using the ORM class-name as the base, you use the module name as the base.

  • ReportReportDefinition.Fields.Name => ReportModule.ReportDefinition.Name.Identifier
  • ReportReportDefinition.MetaProperties.Name => ReportModule.ReportDefinition.Name
  • ReportReportDefinition.Metadata => ReportModule.ReportDefinition.Metadata
  • And so on...

There's an upcoming article that will show more examples of the improved flexibility and capabilities that come with the v2-metadata.

Action names

The standard action names have moved as well.

  • ActionNames => ApplicationActionNames
  • MetaActionNames => MetaApplicationActionNames

Any other, more rarely used action names have been moved back to the actions themselves, so for example

SaveApplicationSettingsAction.ActionName

If you created any actions of your own, then the API there has changed as well. As previously documented in API Design: To Generic or not Generic? (Part II), instead of overriding the following method,

protected override int DoExecute(IApplication application, ConfigurationOptions options, int currentResult)
{
  base.DoExecute(application, options, currentResult);
}

you instead override in the following way,

public override void Execute()
{
  base.Execute();
}

Using NuGet

If you're already using Visual Studio 2015, then the NuGet UI is a good choice for managing packages. If you're still on Visual Studio 2013, then the UI there is pretty flaky and we recommend using the console.

The examples below assume that you have configured a source called "Local Quino" (e.g. a local folder that holds the nupkg files for Quino).

install-package Quino.Data.PostgreSql.Testing -ProjectName Punchclock.Core.Tests -Source "Local Quino"
install-package Quino.Server -ProjectName Punchclock.Server -Source "Local Quino"
install-package Quino.Console -ProjectName Punchclock.Server -Source "Local Quino"
install-package Quino.Web -ProjectName Punchclock.Web.API -Source "Local Quino"

Debugging Support

We recommend using Visual Studio 2015 if at all possible. Visual Studio 2013 is also supported, but we have all migrated to 2015 and our knowhow about 2013 and its debugging idiosyncrasies will deteriorate with time.

These are just brief points of interest to get you set up. As with the NuGet support, these instructions are subject to change as we gain more experience with debugging with packages as well.

  • Hook up to a working symbol-source server (e.g. TeamCity)
  • Get the local sources for your version
  • If you don't have a source server or it's flaky, then get the PDBs for the Quino version you're using (provided in Quino.zip as part of the package release)
  • Add the path to the PDBs to your list of symbol sources in the VS debugging options
  • Tell Visual Studio where the sources are when it asks during debugging
  • Tell R# how to map from the source folder (c:\BuildAgent\work\9a1bb0adebb73b1f for Quino 2.0.0-1765) to the location of your sources

Quino packages are no different than any other NuGet packages. We provide both standard packages as well as packages with symbols and sources. Any complications you encounter with them are due to the whole NuGet experience still being a bit in-flux in the .NET world.

An upcoming post will provide more detail and examples.

Creating Nuget Packages

We generally use our continuous integration server to create packages, but you can also create packages locally (it's up to you to make sure the version number makes sense, so be careful). These instructions are approximate and are subject to change. I provide them here to give you an idea of how packages are created. If they don't work, please contact Encodo for help.

  • Open PowerShell
  • Change to the %QUINO_ROOT%\src directory
  • Run nant build pack to build Quino and packages
  • Set up a local NuGet Source name "Local Quino" to %QUINO_ROOT%\nuget (one-time only)
  • Change to the directory where your Quino packages are installed for your solution.
  • Delete all of the Encodo/Quino packages
  • Execute nant nuget from your project directory to get the latest Quino build from your local folder
Limited drive-space chronicles #2: Why is Visual Studio installed on my machine?

If you're like us at Encodo, you moved to SSDs years ago...and never looked back. However, SSDs are generally smaller because the price (still) ramps up quickly as you increase size. We've almost standardized on 512GB, but some of us still have 256GB drives.

Unfortunately, knowing that we all have giant hard drives started a trend among manufacturers to just install everything, just in case you might need it. This practice didn't really cause problems when we were still using by-then terabyte-sized HDs. But now, we are, once again, more sensitive to unnecessary installations.

If you're a Windows .NET developer, you'll feel the pinch more quickly as you've got a relatively heavyweight Visual Studio installation (or three...) as well as Windows 8.1 itself, which weighs in at about 60GB after all service packs have been installed.

Once you throw some customer data and projects and test databases on your drive, you might find that you need, once again, to free up some space on your drive.

I wrote a similar post last year and those tips & tricks still apply as well.

System Cleanup is back

One additional tip I have is to use Win + S to search for "Free up disk space by deleting unnecessary files"1 and run that application in "clean up system files" mode: the latest version will throw out as much Windows Update detritus as it can, which can clean up gigabytes of space.

imageimageimage

Remove Old Visual Studios

The other measure you can take is to remove programs that you don't use anymore: for .NET developers that means you should finally toss out Visual Studio 2010 -- and possibly even 2013, if you've made the move to the new and improved 2015 already.2 Removing these versions also has the added benefit that extensions and add-ons will no longer try to install themselves into these older Visual Studios anymore.

However, even if you do remove VS2010, for example, you might find that it just magically reappears again. Now, I'm not surprised when I see older runtimes and redistributables in my list of installed programs -- it makes sense to keep these for applications that rely on them -- but when I see the entire VS2010 SP1 has magically reappeared, I'm confused.

image

Imagine my surprise when I installed SQL Server Management Studio 2016 -- the November 2015 Preview -- and saw the following installation item:

image

However, if you do remove this item again, then SQL Server Management Studio will no longer run (no surprise there, now that we know that it installed it). However, if you're just doing cleanup and don't know about this dependency3, you might accidentally break tools. So be careful; if you're too aggressive, you'll end up having to re-install some stuff.4



  1. The reason I write that "it's back" is that for a couple of versions of Windows, Microsoft made it an optional download/feature instead of installing it by default.

  2. Be careful about removing Visual Studio 2013 if you have web projects that still rely on targets installed with VS2013 but not included in VS2015. I uninstalled 2013 on my laptop and noticed a warning about an MS target that the compiler could no longer find.

  3. The fact that Windows still can't tell you about dependencies is a story for another day. We should have had a package manager on Windows years ago. And, no, while Choco is a lovely addition, it's not quite the full-fledged package manager that aptitude is on Ubuntu.

  4. Speaking from experience. Could you tell?

Improving NUnit integration with testing harnesses

imageThese days nobody who's anybody in the software-development world is writing software without tests. Just writing them doesn't help make the software better, though. You also need to be able to execute tests -- reliably and quickly and repeatably.

That said, you'll have to get yourself a test runner, which is a different tool from the compiler or the runtime. That is, just because your tests compile (satisfy all of the language rules) and could be executed doesn't mean that you're done writing them yet.

Testing framework requirements

Every testing framework has its own rules for how the test runner selects methods for execution as tests. The standard configuration options are:

  • Which classes should be considered as test fixtures?
  • Which methods are considered tests?
  • Where do parameters for these methods come from?
  • Is there startup/teardown code to execute for the test or fixture?

Each testing framework will offer different ways of configuring your code so that the test runner can find and execute setup/test/teardown code. To write NUnit tests, you decorate classes, methods and parameters with C# attributes.

The standard scenario is relatively easy to execute -- run all methods with a Test attribute in a class with a TestFixture attribute on it.

Test-runner Requirements

There are legitimate questions for which even the best specification does not provide answers.

When you consider multiple base classes and generic type arguments, each of which may also have NUnit attributes, things get a bit less clear. In that case, not only do you have to know what NUnit offers as possibilities but also whether the test runner that you're using also understands and implements the NUnit specification in the same way. Not only that, but there are legitimate questions for which even the best specification does not provide answers.

At Encodo, we use Visual Studio 2015 with ReSharper 9.2 and we use the ReSharper test runner. We're still looking into using the built-in VS test runner -- the continuous-testing integration in the editor is intriguing1 -- but it's quite weak when compared to the ReSharper one.

So, not only do we have to consider what the NUnit documentation says is possible, but we must also know what how the R# test runner interprets the NUnit attributes and what is supported.

Getting More Complicated

Where is there room for misunderstanding? A few examples,

  • What if there's a TestFixture attribute on an abstract class?
  • How about a TestFixture attribute on a class with generic parameters?
  • Ok, how about a non-abstract class with Tests but no TestFixture attribute?
  • And, finally, a non-abstract class with Tests but no TestFixture attribute, but there are non-abstract descendants that do have a TestFixture attribute?

In our case, the answer to these questions depends on which version of R# you're using. Even though it feels like you configured everything correctly and it logically should work, the test runner sometimes disagrees.

  • Sometimes it shows your tests as expected, but refuses to run them (Inconclusive FTW!)
  • Or other times, it obstinately includes generic base classes that cannot be instantiated into the session, then complains that you didn't execute them. When you try to delete them, it brings them right back on the next build. When you try to run them -- perhaps not noticing that it's those damned base classes -- then it complains that it can't instantiate them. Look of disapproval.

Throw the TeamCity test runner into the mix -- which is ostensibly the same as that from R# but still subtly different -- and you'll have even more fun.

Improving Integration with the R# Test Runner

At any rate, now that you know the general issue, I'd like to share how the ground rules we've come up with that avoid all of the issues described above. The text below comes from the issue I created for the impending release of Quino 2.

Environment

  • Windows 8.1 Enterprise
  • Visual Studio 2015
  • ReSharper 9.2

Expected behavior

Non-leaf-node base classes should never appear as nodes in test runners. A user should be able to run tests in descendants directly from a fixture or test in the base class.

Observed behavior

Non-leaf-node base classes are shown in the R# test runner in both versions 9 and 10. A user must navigate to the descendant to run a test. The user can no longer run all descendants or a single descendant directly from the test.

Analysis

Relatively recently, in order to better test a misbehaving test runner and accurately report issues to JetBrains, I standardized all tests to the same pattern:

  • Do not use abstract anywhere (the base classes don't technically need it)
  • Use the TestFixture attribute only on leaf nodes

This worked just fine with ReSharper 8.x but causes strange behavior in both R# 9.x and 10.x. We discovered recently that not only did the test runner act strangely (something that they might fix), but also that the unit-testing integration in the files themselves behaved differently when the base class is abstract (something JetBrains is unlikely to fix).

You can see that R# treats a non-abstract class with tests as a testable entity, even when it doesn't actually have a TestFixture attribute and even expects a generic type parameter in order to instantiate.

Here it's not working well in either the source file or the test runner. In the source file, you can see that it offers to run tests in a category, but not the tests from actual descendants. If you try to run or debug anything from this menu, it shows the fixture with a question-mark icon and marks any tests it manages to display as inconclusive. This is not surprising, since the test fixture may not be abstract, but does require a type parameter in order to be instantiated.

image

Here it looks and acts correctly:

image

I've reported this issue to JetBrains, but our testing structure either isn't very common or it hasn't made it to their core test cases, because neither 9 nor 10 handles them as well as the 8.x runner did.

Now that we're also using TeamCity a lot more to not only execute tests but also to collect coverage results, we'll capitulate and just change our patterns to whatever makes R#/TeamCity the happiest.

Solution

  • Make all testing base classes that include at least one {{Test}} or {{Category}} attribute {{abstract}}. Base classes that do not have any testing attributes do not need to be made abstract.

Once more to recap our ground rules for making tests:

  • Include TestFixture only on leafs (classes with no descendants)
  • You can put Category or Test attributes anywhere in the hierarchy, but need to declare the class as abstract.
  • Base classes that have no testing attributes do not need to be abstract
  • If you feel you need to execute tests in both a base class and one of its descendants, then you're probably doing something wrong. Make two descendants of the base class instead.

When you make the change, you can see the improvement immediately.

image


  1. ReSharper 10.0 also offers continuous integration, but our experiments with the EAP builds and the first RTM build left us underwhelmed and we downgraded to 9.2 until JetBrains manages to release a stable 10.x.

Quino 2: Starting up an application, in detail

As part of the final release process for Quino 2, we've upgraded 5 solutions1 from Quino 1.13 to the latest API in order to shake out any remaining API inconsistencies or even just inelegant or clumsy calls or constructs. A lot of questions came up during these conversions, so I wrote the following blog to provide detail on the exact workings and execution order of a Quino application.

I've discussed the design of Quino's configuration before, most recently in API Design: Running an Application (Part I) and API Design: To Generic or not Generic? (Part II) as well as the three-part series that starts with Encodos configuration library for Quino: part I.

Quino Execution Stages

The life-cycle of a Quino 2.0 application breaks down into roughly the following stages:

  1. Build Application: Register services with the IOC, add objects needed during configuration and add actions to the startup and shutdown lists
  2. Load User Configuration: Use non-IOC objects to bootstrap configuration from the command line and configuration files; IOC is initialized and can no longer be modified after action ServicesInitialized
  3. Apply Application Configuration: Apply code-based configuration to IOC objects; ends with the ServicesConfigured action
  4. Execute: execute the loop, event-handler, etc.
  5. Shut Down: dispose of the application, shutting down services in the IOC, setting the exit code, etc.

Stage 1

The first stage is all about putting the application together with calls to Use various services and features. This stage is covered in detail in three parts, starting with Encodos configuration library for Quino: part I.

Stage 2

Let's tackle this one last because it requires a bit more explanation.

Stage 3

Technically, an application can add code to this stage by adding an IApplicationAction before the ServicesConfigured action. Use the Configure<TService>() extension method in stage 1 to configure individual services, as shown below.

application.Configure<IFileLogSettings>(
  s => s.Behavior = FileLogBehavior.MultipleFiles
);

Stage 4

The execution stage is application-specific. This stage can be short or long, depending on what your application does.

For desktop applications or single-user utilities, stage 4 is executed in application code, as shown below, in the Run method, which called by the ApplicationManager after the application has started.

var transcript = new ApplicationManager().Run(CreateApplication, Run);

IApplication CreateApplication() { ... }
void Run(IApplication application) { ... }

If your application is a service, like a daemon or a web server or whatever, then you'll want to execute stages 1--3 and then let the framework send requests to your application's running services. When the framework sends the termination signal, execute stage 5 by disposing of the application. Instead of calling Run, you'll call CreateAndStartupUp.

var application = new ApplicationManager().CreateAndStartUp(CreateApplication);

IApplication CreateApplication() { ... }

Stage 5

Every application has certain tasks to execute during shutdown. For example, an application will want to close down any open connections to external resources, close file (especially log files) and perhaps inform the user of shutdown.

Instead of exposing a specific "shutdown" method, a Quino 2.0 application can simply be disposed to shut it down.

If you use ApplicationManager.Run() as shown above, then you're already sorted -- the application will be disposed and the user will be informed in case of catastrophic failure; otherwise, you can shut down and get the final application transcript from the disposed object.

application.Dispose();
var transcript = application.GetTranscript();
// Do something with the transcript...

Stage 2 Redux

We're finally ready to discuss stage 2 in detail.

An IOC has two phases: in the first phase, the application registers services with the IOC; in the second phase, the application uses services from the IOC.

An application should use the IOC as much as possible, so Quino keeps stage 2 as short as possible. Because it can't use the IOC during the registration phase, code that runs in this stage shares objects via a poor-man's IOC built into the IApplication that allows modification and only supports singletons. Luckily, very little end-developer application code will ever need to run in this stage. It's nevertheless interesting to know how it works.

Obviously, any code in this stage that uses the IOC will cause it to switch from phase one to phase two and subsequent attempts to register services will fail. Therefore, while application code in stage 2 has to be careful, you don't have to worry about not knowing you've screwed up.

Why would we have this stage? Some advocates of using an IOC claim that everything should be configured in code. However, it's not uncommon for applications to want to run very differently based on command-line or other configuration parameters. The Quino startup handles this by placing the following actions in stage 2:

  • Parse and apply command-line
  • Import and apply external configuration (e.g. from file)

An application is free to insert more actions before the ServicesInitialized action, but they have to play by the rules outlined above.

"Single" objects

Code in stage 2 shares objects by calling SetSingle() and GetSingle(). There are only a few objects that fall into this category.

The calls UseCore() and UseApplication() register most of the standard objects used in stage 2. Actually, while they're mostly used during stage 2, some of them are also added to the poor man's IOC in case of catastrophic failure, in which case the IOC cannot be assumed to be available. A good example is the IApplicationCrashReporter.

Executing Stages

Before listing all of the objects, let's take a rough look at how a standard application is started. The following steps outline what we consider to be a good minimum level of support for any application. Of course, the Quino configuration is modular, so you can take as much or as little as you like, but while you can use a naked Application -- which has absolutely nothing registered -- and you can call UseCore() to have a bit more -- it registers a handful of low-level services but no actions -- we recommend calling at least UseApplication() to adds most of the functionality outlined below.

  1. Create application: This involves creating the IOC and most of the IOC registration as well as adding most of the application startup actions (stage 1)
  2. Set debug mode: Get the final value of RunMode from the IRunSettings to determine if the application should catch all exceptions or let them go to the debugger. This involves getting the IRunSettings from the application and getting the final value using the IApplicationManagerPreRunFinalizer. This is commonly an implementation that can allows setting the value of RunMode from the command-line in debug builds. This further depends on the ICommandSetManager (which depends on the IValueTools) and possibly the ICommandLineSettings (to set the CommandLineConfigurationFilename if it was set by the user).
  3. Process command line: Set the ICommandProcessingResult, possibly setting other values and adding other configuration steps to the list of startup actions (e.g. many command-line options are switches that are handled by calling Configure<TSettings>() where TSettings is the configuration object in the IOC to modify).
  4. Read configuration file: Load the configuration data into the IConfigurationDataSettings, involving the ILocationManager to find configuration files and the ITextValueNodeReader to read them.
  5. The ILogger is used throughout by various actions to log application behavior
  6. If there is an unhandled error, the IApplicationCrashReporter uses the IFeedback or the ILogger to notify the user and log the error
  7. The IInMemoryLogger is used to include all in-memory messages in the IApplicationTranscript

The next section provides detail to each of the individual objects referenced in the workflow above.

Available Objects

You can get any one of these objects from the IApplication in at least two ways, either by using GetSingle<TService>() (safe in all situations) or GetInstance<TService>() (safe only in stage 3 or later) or there's almost always a method which starts with "Use" and ends in the service name.

The example below shows how to get the ICommandSetManager2 if you need it.

application.GetCommandSetManager();
application.GetSingle<ICommandSetManager>(); // Prefer the one above
application.GetInstance<ICommandSetManager>();

All three calls return the exact same object, though. The first two from the poor-man's IOC; the last from the real IOC.

Only applications that need access to low-level objects or need to mess around in stage 2 need to know which objects are available where and when. Most applications don't care and will just always use GetInstance().

The objects in the poor-man's IOC are listed below.

Core

  • IValueTools: converts values; used by the command-line parser, mostly to translate enumerate values and flags
  • ILocationManager: an object that manages aliases for file-system locations, like "Configuration", from which configuration files should be loaded or "UserConfiguration" where user-specific overlay configuration files are stored; used by the configuration loader
  • ILogger: a reference to the main logger for the application
  • IInMemoryLogger: a reference to an in-memory message store for the logger (used by the ApplicationManager to retrieve the message log from a crashed application)
  • IMessageFormatter: a reference to the object that formats messages for the logger

Command line

  • ICommandSetManager: sets the schema for a command line; used by the command-line parser
  • ICommandProcessingResult: contains the result of having processed the command line
  • ICommandLineSettings: defines the properties needed to process the command line (e.g. the Arguments and CommandLineConfigurationFilename, which indicates the optional filename to use for configuration in addition to the standard ones)

Configuration

  • IConfigurationDataSettings: defines the ConfigurationData which is the hierarchical representation of all configuration data for the application as well as the MainConfigurationFilename from which this data is read; used by the configuration-loader
  • ITextValueNodeReader: the object that knows how to read ConfigurationData from the file formats supported by the application3; used by the configuration-loader

Run

  • IRunSettings: an object that manages the RunMode ("release" or "debug"), which can be set from the command line and is used by the ApplicationManager to determine whether to use global exception-handling
  • IApplicationManagerPreRunFinalizer: a reference to an object that applies any options from the command line before the decision of whether to execute in release or debug mode is taken.
  • IApplicationCrashReporter: used by the ApplicationManager in the code surrounding the entire application execution and therefore not guaranteed to have a usable IOC available
  • IApplicationDescription: used together with the ILocationManager to set application-specific aliases to user-configuration folders (e.g. AppData\{CompanyTitle}\{ApplicationTitle})
  • IApplicationTranscript: an object that records the last result of having run the application; returned by the ApplicationManager after Run() has completed, but also available through the application object returned by CreateAndStartUp() to indicate the state of the application after startup.

Each of these objects has a very compact interface and has a single responsibility. An application can easily replace any of these objects by calling UseSingle() during stage 1 or 2. This call sets the object in both the poor-man's IOC as well as the real one. For those rare cases where a non-IOC singleton needs to be set after the IOC has been finalized, the application can call SetSingle(), which does not touch the IOC. This feature is currently used only to set the IApplicationTranscript, which needs to happen even after the IOC registration is complete.


application.GetSingle<ITextValueNodeReader>();
application.GetInstance<ITextValueNodeReader>();
application.GetConfigurationDataReader(); // Recommended

  1. Two large customer solutions, two medium-sized internal solutions (Punchclock and JobVortex) as well as the Demo/Sandbox solution. These solutions include the gamut of application types:

    * 3 ASP.NET MVC applications
    * 2 ASP.NET WebAPI applications
    * 2 Windows services
    * 3 Winform/DevExpress applications
    * 2 Winform/DevExpress utilities
    * 4 Console applications and utilities
    

  2. I originally used ITextValueNodeReader as an example, but that's one case where the recommended call doesn't match 1-to-1 with the interface name.

  3. Currently only XML, but JSON is on the way when someone gets a free afternoon.

`IServer`: converting hierarchy to composition

Quino has long included support for connecting to an application server instead of connecting directly to databases or other sources. The application server uses the same model as the client and provides modeled services (application-specific) as well as CRUD for non-modeled data interactions.

We wrote the first version of the server in 2008. Since then, it's acquired better authentication and authorization capabilities as well as routing and state-handling. We've always based it on the .NET HttpListener.

Old and Busted

As late as Quino 2.0-beta2 (which we had deployed in production environments already), the server hierarchy looked like screenshot below, pulled from issue QNO-4927:

image

This screenshot was captured after a few unneeded interfaces had already been removed. As you can see by the class names, we'd struggled heroically to deal with the complexity that arises when you use inheritance rather than composition.

The state-handling was welded onto an authentication-enabled server, and the base machinery for supporting authentication was spread across three hierarchy layers. The hierarchy only hints at composition in its naming: the "Stateful" part of the class name CoreStatefulHttpServerBase<TState> had already been moved to a state provider and a state creator in previous versions. That support is unchanged in the 2.0 version.

Implementation Layers

We mentioned above that implementation was "spread across three hierarchy layers". There's nothing wrong with that, in principle. In fact, it's a good idea to encapsulate higher-level patterns in a layer that doesn't introduce too many dependencies and to introduce dependencies in other layers. This allows applications not only to be able to use a common implementation without pulling in unwanted dependencies, but also to profit from the common tests that ensure the components works as advertised.

In Quino, the following three layers are present in many components:

  1. Abstract: a basic encapsulation of a pattern with almost no dependencies (generally just Encodo.Core).
  2. Standard: a functional implementation of the abstract pattern with dependencies on non-metadata assemblies (e.g. Encodo.Application, Encodo.Connections and so on)
  3. Quino: an enhancement of the standard implementation that makes use of metadata to fill in implementation left abstract in the previous layer. Dependencies can include any of the Quino framework assemblies (e.g. Quino.Meta, Quino.Application and so on).

The New Hotness1

The diagram below shows the new hotness in Quino 2.2

image

The hierarchy is now extremely flat. There is an IServer interface and a Server implementation, both generic in TListener, of type IServerListener. The server manages a single instance of an IServerListener.

The listener, in turn, has an IHttpServerRequestHandler, the main implementation of which uses an IHttpServerAuthenticator.

As mentioned above, the IServerStateProvider is included in this diagram, but is unchanged from Quino 2.0-beta3, except that it is now used by the request handler rather than directly by the server.

You can see how the abstract layer is enhanced by an HTTP-specific layer (the Encodo.Server.Http namespace) and the metadata-specific layer is nice encapsulated in three classes in the Quino.Server assembly.

Server Components and Flow

This type hierarchy has decoupled the main elements of the workflow of handling requests for a server:

  • The server manages listeners (currently a single listener), created by a listener factory
  • The listener, in turn, dispatches requests to the request handler
  • The request handler uses the route handler to figure out where to direct the request
  • The route handler uses a registry to map requests to response items
  • The request handler asks the state provider for the state for the given request
  • The state provider checks its cache for the state (the default support uses persistent states to cache sessions for a limited time); if not found, it creates a new one
  • Finally, the request handler checks whether the user for the request is authenticated and/or authorized to execute the action and, if so, executes the response items

It is important to note that this behavior is unchanged from the previous version -- it's just that now each step is encapsulated in its own component. The components are small and easily replaced, with clear and concise interfaces.

Note also that the current implementation of the request handler is for HTTP servers only. Should the need arise, however, it would be relatively easy to abstract away the HttpListener dependency and generalize most of the logic in the request handler for any kind of server, regardless of protocol and networking implementation. Only the request handler is affected by the HTTP dependency, though: authentication, state-provision and listener-management can all be re-used as-is.

Also of note is that the only full-fledged implementation is for metadata-based applications. At the bottom of the diagram, you can see the metadata-specific implementations for the route registry, state provider and authenticator. This is reflected in the standard registration in the IOC.

These are the service registrations from Encodo.Server:

return handler
  .RegisterSingle<IServerSettings, ServerSettings>()
  .RegisterSingle<IServerListenerFactory<HttpServerListener>, HttpServerListenerFactory>()
  .Register<IServer, Server<HttpServerListener>>();

And these are the service registrations from Quino.Server:

handler
  .RegisterSingle<IServerRouteRegistry<IMetaServerState>, StandardMetaServerRouteRegistry>()
  .RegisterSingle<IServerStateProvider<IMetaServerState>, MetaPersistentServerStateProvider>()
  .RegisterSingle<IServerStateCreator<IMetaServerState>, MetaServerStateCreator>()
  .RegisterSingle<IHttpServerAuthenticator<IMetaServerState>, MetaHttpServerAuthenticator>()
  .RegisterSingle<IHttpServerRequestHandler, HttpServerRequestHandler<IMetaServerState>>()

As you can see, the registration is extremely fine-grained and allows very precise customization as well as easy mocking and testing.



  1. Any Men in Black fans out there? Tommy Lee Jones was "old and busted" while Will Smith was "the new hotness"? No? Just me? All righty then...

  2. This diagram brought to you by the diagramming and architecture tools in ReSharper 9.2. Just select the files or assemblies you want to diagram in the Solution Explorer and choose the option to show them in a diagram. You can right-click any type or assembly to show dependent or referenced modules or types. For type diagrams , you can easily control which relationships are to be shown (e.g. I hide aggregations to avoid clutter) and how the elements are to be grouped (e.g. I grouped by namespace to include the boxes in my diagram).

Iterating with NDepend to remove cyclic dependencies (Part II)

In the previous article, we discussed the task of Splitting up assemblies in Quino using NDepend. In this article, I'll discuss both the high-level and low-level workflows I used with NDepend to efficiently clear up these cycles.

Please note that what follows is a description of how I have used the tool -- so far -- to get my very specific tasks accomplished. If you're looking to solve other problems or want to solve the same problems more efficiently, you should take a look at the official NDepend documentation.

What were we doing?

To recap briefly: we are reducing dependencies among top-level namespaces in two large assemblies, in order to be able to split them up into multiple assemblies. The resulting assemblies will have dependencies on each other, but the idea is to make at least some parts of the Encodo/Quino libraries opt-in.

The plan of attack

On a high-level, I tackled the task in the following loosely defined phases.

Remove direct, root-level dependencies

This is the big first step -- to get rid of the little black boxes. I made NDepend show only direct dependencies at first, to reduce clutter. More on specific techniques below.

Remove indirect dependencies

imageCrank up the magnification to show indirect dependencies as well. This will will help you root out the remaining cycles, which can be trickier if you're not showing enough detail. On the contrary, if you turn on indirect dependencies too soon, you'll be overwhelmed by darkness (see the depressing initial state of the Encodo assembly to the right).

Examine dependencies between root-level namespaces

Even once you've gotten rid of all cycles, you may still have unwanted dependencies that hinder splitting namespaces into the desired constellation of assemblies.

For example, the plan is to split all logging and message-recording into an assembly called Encodo.Logging. However, the IRecorder interface (with a single method, Log()) is used practically everywhere. It quickly becomes necessary to split interfaces and implementation -- with many more potential dependencies -- into two assemblies for some very central interfaces and support classes. In this specific case, I moved IRecorder to Encodo.Core.

Even after you've conquered the black hole, you might still have quite a bit of work to do. Never fear, though: NDepend is there to help root out those dependencies as well.

Examine cycles in non-root namespaces

Because we can split off smaller assemblies regardless, these dependencies are less important to clean up for our current purposes. However, once this code is packed into its own assembly, its namespaces become root namespaces of their own and -- voila! you have more potentially nasty dependencies to deal with. Granted, the problem is less severe because you're dealing with a logically smaller component.

In Quino, use non-root namespaces more for organization and less for defining components. Still, cycles are cycles and they're worth examining and at least plucking the low-hanging fruit.

Removing root-level namespace cycles

With the high-level plan described above in hand, I repeated the following steps for the many dependencies I had to untangle. Don't despair if it looks like your library has a ton of unwanted dependencies. If you're smart about the ones you untangle first, you can make excellent -- and, most importantly, rewarding -- progress relatively quickly.1

  1. Show the dependency matrix
  2. Choose the same assembly in the row and column
  3. Choose a square that's black
  4. Click the name of the namespace in the column to show sub-namespaces
  5. Do the same in a row
  6. Keep zooming until you can see where there are dependencies that you don't want
  7. Refactor/compile/run NDepend analysis to show changes
  8. GOTO 1

Once again, with pictures!

The high-level plan of attack sounded interesting, but might have left you cold with its abstraction. Then there was the promise of detail with a focus on root-level namespaces, but alas, you might still be left wondering just how exactly do you reduce these much-hated cycles?

I took some screenshots as I worked on Quino, to document my process and point out parts of NDepend I thought were eminently helpful.

Show only namespaces

imageimageI mentioned above that you should "[k]eep zooming in", but how do you do that? A good first step is to zoom all the way out and show only direct namespace dependencies. This focuses only on using references instead of the much-more frequent member accesses. In addition, I changed the default setting to show dependencies in only one direction -- when a column references a row (blue), but not vice versa (green).

As you can see, the diagrams are considerably less busy than the one shown above. Here, we can see a few black spots that indicate cycles, but it's not so many as to be overwhelming.2 You can hover over the offending squares to show more detail in a popup.

Show members

imageimageIf you don't see any more cycles between namespaces, switch the detail level to "Members". Another very useful feature is to "Bind Matrix", which forces the columns and rows to be shown in the same order and concentrates the cycles in a smaller area of the matrix.

As you can see in the diagram, NDepend then highlights the offending area and you can even click the upper-left corner to focus the matrix only on that particular cycle.

Drill down to classes

imageimageOnce you're looking at members, it isn't enough to know just the namespaces involved -- you need to know which types are referencing which types. The powerful matrix view lets you drill down through namespaces to show classes as well.

If your classes are large -- another no-no, but one thing at a time -- then you can drill down to show which method is calling which method to create the cycle. In the screenshot to the right, you can see where I had to do just that in order to finally figure out what was going on.

In that screenshot, you can also see something that I only discovered after using the tool for a while: the direction of usage is indicated with an arrow. You can turn off the tooltips -- which are informative, but can be distracting for this task -- and you don't have to remember which color (blue or green) corresponds to which direction of usage.

Indirect dependencies

imageimageOnce you've drilled your way down from namespaces-only to showing member dependencies, to focusing on classes, and even members, your diagram should be shaping up quite well.

On the right, you'll see a diagram of all direct dependencies for the remaining area with a problem. You don't see any black boxes, which means that all direct dependencies are gone. So we have to turn up the power of our microscope further to show indirect dependencies.

On the left, you can see that the scary, scary black hole from the start of our journey has been whittled down to a small, black spot. And that's with all direct and indirect dependencies as well as both directions of usage turned on (i.e. the green boxes are back). This picture is much more pleasing, no?

Queries and graphs

imageimageimageFor the last cluster of indirect dependencies shown above, I had to unpack another feature: NDepend queries: you can select any element and run a query to show using/used by assemblies/namespaces.3 The results are shown in a panel, where you can edit the query see live updates immediately.

Even with a highly zoomed-in view on the cycle, I still couldn't see the problem, so I took NDepend's suggestion and generated a graph of the final indirect dependency between Culture and Enums (through Expression). At this zoom level, the graph becomes more useful (for me) and illuminates problems that remain muddy in the matrix (see right).

Crossing the finish line

In order to finish the job efficiently, here are a handful of miscellaneous tips that are useful, but didn't fit into the guide above.

image

  • I set NDepend to automatically re-run an analysis on a successful build. The matrix updates automatically to reflect changes from the last analysis and won't lose your place.
  • If you have ReSharper, you'll generally be able to tell whether you've fixed the dependencies because the usings will be grayed out in the offending file. You can make several fixes at once before rebuilding and rerunning the analysis.
  • At higher zoom levels (e.g. having drilled down to methods), it is useful to toggle display of row dependencies back on because the dependency issue is only clear when you see the one green box in a sea of blue.
  • Though Matrix Binding is useful for localizing, remember to toggle it off when you want to drill down in the row independently of the namespace selected in the column.

And BOOM! just like that4, phase 1 (root namespaces) for Encodo was complete! Now, on to Quino.dll...

Conclusion

imageDepending on what shape your library is in, do not underestimate the work involved. Even with NDepend riding shotgun and barking out the course like a rally navigator, you still have to actually make the changes. That means lots of refactoring, lots of building, lots of analysis, lots of running tests and lots of reviews of at-times quite-sweeping changes to your code base. The destination is worth the journey, but do not embark on it lightly -- and don't forget to bring the right tools.5



  1. This can be a bit distracting: you might get struck trying to figure out which of all these offenders to fix first.

  2. I'm also happy to report that my initial forays into maintaining a relatively clean library -- as opposed to cleaning it -- with NDepend have been quite efficient.

  3. And much more: I don't think I've even scratched the surface of the analysis and reporting capabilities offered by this ability to directly query the dependency data.

  4. I'm just kidding. It was a lot of time-consuming work.

  5. In this case, in case it's not clear: NDepend for analysis and good ol' ReSharper for refactoring. And ReSharper's new(ish) architecture view is also quite good, though not even close to detailed enough to replace NDepend: it shows assembly-level dependencies only.

Splitting up assemblies in Quino using NDepend (Part I)

imageA lot of work has been put into Quino 2.01, with almost no stone left unturned. Almost every subsystem has been refactored and simplified, including but not limited to the data driver, the schema migration, generated code and metadata, model-building, security and authentication, service-application support and, of course, configuration and execution.

Two of the finishing touches before releasing 2.0 are to reorganize all of the code into a more coherent namespace structure and to reduce the size of the two monolithic assemblies: Encodo and Quino.

A Step Back

The first thing to establish is: why are we doing this? Why do we want to reduce dependencies and reduce the size of our assemblies? There are several reasons, but a major reason is to improve the discoverability of patterns and types in Quino. Two giant assemblies are not inviting -- they are, in fact, daunting. Replace these assemblies with dozens of smaller ones and users of your framework will be more likely to (A) find what they're looking for on their own and (B) build their own extensions with the correct dependencies and patterns. Neither of these is guaranteed, but smaller modules are a great start.

Another big reason is portability. The .NET Core was released as open-source software some time ago and more and more .NET source code is added to it each day. There are portable targets, non-Windows targets, Universal-build targets and much more. It makes sense to split code up into highly portable units with as few dependencies as possible. That is, the dependencies should be explicit and intended.

Not only that, but NuGet packaging has come to the fore more than ever. Quino was originally designed to keep third-party boundaries clear, but we wanted to make it as easy as possible to use Quino. Just include Encodo and Quino and off you went. However, with NuGet, you can now say you want to use Quino.Standard and you'll get Quino.Core, Encodo.Core, Encodo.Services.SimpleInjector, Quino.Services.SimpleInjector and other packages.

With so much interesting code in the Quino framework, we want to make it available as much as possible not only for our internal projects but also for customer projects where appropriate and, also, possibly for open-source distribution.

NDepend

I've used NDepend before2 to clean up dependencies. However, the last analysis I did about a year ago showed quite deep problems3 that needed to be addressed before any further dependency analysis could bear fruit at all. With that work finally out of the way, I'm ready to re-engage with NDepend and see where we stand with Quino.

As luck would have it, NDepend is in version 6, released at the start of summer 2015. As was the case last year, NDepend has generously provided me with an upgrade license to allow me to test and evaluate the new version with a sizable and real-world project.

Here is some of the feedback I sent to NDepend:

I really, really like the depth of insight NDepend gives me into my code. I find myself thinking "SOLID" much more often when I have NDepend shaking its head sadly at me, tsk-tsking at all of the dependency snarls I've managed to build.

  • It's fast and super-reliable. I can work these checks into my workflow relatively easily.
  • I'm using the matrix view a lot more than the graphs because even NDepend recommends I don't use a graph for the number of namespaces/classes I'm usually looking at
  • Where the graph view is super-useful is for examining indirect dependencies, which are harder to decipher with the graph
  • I've found so many silly mistakes/lazy decisions that would lead to confusion for developers new to my framework
  • I'm spending so much time with it and documenting my experiences because I want more people at my company to use it
  • I haven't even scratched the surface of the warnings/errors but want to get to that, as well (the Dashboard tells me of 71 rules violated; 9 critical; I'm afraid to look :-)

Use Cases

Before I get more in-depth with NDepend, please note that there at least two main use cases for this tool4:

  1. Clean up a project or solution that has never had a professional dependency checkup
  2. Analyze and maintain separation and architectural layers in a project or solution

These two use cases are vastly different. The first is like cleaning a gas-station bathroom for the first time in years; the second is more like the weekly once-over you give your bathroom at home. The tools you'll need for the two jobs are similar, but quite different in scope and power. The same goes for NDepend: how you'll use it to claw your way back to architectural purity is different than how you'll use it to occasionally clean up an already mostly-clean project.

Quino is much better than it was the last time we peeked under the covers with NDepend, but we're still going to need a bucket of industrial cleaner before we're done.5

The first step is to make sure that you're analyzing the correct assemblies. Show the project properties to see which assemblies are included. You should remove all assemblies from consideration that don't currently interest you (especially if your library is not quite up to snuff, dependency-wise; afterwards, you can leave as many clean assemblies in the list as you like).6

Industrial-strength cleaner for Quino

Running an analysis with NDepend 6 generates a nice report, which includes the following initial dependency graph for the assemblies.

image

As you can see, Encodo and Quino depend only on system assemblies, but there are components that pull in other references where they might not be needed. The initial dependency matrices for Encodo and Quino both look much better than they did when I last generated one. The images below show what we have to work with in the Encodo and Quino assemblies.

imageimage

It's not as terrible as I've made out, right? There is far less namespace-nesting, so it's much easier to see where the bidirectional dependencies are. There are only a handful of cyclic dependencies in each library, with Encodo edging out Quino because of (A) the nature of the code and (B) I'd put more effort into Encodo so far.

I'm not particularly surprised to see that this is relatively clean because we've put effort into keeping the external dependencies low. It's the internal dependencies in Encodo and Quino that we want to reduce.

Small and Focused Assemblies

imageimageimage

The goal, as stated in the title of this article, is to split Encodo and Quino into separate assemblies. While removing cyclic dependencies is required for such an operation, it's not sufficient. Even without cycles, it's still possible that a given assembly is still too dependent on other assemblies.

Before going any farther, I'm going to list the assemblies we'd like to have. By "like to have", I mean the list that we'd originally planned plus a few more that we added while doing the actual splitting.7 The images on the right show the assemblies in Encodo, Quino and a partial overview of the dependency graph (calculated with the ReSharper Architecture overview rather than with NDepend, just for variety).

Of these, the following assemblies and their dependencies are of particular interest[^8]:

  • Encodo.Core: System dependencies only
  • Encodo.Application: basic application support8
  • Encodo.Application.Standard: configuration methods for non-metadata applications that don't want to pick and choose packages/assemblies
  • Encodo.Expressions: depends only on Encodo.Core
  • Quino.Meta: depends only on Encodo.Core and Encodo.Expressions
  • Quino.Meta.Standard: Optional, but useful metadata extensions
  • Quino.Application: depends only on Encodo.Application and Quino.Meta
  • Quino.Application.Standard: configuration methods for metadata applications that don't want to pick and choose packages/assemblies
  • Quino.Data: depends on Quino.Application and some Encodo.* assemblies
  • Quino.Schema: depends on Quino.Data

This seems like a good spot to stop, before getting into the nitty-gritty detail of how we used NDepend in practice. In the next article, I'll discuss both the high-level and low-level workflows I used with NDepend to efficiently clear up these cycles. Stay tuned!


Articles about design:

    * [Encodos configuration library for Quino: part I](/blogs/developer-blogs/encodos-configuration-library-for-quino-part-i/)
    * [Encodos configuration library for Quino: part II](/blogs/developer-blogs/encodos-configuration-library-for-quino-part-ii/)
    * [Encodos configuration library for Quino: part III](/blogs/developer-blogs/encodos-configuration-library-for-quino-part-iii/)
    * [API Design: Running an Application (Part I)](/blogs/developer-blogs/api-design-running-an-application-part-i/)
    * [API Design: To Generic or not Generic? (Part II)](/blogs/developer-blogs/api-design-to-generic-or-not-generic-part-ii/)
If you already see the correct assemblies in the list, you should still check that NDepend picked up the right paths. That is, if you haven't followed the advice in NDepend's white paper and still have a different `bin` folder for each assembly, you may see something like the following in the tooltip when you hover over the assembly name:

Several valid .NET assemblies with the name have been found. They all have the same version. the one with the biggest file has been chosen.

If NDepend has accidentally found an older copy of your assembly, you must delete that assembly. Even if you add an assembly directly, NDepend will not honor the path from which you added it. This isn't as bad as it sounds, since it's a very strange constellation of circumstances that led to this assembly hanging around anyway:

    * The project is no longer included in the latest Quino but lingers in my workspace
    * The version number is unfortunately the same, even though the assembly is wildly out of date

I only noticed because I knew I didn't have that many dependency cycles left in the Encodo assembly.


  1. Release notes for 2.0 betas:

    * [v2.0-beta1: Configuration, services and web](/blogs/developer-blogs/v20-beta1-configuration-services-and-web/)
    * [v2.0-beta2: Code generation, IOC and configuration](/blogs/developer-blogs/v20-beta2-code-generation-ioc-and-configuration/)
    

  2. I published a two-parter in August and November of 2014.

    * [The Road to Quino 2.0: Maintaining architecture with NDepend (part I)](/blogs/developer-blogs/the-road-to-quino-20-maintaining-architecture-with-ndepend-part-i/)
    * [The Road to Quino 2.0: Maintaining architecture with NDepend (part II)](/blogs/developer-blogs/the-road-to-quino-20-maintaining-architecture-with-ndepend-part-ii/)
    

  3. You can see a lot of the issues associated with these changes in the release notes for Quino 2.0-beta1 (mostly the first point in the "Highlights" section) and Quino 2.0-beta2 (pretty much all of the points in the "Highlights" section).

  4. I'm sure there are more, but those are the ones I can think of that would apply to my project (for now).

  5. ...to stretch the gas-station metaphor even further.

  6. Here I'm going to give you a tip that confused me for a while, but that I think was due to particularly bad luck and is actually quite a rare occurrence.

  7. Especially for larger libraries like Quino, you'll find that your expectations about dependencies between modules will be largely correct, but will still have gossamer filaments connecting them that prevent a clean split. In those cases, we just created new assemblies to hold these common dependencies. Once an initial split is complete, we'll iterate and refactor to reduce some of these ad-hoc assemblies.[^8]: Screenshots, names and dependencies are based on a pre-release version of Quino, so while the likelihood is small, everything is subject to change.

  8. Stay tuned for an upcoming post on the details of starting up an application, which is the support provided in Encodo.Application.