2 3 4 5 6 7 8 9 10 11 12
The Road to Quino 2.0: Maintaining architecture with NDepend (part I)

Full disclosure

A while back -- this last spring, I believe -- I downloaded NDepend to analyze code dependencies. The trial license is fourteen days; needless to say, I got only one afternoon in before I was distracted by other duties. That was enough, however, to convince me that it was worth the $375 to continue to clean up Quino with NDepend.

I decided to wait until I had more time before opening my wallet. In the meantime, however, Patrick Smacchia of NDepend approached me with a free license if I would write about my experiences using NDepend on Encodo's blog. I'm happy to write about how I used the tool and what I think it does and doesn't do.1

History & Background

imageWe started working on Quino in the fall of 2007. As you can see from the first commit, the library was super-small and comprised a single assembly.

Fast-forward seven years and Version 1.13 of Quino has 66 projects/assemblies. That's a lot of code and it was long past time to take a look a more structured look at how we'd managed the architecture over the years.

I'd already opened a branch in our Quino repository called feature/dependencyChanges and checked in some changes at the beginning of July. Those changes had come as a result of the first time I used NDepend to find a bunch of code that was in the wrong namespace or the wrong assembly, architecturally speaking.

I wasn't able to continue using this branch, though, for the following reasons.

  1. I got the hang of NDepend relatively quickly and got a bit carried away. Using ReSharper, I was able to make a lot of changes and fixes in a relatively short amount of time.
  2. I checked in all of these changes in one giant commit.
  3. I did this all five months ago.
  4. There have been hundreds of subsequent commits on the master branch, many of which also include global refactoring and cleanup.
  5. As a result of the above, merging master into feature/dependencyChanges is more trouble than it's worth.

Release Methodology

With each Quino change and release, we try our hardest to balance backward-compatibility with maintainability and effort. If it's easy enough to keep old functionality under an old name or interface, we do so.

We mark members and types obsolete so that users are given a warning in the compiler but can continue using the old code until they have time to upgrade. These obsolete members are removed in the next major or minor upgrade.

Developers who have not removed their references to obsolete members will at this point be greeted with compiler errors. In all cases, the user can find out from Quino's release notes how they should fix a warning or error.

The type of high-level changes that we have planned necessitate that we make a major version-upgrade, to Quino 2.0. In this version, we have decided not to maintain backward-compatibility in the code with Obsolete attributes. However, where we do make a breaking change -- either by moving code to new or different assemblies or by changing namespaces -- we want to maintain a usable change-log for customers who make the upgrade. The giant commit that I'd made previously was not a good start.

Take Two

Since some of these changes will be quite drastic departures in structure, we want to come up with a plan to make merging from the master branch to the feature/dependencyChanges branch safer, quicker and all-around easier.

I want to include many of the changes I started in the feature/dependencyChanges branch, but would like to re-apply those changes in the following manner:

  • Split the giant commit into several individual commits, each of which encapsulates exactly one change; smaller commits are much easier to merge
  • Document breaking changes in the release notes for Quino 2.0
  • Blog about/document the process of using NDepend to clean up Quino2

So, now that I'm ready to start cleaning up Quino for version 2.0, I'll re-apply the changes from the giant commit, but in smaller commits. At the same time, I'll use NDepend to find the architectural breaks that caused me to make those changes in the first place and document a bit of that process.

Setting up the NDepend Project

I created an NDepend project and attached it to my solution. Version 1.13 of Quino has 66 projects/assemblies, of which I chose the following "core" assemblies to analyze.

image

I can change this list at any time. There are a few ways to add assemblies. Unfortunately, the option to "Add Assemblies from VS Solution(s)" showed only 28 of the 66 projects in the Quino solution. I was unable to determine the logic that led to the other 38 projects not being shown. When I did select the projects I wanted from the list, the assemblies were loaded from unexpected directories. For example, it added a bunch of core assemblies (e.g. Encodo.Imaging) from the src/tools/Quino.CodeGenerator/bin/ folder rather than the src/libraries/Encodo.Imaging/bin folder. I ended up just taking the references I was offered by NDepend and added references to Encodo and Quino, which it had not offered to add.3

The NDepend Dashboard

Let's take a look at the initial NDepend Dashboard.

image

There's a lot of detail here. The initial impression of NDepend can be a bit overwhelming, I supposed, but you have to remember the sheer amount of interdependent data that it shows. As you can see on the dashboard, not only are there a ton of metrics, but those metrics are also tracked on a time-axis. I only have one measurement so far.

Any assemblies not included in the NDepend project are considered to be "third-party" assemblies, so you can see external dependencies differently than internal ones. There is also support for importing test-coverage data, but I haven't tried that yet.

There are a ton of measurements in there, some of which interest me and others that don't, or with which I disagree. For example, over 1400 warnings are in the Quino* assemblies because the base namespace -- Encodo.Quino -- doesn't correspond to a file-system folder -- it expects Encodo/Quino, but we use just Quino.

Another 200 warnings are to "Avoid public methods not publicly visible", which generally means that we've declared public methods on internal, protected or private classes. The blog post Internal or public? by Eric Lippert covered this adequately and came to the same conclusion that we have: you actually should make methods public if they are public within their scope.

There are some White Books about namespace and assembly dependencies that are worth reading if you're going to get serious about dependencies. There's a tip in there about turning off "Copy Local" on referenced assemblies to drastically increase compilation speed that we're going to look into.

Dependencies and cycles

One of the white books explains how to use namespaces for components and how to "levelize" an architecture. This means that the dependency graph is acyclic -- that there are no dependency cycles and that there are certainly no direct interdependencies. The initial graphs from the Encodo and Quino libraries show that we have our work cut out for us.

imageimageimage

The first matrix shows the high-level view of dependencies in the Encodo and Quino namespaces. Click the second and third to see some initial dependency issues within the Encodo and Quino assemblies.

That's as far as I've gotten so far. Tune in next time for a look at how we managed to fix some of these dependency issues and how we use NDepend to track improvement over time.



  1. I believe that takes care of full disclosure.

  2. This is something I'd neglected to do before. Documenting this process will help me set up a development process where we use NDepend more regularly -- more than every seven years -- and don't have to clean up so much code at once.

  3. After having read the recommendations in the NDepend White Book -- Partitioning code base through .NET assemblies and Visual Studio projects (PDF) -- it's clear why this happens: NDepend recommends using a single /bin folder for all projects in a solution.

v1.13.0: Schema migration, remoting, services and web apps

The summary below describes major new features, items of note and breaking changes. The full list of issues is also available for those with access to the Encodo issue tracker.

Highlights

Data & Schema

Remoting & services

  • Fixed several issues in the remoting driver (client and server parts). (QNO-4626, QNO-4630, QNO-4631, QNO-4388, QNO-4575, QNO-4629, QNO-4573, QNO-4625, QNO-4633, QNO-4575)
  • Added a runner for Windows services that allows debugging and shows logging output for applications that use the CoreServiceBase, which extends the standard .NET ServiceBase. The runner is available in the Encodo.Service assembly.

Web

  • Improved default and custom authentication in web applications and the remoting server. Also improved support for authorization for remote-method routes as well as MVC controllers.
  • Improved configuration, error-handling and stability of the HttpApplicationBase, especially in situations where the application fails to start. Error-page handling was also improved, including handling for Windows Event Log errors.
  • Improved appearance of the web-based schema migrator. (QNO-4559, QNO-4561, QNO-4563, QNO-4548, QNO-4487, QNO-4486, QNO-4488)

Winform

  • Data-provider statistics: improved the WinForm-based statistics form. (QNO-4231, QNO-4545, QNO-4546)
  • Standard forms: updated the standard WinForm about window and splash screen to use Encodo web-site CI. (QNO-4529)

System & Tools

  • Removed the dependency on the SmartWeakEvents library from Quino. (QNO-4645); the Quino and Encodo assemblies now no longer have any external dependencies.
  • Image handling: the Encodo and Quino libraries now use the Windows Imaging Components instead of System.Drawing. (QNO-4536)
  • Window 8.1: fixed culture-handling for en-US and de-CH that is broken in Windows 8.1. (QNO-4534, QNO-4553)
  • R# annotations have been added to the Encodo assembly. Tell R# to look in the Encodo.Core namespace to use annotations like NotNull and CanBeNull with parameters and results. (QNO-4508)
  • Generated code now includes a property that returns a ValueListObject for each enum property in the metadata. For example, for a property named State of type CoreState, the generated code includes the former properties for the enum and the foreign key backing it, but now also includes the ValueListObject property. This new property provides easy access to the captions.
public CoreState State { ... }
public ValueListObject StateObject { ... }
public int? CoreStateIdId { ... }
```Improved the **nant fix** command in the default build tools to fix the assembly name as well. The build tools are available in bin/tools/build. See the `src/demo/Demo.build` file for an example on how to use the Nant build scripts for your own solutions. To change the company name used by the "fix" command, for example, add the following task override:

``` * Fixed the implementation of `IntegrateRemotableMethods` to avoid a race condition with **remote methods**. Also improved the stability of the `DataProvider` statistics. ([QNO-4599](https://secure.encodo.ch/jira/browse/QNO-4599))

Breaking changes

  • The generic argument TRight has been removed from all classes and interfaces in the Encodo.Security.* namespace. In order to fix this code, just remove the int generic parameter wherever it was used. For example, where before you used the interface IUser<int>, you should now use IUser (QNO-4576).
  • The overridable method MetaAccessControl.DoGetAccessChecker() has been renamed to MetaAccessControl.GetAccessChecker().
  • Renamed the Encodo.ServiceLocator.SimpleInjector.dll to Encodo.Services.SimpleInjector.dll and Quino.ServiceLocator.SimpleInjector.dll to Quino.Services.SimpleInjector.dll Also changed the namespace Quino.ServiceLocator to Encodo.Quino.Services.
  • Renamed HttpApplicationBase.StartMetaApplication() to CreateAndStartUpApplication().
  • Classes may no longer contain properties with names that conflict with properties of IMetaReadable (e.g. Deleted, Persisted). The model will no longer validate until the properties have been renamed and the code regenerated. (QNO-4185)
  • Removed StandardIntRights with integer constants and replaced it with StandardRights with string constants.
  • The IAccessControl.Check() and other related methods now accept a sequence of string rights rather than integers.
  • IMetaConfiguration.ConfigureSession() has been deprecated. The method will still be called but may have undesired side-effects, depending on why it was overridden. The common use was to initialize a custom AccessControl for the session. Continuing to do so may overwrite the current user set by the default Winform startup. Instead, applications should use the IDataSessionAccessControlFactory and IDataSessionFactory to customize the data sessions and access controls returned for an application. In order to attach an access control, take care to only set your custom access control for sessions that correspond to your application model.[^1]
internal class JobVortexDataSessionAccessControlFactory : DataSessionAccessControlFactory
{
  public override IAccessControl CreateAccessControl(IDataSession session)
  {
    if (session.Application.Model.MetaId == JobVortexModelGenerator.ModelGuid)
    {
      return new JobVortexAccessControl(session);
    }

    return base.CreateAccessControl(session);
  }
}

The default length of the UserModule.User.PasswordHash property has been increased from 100 characters to 1000. This default is more sensible for implementations that use much longer validations tokens instead of passwords. To avoid the schema migration, revert the change by setting the property default length back to 0 in your application model, after importing the security module, as shown below.

var securityModule = Builder.Include<SecurityModuleGenerator>();      
securityModule.Elements.Classes.User.Properties[
  Encodo.Quino.Models.Security.Classes.SecurityUser.Fields.PasswordHash
].MaximumSize = 100;
````Application.Credentials` has been removed. To fix references, retrieve the `IUserCredentialsManager` from the service locator. For example, the following code returns the current user:

Session.Application.Configuration.ServiceLocator.GetInstance().Current


If your application uses the `WinformMetaConfigurationTools.IntegrateWinformPackages()` or `WinformDxMetaConfigurationTools.IntegrateWinformDxPackages()`, then the  `IDataSession.AccessControl.CurrentUser` will continue to be set correctly. If not, add the `SingleUserApplicationConfigurationPackage` to your application's configuration. The user in the remoting server will be set up correctly. Add the `WebApplicationConfigurationPackage` to web applications in order to ensure that the current user is set up correctly for each request. ([QNO-4596](https://secure.encodo.ch/jira/browse/QNO-4596))
  * `IDataSession.SyncRoot` has been removed as it was no longer needed or used in Quino itself. Sessions should *not* be used in multiple threads, so there is no need for a `SyncRoot`. Code that uses it should be reworked to use a separate session for each thread.
  * Moved `IMetaApplication.CreateSession()` to an extension method. Add `Encodo.Quino.App` to the using clauses to fix any compile errors.
  * Removed `IMetaApplication.DataProvider`; use `IMetaApplication.Configuration.DataProvider` instead. ([QNO-4604](https://secure.encodo.ch/jira/browse/QNO-4604))
  * The schema migration API has been completely overhauled. `ISchemaChange` and descendents has been completely removed. `ISchemaAction` is no longer part of the external API, although it is still used internally. The `ISchemaChangeFactory` has been renamed to `ISchemaCommandFactory` and, instead of creating change objects, which are then applied directly, returns `ISchemaCommand` objects, which can be either executed or transformed in some other way. `IMigrateToolkit.GetActionFor()` has also been replace with `CreateCommands()`, which mirrors the rest of the API by returning a sequence of commands to address a given `ISchemaDifference`. This release still has some commands that cannot be transformed to pure SQL, but the goal is to be able to generate pure SQL for a schema migration. ([QNO-993](https://secure.encodo.ch/jira/browse/QNO-993), [QNO-4579](https://secure.encodo.ch/jira/browse/QNO-4579), [QNO-4581](https://secure.encodo.ch/jira/browse/QNO-4581), [4588](https://secure.encodo.ch/jira/browse/QNO-4588), [4591](https://secure.encodo.ch/jira/browse/QNO-4591), [QNO-4594](https://secure.encodo.ch/jira/browse/QNO-4594))
  * `IMigrateSchemaAspect.Apply()` has been removed. All aspects will have to be updated to implement `GetCommands()` instead, or to use one of the available base classes, like `UpdateDataAspectBase` or `ConvertPropertyTypeSchemaAspect`. The following example shows how to use the `UpdateDataAspectBase` to customize migration for a renamed property.

internal class ArchivedMigrationAspect : UpdateDataAspectBase { public ArchivedMigrationAspect() : base("ArchivedMigrationAspect", DifferenceType.RenamedProperty, ChangePhase.Instead)

protected override void UpdateData(IMigrateContext context, ISchemaDifference difference) { using (var session = context.CreateSession(difference)) { session.ChangeAndSaveAll(UpdateArchivedFlag); } }

private void UpdateArchivedFlag(Project obj) }


The base aspects should cover most needs; if your functionality is completely customized, you can easily pass your previous implementation of `Apply()` to a `DelegateSchemaCommand` and return that from your implementation of `GetCommands()`. See the implementation of `UpdateDataAspectBase` for more examples. ([QNO-4580](https://secure.encodo.ch/jira/browse/QNO-4580))
  * `MetaObjectIdEqualityComparer<T>` can no longer be constructed directly. Instead, use `MetaObjectIdEqualityComparer<Project>.Default`.
  * Renamed `MetaClipboardControlDx.UpdateColorSkinaware()` to `MetaClipboardControlDx.UpdateSkinAwareColors()`.
  * `IMetaUnique.LogicalParent` has been moved to `IMetaBase`. Since `IMetaUnique` inherits from `IMetaBase`, it is unlikely that code is affected (unless reflection or some other direct means was used to reference the property). ([QNO-4586](https://secure.encodo.ch/jira/browse/QNO-4586))
  * `IUntypedMessage` has been removed; the `AssociatedObject` formerly found there has been moved to `IMessage`.
  * `ITypedMessage.AssociatedObject` has been renamed to `ITypedMessage.TypedAssociatedObject`. ([QNO-4647](https://secure.encodo.ch/jira/browse/QNO-4647))
  * Renamed `MetaObjectTools` to `MetaReadableTools`.
  * Redefined the protected methods `GenericObject.GetAsGuid()` and `GenericObject.GetAsGuidDefault` as extension methods in `MetaWritableTools`.
  * `IMetaFeedback.CreateGlobalContext()` has been removed. Instead the `IGlobalContext` is created using the service locator.

------------------------------------------------------------------------


[^1]: The schema migration creates a metadata model for your model -- meta-metadata -- and uses the Quino ORM to load data when importing a model from a database. If you aren't careful, as shown in the code example, then you'll attach your custom access control to the sessions created for the schema migration's data-access, which will more than likely fail when it tries to load user data from a table that does not exist in that model.
Schema migration in Quino 1.13

Quino is a metadata framework for .NET. It provides a means of defining an application-domain model in the form of metadata objects. Quino also provides many components and support libraries that work with that metadata to automate many services and functions. A few examples are an ORM, schema migration, automatically generated user interfaces and reporting tools.

The schema-migration tool

The component we're going to discuss is the automated schema-migration for databases. A question that recently came up with a customer was: what do all of the options mean in the console-based schema migrator?

Here's the menu you'll see in the console migrator:

Advanced Options
(1) Show migration plan
(2) Show significant mappings
(3) Show significant mappings with unique ids
(4) Show all mappings
(5) Show all mappings with unique ids

Main Options
(R) Refresh status
(M) Migrate database
(C) Cancel

The brief summary is:

  • The only action that actually makes changes is (M)
  • Option (1) is the only advanced option you will every likely use; use this to show the changes that were detected

The other advanced options are more for debugging the migration recommendation if something looks wrong. In order to understand what that means, we need to know what the migrator actually does.

image

  1. Provide the application model as input
  2. Import a model from the database as input
  3. Generate a mapping between the two models
  4. Create a migration plan to update the database to reflect the application model
  5. Generate a list of commands that can be applied to the database to enact the plan
  6. Execute the commands against the database

The initial database-import and final command-generation parts of migration are very database-specific. The determination of differences is also partially database-specific (e.g. some databases do not allow certain features so there is no point in detecting a difference that cannot ever be repaired). The rest of the migration logic is database-independent.

Gathering data for migration

The migrator works with two models: the target model and a source model

  • The target model is provided as part of the application and is usually loaded from a core assembly.
  • The source model is imported from the database schema by the "import handler"

Given these two models, the "mapping builder" creates a mapping. In the current implementation of Quino, there is no support for allowing the user to adjust mapping before a migration plan is built from it. However, it would be possible to allow the user to verify and possibly adjust the mapping. Experience has shown that this is not necessary. Anytime we thought we needed to adjust the mapping, the problem was instead that the target model had been configured incorrectly. That is, each time we had an unexpected mapping, it led us directly to a misconfiguration in the model.

The options to show mappings are used to debug exactly such situations. Before we talk about mapping, though, we should talk about what we mean by "unique ids". Every schema-relevant bit of metadata in a Quino model is associated with a unique id, in the form of a Guid and called a "MetaId" in Quino.

Importing a model from a database

What happens during when the import handler generates a model?

The importer runs in two phases:

  1. Extract the "raw model" from the database schema
  2. Enhance the "raw model" with data pulled from the application-specific Quino metadata table in the same database

A Quino application named "demo" will have the following schema:

  • All modeled tables are named "demo__*"
  • The metadata table is named "demometadata__elementdescription"

The migrator reads the following information into a "raw model"

  • Tables => MetaClasses
  • Fields/Columns => MetaProperties
  • Indexes => MetaIndexes
  • Foreign Keys => MetaPaths

If there is no further information in the database, then the mapper will have to use the raw model only. If, however, the database was created or is being maintained by Quino, then there is additional information stored in the metadata table mentioned above. The importer enhanced the raw model with this information, in order to improve mapping and difference-recognition. The metadata table contains all of the Quino modeling information that is not reflected in a standard database schema (e.g. the aforementioned MetaId).

The data available in this table is currently:

  • SchemaIdentifier: the identifier used in the raw model/database schema
  • Identifier: the actual identifier of the metadata element that corresponds to the element identified by the SchemaIdentifier
  • MetaId: the unique id for the metadata element
  • ObjectType: the type of metadata (one of: class, property, index, path, model)
  • ParentMetaId: the unique id of the metadata element that is the logical parent of this one; only allowed to be empty for elements with ObjectType equal to "model"
  • Data: Custom data associated with the element, as key/value pairs
  • DataVersion: Identifies the format type of the "Data" element (1.0.0.0 corresponds to CSV)

For each schema element in the raw model, the importer does the following:

  1. Looks up the data associated with that SchemaIdentifier and ObjectType (e.g. "punchclock__person" and "class")
  2. Updates the "Identifier"
  3. Sets the "MetaId"
  4. Loads the key/value pairs from the Data field and applies that data to the element

Generating a mapping

At this point, the imported model is ready and we can create a mapping between it and the application model. The imported model is called the source model while the application model is called the target model because we're migrating the "source" to match the "target".

We generate a mapping by iterating the target model:

  1. Find the corresponding schema element in the source model using MetaIds1
  2. If an element can be found, create a mapping for those two elements
  3. If no element can be found, create a mapping with the target element. This will cause the element to be created in the database.
  4. For all elements in the source model that have no corresponding element in the target model, create a mapping with only the source element. This will cause the element to be dropped from the database.

Creating a migration plan

The important decisions have already been made in the mapping phase. At this point, the migrator just generates a migration plan, which is a list of differences that must be addressed in order to update the database to match the target model.

  • If the mapping has a source and target element
    • Create a difference if the element has been renamed
    • Create a difference if the element has been altered (e.g. a property has a different type or is now nullable; an index has new properties or is no longer unique; etc.) If the mapping has only a source, generate a difference that the element is unneeded and should be dropped.
  • If the mapping has only a target, generate a difference that the element is missing and should be created.

This is the plan that is shown to the user by the various migration tools available with Quino.2

The advanced console-migrator commands

At this point, we can now understand what the advanced console-migrator commands mean. Significant mappings are those mappings which correspond to a difference in the database (create, drop, rename or alter).

  • Show significant mappings: show significant mappings to see more detail about the names on each side
  • Show significant mappings with unique ids: same as above, but also include the MetaIds for each side. Use this to debug when you suspect that you might have copy/pasted a MetaId incorrectly or inadvertently moved one.
  • Show all mappings: Same detail level as the first option, but with all mappings, including those that are 100% matches
  • Show all mappings with unique ids: same as above, but with MetaIds

As already stated, the advanced options are really there to help a developer see why the migrator might be suggesting a change that doesn't correspond to expectations.

Generating commands for the plan

At this point, the migrator displays the list of differences that will be addressed by the migrator if the user chooses to proceed.

What happens when the user proceeds? The migrator generates database-specific commands that, when executed against the database, will modify the schema of the database.3

Commands are executed for different phases of the migration process. The phases are occasionally extended but currently comprise the following.

  • Initialize: perform any required initialization before doing anything to the schema
  • DropConstraintsAndIndexes: drop all affected constraints and indexes that would otherwise prevent the desired modification of the elements involved in the migration.
  • AddUpdateOrRenameSchema: Create new tables, columns and indexes and perform any necessary renaming. The changes in this phase are non-destructive
  • UpdateData: Perform any necessary data updates before any schema elements are removed. This is usually the phase in which custom application code is executed, to copy existing data from other tables and fields before they are dropped in the next phase. For example, if there is a new required 1--1 relation, the custom code might analyze the other data in the rows of that table to determine which value that row should have for the new foreign key.
  • DropSchema: Drop any unneeded schema elements and data
  • CreatePrimaryKeys: Create primary keys required by the schema. This includes both new primary keys as well as reestablishing primary keys that were temporarily dropped in the second phase.
  • CreateConstraintsAndIndexes: Create constraints and indexes required by the schema. This includes both new constraints and indexes as well as reestablishing constraints and indexes that were temporarily dropped in the second phase.
  • UpdateMetadata: Update the Quino-specific metadata table for the affected elements.

Executing the migration plan

The commands are then executed and the results logged.

Afterward, the schema is imported again, to verify that there are no differences between the target model and the database. In some (always rarer) cases, there will still be differences, in which case, you can execute the new migration plan to repair those differences as well.

In development, this works remarkably well and often, without further intervention.

Fixing failed migrations

In some cases, there is data in the database that, while compatible with the current database schema, is incompatible with the updated schema. This usually happens when a new property or constraint is introduced. For example, a new required property is added that does not have a default value or a new unique index is added which existing data violates.

In these cases, there are two things that can be done:

  • Either the database data is cleaned up in a way that makes it compatible with the target schema4
  • Or the developer must add custom logic to the metadata elements involved. This usually means that the developer must set a default value on a property. In rarer cases, the developer must attach logic to the affected metadata (e.g. the property or index that is causing the issue) that runs during schema migration to create new data or copy it from elsewhere in order to ensure that constraints are satisfied when they are reestablished at the end of the migration.

In general, it's strongly advised to perform a migration against a replica of the true target database (e.g. a production database) in order to guarantee that all potential data situations have been anticipated with custom code, if necessary.

Quino Migration versus EF Migrations

It's important to point out that Quino's schema migration is considerably different from that employed by EF (which it picked up from the Active Migrations in Ruby, often used with Ruby on Rails). In those systems, the developer generates specific migrations to move from one model version to another. There is a clear notion of upgrading versus downgrading. Quino only recognizes migrating from an arbitrary model to another arbitrary model. This makes Quino's migration exceedingly friendly when moving between development branches, unlike EF, whose deficiencies in this area have been documented.



  1. The default is to use only MetaIds. There is a mode in which identifiers are used as a fallback but it is used only for tools that import schemas that were not generated by Quino. Again, if the Quino metadata table hasn't been damaged, this strict form of mapping will work extremely well.

  2. The Winform and Web user interfaces for Quino both include built-in feedback for interacting with the schema migration. There are also two standalone tools to migrate database schemas: a Winform application and a Windows console application.

  3. The form of these commands is currently a mix of SQL and custom C# code. A future feature of the migration will be to have all commands available as SQL text so that the commands, instead of being executed directly, could be saved as a file and reviewed and executed by DBAs instead of letting the tool do it. We're not quite there yet, but proceeding nicely.

  4. This is generally what a developer does with his or her local database. The data contained therein can usually be more or less re-generated. If there is a conflict during migration, a developer can determine whether custom code is necessary or can sometimes determine that the data situation that causes the problem isn't something that comes up in production anyway and just remove the offending elements or data until the schema migration succeeds.

Optimizing compilation and execution for dynamic languages

The long and very technical article Introducing the WebKit FTL JIT provides a fascinating and in-depth look at how a modern execution engine optimizes code for a highly dynamic language like JavaScript.

To make a long story short: the compiler(s) and execution engine optimize by profiling and analyzing code and lowering it to runtimes of ever decreasing abstraction to run as the least dynamic version possible.

A brief history lesson

What does it mean to "lower" code? A programming language has a given level of abstraction and expressiveness. Generally, the more expressive it is, the more abstracted it is from code that can actually be run in hardware. A compiler transforms or translates from one language to another.

When people started programming machines, they used punch cards. Punch cards did not require any compilation because the programmer was directly speaking the language that the computer understood.

The first layer of abstraction that most of us -- older programmers -- encountered was assembly language, or assembler. Assembly code still has a more-or-less one-to-one correspondence between instructions and machine-language codes but there is a bit of abstraction in that there are identifiers and op-codes that are more human-readable.

Procedural languages introduced more types of statements like loops and conditions. At the same time, the syntax was abstracted further from assembler and machine code to make it easier to express more complex concepts in a more understandable manner.

At this point, the assembler (which assembled instructions into machine op-codes) became a compile which "compiled" a set of instructions from the more abstract language. A compiler made decisions about how to translate these concepts, and could make optimization decisions based on registers, volatility and other settings.

In time, we'd graduated to functional, statically typed and/or object-oriented languages, with much higher levels of abstraction and much more sophisticated compilers.

Generally, a compiler still used assembly language as an intermediate format, which some may remember from their days working with C++ or Pascal compilers and debuggers. In fact, .NET languages are also compiled to IL -- the "Intermediate Language" -- which corresponds to the instruction set that the .NET runtime exposes. The runtime compiles IL to the underlying machine code for its processor, usually in a process called JIT -- Just-In-Time compilation. That is, in .NET, you start with C#, for example, which the compiler transforms to IL, which is, in turn, transformed to assembler and then machine code by the .NET runtime.

Static vs. Dynamic compilation

A compiler and execution engine for a statically typed language can make assumptions about the types of variables. The set of possible types is known in advance and types can be checked very quickly in cases where it's even necessary. That is, the statically typed nature of the language allows the compiler to reason about a given program without making assumptions. Certain features of a program can be proven to be true. A runtime for a statically typed language can often avoid type checks entirely. It benefits from a significant performance boost without sacrificing any runtime safety.

The main characteristic of a dynamic language like JavaScript is that variables do not have a fixed type. Generated code must be ready for any eventuality and must be capable of highly dynamic dispatch. The generated code is highly virtualized. Such a runtime will execute much more slowly than a comparable statically compiled program.

Profile-driven compilation

Enter the profile-driven compiler, introduced in WebKit. From the article,

The only a priori assumption about web content that our engine makes is that past execution frequency of individual functions is a good predictor for those functions future execution frequency.

Here a "function" corresponds to a particular overload of a set of instructions called with parameters with a specific set of types. That is, suppose a JavaScript function is declared with one parameter and is called once with a string and 100 times with an integer. WebKit considers this to be two function overloads and will (possibly) elect to optimize the second one because it is called much more frequently. The first overload will still handle all possible types, including strings. In this way, all possible code paths are still possible, but the most heavily used paths are more highly optimized.

All of the performance is from the DFGs type inference and LLVMs low-level optimizing power. [...]

Profile-driven compilation implies that we might invoke an optimizing compiler while the function is running and we may want to transfer the functions execution into optimized code in the middle of a loop; to our knowledge the FTL is the first compiler to do on-stack-replacement for hot-loop transfer into LLVM-compiled code.

Depending on the level of optimization, the code contains the following broad sections:

  • Original: code that corresponds to instructions written by the author
  • Profiling: code to analyze which types actually appear in a given code path
  • Switching: code to determine when a function has been executed often enough to warrant further optimization
  • Bailout code to abandon an optimization level if any of the assumptions made at that level no longer apply

image

While WebKit has included some form of profile-driven compilation for quite some time, the upcoming version is the first to carry the same optimization to LLVM-generated machine code.

I recommend reading the whole article if you're interested in more detail, such as how they avoided LLVM compiler performance issues and how they integrated this all with the garbage collector. It's really amazing how much that we take for granted the WebKit JS runtime treats as "hot-swappable". The article is quite well-written and includes diagrams of the process and underlying systems.

Configure IIS for passing static-file requests to ASP.Net/MVC

At Encodo we had several ASP.Net MVC projects what needed to serve some files with a custom MVC Controller/Action. The general problem with this is that IIS tries hard to serve simple files like PDF's, pictures etc. with its static-file handler which is generally fine but not for files or lets say file-content served by our own action.

The goal is to switch off the static-file handling of IIS for some paths. One of the current projects came up with the following requirements so I did some research and how we can do this better then we did in past projects.

Requirements:

  1. Switch it off only for /Data/...
  2. Switch it off for ALL file-types as we don't yet know what files the authors will store in somewhere else.

This means that the default static-file handling of IIS must be switched off by some "magic" IIS config. In other apps we switched it off on a per file-type basis for the entire application. I finally came up with the following IIS-config (in web.config). It sets up a local configuration for the "data"-location only. Then I used a simple "*" wild-card as the path (yes, this is possible) to transfer requests to the ASP.Net. It looks like this:

<location path="data">
  <system.webServer>
    <handlers>
      <add name="nostaticfile" path="*" verb="GET" type="System.Web.Handlers.TransferRequestHandler" preCondition="integratedMode,runtimeVersionv4.0" />
    </handlers>
  </system.webServer>
</location>

Alternative: Instead a controller one could also use a custom HttpHandler for serving such special URL's/Resources. In this project I decided using an action for this because of the central custom security which I needed for the /Data/... requests as well and got for free when using Action instead a HttpHandler.

iTunes: another tale of woe in UX

I know that pointing out errors in iTunes is a bit passé but Apple keeps releasing new versions of this thing without addressing the fundamental problems that it has as a synchronization client.

The software has to synchronize with hardware from only one manufacturer -- the same one that makes iTunes. I'll leave off complaints about the horrific, very old and utterly non-scaling UI and just regale you with a tale of a recent interaction in which I restored my phone from a backup. In that sense, it's a "user experience".

In this tale, we will see that two of the main features of the synchronization part of the iTunes software -- backup and sync -- seem to be utterly misinterpreted.

Spoiler alert: it all works out in the end, but it's mind-boggling that this is the state of Apple's main software after almost 15 years.1

10 million new iPhones were sold over the weekend. Their owners will all have the pleasure of working with this software.

Restore from backup

Me: attaches phone iTunes: Restore from backup? Me: Sure! iTunes: shows almost full iPhone There you go! Me: Thanks! That was fast! Me: Wait...my phone is empty (no apps, no music, no contacts) iTunes: blushes Yeah, about that... Me: reconnects phone iTunes: shows nearly empty iPhone What's the problem? Me: Seriously, RESTORE FROM BACKUP (select EXACT SAME backup as before) iTunes: On it! Sir, yes sir! Me: OK. Apps are back; contacts are back. No music, iTunes? What part of the word "backup" is causing difficulties here? iTunes: blushes (again) Ummm, dunno what happened there Me: Fine. It was randomly selected anyway. Me: Select random music from this playlist iTunes: Here ya go! Me: Sync iTunes: Nothing to do Me: Sync iTunes: Seriously, dude, there's nothing to do Me: SYNC iTunes: Done Me: No music on phone. Do you understand the word "sync" differently as well? You know, like how you have trouble with the word "backup"? iTunes: ... Me: notices that size of playlist exceeds capacity of iPhone Me: that's 17GB of music. For a 16GB iPhone. iTunes: Yep! Awesome, right? Me: Is that why you won't sync? iTunes: Error messages are gauche. I don't use them. Everything is intuitive. Me: Fine. Reserve space when selecting music: 1GB (don't need more extra space than that) iTunes: NP! Here's 15GB of music. Me: Wait, what? You're supposed to leave 1GB empty of the available space not the total size of the device iTunes: Math is hard. ... You do it. Me: Fine. Reserve 4.2GB? iTunes: Done. Me: Now I have a 28GB playlist. iTunes: pats self on back Me: Reserve 3.2GB ... and "delete all existing" and "replace"? Now does it work? iTunes: 9GB for you Me: tweaks settings 2 or 3 more times iTunes: 10.5GB Me: Perfect. That was totally easy. Me: Sync iTunes: On it! hums to self Me: Why are you only syncing 850 songs when the playlist has 1700 of them? iTunes: continues humming Me: Fine. wanders away iTunes: Done Me: Sync iTunes: syncing 250 more songs Me: What the hell? iTunes: Done. Me: Sync iTunes: syncs remaining songs Me: This is ridiculous iTunes: Done



  1. It has been pointed out to me that I am using this software in a somewhat archaic way: to wit, I am not allowing iTunes to synchronize all of my data to the cloud first. Had I done that, it is claimed, I would have had fewer problems. I am, however, skeptical. I think that a company that can't even get local sync working properly after 15 years has no business getting any of my data.

An introduction to PowerShell

On Wednesday, August 27th, Tymon gave the rest of Encodo1 a great introduction to PowerShell. I've attached the presentation but a lot of the content was in demonstrations on the command-line.

  1. Download the presentation
  2. Unzip to a local folder
  3. Open index.html in a modern web browser (Chrome/Opera/Firefox work the best; IE has some rendering issues)

We learned a few very interesting things:

  • PowerShell is pre-installed on every modern Windows computer
  • You can PowerShell to other machines (almost like ssh!)
  • Windows developers should definitely learn how to use PowerShell.
  • Unix administrators who have to work on Windows machines should definitely learn how to use PowerShell. The underlying functionality of the operating system is much more discoverable via command line, get-command and get-member than the GUI.
  • You should definitely install ConEmu
  • When running ConEmu, make sure that you start a PowerShell session rather than the default Cmd session.
  • If you're writing scripts, you should definitely install and use the ISE, which is an IDE for PowerShell scripts with debugging, code-completion, lists of available commands and much better copy/paste than the standard console.
  • The PowerShell Language Reference v3 is a very useful and compact reference for beginners and even for more advanced users

ConEmu Setup

The easiest way to integrate PowerShell into your workflow is to make it eminently accessible by installing ConEmu. ConEmu is a Windows command-line with a tabbed interface and offers a tremendous number of power-user settings and features. You can tweak it to your heart's content.

imageI set mine up to look like the one that Tymon had in the demonstrations (shown on my desktop to the right).

  1. Download ConEmu; I installed version 140814, the most recent version marked as "beta". There is no official release yet, but the software is quite mature.
  2. Install it and run it. I didn't allow the Win + Num support because I know that I'd never use it. YMMV and you can always change your choice from the preferences.
  3. Show the settings to customize your installation. There are a ton of settings, so I listed the ones I changed below.
  4. imageSet the window size to something a bit larger than the standard settings, especially if you have a larger monitor. I use 120 x 40.
  5. imageChoose the color scheme you want to use. I'm using the standard PowerShell colors but a lot of popular, darker schemes are also available (e.g. Monokai).
  6. imageCheck out the hotkeys and set them up accordingly. The only key I plan on using is the one to show ConEmu. On the Swiss-German keyboard, it's Ctrl + ¨.
  7. imageThe default console is not transparent, but there are those of us who enjoy a bit of transparency. Again, YMMV. I turned it on and left the slider at the default setting.
  8. imageAnd, finally, you can turn on Quake-style console mode to make it drop down from the top of your primary monitor instead of appearing in a free-floating window.


  1. and one former Encodo employee -- hey Stephan!

ASP.Net MVC Areas

After some initial skepticism regarding Areas, I now use them more and more when building new Web-Applications using ASP.Net MVC. Therefore, I decided to cover some of my thoughts and experiences in a blog post so others may get some inspiration out of it.

Before we start, here's a link to a general introduction to the area feature of MVC. Check out this article if you are not yet familiar with Areas.

Furthermore, this topic is based on MVC 5 and C# 4 but may also apply to older versions too as Areas are not really a new thing and were first introduced with MVC 2.

Introduction

Areas are intended to structure an MVC web application. Let's say you're building a line-of-business application. You may want to separate your modules on one hand from each other and on the other hand from central pieces of your web application like layouts, HTML helpers, etc.

Therefore an area should be pretty much self-contained and should not have too much interaction with other areas -- as little as possible -- as otherwise the dependencies between the modules and their implementation grows more and more, undercutting separation, and resulting in less maintainability.

How one draw the borders of each area depends on the need of the application and company. For example, modules can be separated by functionality or by developer/developer-team. In our line-of-business application, we may have an area for "Customer Management", one for "Order Entry", one for "Bookkeeping" and one for "E-Banking" as they have largely separate functionality and will likely be built by different developers or even teams.

The nice thing about areas -- besides modularization/organization of the app -- is that they are built in MVC and therefore are supported by most tools like Visual Studio, R# and most libraries. On the negative side, I can count the lack of good support in the standard HtmlHelpers as one need to specify the area as a routing-object property and use the name of the area as a string. There is no dedicated parameter for the area. But to put that downside into perspective, this is only needed when making an URL to another area than the current one.

Application modularization using Areas

In my point of view, modularization using areas has two major advantages. The first one is the separation from other parts of the application and the second is the fact that area-related files are closer together in the Solution Explorer.

The separation -- apart from the usual separation advantages -- is helpful for reviews as the reviewer can easily see what has changed within the area and what changed on the core level of the application and therefore needs an even-closer look. Another point of the separation is that for larger applications and teams it results in fewer merge conflicts when pushing to the central code repository as each team has its own playground. Last but not least, its nice for me as an application-developer because I know that when I make changes to my area only I will not break other developers' work.

As I am someone who uses the Solution Explorer a lot, I like the fact that with areas I normally have to scroll and search less and have a good overview of the folder- and file-tree of the feature-set I am currently working on. This happens because I moved all area-related stuff into the area itself and leave the general libraries, layouts, base-classes and helpers outside. This results in a less cluttered folder-tree for my areas, where I normally spend the majority of my time developing new features.

Tips and tricks

  • Move all files related to the area into the area itself including style-sheets (CSS, LESS, SASS) and client-side scripts (Javascript, TypeScript).
  • Configure bundles for your area or even for single pages within your area in the area itself. I do this in an enhanced area-registration file.
  • Enhance the default area registration to configure more aspects of your area.When generating links in global views/layouts use add the area="" routing attribute so the URL always points to the central stuff instead being area-relative.

For example: if your application uses '@Html.ActionLink()' in your global _layout.cshtml, use:

@Html.ActionLink("Go to home", "Index", "Home", new { area = "" });

Area Registration / Start Up

And here is a sample of one of my application's area registrations:

public class BookkeepingAreaRegistration : AreaRegistration
{
  public override string AreaName
  {
    get { return "Bookkeeping"; }
  }

  public override void RegisterArea(AreaRegistrationContext context)
  {
    RegisterRoutes(context);
    RegisterBundles(BundleTable.Bundles);
  }

  private void RegisterRoutes(AreaRegistrationContext context)
  {
    if (context == null) { throw new ArgumentNullException("context"); }

    context.MapRoute(
      "Bookkeeping_default",
      "Bookkeeping/{controller}/{action}/{id}",
      new { controller = "Home", action = "Index", id = UrlParameter.Optional }
    );
  }

  private void RegisterBundles(BundleCollection bundles)
    {
      if (bundles == null) { throw new ArgumentNullException("bundles"); }

      // Bookings Bundles
      bundles.Add(new ScriptBundle("~/bundles/bookkeeping/booking")
          .Include("~/Areas/bookkeeping/Scripts/booking.js"
      ));

      bundles.Add(new StyleBundle("~/bookkeeping/css/booking")
          .Include("~/Areas/bookkeeping/Content/booking.css"));

      // Account Overview Bundle
      ...
  }
}

As you can see in this example, I enhanced the area registration a little so area-specific bundles are registered in the area-registration too. I try to place all areas-specific start-up code in here.

Folder Structure

As I wrote in one of the tips, I strongly recommend storing all area-related files within the area's folder. This includes style-sheets, client-side scripts (JS, TypeScript), content, controller, views, view-models, view-model builders, HTML Helpers, etc. My goal is to make an area self-contained so all work can be done within the area's folder.

So the folder structure of my MVC Apps look something like this:

  • App_Start
  • Areas
    • Content
    • Controllers
    • Core
    • Models
      • Builders
    • Scripts
    • Views
  • bin
  • Content
  • Controllers
  • HtmlHelpers
  • Models
    • Builders
  • Scripts
  • Views

As you can see, each area is something like a mini-application within the main application.

Add-ons using Areas deployed as NuGet packages

Beside structuring an entire MVC Application, another nice usage is for hosting "add-on's" in their own area.

For example, lately I wrote a web-user-interface for the database-schema migration of our meta-data framework Quino. Instead of pushing it all in old-school web-resources and do binary deployment of them, I built it as an area. This area I've packed into a NuGet package (.nupkg) and published it to our local NuGet repo.

Applications which want to use the web-based schema-migration UI can just install that package using the NuGet UI or console. The package will then add the area with all the sources I wrote and because the area-registration is called by MVC automatically, it's ready to go without any manual action required. If I publish an update to the NuGet package applications can get these as usual with NuGet. A nice side-effect of this deployment is that the web application contains all the sources so developers can have a look at it if they like. It doesn't just include some binary files. Another nice thing is that the add-on can define its own bundles which get hosted the same way as the MVC app does its own bundles. No fancy web-resources and custom bundling and minification is needed.

To keep conflicts to a minimum with such add-on areas, the name should be unique and the area should be self-contained as written above.

Is Encodo a .NET/C# company?

Encodo has never been about maintaining or establishing a monoculture in either operating system, programming language or IDE. Pragmatism drives our technology and environment choices.1

Choosing technology

Each project we work on has different requirements and we choose the tools and technologies that fit best. A good fit involves considering:

  • What exists in the project already?
  • How much work needs to be done?
  • What future directions could the project take?
  • How maintainable is the solution/are the technologies?
  • How appropriate are various technologies?
  • What do our developers know how to do best?
  • What do the developers who will maintain the project know best? What are they capable of?
  • Is there framework code available that would help?

History: Delphi and Java

When we started out in 2005, we'd also spent years writing frameworks and highly generic software. This kind of software is not really a product per se, but more of a highly configurable programmable "engine", in which other programmers would write their actual end-user applications.

A properly trained team can turn around products very quickly using this kind of approach. It is not without its issues, though: maintaining a framework involves a lot of work, especially producing documentation, examples and providing support. While this is very interesting work, it can be hard to make lucrative, so we decided to move away from this business and focus on creating individual products.

Still, we stuck to the programming environment and platform that we knew best2 (and that our customers were requesting): we developed software mostly in Delphi for projects that we already had.3 For new products, we chose Java.

Why did we choose Java as our "next" language? Simply because Java satisfied a lot of the requirements outlined above. We were moving into web development and found Delphi's offerings lacking, both in the IDE as well as the library support. So we moved on to using Eclipse with Jetty. We evaluated several common Java development libraries and settled on Hibernate for our ORM and Tapestry for our web framework (necessitating HiveMind as our IOC).

History: .NET

A few years later, we were faced with the stark reality that developing web applications on Java (at the time) was fraught with issues, the worst of which was extremely slow development-turnaround times. We found ourselves incapable of properly estimating how long it would take to develop a project. We accept that this may have been our fault, of course, but the reality was that (1 )we were having trouble making money programming Java and (2) we weren't having any fun anymore.

We'd landed a big project that would be deployed on both the web and Windows desktops, with an emphasis on the Windows desktop clients. At this point, we needed to reëvaluate: such a large project required a development language, runtime and IDE strong on the Windows Desktop. It also, in our view, necessitated a return to highly generic programming, which we'd moved away from for a while.

Our evaluation at the time included Groovy/Grails/Gtk, Python/Django/Gtk, Java/Swing/SWT/Web framekworks, etc. We made the decision based on various factors (tools, platform suitability, etc.) and moved to .NET/C# for developing our metadata framework Quino, upon which we would build the array of applications required for this big project.

Today (2014)

We're still developing a lot of software in C# and .NET but also have a project that's built entirely in Python.4 We're not at all opposed to a suggestion by a customer that we add services to their Java framework on another project, because that's what's best there.

We've had some projects that run on a Linux/Mono stack on dedicated hardware. For that project, we made a build-server infrastructure in Linux that created the embedded OS with our software in it.

Most of our infrastructure runs on Linux with a few Windows VMs where needed to host or test software. We use PostgreSql wherever we can and MS-SQL when the customer requires it.5

We've been doing a lot of web projects lately, which means the usual client-side mix of technology (JS/CSS/HTML). We use jQuery, but prefer Knockout for data-binding. We've evaluated the big libraries -- Angular, Backbone, Ember -- and found them to be too all-encompassing for our needs.

We've evaluated both Dart and TypeScript to see if those are useful yet. We've since moved to TypeScript for all of our projects but are still keeping an eye on Dart.

We use LESS instead of pure CSS. We've used SCSS as well, but prefer LESS. We're using Bootstrap in some projects but find it to be too restrictive, especially where we can use Flexbox for layout on modern browsers.

And, with the web comes development, support and testing for iOS and other mobile devices, which to some degree necessitates a move from pure .NET/C# and toward a mix.

We constantly reëvaluate our tools, as well. We use JetBrains WebStorm instead of Visual Studio for some tasks: it's better at finding problems in JavaScript and LESS. We also use PhpStorm for our corporate web site, including these blogs. We used the Java-based Jenkins build server for years but moved to JetBrains TeamCity because it better supports the kind of projects we need to build.

Conclusion

The description above is meant to illustrate flexibility, not chaos. We are quite structured and, again, pragmatic in our approach.

Given the choice, we tend to work in .NET because we have the most experience and supporting frameworks and software for it. We use .NET/C# because it's the best choice for many of the projects we have, but we are most definitely not a pure Microsoft development shop.

I hope that gives you a better idea of Encodo's attitude toward software development.



  1. If it's not obvious, we employ the good kind of pragmatism, where we choose the best tool for the job and the situation, not the bad kind, founded in laziness and unwillingness to think about complex problems. Just so we're clear.

  2. Remo had spent most of his career working with Borland's offerings, whereas I had started our with Borland's Object Pascal before moving on to the first version of Delphi, then Microsoft C++ and MFC for many years. After that came the original version of ASP.NET with the "old" VB/VBScript and, finally, back to Delphi at Opus Software.

  3. We were actually developing on Windows using Delphi and then deploying on Linux, doing final debugging with Borland's Linux IDE, Kylix. The software to be deployed on Linux was headless, which made it much easier to write cross-platform code.

  4. For better or worse; we inherited a Windows GUI in Python, which is not very practical, but I digress

  5. Which is almost always, unfortunately.

Should you return `null` or an empty list?

I've seen a bunch of articles addressing this topic of late, so I've decided to weigh in.

The reason we frown on returning null from a method that returns a list or sequence is that we want to be able to freely use these sequences or lists with in a functional manner.

It seems to me that the proponents of "no nulls" are generally those who have a functional language at their disposal and the antagonists do not. In functional languages, we almost always return sequences instead of lists or arrays.

In C# and other functional languages, we want to be able to do this:

var names = GetOpenItems()
  .Where(i => i.OverdueByTwoWeeks)
  .SelectMany(i => i.GetHistoricalAssignees()
    .Select(a => new { a.FirstName, a.LastName })
  );

foreach (var name in names)
{
  Console.WriteLine("{1}, {0}", name.FirstName, name.LastName);
}

If either GetHistoricalAssignees() or GetOpenItems() might return null, then we'd have to write the code above as follows instead:

var openItems = GetOpenItems();
if (openItems != null)
{
  var names = openItems
  .Where(i => i.OverdueByTwoWeeks)
  .SelectMany(i => (i.GetHistoricalAssignees() ?? Enumerable.Empty<Person>())
    .Select(a => new { a.FirstName, a.LastName })
  );

  foreach (var name in names)
  {
    Console.WriteLine("{1}, {0}", name.FirstName, name.LastName);
  }
}

This seems like exactly the kind of code we'd like to avoid writing, if possible. It's also the kind of code that calling clients are unlikely to write, which will lead to crashes with NullReferenceExceptions. As we'll see below, there are people that seem to think that's perfectly OK. I am not one of those people, but I digress.

The post, Is it Really Better to 'Return an Empty List Instead of null'? / Part 1 by Christian Neumanns serves as a good example of an article that seems to be providing information but is just trying to distract people into accepting it as a source of genuine information. He introduces his topic with the following vagueness.

If we read through related questions in Stackoverflow and other forums, we can see that not all people agree. There are many different, sometimes truly opposite opinions. For example, the top rated answer in the Stackoverflow question Should functions return null or an empty object? (related to objects in general, not specifically to lists) tells us exactly the opposite:

Returning null is usually the best idea ...

The statement "we can see that not all people agree" is a tautology. I would split the people into groups of those whose opinions we should care about and everyone else. The statement "There are many different, sometimes truly opposite opinions" is also tautological, given the nature of the matter under discussion -- namely, a question that can only be answered as "yes" or "no". Such questions generally result in two camps with diametrically opposed opinions.

As the extremely long-winded pair of articles writes: sometimes you can't be sure of what an external API will return. That's correct. You have to protect against those with ugly, defensive code. But don't use that as an excuse to produce even more methods that may return null. Otherwise, you're just part of the problem.

The second article Is it Really Better to 'Return an Empty List Instead of null'? - Part 2 by Christian Neumanns includes many more examples.

I just don't know what to say about people that write things like "Bugs that cause NullPointerExceptions are usually easy to debug because the cause and effect are short-distanced in space (i.e. location in source code) and time." While this is kind of true, it's also even more true that you can't tell the difference between such an exception being caused by a savvy programmer who's using it to his advantage and a non-savvy programmer whose code is buggy as hell.

He has a ton of examples that try to distinguish between a method that returns an empty sequence being different from a method that cannot properly answer a question. This is a concern and a very real distinction to make, but the answer is not to return null to indicate nonsensical input. The answer is to throw an exception.

The method providing the sequence should not be making decisions about whether an empty sequence is acceptable for the caller. For sequences that cannot logically be empty, the method should throw an exception instead of returning null to indicate "something went wrong".

A caller may impart semantic meaning to an empty result and also throw an exception (as in his example with a cycling team that has no members). If the display of such a sequence on a web page is incorrect, then that is the fault of the caller, not of the provider of the sequence.

  • If data is not yet available, but should be, throw an exception
  • If data is not available but the provider isn't qualified to decide, return an empty sequence
  • If the caller receives an empty sequence and knows that it should not be empty, then it is responsible for indicating an error.

That there exists calling code that makes assumptions about return values that are incorrect is no reason to start returning values that will make calling code crash with a NullPointerException.

All of his examples are similar: he tries to make the pure-data call to retrieve a sequence of elements simultaneously validate some business logic. That's not a good idea. If this is really necessary, then the validity check should go in another method.

The example he cites for getting the amount from a list of PriceComponents is exactly why most aggregation functions in .NET throw an exception when the input sequence is empty. But that's a much better way of handling it -- with a precise exception -- than by returning null to try to force an exception somewhere in the calling code.

But the upshot for me is: I am not going to write code that, when I call it, forces me to litter other code with null-checks. That's just ridiculous.