3 4 5 6 7 8 9 10 11 12 13
Quino Data Driver architecture, Part II: Command types & inputs

In part I, we discussed applications -- which provide the model and data provider -- and sessions -- which encapsulate high-level data context.

In this article, we're going to take a look at the command types & inputs

  1. Applications & Sessions
  2. Command types & inputs1
  3. The Data Pipeline
  4. Builders & Commands
  5. Contexts and Connections
  6. Sessions, resources & objects

Overview

image

Before we can discuss how the pipeline processes a given command, we should discuss what kinds of commands the data driver supports and what kind of inputs the caller can pass to it. As you can well imagine, the data driver can be used for CRUD -- to create, read, update and delete and also to refresh data.

In the top-right corner of the diagram to the right, you can see that the only input to the pipeline is an IDataCommandContext. This object comprises the inputs provided by the caller as well as command-specific state used throughout the driver for the duration of the command.

Command types

A caller initiates a command with either a query or an object graph, depending on the type of command. The following commands and inputs are supported:

  • Load: returns a cursor for the objects that match a query
  • Count: returns the number of objects that match a query
  • Save: saves an object graph
  • Reload: refreshes the data in an object graph
  • Delete: deletes an object graph or the objects that match a query

Queries

A query includes information about the data to return (or delete).

  • Metadata: The meta-class represents the type of the root object for the command. For example, a "person" or "company".
  • Filtering: Filters restrict the objects to return. A filter can address properties of the root object, but also properties of objects related to the root object. A caller can query for people whose first names start with the letter "m" -- FirstName %~ 'm'2 -- or the caller can find all people which belong to a company whose name starts with the letter "e" -- Company.FirstName %~ 'e'. The context for these expressions is naturally the meta-class mentioned above. Additionally, the metadata/model can also include default filters to include.
  • Ordering: Orderings that determine in which order the data is returned. Orderings are also specified with the expression language, but are usually simpler, like ordering first by LastName and then by FirstName. More complex expressions are supported -- for example, you could use the expression "{LastName}, {FirstName}", which sorts by a formatted string3 -- but be aware that many data stores have limited support for complex expressions in orderings. Orderings are ignored in a query when used to delete objects.

Queries are a pretty big topic and we've only really scratched the surface so far. Quino has its own query language -- QQL -- the specification for which weighs in at over 80 pages, but that's a topic for another day.

Object graphs

An object graph consists of a sequence of root objects and the sub-objects available along relations defined in the metadata.

It's actually simpler than it perhaps sounds.

Let's use the example above: a person is related to a single company, so the graph of a single person will include the company as well (if the object is loaded and/or assigned). Additionally, the company defines a relation that describes the list of people that belong to it. The person=>company relationship is complementary to the company=>person relationship. We call person=>company a 1-1 relation, while company=>person is a 1-n relation.

The following code creates two new companies, assigns them to three people and saves everything at once.

var encodo = new Company { Name = "Encodo Systems AG" };
var other = new Company { Name = "Not Encodo" };
var people = new [] 
{
  new Person { FirstName = "John", LastName = "Doe", Company = other },
  new Person { FirstName = "Bob", LastName = "Smith", Company = encodo },
  new Person { FirstName = "Ted", LastName = "Jones", Company = encodo }
};

Session.Save(people);

The variable people above is an object graph. The variables encodo and other are also object graphs, but only to parts of the first one. From people, a caller can look up people[0].Company, which is other. The graph contains cycles, so people[0].Company.People[0].Company is also other. From encodo, the caller can get to other people in the same company, but not to people in the other company, for example, encodo.People[0] gets "Bob Smith" and encodo.People[0].Company.People[1] gets "Ted Jones".

As with queries, object graphs are a big topic and are strongly bound to the kind of metadata available in Quino. Another topic for another day.

Determining Inputs

Phew. We're almost to the point where we can create an IDataCommandContext to send into the data pipeline.

  • We have an IDataSession and know why we need it
  • We know what type of command we want to execute (e.g. "Load")
  • We have either a query or an object graph

With those inputs, Quino has all it needs from the caller. A glance at the top-left corner of the diagram above shows us that Quino will determine an IMetaClass and an IMetaObjectHandler from these inputs and then use them to build the IDataCommandContext.

An IQuery has a MetaClass property, so that's easy. With the meta-class and the requested type of object, the data driver checks a list of registered object-handlers and uses the first one that says it supports that type. If the input is an object graph, though, the object-handler is determined first and then the meta-class is obtained from the object-handler using a root object from the graph.

Most objects will inherit from GenericObject which implements the IPersistable interface required by the standard object handler. However, an application is free to implement an object handler for other base classes -- or no base class at all, using reflection to get/set values on POCOs. That is, however, an exercise left up to the reader.

At this point, we have all of our inputs and can create the IDataCommandContext.

In the next part, we'll take a look at the "Data Pipeline" through which this command context travels.



  1. You'll notice, perhaps, that this topic is new to this article. I'm expanding the series as I go along, trying to provide enough information to understand the process while keeping the individual blog entries to a digestible size.

  2. "%~" is actually the case-insensitive begins-with operator. You can find out more about comparison operators in the Quino documentation. Browse to "Encodo Base Library" and then "Expressions".

  3. For more information on how to use Quino's unique take on interpolated strings, see the documentation in the footnote above.

Quino Data Driver architecture, Part I: Applications & Sessions

One part of Quino that has undergone quite a few changes in the last few versions is the data driver. The data driver is responsible for CRUD: create, read, update and delete operations. One part of this is the ORM -- the object-relational mapper -- that marshals data to and from relational databases like PostgreSql, SQL Server and SQLite.

We're going to cover a few topics in this series:

  1. Applications & Sessions
  2. The Data Pipeline
  3. Builders & Commands
  4. Contexts and Connections
  5. Sessions, resources & objects

But first let's take a look at an example to anchor our investigation.

Introduction

An application makes a request to the data driver using commands like Save() to save data and GetObject() or GetList() to get data. How are these high-level commands executed? Quino does an excellent job of shielding applications from the details but it's still very interesting to know how this is achieved.

The following code snippet creates retrieves some data, deletes part of it and saves a new version.

using (var session = application.CreateSession())
{
  var people = session.GetList<Person>();
  people.Query.WhereEquals(Person.Fields.FirstName, "john");
  session.Delete(people);
  session.Save(new Person { FirstName = "bob", LastName = "doe" });
}

In this series, we're going to answer the following questions...and probably many more.

  • Where does the data come from?
  • What kind of sources are supported? How?
  • Is at least some of the data cached?
  • Can I influence the cache?
  • What is a session? Why do I need one?
  • Wait...what is the application?

Let's tackle the last two questions first.

Application

The application defines common configuration information. The most important bits for the ORM are as follows:

  • Model: The model is the central part of any Quino application. The model defines entities, their properties, relationships between entities and so on. Looking at the example above, the model will include a definition for a Person, which has at least the two properties LastName and FirstName. There is probably an entity named Company as well, with a one-to-many relationship to Person. As you can imagine, Quino uses this information to formulate requests to data stores that contain data in this format.1 For drivers that support it, Quino also uses this information in order to create that underlying data schema.2
  • DataProvider: The data provider encapsulates all of the logic and components needed to map the model to data sources. This is the part of the process on which this series will concentrate.
  • ConfigurationData: The configuration data describes which parts of the model are connected to which parts of the data provider. The default is, of course, that the entire model is mapped to a single data source. However, even in that case, the configuration indicates which data source: Sql Server? PostgreSql? A remote application server (2nd tier)? With a high-level API as described above, all of these decisions can be made in the configuration rather than assumed throughout the application. Yes, this means that you can change your Quino application from a two-tier to a three-tier application with a single configuration change.

Sessions

So that's the application. There is a single shared application for a process.

But in any non-trivial application -- and any non-desktop application -- we will have multiple data requests running, possibly in different threads of execution.

  • Each request in a web application is a separate data context. Changes made in one request should not affect any other request. Each request may be authenticated as a different user.
  • A remote application-server is very similar to a web application. It handles requests from multiple users. Since it's generally the second layer, it will most likely have direct connections to one or more databases. In this case, it will probably be in charge of executing business logic, most likely in a database transaction. In that case, we definitely don't want one request using the transaction context from another request.
  • Even a non-web client-side application may want to execute some logic in the background or in a separate thread. In those cases, we probably want to keep the data used there separate from the data or objects used to render the other parts of the application.

That's where sessions come in. The session encapsulates a data context, which contains the following information:

  • Application: The application will, as described above, tell the session which model and data provider to use.
  • Current user: For those familiar with ASP.NET, this is very similar to the HttpContext.Current.User but generalized to be available in any Quino application. All data requests over a session are made in the context of this user.
  • Access control: The access control provides information about the security context of an application. An application generally uses the access control to perform authorization checks.
  • Cache: Each session also has its own cache. There are global caches, but those are for immutable data. The session's cache is always available, even when using transactions.
  • ConnectionManager: Many external data sources have transactable/shared state in the form of a connection. As with data, connections can sometimes be shared between sessions and sometimes they can't. The connection manager takes care of knowing all of that for you.

If we go back to the original code sample, we now know that creating a new session with CreateSession() creates a new data context, with its own user and its own data cache. Since we didn't pass in any credentials, the session uses the default credentials for the application.3 All data access made on that session is nicely shielded and protected from any data access made in other sessions (where necessary, of course).

So now we're no closer to knowing how Quino works with data on our behalf, but we've taken the first step: we know all about one of the main inputs to the data driver, the session.

In the next part, we'll cover the topic "The Data Pipeline".


var requestCredentials = requestSession.AccessControl.CurrentUser.CreateCredentials();
using (var session = application.CreateSession(requestCredentials))
{
  // Work with session
}

  1. The domain model is used for everything in a Quino application -- not just the ORM and for schema-migration. We use the model to generate C# code like concrete ORM objects, metadata references (e.g. the Person.Fields.FirstName in the example), or view models, DTOs or even client-side TypeScript definitions. We also use the model to generate user interfaces -- both for entire desktop-application interfaces but also for HTML helpers to build MVC views.

  2. See the article Schema migration in Quino 1.13 for more information on how that works.

  3. This is code that you might use in a single-user application. In a server application, you would most likely just use the session that was created for your request by Quino. If an application wants to create a new session, but using the same user as an existing session, it would call:

Are you ready for ReSharper 9? Not for testing, you aren't.

We've been using ReSharper at Encodo since version 4. And we regularly use a ton of other software from JetBrains1 -- so we're big fans.

How to Upgrade R#

As long-time users of ReSharper, we've become accustomed to the following pattern of adoption for new major versions:

EAP

  1. Read about cool new features and improvements on the JetBrains blog
  2. Check out the EAP builds page
  3. Wait for star ratings to get higher than 2 out of 5
  4. Install EAP of next major version
  5. Run into issues/problems that make testing EAP more trouble than it's worth
  6. Re-install previous major version

RTM

  1. Major version goes RTM
  2. Install immediately; new features! Yay!
  3. Experience teething problems in x.0 version
  4. Go through hope/disappointment cycle for a couple of bug-fix versions (e.g. x.0.1, x.0.2)
  5. Install first minor-version release immediately; stability! Yay!

This process can take anywhere from several weeks to a couple of months. The reason we do it almost every time is that the newest version of ReSharper almost always has a few killer features. For example, version 8 had initial TypeScript support. Version 9 carries with it a slew of support improvements for Gulp, TypeScript and other web technologies.

Unfortunately, if you need to continue to use the test-runner with C#, you're in for a bumpy ride.

History of the Test Runner

Any new major version of ReSharper can be judged by its test runner. The test runner seems to be rewritten from the ground-up in every major version. Until the test runner has settled down, we can't really use that version of ReSharper for C# development.

The 6.x and 7.x versions were terrible at the NUnit TestCase and Values attributes. They were so bad that we actually converted tests back from using those attributes. While 6.x had trouble reliably compiling and executing those tests, 7.x was better at noticing that something had changed without forcing the user to manually rebuild everything.

Unfortunately, this new awareness in 7.x came at a cost: it slowed editing in larger NUnit fixtures down to a crawl, using a tremendous amount of memory and sending VS into a 1.6GB+ memory-churn that made you want to tear your hair out.

8.x fixed all of this and, by 8.2.x was a model of stability and usefulness, getting the hell out of the way and reliably compiling, displaying and running tests.

The 9.x Test Runner

And then along came 9.x, with a whole slew of sexy new features that just had to be installed. I tried the new features and they were good. They were fast. I was looking forward to using the snazzy new editor to create our own formatting template. ReSharper seemed to be using less memory, felt snappier, it was lovely.

And then I launched the test runner.

And then I uninstalled 9.x and reinstalled 8.x.

And then I needed the latest version of DotMemory and was forced to reinstall 9.x. So I tried the test runner again, which inspired this post.2

So what's not to love about the test runner? It's faster and seems much more asynchronous. However, it gets quite confused about which tests to run, how to handle test cases and how to handle abstract unit-test base classes.

Just like 6.x, ReSharper 9.x can't seem to keep track of which assemblies need to be built based on changes made to the code and which test(s) the user would like to run.

imageimage

To be fair, we have some abstract base classes in our unit fixtures. For example, we define all ORM query tests in multiple abstract test-fixtures and then create concrete descendants that run those tests for each of our supported databases. If I make a change to a common assembly and run the tests for PostgreSql, then I expect -- at the very least -- that the base assembly and the PostgreSql test assemblies will be rebuilt. 9.x isn't so good at that yet, forcing you to "Rebuild All" -- something that I'd no longer had to do with 8.2.x.

TestCases and the Unit Test Explorer

It's the same with TestCases: whereas 8.x was able to reliably show changes and to make sure that the latest version was run, 9.x suffers from the same issue that 6.x and 7.x had: sometimes the test is shown as a single node without children and sometimes it's shown with the wrong children. Running these tests results in a spinning cursor that never ends. You have to manually abort the test-run, rebuild all, reload the runner with the newly generated tests from the explorer and try again. This is a gigantic pain in the ass compared to 8.x, which just showed the right tests -- if not in the runner, then at-least very reliably in the explorer.

imageimage

And the explorer in 9.x! It's a hyperactive, overly sensitive, eager-to-please puppy that reloads, refreshes, expands nodes and scrolls around -- all seemingly with a mind of its own! Tests wink in and out of existence, groups expand seemingly at random, the scrollbar extends and extends and extends to accommodate all of the wonderful things that the unit-test explorer wants you to see -- needs for you to see. Again, it's possible that this is due to our abstract test fixtures, but this is new to 9.x. 8.2.x is perfectly capable of displaying our tests in a far less effusive and frankly hyperactive manner.

One last thing: output-formatting

Even the output formatting has changed in 9.x, expanding all CR/LF pairs from single-spacing to double-spacing. It's not a deal-breaker, but it's annoying: copying text is harder, reading stack traces is harder. How could no one have noticed this in testing?

image

Conclusion

The install/uninstall process is painless and supports jumping back and forth between versions quite well, so I'll keep trying new versions of 9.x until the test runner is as good as the one in 8.2.x is. For now, I'm back on 8.2.3. Stay tuned.



  1. In no particular order, we have used or are using:

    * DotMemory
    * DotTrace
    * DotPeek
    * DotCover
    * TeamCity
    * PHPStorm
    * WebStorm
    * PyCharm
    

  2. Although I was unable to install DotMemory without upgrading to ReSharper 9.x, I was able to uninstall ReSharper 9.x afterwards and re-install ReSharper 8.x.

Who's using up my entire SSD?

Hard drives => SSDs

imageIn the old days, we cleaned up our hard drives because we didn't have enough space for all of our stuff. Our operating systems, applications and caches took up a reasonable portion of that hard drive.

Then we had gigantic hard drives with more than enough space for everything. Operating systems, applications and caches grew. Parsimonious software was no longer in vogue because it was a waste of time and money.

SSDs replaced hard drives, improving speeds drastically and ushering in a new era in performance. This did not come without cost, though. SSDs were much more expensive to make, so the affordable ones were necessarily much smaller than our existing hard drives. Our operating systems, applications and caches have not made the adjustment, though, at least not on Windows.

We are left with drives 70-80% smaller than the ones we had a couple of years ago -- 256MB vs. 1TB. Developers, in particular, tend to have software that uses space indiscriminately.

Drive space: critical

I recently noticed that my system drive had filled up to almost 80% and took a little time to do something about it. I downloaded TreeSize Free from Jam Software to get an idea of which folders took up the most space. I also referred to Guide to Freeing up Disk Space under Windows 8.1 by Scott Hanselman: there are a lot of great tips in there.

Without further ado, here are the locations that struck me as being "space hogs" -- locations that were large but didn't seem to offer much utility or seemed to be logs, caches or backups.

C:\Windows\Installer

imageThis folder is almost 22GB on my machine. It seems to contain MS installers, updates, service packs and hot-fixes. There are a few tips online -- some from Microsoft -- on how to clean up this folder. Even after running a couple of them, I didn't notice a significant difference in size. I didn't spend a lot of time here, but cleaning up this folder would yield significant savings.

SQL Server

There were several gigabytes -- I had 2.8GB -- of older versions and installers in the main SQL Server folder, located at /Program Files/Microsoft SQL Server/110/Setup Bootstrap. If you have large databases, consider moving them to another drive or location and setting the default data directory to somewhere other than the Program Files directory on the system drive.

Miro

I use this player for podcasts. It stores almost 1GB in something called the "icon cache", located at /Users/<username>/Roaming/Participatory Culture Foundation/Miro/icon-cache

SmartGit

SmartGit updates itself automatically now and they have very regular builds and updates, especially if you use preview releases. It never seems to delete these updates, instead retaining them in /Users/<username>/Roaming/syntevo/SmartGit/updates.

TimeSnapper

I use this to keep track of my day, referring to it to fill out my timesheet. Screen captures are located in /Users/<username>/Local/TimeSnapper/Snapshots. The default settings are to capture 100%-quality PNG files for all monitors every ten seconds. I have two large monitors and the default 5GB cache fills up in less than a day. This is not very helpful and wastes a lot of space. Instead, I recommend these settings:

 * **File Type:** JPG
 * **Resolution:** 50%
 * **Quality:** 50%
 * **Interval:** 60 seconds
 * **Remove images older than:** (not set)
 * **Maximum allowed space:** 1000MB

Sandcastle

If you build XML documentation locally, you might have a sizable cache left over from the last build. I had over 800MB in the \Users\<username>\AppData\Local\EWSoftware\Sandcastle Help File Builder\Cache

Java

imageJava also likes to update itself regularly and never throws away its older versions. Unless you know that you absolutely need a specific version, you can throw away the older versions found in C:\Users\marco\AppData\LocalLow\Sun\Java

GhostDoc

The Visual Studio documentation extension keeps quite an extensive cache in the \Users\<username>\AppData\Local\SubMain\Cache directory.

JetBrains

imageThis is another company that squirrels away all of its installers for its various products -- I use DotPeek, DotCover, DotTrace, ReSharper, PhpStorm and 0xDBE -- in this folder \Users\marco\AppData\Local\JetBrains. Feel free to throw away old installations and installers.

MSOCache

This mysterious folder located at the root of the system drive has been around since time immemorial. It appears to be 0 bytes when examined with a standard user. When you run TreeSize in administrator mode, though, you'll see that it's 2.4GB of ... stuff. This stuff is apparently installers for all of the office products that you have installed on your machine. They are cached in this folder in order to avoid requesting installation media if Office decides to install something on-the-fly. That's right: if you elect not to install certain features to avoid wasting drive space, Office obliges by putting all of the stuff you didn't install into a 2.5GB directory that you can't delete. Documentation is spotty, but this article claims that you can remove it by using the standard disk-cleanup tool.

This list is meant to show where space is being wasted on a Windows developer machine. I wasn't able to find a way to remove all of these, but cleaned up what I could quickly clean up.

If you're really tight on space, you can turn off hibernation -- which uses 13GB on my machine -- or reduce the size of the page file -- which is 6GB on my machine. And, as mentioned above, Scott Hanselman's guide is quite helpful.

The Road to Quino 2.0: Maintaining architecture with NDepend (part II)

In the previous article, I explained how we were using NDepend to clean up dependencies and the architecture of our Quino framework. You have to start somewhere, so I started with the two base assemblies: Quino and Encodo. Encodo only has dependencies on standard .NET assemblies, so let's start with that one.

The first step in cleaning up the Encodo assembly is to remove dependencies on the Tools namespace. There seems to be some confusion as to what belongs in the Core namespace versus what belongs in the Tools namespace.

There are too many low-level classes and helpers in the Tools namespace. Just as a few examples, I moved the following classes from Tools to Core:

  • BitTools
  • ByteTools
  • StringTools
  • EnumerableTools

The names kind of speak for themselves: these classes clearly belong in a core component and not in a general collection of tools.

Now, how did I decide which elements to move to core? NDepend helped me visualize which classes are interdependent.

Direct Dependencies

imageWe see that EnumerableTools depends on StringTools. I'd just moved EnumerableTools to Encodo.Core to reduce dependence on Encodo.Tools. However, since StringTools is still in the Tools namespace, the dependency remains. This is how examining dependencies really helps clarify a design: it's now totally obvious that something as low-level as StringTools belongs in the Encodo.Core namespace and not in the Encodo.Tools namespace, which has everything but the kitchen sink in it.

imageAnother example in the same vein is shown to the left, where we examine the dependencies of MessageTools on Encodo.Tools. The diagram explains that the colors correspond to the two dependency directions.1

We would like the Encodo.Messages namespace to be independent of the Encodo.Tools namespace, so we have to consider either (A) removing the references to ExceptionTools and OperatingSystemTools from MessageTools or (B) moving those two dependencies to the Encodo.Core namespace.

Choice (A) is unlikely while choice (B) beckons with the same logic as the example above: it's now obvious that tools like ExceptionTools and OperatingSystemTools belong in Encodo.Core rather than the kitchen-sink namespace.

Indirect Dependencies

Once you're done cleaning up your direct dependencies, you still can't just sit back on your laurels. Now, you're ready to get started looking at indirect dependencies. These are dependencies that involve more than just two namespaces that use each other directly. NDepend displays these as red bounding blocks. The documentation indicates that these are probably good component boundaries, assuming that the dependencies are architecturally valid.

NDepend can only show you information about your code but can't actually make the decisions for you. As we saw above, if you have what appear to be strange or unwanted dependencies, you have to decide how to fix them. In the cases above, it was obvious that certain code was just in the wrong namespace. In other cases, it may simply be a few bits of code are defined at too low a level.

Improper use of namespaces

For example, our standard practice for components is to put high-level concepts for the component at the Encodo.<ComponentName> namespace. Then we would use those elements from sub-namespaces, like Encodo.<ComponentName>.Utils. However, we also ended up placing types that then used that sub-namespace in the upper-level namespace, like ComponentNameTools.SetUpEnvironment() or something like that. The call to SetUpEnvironment() references the Utils namespace which, in turn, references the root namespace. This is a direct dependency, but if another namespace comes between, we have an indirect dependency.

This happens quite quickly for larger components, like Encodo.Security.

The screenshots below show a high-level snapshot of the indirect dependencies in the Encodo assembly and then also a detail view, with all sub-namespaces expanded. The detail view is much larger but shows you much more information about the exact nature of the cycle. When you select a red bounding box, another panel shows the full details and exact nature of the dependency.

imageimageimage

Base Camp Two: base library almost cleaned up

imageimage

After a bunch of work, I've managed to reduce the dependencies to a set of interfaces that are clearly far too dependent on many subsystems.

  • ICoreConfiguration: references configuration options for optional subsystems like the software updater, the login, the incident reporter and more
  • ICoreFeedback: references feedbacks for several optional processes, like software-update, logins and more
  • ICoreApplication: references both the core configuration and feedback

The white books for NDepend claim that "[t]echnically speaking, the task of merging the source code of several assemblies into one is a relatively light one that takes just a few hours." However, this assumes that the code has already been properly separated into non-interdependent namespaces that correspond to components. These components can then relatively easily be extracted to separate assemblies.

The issue that I have above with the Encodo assembly is a thornier one: the interfaces themselves embody a pattern that is inherently non-decoupling. I need to change how the configuration and feedback work completely in order to decouple this code.

Roadmap for startup and configuration

To that end, I've created an issue in the issue-tracker for Quino, QNO-46592, titled "Re-examine how the configuration, feedback and application work together". The design of these components predates our introduction of a service locator, which means it's much more tightly coupled (as you can see above).

After some internal discussion, we've decided to change the design of the Encodo and Quino library support for application-level configuration and state.

Merge the configuration and application

To date, the configuration has contained all of the information necessary to run an application. The configuration was more-or-less stateless and corresponded to the definition of an application, akin to how a class is the underlying stateless definition, while an object is an instance of that definition. In practice, though, we always use a single application per configuration and the distinction is irrelevant, for all practical purposes. This will simplify all referencing code, as we will no longer need to pass around an IApplication<TConfiguration, TFeedback>.

Move the feedback to the service locator

Instead of treating the feedback like a first-class citizen, with a direct reference on the application, make consumers use the service locator to retrieve an instance. This will remove the remaining generic argument in the definition of IApplication, leaving us with a base interface that is free of generic arguments.

Move specific configuration objects to the service locator

The specific sub-interfaces that introduce dependencies are as follows:

 * IncidentReporter
 * SoftwareUpdater
 * CommandSetManager
 * LocationManager
 * ConnectionSettingsManager

Any components that currently reference the properties on the ICoreConfiguration can use the service locator to retrieve an instance instead.

Move specific settings to sub-objects

The configuration object is not only dependent on sub-objects, but is also overloaded with individual settings that are only used by very few specific sub-components. These will also be extracted into interfaces and moved into the service locator.

 * ILoginConfiguration
 * ISoftwareUpdateConfiguration
 * IFileLogConfiguration

As you can see, while NDepend is indispensable for finding dependencies, it can -- along with a good refactoring tool (we use ReSharper) -- really only help you clean up the low-hanging fruit. While I started out trying to split assemblies, I've now been side-tracked into cleaning up an older and less--well-designed component -- and that's a very good thing.

There are some gnarly knots that will feel nearly unsolvable -- but with a good amount of planning, those can be re-designed as well. As I mentioned in the previous article, though, we can do so only because we're making a clean break from the 1.x version of Quino instead of trying to maintain backward compatibility.

It's worth it, though: the new design already looks much cleaner and is much more easily explained to new developers. Once that rewrite is finished, the Encodo assembly should be clean and I'll use NDepend to find good places to split up that rather large assembly into sensible sub-assemblies.



  1. There is a setting to turn off showing the green dependencies -- where the row depends on the column -- to make it easier to read the matrix. If you do that, though, you have to make sure to select the class from which you're trying to remove dependencies in the column. For example, if class A and B are interdependent, but A should not rely on B, you should make sure A is showing in the column. You can then examine dependencies on row B -- and then remove them. This works very nicely with both direct and indirect dependencies.

  2. This link is to the Quino issue tracker, which requires a login.

The Road to Quino 2.0: Maintaining architecture with NDepend (part I)

Full disclosure

A while back -- this last spring, I believe -- I downloaded NDepend to analyze code dependencies. The trial license is fourteen days; needless to say, I got only one afternoon in before I was distracted by other duties. That was enough, however, to convince me that it was worth the $375 to continue to clean up Quino with NDepend.

I decided to wait until I had more time before opening my wallet. In the meantime, however, Patrick Smacchia of NDepend approached me with a free license if I would write about my experiences using NDepend on Encodo's blog. I'm happy to write about how I used the tool and what I think it does and doesn't do.1

History & Background

imageWe started working on Quino in the fall of 2007. As you can see from the first commit, the library was super-small and comprised a single assembly.

Fast-forward seven years and Version 1.13 of Quino has 66 projects/assemblies. That's a lot of code and it was long past time to take a look a more structured look at how we'd managed the architecture over the years.

I'd already opened a branch in our Quino repository called feature/dependencyChanges and checked in some changes at the beginning of July. Those changes had come as a result of the first time I used NDepend to find a bunch of code that was in the wrong namespace or the wrong assembly, architecturally speaking.

I wasn't able to continue using this branch, though, for the following reasons.

  1. I got the hang of NDepend relatively quickly and got a bit carried away. Using ReSharper, I was able to make a lot of changes and fixes in a relatively short amount of time.
  2. I checked in all of these changes in one giant commit.
  3. I did this all five months ago.
  4. There have been hundreds of subsequent commits on the master branch, many of which also include global refactoring and cleanup.
  5. As a result of the above, merging master into feature/dependencyChanges is more trouble than it's worth.

Release Methodology

With each Quino change and release, we try our hardest to balance backward-compatibility with maintainability and effort. If it's easy enough to keep old functionality under an old name or interface, we do so.

We mark members and types obsolete so that users are given a warning in the compiler but can continue using the old code until they have time to upgrade. These obsolete members are removed in the next major or minor upgrade.

Developers who have not removed their references to obsolete members will at this point be greeted with compiler errors. In all cases, the user can find out from Quino's release notes how they should fix a warning or error.

The type of high-level changes that we have planned necessitate that we make a major version-upgrade, to Quino 2.0. In this version, we have decided not to maintain backward-compatibility in the code with Obsolete attributes. However, where we do make a breaking change -- either by moving code to new or different assemblies or by changing namespaces -- we want to maintain a usable change-log for customers who make the upgrade. The giant commit that I'd made previously was not a good start.

Take Two

Since some of these changes will be quite drastic departures in structure, we want to come up with a plan to make merging from the master branch to the feature/dependencyChanges branch safer, quicker and all-around easier.

I want to include many of the changes I started in the feature/dependencyChanges branch, but would like to re-apply those changes in the following manner:

  • Split the giant commit into several individual commits, each of which encapsulates exactly one change; smaller commits are much easier to merge
  • Document breaking changes in the release notes for Quino 2.0
  • Blog about/document the process of using NDepend to clean up Quino2

So, now that I'm ready to start cleaning up Quino for version 2.0, I'll re-apply the changes from the giant commit, but in smaller commits. At the same time, I'll use NDepend to find the architectural breaks that caused me to make those changes in the first place and document a bit of that process.

Setting up the NDepend Project

I created an NDepend project and attached it to my solution. Version 1.13 of Quino has 66 projects/assemblies, of which I chose the following "core" assemblies to analyze.

image

I can change this list at any time. There are a few ways to add assemblies. Unfortunately, the option to "Add Assemblies from VS Solution(s)" showed only 28 of the 66 projects in the Quino solution. I was unable to determine the logic that led to the other 38 projects not being shown. When I did select the projects I wanted from the list, the assemblies were loaded from unexpected directories. For example, it added a bunch of core assemblies (e.g. Encodo.Imaging) from the src/tools/Quino.CodeGenerator/bin/ folder rather than the src/libraries/Encodo.Imaging/bin folder. I ended up just taking the references I was offered by NDepend and added references to Encodo and Quino, which it had not offered to add.3

The NDepend Dashboard

Let's take a look at the initial NDepend Dashboard.

image

There's a lot of detail here. The initial impression of NDepend can be a bit overwhelming, I supposed, but you have to remember the sheer amount of interdependent data that it shows. As you can see on the dashboard, not only are there a ton of metrics, but those metrics are also tracked on a time-axis. I only have one measurement so far.

Any assemblies not included in the NDepend project are considered to be "third-party" assemblies, so you can see external dependencies differently than internal ones. There is also support for importing test-coverage data, but I haven't tried that yet.

There are a ton of measurements in there, some of which interest me and others that don't, or with which I disagree. For example, over 1400 warnings are in the Quino* assemblies because the base namespace -- Encodo.Quino -- doesn't correspond to a file-system folder -- it expects Encodo/Quino, but we use just Quino.

Another 200 warnings are to "Avoid public methods not publicly visible", which generally means that we've declared public methods on internal, protected or private classes. The blog post Internal or public? by Eric Lippert covered this adequately and came to the same conclusion that we have: you actually should make methods public if they are public within their scope.

There are some White Books about namespace and assembly dependencies that are worth reading if you're going to get serious about dependencies. There's a tip in there about turning off "Copy Local" on referenced assemblies to drastically increase compilation speed that we're going to look into.

Dependencies and cycles

One of the white books explains how to use namespaces for components and how to "levelize" an architecture. This means that the dependency graph is acyclic -- that there are no dependency cycles and that there are certainly no direct interdependencies. The initial graphs from the Encodo and Quino libraries show that we have our work cut out for us.

imageimageimage

The first matrix shows the high-level view of dependencies in the Encodo and Quino namespaces. Click the second and third to see some initial dependency issues within the Encodo and Quino assemblies.

That's as far as I've gotten so far. Tune in next time for a look at how we managed to fix some of these dependency issues and how we use NDepend to track improvement over time.



  1. I believe that takes care of full disclosure.

  2. This is something I'd neglected to do before. Documenting this process will help me set up a development process where we use NDepend more regularly -- more than every seven years -- and don't have to clean up so much code at once.

  3. After having read the recommendations in the NDepend White Book -- Partitioning code base through .NET assemblies and Visual Studio projects (PDF) -- it's clear why this happens: NDepend recommends using a single /bin folder for all projects in a solution.

v1.13.0: Schema migration, remoting, services and web apps

The summary below describes major new features, items of note and breaking changes. The full list of issues is also available for those with access to the Encodo issue tracker.

Highlights

Data & Schema

Remoting & services

  • Fixed several issues in the remoting driver (client and server parts). (QNO-4626, QNO-4630, QNO-4631, QNO-4388, QNO-4575, QNO-4629, QNO-4573, QNO-4625, QNO-4633, QNO-4575)
  • Added a runner for Windows services that allows debugging and shows logging output for applications that use the CoreServiceBase, which extends the standard .NET ServiceBase. The runner is available in the Encodo.Service assembly.

Web

  • Improved default and custom authentication in web applications and the remoting server. Also improved support for authorization for remote-method routes as well as MVC controllers.
  • Improved configuration, error-handling and stability of the HttpApplicationBase, especially in situations where the application fails to start. Error-page handling was also improved, including handling for Windows Event Log errors.
  • Improved appearance of the web-based schema migrator. (QNO-4559, QNO-4561, QNO-4563, QNO-4548, QNO-4487, QNO-4486, QNO-4488)

Winform

  • Data-provider statistics: improved the WinForm-based statistics form. (QNO-4231, QNO-4545, QNO-4546)
  • Standard forms: updated the standard WinForm about window and splash screen to use Encodo web-site CI. (QNO-4529)

System & Tools

  • Removed the dependency on the SmartWeakEvents library from Quino. (QNO-4645); the Quino and Encodo assemblies now no longer have any external dependencies.
  • Image handling: the Encodo and Quino libraries now use the Windows Imaging Components instead of System.Drawing. (QNO-4536)
  • Window 8.1: fixed culture-handling for en-US and de-CH that is broken in Windows 8.1. (QNO-4534, QNO-4553)
  • R# annotations have been added to the Encodo assembly. Tell R# to look in the Encodo.Core namespace to use annotations like NotNull and CanBeNull with parameters and results. (QNO-4508)
  • Generated code now includes a property that returns a ValueListObject for each enum property in the metadata. For example, for a property named State of type CoreState, the generated code includes the former properties for the enum and the foreign key backing it, but now also includes the ValueListObject property. This new property provides easy access to the captions.
public CoreState State { ... }
public ValueListObject StateObject { ... }
public int? CoreStateIdId { ... }
```Improved the **nant fix** command in the default build tools to fix the assembly name as well. The build tools are available in bin/tools/build. See the `src/demo/Demo.build` file for an example on how to use the Nant build scripts for your own solutions. To change the company name used by the "fix" command, for example, add the following task override:

``` * Fixed the implementation of `IntegrateRemotableMethods` to avoid a race condition with **remote methods**. Also improved the stability of the `DataProvider` statistics. ([QNO-4599](https://secure.encodo.ch/jira/browse/QNO-4599))

Breaking changes

  • The generic argument TRight has been removed from all classes and interfaces in the Encodo.Security.* namespace. In order to fix this code, just remove the int generic parameter wherever it was used. For example, where before you used the interface IUser<int>, you should now use IUser (QNO-4576).
  • The overridable method MetaAccessControl.DoGetAccessChecker() has been renamed to MetaAccessControl.GetAccessChecker().
  • Renamed the Encodo.ServiceLocator.SimpleInjector.dll to Encodo.Services.SimpleInjector.dll and Quino.ServiceLocator.SimpleInjector.dll to Quino.Services.SimpleInjector.dll Also changed the namespace Quino.ServiceLocator to Encodo.Quino.Services.
  • Renamed HttpApplicationBase.StartMetaApplication() to CreateAndStartUpApplication().
  • Classes may no longer contain properties with names that conflict with properties of IMetaReadable (e.g. Deleted, Persisted). The model will no longer validate until the properties have been renamed and the code regenerated. (QNO-4185)
  • Removed StandardIntRights with integer constants and replaced it with StandardRights with string constants.
  • The IAccessControl.Check() and other related methods now accept a sequence of string rights rather than integers.
  • IMetaConfiguration.ConfigureSession() has been deprecated. The method will still be called but may have undesired side-effects, depending on why it was overridden. The common use was to initialize a custom AccessControl for the session. Continuing to do so may overwrite the current user set by the default Winform startup. Instead, applications should use the IDataSessionAccessControlFactory and IDataSessionFactory to customize the data sessions and access controls returned for an application. In order to attach an access control, take care to only set your custom access control for sessions that correspond to your application model.[^1]
internal class JobVortexDataSessionAccessControlFactory : DataSessionAccessControlFactory
{
  public override IAccessControl CreateAccessControl(IDataSession session)
  {
    if (session.Application.Model.MetaId == JobVortexModelGenerator.ModelGuid)
    {
      return new JobVortexAccessControl(session);
    }

    return base.CreateAccessControl(session);
  }
}

The default length of the UserModule.User.PasswordHash property has been increased from 100 characters to 1000. This default is more sensible for implementations that use much longer validations tokens instead of passwords. To avoid the schema migration, revert the change by setting the property default length back to 0 in your application model, after importing the security module, as shown below.

var securityModule = Builder.Include<SecurityModuleGenerator>();      
securityModule.Elements.Classes.User.Properties[
  Encodo.Quino.Models.Security.Classes.SecurityUser.Fields.PasswordHash
].MaximumSize = 100;
````Application.Credentials` has been removed. To fix references, retrieve the `IUserCredentialsManager` from the service locator. For example, the following code returns the current user:

Session.Application.Configuration.ServiceLocator.GetInstance().Current


If your application uses the `WinformMetaConfigurationTools.IntegrateWinformPackages()` or `WinformDxMetaConfigurationTools.IntegrateWinformDxPackages()`, then the  `IDataSession.AccessControl.CurrentUser` will continue to be set correctly. If not, add the `SingleUserApplicationConfigurationPackage` to your application's configuration. The user in the remoting server will be set up correctly. Add the `WebApplicationConfigurationPackage` to web applications in order to ensure that the current user is set up correctly for each request. ([QNO-4596](https://secure.encodo.ch/jira/browse/QNO-4596))
  * `IDataSession.SyncRoot` has been removed as it was no longer needed or used in Quino itself. Sessions should *not* be used in multiple threads, so there is no need for a `SyncRoot`. Code that uses it should be reworked to use a separate session for each thread.
  * Moved `IMetaApplication.CreateSession()` to an extension method. Add `Encodo.Quino.App` to the using clauses to fix any compile errors.
  * Removed `IMetaApplication.DataProvider`; use `IMetaApplication.Configuration.DataProvider` instead. ([QNO-4604](https://secure.encodo.ch/jira/browse/QNO-4604))
  * The schema migration API has been completely overhauled. `ISchemaChange` and descendents has been completely removed. `ISchemaAction` is no longer part of the external API, although it is still used internally. The `ISchemaChangeFactory` has been renamed to `ISchemaCommandFactory` and, instead of creating change objects, which are then applied directly, returns `ISchemaCommand` objects, which can be either executed or transformed in some other way. `IMigrateToolkit.GetActionFor()` has also been replace with `CreateCommands()`, which mirrors the rest of the API by returning a sequence of commands to address a given `ISchemaDifference`. This release still has some commands that cannot be transformed to pure SQL, but the goal is to be able to generate pure SQL for a schema migration. ([QNO-993](https://secure.encodo.ch/jira/browse/QNO-993), [QNO-4579](https://secure.encodo.ch/jira/browse/QNO-4579), [QNO-4581](https://secure.encodo.ch/jira/browse/QNO-4581), [4588](https://secure.encodo.ch/jira/browse/QNO-4588), [4591](https://secure.encodo.ch/jira/browse/QNO-4591), [QNO-4594](https://secure.encodo.ch/jira/browse/QNO-4594))
  * `IMigrateSchemaAspect.Apply()` has been removed. All aspects will have to be updated to implement `GetCommands()` instead, or to use one of the available base classes, like `UpdateDataAspectBase` or `ConvertPropertyTypeSchemaAspect`. The following example shows how to use the `UpdateDataAspectBase` to customize migration for a renamed property.

internal class ArchivedMigrationAspect : UpdateDataAspectBase { public ArchivedMigrationAspect() : base("ArchivedMigrationAspect", DifferenceType.RenamedProperty, ChangePhase.Instead)

protected override void UpdateData(IMigrateContext context, ISchemaDifference difference) { using (var session = context.CreateSession(difference)) { session.ChangeAndSaveAll(UpdateArchivedFlag); } }

private void UpdateArchivedFlag(Project obj) }


The base aspects should cover most needs; if your functionality is completely customized, you can easily pass your previous implementation of `Apply()` to a `DelegateSchemaCommand` and return that from your implementation of `GetCommands()`. See the implementation of `UpdateDataAspectBase` for more examples. ([QNO-4580](https://secure.encodo.ch/jira/browse/QNO-4580))
  * `MetaObjectIdEqualityComparer<T>` can no longer be constructed directly. Instead, use `MetaObjectIdEqualityComparer<Project>.Default`.
  * Renamed `MetaClipboardControlDx.UpdateColorSkinaware()` to `MetaClipboardControlDx.UpdateSkinAwareColors()`.
  * `IMetaUnique.LogicalParent` has been moved to `IMetaBase`. Since `IMetaUnique` inherits from `IMetaBase`, it is unlikely that code is affected (unless reflection or some other direct means was used to reference the property). ([QNO-4586](https://secure.encodo.ch/jira/browse/QNO-4586))
  * `IUntypedMessage` has been removed; the `AssociatedObject` formerly found there has been moved to `IMessage`.
  * `ITypedMessage.AssociatedObject` has been renamed to `ITypedMessage.TypedAssociatedObject`. ([QNO-4647](https://secure.encodo.ch/jira/browse/QNO-4647))
  * Renamed `MetaObjectTools` to `MetaReadableTools`.
  * Redefined the protected methods `GenericObject.GetAsGuid()` and `GenericObject.GetAsGuidDefault` as extension methods in `MetaWritableTools`.
  * `IMetaFeedback.CreateGlobalContext()` has been removed. Instead the `IGlobalContext` is created using the service locator.

------------------------------------------------------------------------


[^1]: The schema migration creates a metadata model for your model -- meta-metadata -- and uses the Quino ORM to load data when importing a model from a database. If you aren't careful, as shown in the code example, then you'll attach your custom access control to the sessions created for the schema migration's data-access, which will more than likely fail when it tries to load user data from a table that does not exist in that model.
Schema migration in Quino 1.13

Quino is a metadata framework for .NET. It provides a means of defining an application-domain model in the form of metadata objects. Quino also provides many components and support libraries that work with that metadata to automate many services and functions. A few examples are an ORM, schema migration, automatically generated user interfaces and reporting tools.

The schema-migration tool

The component we're going to discuss is the automated schema-migration for databases. A question that recently came up with a customer was: what do all of the options mean in the console-based schema migrator?

Here's the menu you'll see in the console migrator:

Advanced Options
(1) Show migration plan
(2) Show significant mappings
(3) Show significant mappings with unique ids
(4) Show all mappings
(5) Show all mappings with unique ids

Main Options
(R) Refresh status
(M) Migrate database
(C) Cancel

The brief summary is:

  • The only action that actually makes changes is (M)
  • Option (1) is the only advanced option you will every likely use; use this to show the changes that were detected

The other advanced options are more for debugging the migration recommendation if something looks wrong. In order to understand what that means, we need to know what the migrator actually does.

image

  1. Provide the application model as input
  2. Import a model from the database as input
  3. Generate a mapping between the two models
  4. Create a migration plan to update the database to reflect the application model
  5. Generate a list of commands that can be applied to the database to enact the plan
  6. Execute the commands against the database

The initial database-import and final command-generation parts of migration are very database-specific. The determination of differences is also partially database-specific (e.g. some databases do not allow certain features so there is no point in detecting a difference that cannot ever be repaired). The rest of the migration logic is database-independent.

Gathering data for migration

The migrator works with two models: the target model and a source model

  • The target model is provided as part of the application and is usually loaded from a core assembly.
  • The source model is imported from the database schema by the "import handler"

Given these two models, the "mapping builder" creates a mapping. In the current implementation of Quino, there is no support for allowing the user to adjust mapping before a migration plan is built from it. However, it would be possible to allow the user to verify and possibly adjust the mapping. Experience has shown that this is not necessary. Anytime we thought we needed to adjust the mapping, the problem was instead that the target model had been configured incorrectly. That is, each time we had an unexpected mapping, it led us directly to a misconfiguration in the model.

The options to show mappings are used to debug exactly such situations. Before we talk about mapping, though, we should talk about what we mean by "unique ids". Every schema-relevant bit of metadata in a Quino model is associated with a unique id, in the form of a Guid and called a "MetaId" in Quino.

Importing a model from a database

What happens during when the import handler generates a model?

The importer runs in two phases:

  1. Extract the "raw model" from the database schema
  2. Enhance the "raw model" with data pulled from the application-specific Quino metadata table in the same database

A Quino application named "demo" will have the following schema:

  • All modeled tables are named "demo__*"
  • The metadata table is named "demometadata__elementdescription"

The migrator reads the following information into a "raw model"

  • Tables => MetaClasses
  • Fields/Columns => MetaProperties
  • Indexes => MetaIndexes
  • Foreign Keys => MetaPaths

If there is no further information in the database, then the mapper will have to use the raw model only. If, however, the database was created or is being maintained by Quino, then there is additional information stored in the metadata table mentioned above. The importer enhanced the raw model with this information, in order to improve mapping and difference-recognition. The metadata table contains all of the Quino modeling information that is not reflected in a standard database schema (e.g. the aforementioned MetaId).

The data available in this table is currently:

  • SchemaIdentifier: the identifier used in the raw model/database schema
  • Identifier: the actual identifier of the metadata element that corresponds to the element identified by the SchemaIdentifier
  • MetaId: the unique id for the metadata element
  • ObjectType: the type of metadata (one of: class, property, index, path, model)
  • ParentMetaId: the unique id of the metadata element that is the logical parent of this one; only allowed to be empty for elements with ObjectType equal to "model"
  • Data: Custom data associated with the element, as key/value pairs
  • DataVersion: Identifies the format type of the "Data" element (1.0.0.0 corresponds to CSV)

For each schema element in the raw model, the importer does the following:

  1. Looks up the data associated with that SchemaIdentifier and ObjectType (e.g. "punchclock__person" and "class")
  2. Updates the "Identifier"
  3. Sets the "MetaId"
  4. Loads the key/value pairs from the Data field and applies that data to the element

Generating a mapping

At this point, the imported model is ready and we can create a mapping between it and the application model. The imported model is called the source model while the application model is called the target model because we're migrating the "source" to match the "target".

We generate a mapping by iterating the target model:

  1. Find the corresponding schema element in the source model using MetaIds1
  2. If an element can be found, create a mapping for those two elements
  3. If no element can be found, create a mapping with the target element. This will cause the element to be created in the database.
  4. For all elements in the source model that have no corresponding element in the target model, create a mapping with only the source element. This will cause the element to be dropped from the database.

Creating a migration plan

The important decisions have already been made in the mapping phase. At this point, the migrator just generates a migration plan, which is a list of differences that must be addressed in order to update the database to match the target model.

  • If the mapping has a source and target element
    • Create a difference if the element has been renamed
    • Create a difference if the element has been altered (e.g. a property has a different type or is now nullable; an index has new properties or is no longer unique; etc.) If the mapping has only a source, generate a difference that the element is unneeded and should be dropped.
  • If the mapping has only a target, generate a difference that the element is missing and should be created.

This is the plan that is shown to the user by the various migration tools available with Quino.2

The advanced console-migrator commands

At this point, we can now understand what the advanced console-migrator commands mean. Significant mappings are those mappings which correspond to a difference in the database (create, drop, rename or alter).

  • Show significant mappings: show significant mappings to see more detail about the names on each side
  • Show significant mappings with unique ids: same as above, but also include the MetaIds for each side. Use this to debug when you suspect that you might have copy/pasted a MetaId incorrectly or inadvertently moved one.
  • Show all mappings: Same detail level as the first option, but with all mappings, including those that are 100% matches
  • Show all mappings with unique ids: same as above, but with MetaIds

As already stated, the advanced options are really there to help a developer see why the migrator might be suggesting a change that doesn't correspond to expectations.

Generating commands for the plan

At this point, the migrator displays the list of differences that will be addressed by the migrator if the user chooses to proceed.

What happens when the user proceeds? The migrator generates database-specific commands that, when executed against the database, will modify the schema of the database.3

Commands are executed for different phases of the migration process. The phases are occasionally extended but currently comprise the following.

  • Initialize: perform any required initialization before doing anything to the schema
  • DropConstraintsAndIndexes: drop all affected constraints and indexes that would otherwise prevent the desired modification of the elements involved in the migration.
  • AddUpdateOrRenameSchema: Create new tables, columns and indexes and perform any necessary renaming. The changes in this phase are non-destructive
  • UpdateData: Perform any necessary data updates before any schema elements are removed. This is usually the phase in which custom application code is executed, to copy existing data from other tables and fields before they are dropped in the next phase. For example, if there is a new required 1--1 relation, the custom code might analyze the other data in the rows of that table to determine which value that row should have for the new foreign key.
  • DropSchema: Drop any unneeded schema elements and data
  • CreatePrimaryKeys: Create primary keys required by the schema. This includes both new primary keys as well as reestablishing primary keys that were temporarily dropped in the second phase.
  • CreateConstraintsAndIndexes: Create constraints and indexes required by the schema. This includes both new constraints and indexes as well as reestablishing constraints and indexes that were temporarily dropped in the second phase.
  • UpdateMetadata: Update the Quino-specific metadata table for the affected elements.

Executing the migration plan

The commands are then executed and the results logged.

Afterward, the schema is imported again, to verify that there are no differences between the target model and the database. In some (always rarer) cases, there will still be differences, in which case, you can execute the new migration plan to repair those differences as well.

In development, this works remarkably well and often, without further intervention.

Fixing failed migrations

In some cases, there is data in the database that, while compatible with the current database schema, is incompatible with the updated schema. This usually happens when a new property or constraint is introduced. For example, a new required property is added that does not have a default value or a new unique index is added which existing data violates.

In these cases, there are two things that can be done:

  • Either the database data is cleaned up in a way that makes it compatible with the target schema4
  • Or the developer must add custom logic to the metadata elements involved. This usually means that the developer must set a default value on a property. In rarer cases, the developer must attach logic to the affected metadata (e.g. the property or index that is causing the issue) that runs during schema migration to create new data or copy it from elsewhere in order to ensure that constraints are satisfied when they are reestablished at the end of the migration.

In general, it's strongly advised to perform a migration against a replica of the true target database (e.g. a production database) in order to guarantee that all potential data situations have been anticipated with custom code, if necessary.

Quino Migration versus EF Migrations

It's important to point out that Quino's schema migration is considerably different from that employed by EF (which it picked up from the Active Migrations in Ruby, often used with Ruby on Rails). In those systems, the developer generates specific migrations to move from one model version to another. There is a clear notion of upgrading versus downgrading. Quino only recognizes migrating from an arbitrary model to another arbitrary model. This makes Quino's migration exceedingly friendly when moving between development branches, unlike EF, whose deficiencies in this area have been documented.



  1. The default is to use only MetaIds. There is a mode in which identifiers are used as a fallback but it is used only for tools that import schemas that were not generated by Quino. Again, if the Quino metadata table hasn't been damaged, this strict form of mapping will work extremely well.

  2. The Winform and Web user interfaces for Quino both include built-in feedback for interacting with the schema migration. There are also two standalone tools to migrate database schemas: a Winform application and a Windows console application.

  3. The form of these commands is currently a mix of SQL and custom C# code. A future feature of the migration will be to have all commands available as SQL text so that the commands, instead of being executed directly, could be saved as a file and reviewed and executed by DBAs instead of letting the tool do it. We're not quite there yet, but proceeding nicely.

  4. This is generally what a developer does with his or her local database. The data contained therein can usually be more or less re-generated. If there is a conflict during migration, a developer can determine whether custom code is necessary or can sometimes determine that the data situation that causes the problem isn't something that comes up in production anyway and just remove the offending elements or data until the schema migration succeeds.

Optimizing compilation and execution for dynamic languages

The long and very technical article Introducing the WebKit FTL JIT provides a fascinating and in-depth look at how a modern execution engine optimizes code for a highly dynamic language like JavaScript.

To make a long story short: the compiler(s) and execution engine optimize by profiling and analyzing code and lowering it to runtimes of ever decreasing abstraction to run as the least dynamic version possible.

A brief history lesson

What does it mean to "lower" code? A programming language has a given level of abstraction and expressiveness. Generally, the more expressive it is, the more abstracted it is from code that can actually be run in hardware. A compiler transforms or translates from one language to another.

When people started programming machines, they used punch cards. Punch cards did not require any compilation because the programmer was directly speaking the language that the computer understood.

The first layer of abstraction that most of us -- older programmers -- encountered was assembly language, or assembler. Assembly code still has a more-or-less one-to-one correspondence between instructions and machine-language codes but there is a bit of abstraction in that there are identifiers and op-codes that are more human-readable.

Procedural languages introduced more types of statements like loops and conditions. At the same time, the syntax was abstracted further from assembler and machine code to make it easier to express more complex concepts in a more understandable manner.

At this point, the assembler (which assembled instructions into machine op-codes) became a compile which "compiled" a set of instructions from the more abstract language. A compiler made decisions about how to translate these concepts, and could make optimization decisions based on registers, volatility and other settings.

In time, we'd graduated to functional, statically typed and/or object-oriented languages, with much higher levels of abstraction and much more sophisticated compilers.

Generally, a compiler still used assembly language as an intermediate format, which some may remember from their days working with C++ or Pascal compilers and debuggers. In fact, .NET languages are also compiled to IL -- the "Intermediate Language" -- which corresponds to the instruction set that the .NET runtime exposes. The runtime compiles IL to the underlying machine code for its processor, usually in a process called JIT -- Just-In-Time compilation. That is, in .NET, you start with C#, for example, which the compiler transforms to IL, which is, in turn, transformed to assembler and then machine code by the .NET runtime.

Static vs. Dynamic compilation

A compiler and execution engine for a statically typed language can make assumptions about the types of variables. The set of possible types is known in advance and types can be checked very quickly in cases where it's even necessary. That is, the statically typed nature of the language allows the compiler to reason about a given program without making assumptions. Certain features of a program can be proven to be true. A runtime for a statically typed language can often avoid type checks entirely. It benefits from a significant performance boost without sacrificing any runtime safety.

The main characteristic of a dynamic language like JavaScript is that variables do not have a fixed type. Generated code must be ready for any eventuality and must be capable of highly dynamic dispatch. The generated code is highly virtualized. Such a runtime will execute much more slowly than a comparable statically compiled program.

Profile-driven compilation

Enter the profile-driven compiler, introduced in WebKit. From the article,

The only a priori assumption about web content that our engine makes is that past execution frequency of individual functions is a good predictor for those functions future execution frequency.

Here a "function" corresponds to a particular overload of a set of instructions called with parameters with a specific set of types. That is, suppose a JavaScript function is declared with one parameter and is called once with a string and 100 times with an integer. WebKit considers this to be two function overloads and will (possibly) elect to optimize the second one because it is called much more frequently. The first overload will still handle all possible types, including strings. In this way, all possible code paths are still possible, but the most heavily used paths are more highly optimized.

All of the performance is from the DFGs type inference and LLVMs low-level optimizing power. [...]

Profile-driven compilation implies that we might invoke an optimizing compiler while the function is running and we may want to transfer the functions execution into optimized code in the middle of a loop; to our knowledge the FTL is the first compiler to do on-stack-replacement for hot-loop transfer into LLVM-compiled code.

Depending on the level of optimization, the code contains the following broad sections:

  • Original: code that corresponds to instructions written by the author
  • Profiling: code to analyze which types actually appear in a given code path
  • Switching: code to determine when a function has been executed often enough to warrant further optimization
  • Bailout code to abandon an optimization level if any of the assumptions made at that level no longer apply

image

While WebKit has included some form of profile-driven compilation for quite some time, the upcoming version is the first to carry the same optimization to LLVM-generated machine code.

I recommend reading the whole article if you're interested in more detail, such as how they avoided LLVM compiler performance issues and how they integrated this all with the garbage collector. It's really amazing how much that we take for granted the WebKit JS runtime treats as "hot-swappable". The article is quite well-written and includes diagrams of the process and underlying systems.

Configure IIS for passing static-file requests to ASP.Net/MVC

At Encodo we had several ASP.Net MVC projects what needed to serve some files with a custom MVC Controller/Action. The general problem with this is that IIS tries hard to serve simple files like PDF's, pictures etc. with its static-file handler which is generally fine but not for files or lets say file-content served by our own action.

The goal is to switch off the static-file handling of IIS for some paths. One of the current projects came up with the following requirements so I did some research and how we can do this better then we did in past projects.

Requirements:

  1. Switch it off only for /Data/...
  2. Switch it off for ALL file-types as we don't yet know what files the authors will store in somewhere else.

This means that the default static-file handling of IIS must be switched off by some "magic" IIS config. In other apps we switched it off on a per file-type basis for the entire application. I finally came up with the following IIS-config (in web.config). It sets up a local configuration for the "data"-location only. Then I used a simple "*" wild-card as the path (yes, this is possible) to transfer requests to the ASP.Net. It looks like this:

<location path="data">
  <system.webServer>
    <handlers>
      <add name="nostaticfile" path="*" verb="GET" type="System.Web.Handlers.TransferRequestHandler" preCondition="integratedMode,runtimeVersionv4.0" />
    </handlers>
  </system.webServer>
</location>

Alternative: Instead a controller one could also use a custom HttpHandler for serving such special URL's/Resources. In this project I decided using an action for this because of the central custom security which I needed for the /Data/... requests as well and got for free when using Action instead a HttpHandler.