Munjari News

Published by Marco on in Munjari
We would really love to be able to tell you that Munjari is ready and that you’re all invited to a private beta test. Things don’t always work out as planned and we haven’t made as much progress on Munjari as we would have liked. Though this is partially due to our size and the amount of other work we had (it was a pretty good year ;-), it is mostly due to some pretty sobering experiences with quality problems in industry-standard tools.

Reboot

After more than a year’s time, we have decided that we can’t keep going in the same direction anymore – it costs too much time and money and frustration. Munjari as it was conceived stays exactly the same – a dead-simple way for anyone to get their data online – but, since the first of the year, we rethought the project from the ground up and made quite a few changes.

With more experience, we now know that basic components that we expected would already exist, simply aren’t available or suffer from such massive quality problems that they can’t be used. Given that, we have come to the conclusion that we must write much more of the software ourselves instead of being able to follow in others’ footsteps. We looked for and found a new programming language, some new libraries, including a fair share that we are writing ourselves and, most of all, new hope and drive to get Munjari into your hands as quickly as possible.

Building Munjari

When we started Munjari, we were also starting our business. Since Munjari is a privately-funded internal project, we had to figure out how to make it manageable in terms of both time and money invested. We decided on a strategy of building software for clients using the same technologies as we used in Munjari, so each could benefit from experience gained while working on the other.

At the time, we made what seemed like a relatively safe choice: we decided on Java. The backend software would be written in Java, as would the web site itself. All the cool kids were doing it.

We knew which components we needed:

  • A database, with an ORM (Object Relational Mapper) to read and store our data
  • An application framework, with which we could build Munjari without having to start from scratch
  • A web framework, for Munjari as well as for client projects
  • A development environment in which to develop everything

Though the ORM is necessary for most projects, Munjari’s didn’t require one as both the project files and user data were far too generic to warrant using one.

Java

We were drawn to Java because it’s one of the industry-standard languages – and we sure weren’t going to use C++! With the addition of annotations and rudimentary generics support, it had matured into a language worth using. The sheer number of libraries and open-source projects available is impressive. It seemed like a slam dunk.

We were not prepared for several disappointments:

  • The language itself is quite rigid and uncompromising and doesn’t lend itself well at all to succinct expression of generic ideas. Even with generics, there is an uncomfortable amount of casting required when working at a more abstract level.
  • The quality of third-party libraries left much to be desired. Hibernate is only one example with rather grave problems; Tapestry is stable enough when deployed, but has problems in development. Evaluations of other software components often resulted in many products, only a handful of which were usable. A profusion of available software is only useful if it actually saves you time, effort and frustration. That was almost never the case, in our experience.

Though there are literally dozens of web frameworks, they mostly support POJOs (Plain Old Java Objects). That is, they make it possible to render literally anything. They do not, however, provide any default rendering of objects or any framework that takes redundant work off of an application developer’s hands. In its native form (without, for example, the BeanForm library), Tapestry requires that you place all controls by hand, binding each to a property of the POJO, and writing in labels by hand. Even with BeanForm, the meta-data describing the form is defined in the page containing the form.

In essence, it seemed that many of the difficult bits of building applications – those that can be generalized for all web applications – were always left as “an exercise for the reader”. We increasingly noticed the time saved by using a framework being eaten up by the time invested bolting missing functionality onto a design that was often quite resistent to change. Already we were thinking more and more: if we were going to all of this effort, it seemed we could at least invest it into a design we believed in.

Environment

We began by using Tomcat as a local development server, which was an absolutely painful decision. Though it’s the industry standard for server deployment, it was sorely lacking as a development server. The key feature of a web server used for development is speed. It needs to start and stop quickly. Tomcat took, on average, about 5–10 seconds to start. Combined with the Tapestry start time of 10–30 seconds, this was completely unacceptable for development.

A server needs to be able to disable all caching or reload automatically anything that has changed. We started on Tomcat 4, which had shaky support for automatic reload, then moved to Tomcat 5, which was better, but still flawed. It seemed to understand that it should reload changed files, but still got confused, after which you were once again left to dig through its various on-disk caches to clean them out manually. Combined with a horrendous memory leak in Tapestry (discussed below), we were lucky to get 2 or 3 runs out of the server before it got unusably slow or simply generated an “out of memory” error.

After finding several tips online, we moved to the Jetty Java Container, which was considerably faster and more stable. The server start time itself was tiny, leaving us only with the Tapestry startup time to slow us down.

Even using Jetty, we were still surprised to find how often the server still needed to be restarted because the running JVM could not be hot-patched. It seems that this capability is limited to altering the implementation of defined methods. When it worked, it was wonderful – you can make a change, save the file and the execution point jumps back to the start of the function. It just didn’t work like that very often, for various reasons:

  • Many modern Java libraries (including Hibernate and Tapestry) use annotations, a feature new to Java 1.5. Various library subsystems read these annotations at application startup to adjust the loaded code accordingly. Therefore, most annotation changes cannot be fixed using the hot-patch mechanism.
  • The hot-patcher cannot add code to a JVM – not even a constant!

Compared to web development with other scripting languages, like PHP, development with Java was incredibly slow and annoying, getting in the way at nearly every step.

Tapestry

Tapestry is a Java web application framework, which takes a component-based approach to building web pages. Each page is a tree of components, each of which renders HTML output; together, they define the output sent back to the user. The engine within which this all occurs is a complex network of services that are instantiated and connected by the required companion technology, called Hivemind. Hivemind is an IOC (Inversion of Control) container; suffice it to say that it is a way of building an application, whose components are only very loosely coupled using Java interfaces.

If that all sounds horribly complicated, it’s because it is. Determining what exactly is going on during the handling of a page request is fraught with danger and uncertainty. Problems include:

  • Completely useless URLs for most links generated by Tapestry components. Most require an implicit context in the session that is invalidated when the server is restarted. This makes development a nightmare of repetitively clicking the same sequence of links to return to the desired point in an application.
  • Synchronization with the Hibernate session used to process the request is purely left up to the application. Tapestry ships with no support, tutorials, FAQs or samples on how to integrate a Tapestry application with the industry-standard ORM as a data back-end. This is an example of the type of software support we expected from something calling itself a framework.
  • Hivemind is only configurable using XML files. Though the elements of the XML schema are documented, there are almost no samples showing how to combine the elements to achieve particular effects. The syntax/semantics for similar tasks often involved completely different constructs, making it difficult to apply previous experience when tackling new requirements. Any change to the XML requires a restart of the server and looking at the output console to determine whether there were any problems.
  • Loading the Hivemind registry is relatively quick (1-2 seconds). Actually using something from Tapestry invokes an instantiation and wiring orgy that takes between 10-30 seconds (depending on processor speed, hard drive seek speed and the phase of the moon).
  • For development, you have to turn off caching of pages and components, so that you can make changes without restarting the server. Tapestry can do this, but there’s a known bug: it fails to reclaim memory for a large number of objects when run in this mode. No one in the Tapestry world has been able to track down this bug and fix it. In one of our applications, we measured a 2MB leak per page refresh. This made the application nearly untestable in uncached mode.
  • The documentation is, after more than a year, still filled with a confusing mix of Tapestry 3 and 4 for the more important tasks and still lacks examples or support for anything but the simplest tasks. This is improving, but it’s not there yet.

Over the last year, Tapestry has fragmented more and more, with the founder of the library, Howard Lewis Ship, moving on to work on Tapestry 5, a new web framework, which is completely incompatible with Tapestry 4. Just as Tapestry 4 was more-or-less completely incompatible with Tapestry 3. Though Tapestry 5 looks quite promising and seems to address many of the problems found with the development model of Tapestry 4, it is still a long way off and we need a solution now.

Hibernate

Of all of the components that disappointed us, it’s Hibernate that came as the biggest surprise. Hibernate is an enormous open-source project with wide industry support and is used by hundreds, if not thousands, of applications around the world. Our attempts to shoehorn it into service were met with strong resistance. Upon digging deeper, we discovered we were neither at fault, nor alone. However, though others online were encountering the same bugs, they were far more likely to forgive Hibernate than we were.

Among the bigger problems were:

  • Datasets whose shape changed depending on the query. Retrieving a set of objects might give a list of objects (A), each of which has a list of sub-objects (B). Changing a restriction on the (B) results in a change in the number of objects in (A), with some objects appearing twice. Filtering to remove duplicates is an exercise left up to the reader.
  • The most common inheritance model generates extremely innefficient queries, essentially joining the base table with all possible descendents in order to determine the type of the object. In order to support lazy-loading, all objects are actually dynamically-generated proxy objects. However, they are not necessarily proper descendents of the expected objects, rendering polymorphism and instanceof unusable.
  • Many solutions we found online – and this was common for many Java ORM solutions – stopped short of mapping complex queries to the database. And, in many cases, anything more than a single join was considered complex. Often the solutions offered by the community involved reading a superset into memory and performing local filtering. Granted, there are cases in which this is the only solution, but we expect a mature ORM to have a much better “best effort” than Hibernate is able to offer.

We came to the conclusion that our time was better invested in another solution than in programming around Hibernate’s rather egregious errors.

The Future

After a post-mortem like that, we decided to start fresh and looked around at other web frameworks. Our search came up with some interesting candidates:

Python/Django
Now this is more like it! Django deserves to be called a framework. Meta-data stands at the center, with data access, an automatically generated “admin” interface (with lists and data-entry) and a forms library all rendered automatically from it. Combine this with elegant URL handling/routing and a straighforward API and it is a very good-looking, elegant library. The documentation is also the best we’ve ever seen, bar none. It’s only drawback is that it uses Python, which may be difficult to bring into corporate environments (unless you sell software to Google).
Groovy/Grails
Groovy is a Java-like language, which runs in the JVM, but extends Java with support for functional constructs (like closures), implicit types and dynamic properties. It is ostensibly a scripting version of Java that puts the language on a level with other hot language today, like Python and Ruby, but does so within the established platform of the JVM. Grails is a web development framework, which uses Groovy to provide a subset of the generic framework support provided by something like Django.
Cayenne
As a replacement for Hibernate as the ORM, there is Cayenne, which gets almost all of the basic stuff right. It, too, quits a bit early for more complex queries and mapping duties, but is still head and shoulders above Hibernate. We’ve done some basic tests (throwing stuff at it that caused us untold woe in Hibernate) and it came through just fine.
.NET
The latest Visual Studio from Microsoft includes everything in one integrated package, from an incredible amount of documentation to a fully modern IDE with code completion for everything and easy debugging. Initial investigations have shown that the libraries for the web framework seem quite well thought out, as do those for GUI development.

So, that’s the latest news about Munjari. We’re working hard, but have had to backtrack a bit. We’ve put some bad experiences behind us and have learned from them. We’ll let you know soon which direction we’ve chosen and what kind of progress we’re making with our new solution as soon as we can.