1 2 3 4
Submitting a CPAN module

We're in process of moving from our internal wiki from MediaWiki to XWiki. To achieve this goal, I've written a dialect plug-in to the HTML::WikiConverter, a framework that converts raw HTML to a given wiki dialect. More on that on that in a later article. To give the community the source back we've decided to submit the new module to CPAN, the Perl module archive.

To actually commit something you have to become a Perl author. To become a Perl author you need to apply for an account on the PAUSE (Perl Authors Upload Server). Using the PAUSE interface, it's a matter of two minutes to fill in some fields. What could take a while is the approval process since its done by hand. After two hours, my account was ready and I was able to log in.

Packing the module

After getting my account info, I tried to locate information about what my packed model should contain. Failing to find anything on that matter (apart from the fact that every .pm needs to have a version), I've taken another module as a template.

My Module has the following structure:

/Changes

Short change history, not much as of now

/Makefile

Makefile, generated by perl MakefilePL

/Makefile.PL

Make file generator

/MANIFEST

List of files

/META.yml

Meta description of the module, built automatically

/README

Readme generated by perldoc

/lib/HTML/WikiConverter/XWiki.pm

My module

/t/00-load.t

Test if the module can be loaded

/t/boilerplate.t

Checks the README and Changes files

/t/pod-coverage.t

Checks if the package Test::Pod::Coverage is installed

/t/pod.t

Checks if the package Test::Pod is installed

/t/runtests.pl

Needed by xwiki.t

/t/xwiki.t

Runs a few tests (~90) to see if we're generating sane output

With:

perl Makefile.PL
make dist

I've got all files tar'ed, zipped up and ready to upload.

Namespace or not?

After the upload you are encouraged to register your namespace. But it doesn't say what that has to do with indexing or getting your module listed. Disregarding all that, I've registered the namespace HTML::WikiConverter::XWiki and am still awaiting confirmation.

Finally

Meanwhile the indexer has reached my module and it's available for download here. Or, even more conveniently, via cpan:

perl -MCPAN -e 'install HTML::WikiConverter::XWiki'
Cleaning Up Unneeded Changelists

Synchronizing Perforce Depots presented a script that automatically pulled information from a remote Perforce server into a local depot. Occasionally, the script will be able to create the changelist, but will, for one reason or another, be unable to submit it. In that case, we need to go in and clean up these "dangling" changelists manually.

Below are the commands to execute on the server:

p4 login
p4 revert -n //depot/...
p4 revert //depot/...
p4 change -d changelist#
  1. Log in to the server as the user under to whom the changelists belong (in our case, we named the Perforce admin user "root"). This ensures that the Perforce login command will work correctly.[^1]
  2. Check the list of files that will be reverted. Use the -n option to preview without reverting.
  3. Revert all open files. If there are files open that should not be reverted or that belong to changelists that will not be deleted, use one or more revert commands with more restrictive paths to clean up the files.
  4. Once the lists are empty, delete each changelist using the change command and the -d option.

Automating the Process

We don't have to execute this very often, so we haven't automated it yet. However, it should be relatively easy to dopossible to use retrieve the list of pending changelists for a user (again, assuming the user is named root):

p4 changelists -s pending -u root

This returns a list of open changeslists, each of which is displayed with the following format:

Change 1234 on 2007/02/07 by root@perforce-server *pending* '[Changelist description]'

It shouldn't be too hard to write a script that does the following:

Get the list of pending changelists for a given user:

p4 changelists -s pending -u root
  1. For each changelist, do the following:Extract the changelist number using something like grep or sed or whatever.Revert files, using the -c option to restrict to that changelist:
p4 revert -c 1234 //depot/...
```Delete it:

p4 change -d 1234


A working script is an exercise left up to the reader. ;-)

------------------------------------------------------------------------


[^1]: If you need to use another user name, then you'll have to either pass the user name and password as options to each command or set the P4USER and P4PASSWD environment variables.
Using P4D 2006.1/104454 and P4 2007.1.11.3813
Loadbalancing with mod_jk and Apache

Now that we've built a recent version of mod_jk, lets use one of the newly gained features: Loadbalancing.

Suppose we have two hosts, node1 and node2. Node1 runs an Apache and a Tomcat instance. On node2 we've got another Tomcat.

image

A browser will connect to the host that's running the Apache. Since the load on a single server running a web application can be pretty severe, we're going to share the burden of serving servlets with multiple hosts (in our case two hosts). And we're going to make mod_jk to do that for us.

First let's take a look at what's needed to get Apache to talk with mod_jk. These lines should go into your httpd.conf:

LoadModule jk_module /usr/lib/apache/1.3/mod_jk.so

JkWorkersFile /etc/apache/workers.properties
JkLogFile /var/log/apache/mod_jk.log
JkLogLevel info
JkLogStampFormat "[%a %b %d %H:%M:%S %Y] "

<VirtualHost *>
  ServerAdmin example@example.com
  ServerAlias www.example.com
  DocumentRoot /var/www
  ServerName example.com
  JkMount /* loadbalancer
  JkMount /status/* status
  ErrorLog /var/log/apache/example-com-error_log
  CustomLog /var/log/apache/example-com-access_log combined
</VirtualHost>

The first line tells mod_jk where to look for its configuration. We're going to create this file in a short while, but let's first look at the other options.

JkMount /* loadbalancer
  JkMount /status/* status

These lines redirect all requests to '/' to our JkWorker named loadbalancer and the requests to '/status/' to the worker status. The workers are specified in the workers.properties file.

Let's have a look at this file:

# workers to contact, that's what you have in your httpd.conf
worker.list=loadbalancer,status

#setup node1
worker.node1.port=8009
worker.node1.host=localhost
worker.node1.type=ajp13
worker.node1.lbfactor=50

#setup node2
worker.node2.port=8009
worker.node2.host=host2
worker.node2.type=ajp13
worker.node2.lbfactor=100

#setup the load-balancer
worker.loadbalancer.type=lb
worker.loadbalancer.balance_workers=node1,node2
worker.loadbalancer.sticky_session=True
#worker.loadbalancer.sticky_session_force=True

# Status worker for managing load balancer
worker.status.type=status

We need to supply mod_jk with a list of top level workers; in our case, these are loadbalancer and status.

worker.list=loadbalancer,status

The configuration for our status worker is as easy as it gets:

worker.status.type=status

Configuring our workers which will actually do the hard work is no different in a load balanced environment:

worker.node1.port=8009
worker.node1.host=localhost
worker.node1.type=ajp13

This is exactly what you'd do if you had only one tomcat to connect to. Now comes the part that isn't in the standard playbook:

worker.node1.lbfactor=50
[...]
worker.node2.lbfactor=100

These lines inform the loadbalancer to spread the load 1:2 over the nodes 1 & 2. It's the ratio that's important, and not the numbers itself. A lbfactor of 2 and 4 would yield the same result.

But we don't even have our loadbalancer worker defined, lets do that now:

worker.loadbalancer.type=lb
worker.loadbalancer.balance_workers=node1,node2
worker.loadbalancer.sticky_session=True
#worker.loadbalancer.sticky_session_force=True

We're defining a worker named loadbalancer with it's type set to lb (obviously short for loadbalancer) and assign node1 and node2 to handle the load.

Sticky sessions are an important feature if you rely on jSessionIDs and are not using any session-replication layer. If sticky_session is True a request always gets routed back to the node which assigned this jSessionID. If that host should get disconnected, crash or become unreachable otherwise the request will be forwarded to another host in our cluster (although not too successfully as the session-id is invalid in it's context). You can prevent this from happening by setting sticky_session_force to True. In this case if the host handling the requested session-id fails, Apache will generate an internal server error 500.

Now we've told mod_jk about our setup. If you are using sticky sessions, you'll need to tell Tomcat to append its node-id to the session id. This needs to be the same as worker.name.jvm_route, which by default is the worker's name (in our case node1 and node2).

Search for the Engine-tag in your server.xml and add the following attribute:

jvmRoute="node1"

Do that on both Tomcat installations. If you don't, the load balancing will work but only for the first request per session. The following lines will appear in your log file:

[Thu Oct 26 17:28:36 2006] [3986:0000] [info]  get_most_suitable_worker::jk_lb_worker.c (672): all workers are in error state for session 1AB31B3F1F72D673E59D42F4A79E364C
[Thu Oct 26 17:28:36 2006] [3986:0000] [error] service::jk_lb_worker.c (984): All tomcat instances failed, no more workers left for recovery
[Thu Oct 26 17:28:36 2006] [3986:0000] [info]  jk_handler::mod_jk.c (1986): Service error=0 for worker=loadbalancer

Meaning that Tomcat can't find the node that served your session.

How to build mod_jk on Debian

As the version of mod_jk in Debian stable is somewhat outdated (what could have possibly changed in three years time?) we've decided to go forward and build our own. Here's a digest on how we've actually built and installed the module.

First we'll need the apache-development package:

apt-get install apache-dev

After that's completed, download the source from your favourite mirror.

Untar, configure and make (you know the drill):

tar xvzf tomcat-connectors-1.2.19-src.tar.gz
cd tomcat-connectors-1.2.19-src/native
./configure --with-apxs=/usr/bin/apxs
make

Since we don't have a /usr/libexec directory and the makefile built by configure fails due to this fact, we'll have to copy the files manually:

cp apache-1.3/mod_jk.so.0.0.0 /usr/lib/apache/1.3/mod_jk.so

Last but not least, paste the following text into /usr/lib/apache/1.3/500mod_jk.info

LoadModule: jk_module /usr/lib/apache/1.3/mod_jk.so
Directives:
 JkMountCopy
 JkMount
 JkAutoMount
 JkWorkersFile
 JkLogFile
 JkLogLevel
 JkLogStampFormat
 JkAutoAlias
 JkRequestLogFormat
 JkExtractSSL
 JkHTTPSIndicator
 JkCERTSIndicator
 JkCIPHERIndicator
 JkSESSIONIndicator
 JkKEYSIZEIndicator
 JkOptions
 JkEnvVar
Description: Tomcat connector for Java servlets and web applications

All thats left to do is to configure apache to actually load our newly built module. A quick guide on how to achieve that can be found here.

Interfaces in Delphi - Part II

This is the second of a two-part article on interfaces. part one is available here.

In part one, we saw how to use non-reference-counted interfaces to prevent objects from magically disappearing when using interfaces in common try...finally...FreeAndNil() cases. Though this brings the interface problem under control, there is further danger.

Dangling Interfaces

A dangling interface is another problem that arises even when using non-reference-counted interfaces. In this case, the crash happens because an object has been freed, but there are still (often implicit) references to it in interfaces. Anytime a reference to an interface is removed -- set to nil -- the function _Release is called on the object behind the interface. If this object has already been freed, there is a rather nasty crash deep in library code.

A nice use of interfaces is as a return type, so that objects from various inheritance hierarchies can be used from common code. To better illustrate this problem, consider the two interfaces below:

IRow = **interface**
  **function** ValueAtIndex( aIndex: integer ): variant; 
**end**;

ITable = **interface**
  **procedure** GoToFirst;
  **procedure** GoToNext;
  **function** IsPastEnd: boolean;
  **function** CurrentRow: IRow;
**end**;

The two interfaces describe a way of generically iterating a table and retrieving values for each column in a row. Now, take a look at a concrete implementation for the table iterator.1

TRow = **class**( TNonReferenceCountedObject, IRow )
**protected**
  Values: array of variant;
**public**
  **function** ValueAtIndex( aIndex: integer ): variant; 
**end**;

TTable = **class**( TNonReferenceCountedObject, ITable )
**protected**
  Index: integer;
  Rows: TObjectList;
**public**
  **procedure** GoToFirst;
  **procedure** GoToNext;
  **function** IsPastEnd: boolean;
  **function** CurrentRow: IRow;
**end**;

The implementation is not shown, but assume that each row allocates a buffer for its values and that the table allocates and frees its rows when destroyed. Assume further the naive implementation for the remaining methods -- they are not salient to this discussion.

The example that follows iterates this table in a seemingly innocuous way, but one that causes a crash ... sometimes. That's what makes this class of problem even more difficult -- it's unpredictability. The lines of code that change a row's reference count are followed by the reference count. This helps see what is happening behind the scenes and explains the ensuing crash.

**procedure** DoSomething;
    rowSet:= CreateRowSet;
    **try**
      rowSet.GoToFirst;
      **while not** rowSet.IsPastEnd **do begin**
        val1:= rowSet.CurrentRow.ValueAtIndex( 0 ); // (1)
        val2:= rowSet.CurrentRow.ValueAtIndex( 1 ); // (2)
        rowSet.GoToNext;
      **end**;
    **finally**
      FreeAndNil( rowSet );
    **end**;
  **end**; // (1) CRASH!

The code looks harmless enough; it is not obvious at all that CurrentRow returns an interface. The two references to an IRow are left "dangling" in the sense that the code has no references to them. But they exist nonetheless and will be cleared when exiting the function scope -- after the objects to which they refer have been freed.

The way to fix this -- and to work completely safely with interfaces -- is to use only explicit references to interfaces. DoSomething is rewritten below:

**procedure** DoSomething;
    rowSet:= CreateRowSet;
    **try**
      rowSet.GoToFirst;
      **while not** rowSet.IsPastEnd **do begin**
        row:= rowSet.CurrentRow; // (1)
        **try**
          val1:= row.ValueAtIndex( 0 );
          val2:= row.ValueAtIndex( 1 );
        **finally**
          row:= nil; // (0)
        **end**;
        rowSet.GoToNext;
      **end**;
    **finally**
      FreeAndNil( rowSet );
    **end**;
  **end**;

Interfaces are very useful, but Delphi Pascal's implementation leaves a lot to be desired. It is possible to write completely safe code for them, but it takes a lot of practice and care. And, as seen in the examples above, interfaces can be easily hidden in with and mixed with objects, so that crashes remain a mystery if the presence of a rogue interface is not detected.



Using Delphi 7.0


  1. The class TNonReferenceCountedObject is assumed to an implementation of the IUnknown methods to prevent reference counting, as illustrated earlier in the article.

Interfaces in Delphi - Part I

This is the first of a two-part article on interfaces. part two is available here.

Delphi Pascal, like many other languages that refuse to implement multiple inheritance, regardless of how appropriate the solution often is, added interfaces to the mix several years ago. However, Borland failed, at the same time, to add garbage collection, so they opted instead for a COM-like reference-counting model, which automatically frees an interface when there are no more references to it. Any references to the object behind the interface are on their own.

The Perils of Reference Couting

This is not just a theoretical problem; it's extraordinarily easy to provoke this situation. The definitions below show a simple interface and a class that uses that interface:

ISomeInterface = interface
  **procedure** DoSomethingGreat;
end;

TSomeObject = **class**( TInterfacedObject, ISomeInterface )
  **procedure** DoSomethingGreat;
end;

Now imagine that an application has a library of functions that accept an interface of type ISomeInterface (like DoSomething in the example below). Given the definition above, if it has an instance of TSomeObject, it can magically profit from this library, even though the library doesn't know anything about any of the objects in its inheritance chain. ProcessObjects below uses this library function in the simplest and most direct way possible.

**procedure** DoSomething( aObj: ISomeInterface );
**begin**
  aObj.DoSomethingGreat;
**end**;

**procedure** ProcessObjects;
**var**
  obj: TSomeObject;
**begin**
  obj:= TSomeObject.Create;
  **try**
    DoSomething( obj );
  **finally**
    obj.Free;
  **end**;
**end**;

At first glance, there is nothing wrong with this code. However, executing it results in an access violation (crash). Why? The short answer is that references to the object (as opposed to references to the interface) do not increase the reference count on the object. In order to better illustrate this point, let's unroll the DoSomething function into ProcessObjects to make the interface assignment explicit. This is shown below, with the reference count of obj shown before each line:

**procedure** ProcessObjects;
**var**
  obj: TSomeObject;
  aObj: ISomeInterface;
**begin**
  obj:= TSomeObject.Create; // (0)
  **try**
    aObj:= obj; // (1)
    aObj.DoSomethingGreat; 
    aObj:= nil;  // (0) obj is freed automatically!
  **finally**
    obj.Free;
  **end**;
**end**;

With reference-counted objects, as soon as the reference count reaches 0, it is automatically freed. Programming with this kind of pattern is, at best, a touchy affair, so most experienced Delphi programmers have learned one of two things about interfaces:

  1. Don't touch them, they're poison
  2. Use only non-reference-counted versions

A non-reference-counted interface implementation overrides the _AddRef and _Release methods to always return 1, so that the object behind the interface is never automatically released. This avoids a lot of crashes, but not all of them. Part two will show how to avoid the dreaded dangling interface.

Continue to part two.


Using Delphi 7.0

Up to date with updates

We here at Encodo Systems AG have a thing for shiny new things, such as shiny, fast and geeky servers. Since every server has to be upgraded in time to ensure that we're the only ones who have access to them, we'll need to know that they are up-to-date.

If you're encircled by a herd of Debian servers, here is a script that you might find usable (we certainly do). Add it to your crontab and you'll receive an email when an update is due on your server.

For your copy-paste-convenience:

#!/bin/bash
# apt_update_check.sh 1.0
# Check for available security updates on systems running APT
#
# BEGIN LICENSE BLOCK
#
#  Copyright (c) 2006 Encodo Software AG, Patrick Staehlin <patrick@encodo.ch>
#
#  This program is free software; you can redistribute it and/or modify
#  it under the terms of version 2 of the GNU General Public License
#  as published by the Free Software Foundation.
#
#  A copy of that license should have arrived with this
#  software, but in any event can be snarfed from www.gnu.org.
#
#  This program is distributed in the hope that it will be useful,
#  but WITHOUT ANY WARRANTY; without even the implied warranty of
#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#  GNU General Public License for more details.
#
# END LICENSE BLOCK

# config
LAST_RUN="/tmp/update-last-run";
THIS_RUN="/tmp/update-this-run";
EMAIL="my-email@host.tld";

# setup
results='';
if [ ! -e $LAST_RUN ]; then
  touch $LAST_RUN;
fi

# get the list of packages to be updated
apt-get -qq update && apt-get -dqq upgrade && apt-get -sqq upgrade | grep Inst | cut -d\  -f2,3 --output-delimiter=_ > $THIS_RUN;

if [ $? -ne 0 ]; then
  exit 0;
fi

# check this list against the last run
for pkg in `cat $THIS_RUN`; do
  res=`fgrep $pkg $LAST_RUN | wc -l`;
  if [ $res -eq 0 ]; then
    results="$results$pkg\n";
  fi
done

# mark this run as the last run
mv $THIS_RUN $LAST_RUN

# if we had results mail them to $EMAIL
if [ -n "$results" ]; then
  echo -ne "Host: ${HOSTNAME}\nPackages:\n --  --  --  --  --  -- --\n${results} --  --  --  --  --  -- --\n\nPlease type apt-get upgrade to upgrade this host." |   mail -s "Upgrades available on $HOSTNAME" $EMAIL
fi
Show Scanned Files in Kaspersky

In order to show the files actually being scanned and/or analyzed by Kaspersky, do the following:

  1. First make sure that all logging is enabled.
  2. Select the Protection tab from the main window
  3. Select View Reports from the bottom left
  4. Double-click Real-time file system protection (running...) to open the report
  5. Select the Report tab to see which files are being accessed/scanned

From here, you can see which files are scanned and why they were ignored (usually your preferences).

Enable All Logging in Kaspersky

Kaspersky only logs errors and warnings by default. For more detailed reports on individual files scanned, do the following:

  1. Open Kaspersky
  2. Select the Settings tab from the main window
  3. Select Additional Settings from the bottom left
  4. In the General tab, check the Log all reports box
  5. Select Ok to save your settings