From MediaWiki to XWiki part II

Published by Patrick on in Tips and Tricks

Previously…

In part I we talked about how to get the information out of our old wiki and tranform it to the new format.

The harvested and converted content of our old MediaWiki is now in a flat directory located on our hard drive. But these hundreds of files would require a serious copy-paste action on our behalf, which we’re not willing to participate in. Thus we’re getting some help from groovy.

Importing

XWiki has an XML-RPC interface which allows us to sneak our pages in. Using groovy to talk to this interface will spare us having to write huge chunks of custom code. You can open a so-called XMLRPC-proxy to talk directly to an XMLRPC-API. Just create the proxy and use it like a COM-Object with late-binding (and without the other hassles of COM-Objects):

serverProxy = new XMLRPCServerProxy("http://myserver/xwiki/xmlrpc/confluence")
token = serverProxy.confluence1.login(username, password)

The code above will authenticate us on our XWiki running on “myserver”. Token is the authentication-token that we’ll have to pass to XWiki with every action.

Now we’re going to import our exported MediaWiki based on regular expressions over the page-title:

// enumerate files 
new File(dirname).eachFileMatch( ~"${pattern}" ) { f ->
  try {
    spaceAndFile = "${space}." + f.getName();
    println "Importing ${spaceAndFile}…";
    page = serverProxy.confluence1.getPage(token, spaceAndFile )
    page["content"] = f.getText().replaceAll( "MySpacePlaceholder", space );
    serverProxy.confluence1.storePage(token, page);
    f.delete();
  } catch( Exception e) {
    println "Cannot upload the page!:\n ${e}"
    throw e;
  }
}

The variables used in this piece of code are:

dirname
The directory our pages are located in.
pattern
Filename pattern. In part I we exported all of our pages, converting to the XWiki-markup using the page-title (URI) as the filename.
space
The name of the XWiki-space to import the pages into.

Every occurence of “MySpacePlaceholder” is replaced by space and the source-file is deleted after the successful import. If you don’t care about spaces at all you’d want to specify the space as “Main” and the pattern as “*”.

You can fetch the whole file (with commandline handling, etc.) here: import.groovy script.

Importing

If you’re happy with only one space, this solution will suit you fine. Go to the directory to which you’ve exported the wiki-content and run the following line:

groovy import.groovy . * Main

This will import all articles in the current directory and put them into the “Main” space.

If you’d like to sort the articles in advance (like we did) you’ll have to wait for my next article, describing a more sophisticated replacement of the MySpacePlaceholders.