Java Wikipedia API

<<Back

The main purpose of the Java Wikipedia API (Bliki engine) is the rendering of the Wikipedia syntax into HTML.

For an overview of the complete API please look at the Bliki Javadoc

Special features in the Wikipedia to HTML renderer:

  • At the moment the <math> tag creates HTML suitable for the jsMath JavaScript package: http://www.math.union.edu/~dpvc/jsMath/ (if you override the IWikiModel#isMathtranRenderer() method it will return image links to the http://www.mathtran.org service)
  • Syntax highlighting for code snippets (surround the code in tags like: <source lang=java>...</source>, <source lang=php>...</source>, <source lang=csharp>...</source>, <source lang=javascript>...</source>, <source lang=xml>...</source>, <source lang=abap>...</source> is enabled by default and can be switched off in the IWikiModel#showSyntax() method.
  • The following Interwiki link targets (see INTERWIKI_STRINGS) from the default Configuration are supported.

Contents


Testpages

Download

You can download the latest release here:

Forum

There is a Bliki Wikipedia Engine Forum for discussing the Wikipedia Engine API.

Wikipedia text to HTML Example

The general idea for the wiki to html "WikiModel" is, that the common wiki syntax rendering is hidden in the internal WikipediaParser. Users of the API should derive a class from WikiModel or AbstractWikiModel, where special things could be managed.

A simple wiki to html fragment looks like this:

public static void main(String[] args)
	{
		WikiModel wikiModel = 
                            new WikiModel("http://www.mywiki.com/wiki/${image}", 
                                          "http://www.mywiki.com/wiki/${title}");
		String htmlStr = wikiModel.render("This is a simple [[Hello World]] wiki tag");
		System.out.print(htmlStr);
	}

You can for example overwrite the WikiModel#parseInternalImageLink() method to change the default rendering behaviour of the [[Image:...]] tag.

public class WikiTestModel extends WikiModel {
  public WikiTestModel(String imageBaseURL, String linkBaseURL) {
    super(imageBaseURL, linkBaseURL);
  }

  public void parseInternalImageLink(StringBuffer writer, String imageNamespace, String name) {
    ...

    ...
  }
}

By default the rendering engine doesn't allow the style attribute to avoid cross-site scripting risks. You can define the style attribute as allowed in a static block of your WikiModel implementation.

  static {
    TagNode.addAllowedAttribute("style");  
    ...
  }

Look in the WikiModel.java and AbstractWikiModel.java sources for an example:

A more advanced example can be found in the HTMLCreatorTest.java file. If you run this example the first time, the Tom Hanks wiki source from Wikipedia is downloaded through the Wikipedia API. The downloaded wiki texts and templates are stored in an Apache Derby database, and associated images are downloaded in an already existing image directory c:/temp/WikiImages. After the first run there's a new Derby database created in the directory C:\temp\WikiDB. Every subsequent run of this code snippet will only download the Tom Hanks wiki source. The associated templates and images are already cached in the Derby database and in the images directory:

	public static void testCreator001() {
		String[] listOfTitleStrings = { "Tom Hanks" };
		User user = new User("", "", "http://en.wikipedia.org/w/api.php");
		user.login();
		String mainDirectory = "c:/temp/";
		// the following subdirectory should not exist if you would like to create a
		// new database
		String databaseSubdirectory = "WikiDB";
		// the following directory must exist for image downloads
		String imageDirectory = "c:/temp/WikiImages";
		WikiDB db = null;

		try {
			db = new WikiDB(mainDirectory, databaseSubdirectory);
			APIWikiModel wikiModel = new APIWikiModel(user, db, "${image}", "${title}", imageDirectory);
			DocumentCreator creator = new DocumentCreator(wikiModel, user, listOfTitleStrings);
			creator.setHeader(HTMLConstants.HTML_HEADER1 + HTMLConstants.CSS_STYLE + HTMLConstants.HTML_HEADER2);
			creator.setFooter(HTMLConstants.HTML_FOOTER);
			creator.renderToFile(mainDirectory + "TomHanks.html");
		} catch (IOException e) {
			e.printStackTrace();
		} catch (Exception e1) {
			e1.printStackTrace();
		} finally {
			if (db != null) {
				try {
					db.tearDown();
				} catch (Exception e) {
					e.printStackTrace();
				}
			}
		}
	}

HTML to Wikipedia text example

The HTML 2 Wiki parser is derived from this project:

A simple html to wiki fragment looks like this:

		HTML2WikiConverter conv = new HTML2WikiConverter();
		conv.setInputHTML("<b>hello<em>world</em></b>");
		String result = conv.toWiki(new ToWikipedia());
		assertEquals(result, "'''hello''world'''''");

You can use the HTML to Wikipedia text converter online here:

Sourcecode to syntax highlighted HTML converter

A syntax highlighter is available for the languages: ABAP (SAP R/3), C#, Java, JavaScript, PHP, XML(HTML).

A conversion from Java to syntax highlighted HTML looks like this:

import info.bliki.wiki.tags.code.JavaCodeFilter;
import info.bliki.wiki.tags.code.SourceCodeFormatter;

public class HelloWorld {
	
	public static void main(String[] args) {
		String javaCode = "public class HelloWorld {\n" + 
				"	public static void main(String[] args) {\n" + 
				"		System.out.println(\"Hello World\");\n" + 
				"	}\n" + 
				"}\n" + 
				"";
		SourceCodeFormatter f = new JavaCodeFilter();
		String result;
		String coding1 = "<pre class=\"java\" style=\"border: 1px solid #b4d0dc; background-color: #ecf8ff;\">";
		String coding3 = "</pre>";
		result = f.filter(javaCode);
		result = coding1 + result + coding3;
		System.out.println(result);
	}
}

You can use the source code to syntax highlighted HTML converter online here:

Wikipedia text to PDF conversion

There's partial support for converting a Wikipedia text into a PDF document.

See for example the PDFCreatorTest.java file:

	public static void testPDF001() {
		String[] listOfTitleStrings = { "Tom Hanks" };
		User user = new User("", "", "http://en.wikipedia.org/w/api.php");
		user.login();
		WikiDB db = null;
		String mainDirectory = "c:/temp/";
		// the following subdirectory should not exist if you would like to create a
		// new database
		String databaseSubdirectory = "WikiDB";
		// the following directory must exist for image downloads
		String imageDirectory = "c:/temp/WikiImages";
		try {
			db = new WikiDB(mainDirectory, databaseSubdirectory);
			APIWikiModel myWikiModel = new APIWikiModel(user, db, "${image}", "file:///c:/temp/${title}", imageDirectory);
			DocumentCreator creator = new DocumentCreator(myWikiModel, user, listOfTitleStrings);

			creator.renderPDFToFile(mainDirectory, "Tom_Hanks.pdf", HTMLConstants.CSS_STYLE);
		} catch (Exception e) {
			e.printStackTrace();
		} finally {
			if (db != null) {
				try {
					db.tearDown();
				} catch (Exception e) {
					e.printStackTrace();
				}
			}
		}
	}

The PDF Generation is based on the Flying Saucer All-Java XHTML Renderer project. See these webpages for more information:

Wikipedia text to Docbook conversion

There's also partial support for converting a Wikipedia text into a Docbook document.

	// some wiki text stored in a test string:
	public final static String TEST1 = "...";

	public void testDocbook() {
		WikiModel myWikiModel = new WikiTestModel("file:///c:/temp/${image}", "file:///c:/temp/${title}");
		String renderedXHTML = myWikiModel.render(TEST1);
		DocbookGenerator gen = new DocbookGenerator();
		try {
			String output = gen.create(renderedXHTML, DocbookGenerator.HEADER_TEMPLATE, DocbookGenerator.FOOTER, "Big Docbook Test");
			System.out.println(output);
		} catch (Exception e) {
			e.printStackTrace();
		}
	}

Helper classes for the Wikimedia api.php

Helper Classes for the MediaWiki API page can be found in the package: info.bliki.api

For example this snippet determines the categories used in the Wikimedia Main Page and http://meta.wikimedia.org/wiki/API API page:

	public static void testQueryCategories001() {
		String[] listOfTitleStrings = { "Main Page", "API" };
		User user = new User("", "", "http://meta.wikimedia.org/w/api.php");
		user.login();
		List<Page> listOfPages = user.queryCategories(listOfTitleStrings);
		for (Page page : listOfPages) {
			// print page information
			System.out.println(page.toString());
			for (int j = 0; j < page.sizeOfCategoryList(); j++) {
				Category cat = page.getCategory(j);
				// print every category in this page
				System.out.println(cat.toString());
			}
		}
	}

Development

The latest source code can be found in the projects SVN Repository:

For usage examples please look at the JUnit tests:

How to use SVN:

Similar Java projects