Wednesday, March 10, 2010

JavaXPCom : Setup a Java Browser

I wrote earlier about setting up a Java Browser using Gecko browser engine from Mozilla and SWT at http://techdior.blogspot.com/2010/03/java-web-browser-with-gecko.html. Turns out, there is a simpler solution, without needing SWT and all that threading code. And this is closer to XPCOM, at least looks closer at a first glance than the other approach. So here we go

Download Gecko SDK aka XULRunner SDK from Mozilla. After installation, run
xulrunner.exe --register-user
Add xulrunner-sdk\bin to your PATH variable. If the path variable is not set, you would see the following exception
Exception in thread “main” java.lang.UnsatisfiedLinkError: C:\xulrunner\javaxpcomglue.dll: Can’t find dependent libraries

Now we create a new java eclipse project. Add MozillaInterfaces.jar and MozillaGlue.jar from xulrunner-sdk\lib\ to java build path. Write a simple class for driving the browser. Here is full class code


import java.io.File;

import org.mozilla.interfaces.nsIAppStartup;
import org.mozilla.interfaces.nsIDOMWindow;
import org.mozilla.interfaces.nsIServiceManager;
import org.mozilla.interfaces.nsIWindowCreator;
import org.mozilla.interfaces.nsIWindowWatcher;
import org.mozilla.xpcom.Mozilla;

public class TestClass {

public static void main(String[] args) throws Exception {

Mozilla mozilla = Mozilla.getInstance();
File grePath = new File("D:/softwares/xulrunner-sdk/bin");
LocationProvider locProvider = new LocationProvider(grePath);

mozilla.initialize(grePath);
mozilla.initEmbedding(grePath, grePath, locProvider);

// Now we need to start an XUL application, so we get an instance of the
// XPCOM service manager
nsIServiceManager serviceManager = mozilla.getServiceManager();

// Now we need to get the @mozilla.org/toolkit/app-startup;1 service:
nsIAppStartup appStartup = (nsIAppStartup) serviceManager
.getServiceByContractID("@mozilla.org/toolkit/app-startup;1",
nsIAppStartup.NS_IAPPSTARTUP_IID);

// Get the nsIWindowWatcher interface to the above
nsIWindowCreator windowCreator = (nsIWindowCreator) appStartup
.queryInterface(nsIWindowCreator.NS_IWINDOWCREATOR_IID);

// Get the window watcher service
nsIWindowWatcher windowWatcher = (nsIWindowWatcher) serviceManager
.getServiceByContractID(
"@mozilla.org/embedcomp/window-watcher;1",
nsIWindowWatcher.NS_IWINDOWWATCHER_IID);

// Set the window creator (from step 6)
windowWatcher.setWindowCreator(windowCreator);

// Create the root XUL window:
nsIDOMWindow win = windowWatcher.openWindow(null,
"http://www.google.com", "mywindow",
"chrome,resizable,centerscreen", null);

// Set this as the active window
windowWatcher.setActiveWindow(win);

// Hand over the application to xpcom/xul, this will block:
appStartup.run();

mozilla.termEmbedding();

}

}


This uses LocationProvider class, which is more or less standard unless you want to play with package structure and is presented next

import java.io.File;

import org.mozilla.xpcom.IAppFileLocProvider;

public class LocationProvider implements IAppFileLocProvider {

File grePath;

public LocationProvider(File grePath) {
this.grePath = grePath;
}

public File getFile(String aProp, boolean[] aPersistent) {
File file = null;
if (aProp.equals("GreD") || aProp.equals("GreComsD")) {
file = grePath;
if (aProp.equals("GreComsD")) {
file = new File(file, "components");
}
} else if (aProp.equals("MozBinD") || aProp.equals("CurProcD")
|| aProp.equals("ComsD") || aProp.equals("ProfD")) {
file = grePath;
if (aProp.equals("ComsD")) {
file = new File(file, "components");
}
}
return file;
}

public File[] getFiles(String aProp) {
//System.out.println(aProp);

File[] files = null;
if (aProp.equals("APluginsDL")) {
files = new File[1];
files[0] = new File(grePath, "plugins");
}
return files;
}
}

Saturday, March 6, 2010

Java Web Browser with Gecko

For my latest project on analyzing web-pages for user actionable items, I wanted to create a customizable web-browser. The idea is to have a controlled environment in which the web-page is rendered. It would have been simple to do this via java-script that is run before the page is being loaded, for example via a extension in Firefox, however, I wanted to do it in the back-end in a headless application. The first step would be to run a Gecko browser in Java.

This turned out to be simpler than expected. First, download swt.jar from eclipse. This provides us with a simple widget framework, inside of which our browser is rendered. Once we have that, download Gecko SDK aka XULRunner SDK from Mozilla. After installation, run
xulrunner.exe --register-user

Note: I tried xulrunner.exe --register-global, but it failed on my Windows 7 machine, perhaps due to user restrictions.

Now we create a new java eclipse project. Add swt.jar into the java build path. Also add MozillaInterfaces.jar and MozillaGlue.jar from xulrunner-sdk\lib\. Create a simple class with the following code


Display display = new Display();
shell = new Shell(display);
shell.setSize(800, 600);
shell.open();
Browser browser = new Browser(shell, SWT.MOZILLA);
browser.setBounds(shell.getClientArea());
browser.setUrl("http://www.google.com");


That is the basic code. However, this needs to be modified in order to manage timing issues. The entire java file with comments is reproduced here. This is from an independent source on the net, which at the moment I am unable to find and so cannot reference.

Now we just need to find a way to plug into the browser environment, get access to Dom, execute some javascript, and probably write some extensions to existing Dom elements.


import java.io.IOException;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.TimeUnit;

import org.eclipse.swt.SWT;
import org.eclipse.swt.SWTError;
import org.eclipse.swt.browser.Browser;
import org.eclipse.swt.browser.ProgressEvent;
import org.eclipse.swt.browser.ProgressListener;
import org.eclipse.swt.widgets.Display;
import org.eclipse.swt.widgets.Shell;

public class SimpleBrowserWithGo {

// We will need SWT display to execute methods
// into the SWT event thread.

Browser browser;
private Display display;

// Latch used to manage page loading
// Uses a count of 1, so when the browser starts loading
// a page, we create a new latch, which will be
// decremented when the page is loaded.
private CountDownLatch latch;

// Default timeout to 60 seconds
private long defaultTimeout = 60000;

/**
* Creates a web browser which is able to load pages waiting until the page
* is completely loaded.
*
*/
public SimpleBrowserWithGo() {

// Use a latch to wait for the browser initialization.
final CountDownLatch initLatch = new CountDownLatch(1);

// MozillaBrowser needs a window manager to work. We are using SWT
// for the graphical interface, so we need to execute MozillaBrowser
// methods into the SWT event thread. If we were use another thread,
// that methods could not work properly and throw an exception,
// breaking the execution flow and crashing our application.
new Thread("SWT-Event-Thread") {
@Override
public void run() {

display = new Display();
Shell shell = new Shell(display);

shell.setSize(800, 600);
shell.open();

// If you have XULRunner installed, you can call the constructor
// without
// the last parameter:
//
// final MozillaBrowser browser = new
// MozillaBrowser(shell,SWT.BORDER);
//
// That last parameter is the path for XULRunner files
// (where you have uncompressed downloaded XULRunner package).
try {
browser = new Browser(shell, SWT.MOZILLA);
} catch (SWTError e) {
System.out.println("Could not instantiate Browser: "
+ e.getMessage());
e.printStackTrace();
return;
}

// Adapt browser size to shell size
browser.setBounds(shell.getClientArea());

// Listens for page loading status.
browser.addProgressListener(new ProgressListener() {
public void changed(ProgressEvent event) {
}

public void completed(ProgressEvent event) {
// When a page is loaded, decrement the latch,
// which count will be 0 after this call.
latch.countDown();
}
});

// Realease the initialization latch, which has value 1,
// so after this call its value will be 0.
initLatch.countDown();

while (!shell.isDisposed()) {
if (!display.readAndDispatch()) {
display.sleep();
}
}

System.exit(0);
}
}.start();

try {
// Waits until the initialization latch is released.
initLatch.await();
} catch (InterruptedException e) {
Thread.interrupted();
}
}

/**
* Loads an URL into the browser and waits until the page is totally loaded.
*
* @param url
* @throws SimpleBrowserException
*/
public void go(final String url) throws IOException {

// Creates a latch with count 1
latch = new CountDownLatch(1);

// Uses the SWT event thread to execute the method to
// load an URL in the browser.
display.syncExec(new Runnable() {
public void run() {
browser.setUrl(url);
}
});

// Waits for the finish of the page loading, or for a given
// timeout in case that the loading doesn't finish in a
// reasonable time.
boolean timeout = waitLoad(defaultTimeout);
if (timeout) {
throw new IOException("Timeout waiting page loading.");
}

}

private boolean waitLoad(long millis) {
try {
// Uses the latch, created by 'go' method to wait for
// the finish of the page loading (it will occurs when
// our 'progressListener' receives a event for its method
// 'completed'), or for a given timeout in case that the

// loading doesn't finish in a reasonable time.
boolean timeout;
timeout = !latch.await(millis, TimeUnit.MILLISECONDS);

if (timeout) {
// If the timeout expired, then we will stop
// page loading.
display.syncExec(new Runnable() {
public void run() {
browser.stop();
}
});
// Waits for the loading is stopped
latch.await(millis, TimeUnit.MILLISECONDS);
}
return timeout;
} catch (InterruptedException e) {
throw new Error(e);
}
}

public static void main(String[] args) {

// Instantiate our simple web browser
SimpleBrowserWithGo simpleBrowser = new SimpleBrowserWithGo();

try {
// Use the new functionality to load some URLs
// with our browser.
// simpleBrowser.go("http://www.google.com");
// Thread.sleep(3000);
// simpleBrowser.go("http://www.urjc.es");
// Thread.sleep(3000);
simpleBrowser.go("http://www.mozilla.org");
Thread.sleep(3000);
System.in.read();
} catch (IOException e) {
System.err.println("Problems calling go method.");
e.printStackTrace();
} catch (InterruptedException e) {
System.err.println("Problems calling sleep.");
e.printStackTrace();
Thread.interrupted();
}

Runtime.getRuntime().halt(0);

}

}

Thursday, June 18, 2009

Compass (http://www.compass-project.org/) makes integrating search functionality into your website a breeze, only if you are building it in Java though. I have been experimenting a lot with other languages like php, python, rails but always turn back to java and the reason has mostly been the compelling library support and more-recently Compass. Maybe it is my naivety, but I couldn't find anything half as interesting anywhere else.

So, back to compass. Lucene is a well known search tool offered in Java. But was a bit difficult to setup and use, until someone built a user-friendly layer on top. Don't get me wrong, lucene is pretty good in itself and can be setup if you are ready to spend a week (two weeks if you want to optimize). However, that involves writing a lot of code, which we no longer have to do in compass. For e.g. when using Date queries, in Lucene you have to convert dates into lexicographic order (YYYY-MM-DD) before you can run span queries on it. But compass has that functionality built in using a DateConverter. And then compass integrates much more cleanly with hibernate than Lucene used to. So much for the rant. Lets see it in action.

We need two configuration file, assuming we have the data model ready.
compass.cfg.xml - Compass specific configuration


<compass-core-configuration>
<compass>
<setting name="compass.engine.connection">indexes</setting>
<mapping resource="search/model.cpm.xml">
</mapping>
</compass>
</compass-core-configuration>


model.cpm.xml - Mapping from Models to Indexes

<compass-core-mapping package="com.model">
<class name="User" alias="user">
<id name="id">
<dynamic-meta-data name="text" converter="ognl">toString()</dynamic-meta-data>
<constant>
<meta-data>type</meta-data>
<meta-data-value>user</meta-data-value>
</constant>
<component name="profile">
<property name="userName">
<property name="email">
</property>
</property>
</component></id></class></compass-core-mapping>


Other than this, we need to provide a add a HibernateGPS device as an entity listener. This listens to any updates done on the entity and writes these out to the database. I use spring, so all this config goes into my application context xml.

<bean id="hibernateGpsDevice" class="org.compass.gps.device.hibernate.HibernateGpsDevice">
<property name="name"><value>hibernateDevice</value></property>
<property name="sessionFactory"><ref local="sessionFactory"></ref>
<property name="nativeExtractor"><bean class="org.compass.spring.device.hibernate.SpringNativeHibernateExtractor"></bean>
</property>



<bean id="compassGps" class="org.compass.gps.impl.SingleCompassGps" method="start">
<property name="compass"><ref bean="compass"></ref>
<property name="gpsDevices">
<list>
<ref bean="hibernateGpsDevice">
</ref>
</list>
</property>
</property></bean></property></bean><


Once that is done, we can create/re-create search index by calling compassGps.index(). Now just need to create a Compass object and our search is in order.

<bean id="compass" class="org.compass.spring.LocalCompassBean">
<property name="configLocation" value="classpath:search/compass.cfg.xml">
<property name="transactionManager" ref="transactionManager">
</property>
</property></bean>


And now do the actual search, with pagination et. all.

final CompassSearchHelper compassHelper = new CompassSearchHelper(this.compass, pageSize);
return new SearchResults(compassHelper.search(new CompassSearchCommand("email: test@test.com", page)));


And we are done. Happy searching.