Sat 29 May 2010
Quite possibly my best single line of code ever. Certainly my best single line of Java ever.
And it’s in a unit test.
private static String W = "Президент Буш";
What’s yours?
Sat 29 May 2010
Quite possibly my best single line of code ever. Certainly my best single line of Java ever.
And it’s in a unit test.
private static String W = "Президент Буш";
What’s yours?
Sun 26 Apr 2009
We are starting to use thrift and needed an ant build recipe. Here’s what we came up with. It works good and the only thing that looks like an abstraction leakage to me is that I needed to know the package, the java namespace, for the resultant generated thrift code and the name of one of the thrift generated files. The primary goal was to eliminate running the thrift generator when the generated code is newer than the .thrift files. There isn’t a one to one mapping between .thrift files and generated output so if any of the generated stuff is newer than any of the thrift then it all gets recreated.
I also didn’t want to have to copy the thrift output to someplace else, so a javac target was added to just treat the “gen-java” thrift output as a new source directory for direct java compilation. The normal ant target to compile the java code can now just depend on “thrift-gen”.
Sat 21 Feb 2009
JNA is surely deserving of all the praise it has been getting. It’s being used on some pretty high profile projects like JRuby with great success. After having done JNI the hard way, the painful, tortuous, despicable, bang-head-on-keyboard-while-wondering-if-ReleaseStringUTFChars-applies-here and why-the-jvm-is-segfaulting-again way, well I have a deep appreciation for JNA.
Still who wants to go write a bunch of useless Java interfaces for stuff that already exists in built into the native library itself? Not me. So that’s where Jython and JRuby come in.
This week I needed Jython/Python access to some native modules, namely ssdeep for fuzzy hashing. There’s already a pretty nice solution for connecting pure python to native libraries — you can either use swig or pyrex. The pyrex piece for ssdeep has been mostly written here. Needed to add the fuzzy_hash_buf method into that mix but it was nice and easy. From inside pure python with ssdeepmodule.so (via pyrex) and libfuzzy.so (from ssdeep) (or .dylib for Mac or .dll on Windows) sitting there on your LD_LIBRARY_PATH, you get to do this coolness:
from ssdeep import ssdeep
import sys, os
f = open("/bin/ls","rb")
data = f.read()
f.close()
ss = ssdeep()
fuzzy_hash = ss.fuzzy_hash_buf(data)
Pretty nice you have to admit. But from my pure java p2p data-driven workflow framework, I really wanted to do this from Jython to keep from having to start the interpreter up in a subprocess over and over. Pyrex extensions do not work in Jython. Makes perfect sense. M’kay. I could write a whole bunch of lame JNi code to hook libfuzzy.so in there. Or I could use JNA and write some non-dry interface in java and figure out all the details of the types and so forth. Or … I could just push all that code down into the python module that I’m going to call.
from com.sun.jna import NativeLibrary, Function, Memory
import sys,os
class ssdeep:
fuzlib = None
hash_func = None
FUZZY_HASH_SIZE = 116
def __init__(self):
self.fuzlib = NativeLibrary.getInstance('fuzzy')
self.hash_func = fuzlib.getFunction('fuzzy_hash_buf')
pass
def hash_data(self,data):
ptr = Memory(self.FUZZY_HASH_SIZE)
i = self.hash_func.invokeInt([data,len(data),ptr])
return ptr.getString(0,False)
With a class and method conveniently named exactly the same as the pyrex module I can make it all flexible enough to work either way:
try:
# try the pyrex extension module
from ssdeep import ssdeep
except ImportError:
try:
# try the jna wrapper when in jython
from ssdeepjna import ssdeep
except ImportError:
# write tmp files and just exec the dumb thing
Which is all pretty nice I think. Not too many worries about creating interfaces or other crazy things. Seems very efficient, a little extra packaging and we are good to go.
Well there was one problem in getting different values from the hashes when the data was binary. Turns out the JNA layer needs to be told how to convert data with -Djna.encoding=8859_1 on the JVM command line. Since I usually run with -Dfile.encoding=UTF-8 and in a UTF-8 locale, this made all the difference.
If that is inconvenient or you want to encode things differently only sometimes, the extra steps in the python layer would be something like
from java.lang import String
def wrap_hash_buf(self,data):
javastr = String(data,"8859_1")
jbytes = javastr.getBytes("8859_1")
return hash_buf(jbytes)
The same type of thing would work just as well from JRuby.
Welcome to the sweet spot. Code on, baby!
Mon 24 Mar 2008
The Java JVM has the -Xnoclassgc argument to inhibit class garbage collection. That is so 2006. If you have a long-running server JVM, this is most likely going to leak memory.
Specifically, if your process
then you have a slow, psssssssssssssssssssss sound coming from your JVM.
Do you seriously think that in 2008 you know that none of the jars on your classpath use reflection or serialization?
We live in a world now where JRuby comes along and may generate holder and wrapper classes during runtime and do all sorts of stuff to the internals trying to get things to byte-compile down. Only geniuses can truly understand this stuff.
Instead of trying to save a few millis over the course of a month of server JVM run time by turning off the class GC, make it ConcurrentMarkSweep enabled instead:
java -server -XX:+CMSClassUnloadingEnabled -XX:+CMSPermGenSweepingEnabled
and let’s relegate -Xnoclassgc to the scrap-heap. It probably doesn’t do what you think it does.
Sun 3 Feb 2008
It all started with Log4j — Java, log4j and I sort of grew up together. I loved log4j. But along came j.c.l, jdk1.4 logging and I had to try them all out. When building a library that other people are supposed to embed into something of theirs, though, turning the logging choice over to them makes a lot of sense. As our little project grew and needed to bring in other components we always appreciated those that did logging the way we wanted it done.
Jetty was a standout in that area — I first learned the coolness of dependency injection from studying Jetty. Jetty provided our first introduction to Slf4j. Slf4j is sort of the ultimately flexible logging system. You build to the Slf4j API and drop in the statically bound jar file for the back end implementation you want (e.g. log4j, logback, log4juli). This is a fantastic idea and thanks are due to Ceki Gülcü for log4j, logback and slf4j.
Quite some time back we felt the pain of just trying out a new logging system. Since every file has imports and a factory method call to get the logger, it is painful to switch. So we insulated ourselves from that by creating our own Logger API, an abstract base class that pretty closely matched what we needed and were using from log4j. We had a Logger.getLogger(String) method, debug, info, warn, error and fatal levels. And we had a LogManager that acted as the factory for it all, returning a wrapper around log4j loggers that met our API. Life was good.
We added Slf4j into the mix at first only to get logging from some Jakarta components that were using j.c.l out into the log4j backend where we wanted them. Slf4j was fantastic at that.
Then the haunting question — why aren’t we doing in our own library the very thing that we enjoy about the logging capabilities of Jetty? Jetty has a soft dependency on Slf4j and falls back to some type of logging to stdout when it isn’t present. Cool. We can do that too.
The friction came when I realized that our nice little log4j-based API abstraction needed NDC. The Slf4j API does not support NDC. Major bummer. We find this incredibly helpful in debugging a complex multi-threaded system. We have pushContext(String), popContext() and clearContext() methods in our logging abstraction and I don’t really want to give them up.
But hey! Why not make a soft dependency on those features just like we are doing for Slf4j? So in the new Slf4jLogger that we use to encapsulate our Slf4j capability, we can check for the presence of org.apache.log4j.Logger and grab those methods we need for NDC. If they aren’t there, ok, we drop the NDC information, but if they are there, then we have it all!
Here’s some code to add the soft dependency on NDC. First, in the static initializer of our Slf4jLogger class:
// Support methods for ndc capability if log4j is available
private static Method ndcClear;
private static Method ndcPop;
private static Method ndcPush;
static {
/* Not shown - Slf4j soft dependency loading */
Class ndcz = null;
String NDC = "org.apache.log4j.NDC"
try
{
ClassLoader cl = Thread.currentThread().getContextClassLoader();
ndcz = cl==null?Class.forName(NDC):cl.loadClass(NDC);
ndcClear = ndcz.getMethod("clear",new Class[]{});
ndcPop = ndcz.getMethod("pop",new Class[]{});
ndcPush = ndcz.getMethod("push",new Class[]{String.class});
} catch (Throwable t) {}
}
So that grabs the underlying methods we need when they are present and doesn’t carp when they aren’t.
We can wrap that up in a simple function to let us know when NDC capability is present:
protected static boolean isNDCEnabled ()
{
return ndcPush != null;
}
And protect the actual methods that are the NDC part of the logger API:
/**
* Push the string on the ndc context stack if the capability is supported
* @param s the new context
*/
public void pushContext(String s)
{
if (isNDCEnabled())
{
try
{
ndcPush.invoke(null,s);
}
catch (Exception e) { /* e.printStackTrace(); */}
}
}
Since we are pushing the context into the actual Log4j NDC object, we didn’t have to reinvent anything. If Log4j is on the runtime classpath the features will be found. If Slf4j is configured to actually use Log4j as the backend for logging, then the NDC will actually pop out of the log and we still don’t need Log4j to be on the classpath at compile time!
So have your Slf4j and your NDC too!
Sun 3 Feb 2008
String s = "This is a test of the Google Syntax highligher plugin.";
Mon 14 Jan 2008
Bill: What happens if I specify -Xmx multiple times on the command line?
Me: I don’t know, let’s find out.
java -Xms1024m -Xmx99999 -Xmx1024m -version
java version "1.5.0_13"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_13-b05-237)
Java HotSpot(TM) Client VM (build 1.5.0_13-119, mixed mode)
java -Xms1024m -Xmx1024m -Xmx99999 -version
Error occurred during initialization of VM
Incompatible initial and maximum heap sizes specified
Abort trap
Me: The last one always wins.
Mon 14 Jan 2008
Suppose you have a Java interface
public interface Filter {
int filter(Payload p);
int filter(java.util.Collection plist);
}
and an abstract class that implements the interface:
public abstract class AbstractFilter implements Filter {
public filter(java.util.Collection plist) {
for (Payload p : plist) {
filter(p);
}
}
}
I sometimes do stuff like this, providing convenience methods in an
abstract base class to enrich an API without putting a burden
on those wishing to implement it, and not tempting anyone to
cut-and-paste a bunch of code around.
Since the filter(Payload p ) method is declared on the interface and
because AbstractFilter is, well, Abstract, I don’t need to mention
filter(Payload p).
But suppose someone wants to provide an implementation:
public class FooFilter extends AbstractFilter {
@Override
public int filter(Payload p) {
...
}
}
This fails on JDK 1.5 with “method does not override a method from its superclass”.
This will compile without warning on JDK 1.6, and we could just run around and tell
everyone not to use @Override in that situation, but, it’s clearer and better for everyone
and compiles without warnings in both JDK 1.5 and 1.6 if we just add this line to
the AbstractFilter class
public abstract int filter(Payload p);
This is nice because it gives a place to put some javadoc for the AbstractFilter class,
returns search results if someone is grepping the source for filter methods, and it
allows the @Override to work as it should.
So +3 and -0, on that. It is an all around winner.
Fri 4 Jan 2008
Thanks to a little help from the JRuby user mailing list, I learned that the behavior we had been seeing is not a bug. Local variables that come from an eval are created in ThreadLocal storage. I think I already should have know this as I was not having similar issues with global variables.
But in order to keep multiple users from interfering with each other and never being able to use local variables, I changed our RubyConsole object to run on its own thread. It now implements the HttpSessionBindingListener interface so that I can properly cleanup when the session expires. So there is happiness in JRuby integration land again.
I also learned how to capture the stdout and stderr from the JRuby environment and stuff them into ByteArrayOutputStream instances so that I can capture the data and put it on the HTML response along with the toString of any actual RubyObject that comes back from the eval call. So our JRuby console over AJAX HTTP looks a lot more like what happens in irb on a local terminal. This isn’t currently possible to due with pure BSF, but with a little help from JRuby objects it works by setting up like this when a new BSFManager is created.
protected ByteArrayOutputStream stdout = new ByteArrayOutputStream();
Ruby runtime = Ruby.getDefaultInstance();
IRubyObject out = new RubyIO( runtime, stdout );
manager.declareBean("stdout", out, RubyIO.class);
manager.declareBean("defout", out, RubyIO.class);
manager.declareBean(">", out, RubyIO.class);
And then after every eval call, we can get any available stdout data by calling
stdout.toString();
stdout.reset();
There are some other nice ways to do this without BSF and I still need to investigate whether there are improvements in JSR233 in JDK6 that would make this easier. But it works really nicely. Way to go JRuby guys!