Up ]

main()
Home News Listen Read Resources Feedback Contents Search RSS, Contacts

 

 

 

Sponsor Links

Fast, reliable data access for ODBC, JDBC, ADO.NET and XML
Need an expert for Java, XML and Web Services projects?
WSSC 2008: The only event dedicated to Web Services Security technology and business
IBM MQSeries for Compaq NSK - ( v. 5.1 ) - media
88x31 CTIX Logo - Clear Background
Microsoft SQL Server 2005 Standard Edition X64 - complete package
Corel DESIGNER Technical Suite - ( v. 12 ) - complete package
Find XML examples at XML Pitstop

 

Madhu's Workshop

Reflecting On Binary Files

XML
BCEL
XML Spy

Compiler
ANTLR

Java
VM

Tools
Apache
Eclipse

Logo for grid computing portal: GridSummit.com
Logo for database portal logo: SQLSummit.com

 

<< Previous 1 2 3 4 5

Now all we need to is translate the class file specification into a set of classes, which is our grammar:

public class ClassFile {
    int magic_1;
    short minorVersion_2;
    short majorVersion_3;
    // hack, special name -- see Parser.parse
    short constant_pool_count_4;
    CPEntry[] constantPool_5;
    short accessFlags_6;
    short thisClass_7;
    short superClass_8;
    short interfacesCount_9;
    short[] interfaces_10;
    short fieldCount_11;
    FieldInfo[] fields_12;
    short methodCount_13;
    MethodInfo[] methods_14;
    short attributesCount_15;
    AttributeInfo[] attributes_16;
}

This would all work beautifully if it weren't for a few non-uniform details in the class file specification. The constant pool contains one less than the count in the file. The reasoning is the first element represents null and is not included. This could have been accommodated in a more uniform manner, but the designers chose not to do so. Similarly, long and double types require two elements in the constant pool, something that is a source of regret and so noted in the virtual machine specification. Even still, with a couple minor hacks we can deal with it. Also, you will notice the trailing numbers in the field names. That's used to sort the fields, since field order is not guaranteed.

Here's a main() method to get things going:

public static void main(String[] args) throws Exception {
    File file = new File("bin/com/madhu/picovm/Parser.class");
    FileInputStream fis = new FileInputStream(file);
    byte[] data = new byte[(int) file.length()];
    fis.read(data);
    fis.close();
    Parser p = new Parser();
    Object o = null;
    long start = System.currentTimeMillis();
    for (int i=0; i<100; i+=1) {
        o = p.parse(new ByteArrayInputStream(data), ClassFile.class);
    }
    long time = System.currentTimeMillis() - start;
    System.out.println(o);
    System.out.println("Time: " + time + "ms");
}

That's it! We have a complete parser for a small, but powerful class of grammars which takes less than 100 lines of code!

The grammar is easily translated from the spec, which is actually larger than the parser itself. As a comparison, a fully hand-coded parser with classes for each structure took me several days to complete and test. The parser above and the "grammar" classes took only a few hours. There are many more details in the class file, such as Code attributes, that are easily accommodated. More importantly, I can tackle other binary formats just by defining the grammar. The technique above can also be used in reverse to write a class file as well.

Some of you might wonder about performance. Historically, Reflection has been slow, so I made some timing measurements and it's not bad at all. On average, it takes about 16 ms to parse the 6 kb Parser class file itself on a 1.5 GHz Pentium. Your mileage will probably vary. You might be able to do better with a hand-coded parser, but given a choice, I'll let the computer do the hard work!


Madhu Siddalingaiah is a consultant focusing on modern technologies such as wireless, embedded and enterprise systems. He helps organizations reach new markets and reduce costs through strategic use of information technology. Madhu has worked with a number of high-profile clients in many industries including health care, energy, aerospace, defense, and high energy physics.

Madhu has authored several books, the latest titled "XML and Web Services Unleashed". He is a popular presenter at technology conferences all over the world.

<< Previous 1 2 3 4 5  

 

 


 

TII Computer Deals at Dell Home Systems 120x90 Button_120x90_C Overstock.com Auctions! Easier, Cheaper, Friendlier!

 

 

Home ] Up ]

Copyright © 2008,  Ken North Computing, LLC
Last modified: March 31, 2008