Tag Archives: performance

Java micro benchmark with jmh and Netbeans

Note that jmh has evolved significantly since I wrote this post and some of the information below might be obsolete.

jmh (Java Microbenchmark Harness) is an open source micro-benchmarking tool for java, part of the OpenJDK. I have been using it for a few weeks and found it easy to use and very useful. One advantage it has over Caliper is that it runs on Windows.

Installation

The installation process is fairly straight-forward using Maven. For example, with Netbeans, it can be done following those steps:

  • Download source (you need to have Mercurial installed):

  • Open, compile and install the library:

    • Netbeans then proposes to open the project: click Open Projects
    • Select the top project and click open
    • Right click on the project > Custom > Goals
    • In Goals, type: clean install -DskipTests=true

Create a Microbenchmark Project

  • Menu File > New Project
  • Select Maven / Java Application > Next
  • Let’s call it performance
  • Enter a GroupID (I use com.assylias for this example)
  • Click Finish

Let’s now configure the dependencies and allow the project to run jmh:

  • In the project’s Project Files, select and edit pom.xml
  • Use the following dependencies and build settings:

<dependencies>
    <dependency>
        <groupId>org.openjdk.jmh</groupId>
        <artifactId>jmh-core</artifactId>
        <version>1.0-SNAPSHOT</version>
    </dependency>
    <dependency>
        <groupId>junit</groupId>
        <artifactId>junit</artifactId>
        <version>3.8.1</version>
        <scope>test</scope>
    </dependency>
</dependencies>
<build>
    <plugins>
        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-shade-plugin</artifactId>
            <version>2.0</version>
            <executions>
                <execution>
                    <phase>package</phase>
                    <goals>
                        <goal>shade</goal>
                    </goals>
                    <configuration>
                        <finalName>microbenchmarks</finalName>
                        <transformers>
                            <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                                <mainClass>org.openjdk.jmh.Main</mainClass>
                            </transformer>
                        </transformers>
                    </configuration>
                </execution>
            </executions>
        </plugin>
    </plugins>
</build>

Sample benchmark

Let’s try to benchmark something to see if it works. We could for example try to find the best method to copy an array. The four candidates we are going to test are:

  • Object[] newArray = originalArray.clone();
  • Object[] newArray = Arrays.copyOf(originalArray, originalArray.length);
  • System.arrayCopy(originalArray, 0, newArray, 0, originalArray.length);
  • and a plain old loop

In your project source package, right click and add a new class, let’s call it ArrayCopy, and copy the following code:

import java.util.Arrays;
import java.util.concurrent.TimeUnit;
import org.openjdk.jmh.annotations.BenchmarkType;
import org.openjdk.jmh.annotations.GenerateMicroBenchmark;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.State;

/**
 *
 */
@State(Scope.Thread)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class ArrayCopy {

    private int[] array = {1, 2, 3, 4, 5, 6, 7, 8, 9};

    @GenerateMicroBenchmark(BenchmarkType.AverageTimePerOp)
    public int[] clone_() {
        int[] copy = array.clone();
        return copy;
    }

    @GenerateMicroBenchmark(BenchmarkType.AverageTimePerOp)
    public int[] arrayCopy() {
        int[] copy = new int[array.length];
        System.arraycopy(array, 0, copy, 0, array.length);
        return copy;
    }

    @GenerateMicroBenchmark(BenchmarkType.AverageTimePerOp)
    public int[] copyOf() {
        int[] copy = Arrays.copyOf(array, array.length);
        return copy;
    }

    @GenerateMicroBenchmark(BenchmarkType.AverageTimePerOp)
    public int[] loop() {
        int[] copy = new int[array.length];
        for (int i = 0; i < array.length; i++) {
            copy[i] = array[i];
        }
        return copy;
    }    
}

Finally, let’s create the launcher that will run the micro-benchmark – I use this runner class (alternatively you can run it from the command line but I’m lazy and prefer running it from the IDE):

import java.io.IOException;
import org.openjdk.jmh.Main;

public class RunTest {
    private static final String TEST = ".*ArrayCopy.*"; //uses regexp

    public static void main(String[] args) throws IOException {
        Main.main(getArguments(TEST, 5, 5000, 1));
    }

    private static String[] getArguments(String className, int nRuns, int runForMilliseconds, int nThreads) {
        return new String[]{className,
            "-i", "" + nRuns,
            "-r", runForMilliseconds + "ms",
            "-t", "" + nThreads,
            "-w", "5000ms",
            "-wi", "3",
            "-v"
        };
    }
}

Clean and Build the project (CTRL+F11) and run it (SHIFT+F6 with the RunTest class selected).

You should get a detailed output of the performance of the various methods and a summary table that looks like this:

Benchmark                       Thr    Cnt  Sec         Mean   Mean error          Var    Units
c.a.p.g.a.ArrayCopy.arrayCopy     1     10    1       11.947        0.049        0.002  nsec/op
c.a.p.g.a.ArrayCopy.clone_        1     10    1       11.801        0.368        0.128  nsec/op
c.a.p.g.a.ArrayCopy.copyOf        1     10    1       11.783        0.115        0.013  nsec/op
c.a.p.g.a.ArrayCopy.loop          1     10    1       17.985        0.109        0.011  nsec/op

Next steps

The jmh project comes with a few samples which are very interesting and useful to read. It is also useful to check the usage, for example with the printUsage method in the example above (or by running it from the command line with no argument: java -jar microbenchmarks.jar).

Tagged , , , ,

Size of objects in Java

I need to optimise the size taken by some of my objects as there will be millions of them and a factor of 3 starts to make a difference on a standard desktop PC, especially if it is a 32 bit machine.

JVM parameters

To get (fairly) accurate results using Runtime.getRuntime().freeMemory() I have used the following JVM parameters (on hotspot 7u11 x64 – some parameters might not be available on other JVMs): -server -Xms2000m -Xmx2000m -verbose:gc -XX:-UseTLAB -XX:+UseCompressedOops. The various flags do the following:

  • -Xms and -Xmx are well known and respectiely define the initial and maximum heap size. I set them high (2GB) to prevent the garbage collector from running during the test.
  • -verbose:gc makes the JVM print to the console when a GC is run. That enables to visually control that no GC ran during the tests.
  • -XX:-UseTLAB asks the JVM not to allocate memory in chunks (which it otherwise does for efficiency). Turning that option off gives more accurate and stable results.
  • -XX:+UseCompressedOops asks the JVM to compress references from 8 to 4 bytes. Note that it is on by default on Hotspot 7 x64.

Tests

The objective of my test was to determine the size of various classes that I could use to store tick data. The issue being that some ticks come with extra information and some don’t. I was wondering whether I should declare all the possible fields and leave them null when not available, or have an array or EnumMap instead. The only constraint here being memory footprint.

There are a few interesting observations – they are obviously empirical and might vary depending on several factors, including architecture (32/64 bits), JVM, other?:

  • an int takes 4 bytes
  • BUT an empty class (no members) and a class that has an int take the same space in memory (16 bytes), due to memory alignement
  • a null reference takes 4 bytes (same as int, due to compressed oop), with the same alignement observation
  • whether members are instantiated when declared or via a constructor does not make a difference (4 bytes)
  • an Object takes 16 bytes in memory
  • Adding 6 null Strings to my class (from TickBasic to TickComplete) adds 24 bytes, i.e. a size increase of 50%
  • The natural design is better than trying to use a “lazy” data structure with EnumMap

The test code at the bottom is fairly extensive but not very complicated. The output is (64 bits machine):

Object: 16 bytes
Object array (empty): 4 bytes
Object array (full) - incremental size: 16 bytes
Object with 1 int: 16 bytes
Object with 2 ints: 24 bytes
Object with 3 ints: 24 bytes
Object with 1 long: 24 bytes
Object with 2 longs: 32 bytes
Object with 3 longs: 40 bytes
Object with 1 null Object: 16 bytes
Object with 2 null Objects: 24 bytes
Object with 3 null Objects: 24 bytes
Object with 1 allocated Object: 16 bytes
Object with 2 allocated Objects: 24 bytes
Object with 3 allocated Objects: 24 bytes
TickBasic: 56 bytes
TickComplete: 80 bytes
TickCompleteWithConstructor: 80 bytes
TickArray: 128 bytes
TickEnumMap: 112 bytes

Test Code

public class TestMemory {

    private static final int SIZE = 100_000;
    private static Runnable r;
    private static Object[] array;
    private static Object o;
    private static Object o1 = new Object();
    private static Object o2 = new Object();
    private static Object o3 = new Object();

    private static void test(Runnable r, String name, int numberOfObjects) {
        long mem = Runtime.getRuntime().freeMemory();
        r.run();
        System.out.println(name + ": " + (mem - Runtime.getRuntime().freeMemory()) / numberOfObjects + " bytes");
    }

    public static void main(String[] args) throws Exception {
        System.out.println(System.getProperty("java.vm.name"));
        DateTime date = new DateTime(); //for some reason the result gets biased if we don't initialise DateTime first

        r = new Runnable() { public void run() { for (int i = 0; i < SIZE; i++) o = new Object(); } };
        test(r, "Object", SIZE);

        r = new Runnable() { public void run() { array = new Object[SIZE]; } };
        test(r, "Object array (empty)", SIZE);

        r = new Runnable() { public void run() { for (int i = 0; i < SIZE; i++) array[i] = new Object(); } };
        test(r, "Object array (full) - incremental size", SIZE);

        r = new Runnable() { public void run() { for (int i = 0; i < SIZE; i++) o = new ObjectWith1Int(); } };
        test(r, "Object with 1 int", SIZE);

        r = new Runnable() { public void run() { for (int i = 0; i < SIZE; i++) o = new ObjectWith2Ints(); } };
        test(r, "Object with 2 ints", SIZE);

        r = new Runnable() { public void run() { for (int i = 0; i < SIZE; i++) o = new ObjectWith3Ints(); } };
        test(r, "Object with 3 ints", SIZE);

        r = new Runnable() { public void run() { for (int i = 0; i < SIZE; i++) o = new ObjectWith1Long(); } };
        test(r, "Object with 1 long", SIZE);

        r = new Runnable() { public void run() { for (int i = 0; i < SIZE; i++) o = new ObjectWith2Longs(); } };
        test(r, "Object with 2 longs", SIZE);

        r = new Runnable() { public void run() { for (int i = 0; i < SIZE; i++) o = new ObjectWith3Longs(); } };
        test(r, "Object with 3 longs", SIZE);

        r = new Runnable() { public void run() { for (int i = 0; i < SIZE; i++) o = new ObjectWith1NullObject(); } };
        test(r, "Object with 1 null Object", SIZE);

        r = new Runnable() { public void run() { for (int i = 0; i < SIZE; i++) o = new ObjectWith2NullObjects(); } };
        test(r, "Object with 2 null Objects", SIZE);

        r = new Runnable() { public void run() { for (int i = 0; i < SIZE; i++) o = new ObjectWith3NullObjects(); } };
        test(r, "Object with 3 null Objects", SIZE);

        r = new Runnable() { public void run() { for (int i = 0; i < SIZE; i++) o = new ObjectWith1Object(); } };
        test(r, "Object with 1 allocated Object", SIZE);

        r = new Runnable() { public void run() { for (int i = 0; i < SIZE; i++) o = new ObjectWith2Objects(); } };
        test(r, "Object with 2 allocated Objects", SIZE);

        r = new Runnable() { public void run() { for (int i = 0; i < SIZE; i++) o = new ObjectWith3Objects(); } };
        test(r, "Object with 3 allocated Objects", SIZE);

        r = new Runnable() { public void run() { for (int i = 0; i < SIZE; i++) o = new TickBasic(); } };
        test(r, "TickBasic", SIZE);

        r = new Runnable() { public void run() { for (int i = 0; i < SIZE; i++) o = new TickComplete(); } };
        test(r, "TickComplete", SIZE);

        r = new Runnable() { public void run() { for (int i = 0; i < SIZE; i++) o = new TickCompleteWithConstructor(new DateTime(), TickType.TRADE); } };
        test(r, "TickCompleteWithConstructor", SIZE);

        r = new Runnable() { public void run() { for (int i = 0; i < SIZE; i++) o = new TickArray(); } };
        test(r, "TickArray", SIZE);

        r = new Runnable() { public void run() { for (int i = 0; i < SIZE; i++) o = new TickEnumMap(); } };
        test(r, "TickEnumMap", SIZE);
    }

    public static class ObjectWith1Int  { int i; }
    public static class ObjectWith2Ints { int i, j; }
    public static class ObjectWith3Ints { int i, j, k; }
    public static class ObjectWith1Long  { long i; }
    public static class ObjectWith2Longs { long i, j; }
    public static class ObjectWith3Longs { long i, j, k; }
    public static class ObjectWith1NullObject  { Object o; }
    public static class ObjectWith2NullObjects { Object o, p; }
    public static class ObjectWith3NullObjects { Object o, p, q; }
    public static class ObjectWith1Object  { Object o = o1; }
    public static class ObjectWith2Objects { Object o = o1; Object p = o2; }
    public static class ObjectWith3Objects { Object o = o1; Object p = o2; Object q = o3; }

    public static class TickBasic {
        DateTime time = new DateTime();
        TickType type = TickType.TRADE;
        double value;
        int size;
    }

    public static class TickComplete {
        DateTime time = new DateTime();
        TickType type = TickType.TRADE;
        double value;
        int size;
        String cc;
        String ec;
        String mc;
        String bbc;
        String bsc;
        String rc;
    }

    public static class TickCompleteWithConstructor {
        DateTime time;
        TickType type;
        double value;
        int size;
        String cc;
        String ec;
        String mc;
        String bbc;
        String bsc;
        String rc;

        public TickCompleteWithConstructor(DateTime time, TickType type) {
            this.time = time;
            this.type = type;
        }
    }

    public static class TickArray {
        Object[] o = new Object[TickFields.values().length];
    }

    public static class TickEnumMap {
        Map values = new EnumMap(TickFields.class);
    }

    public static enum TickFields {

        TIME("time"),
        TYPE("type"),
        VALUE("value"),
        SIZE("size"),
        CONDITION_CODE("conditionCode"),
        EXCHANGE_CODE("exchangeCode"),
        MIC_CODE("micCode"),
        BROKER_BUY_CODE("brokerBuyCode"),
        BROKER_SELL_CODE("brokerSellCode"),
        RPS_CODE("rpsCode");
        private String code;

        TickFields(String code) {
            this.code = code;
        }
    }

    public static enum TickType {

        TRADE,
        BID,
        MID,
        ASK;
    }
}
Tagged , , ,