2012/12/29

Code Index 6 released

This time, I focus on for loop and repeating code.

FunctionLength

Test code:

void test_xxxxx_speed(int x){
  x+=x;
  x+=x;
  ... (repeat xxx times)
  x+=x;
  return x;  
}

Samsung Galaxy Note 2 run:

Linux OpenJDK 7 run:

com.luzi82.codeindex.FunctionLength START
test_00001_speed: 10.3M lines, 96.8ns/line
test_00010_speed: 103M lines, 9.67ns/line
test_00100_speed: 658M lines, 1.52ns/line
test_01000_speed: 1.33G lines, 751ps/line
test_10000_speed: 141M lines, 7.06ns/line
com.luzi82.codeindex.FunctionLength END

Observation:

  • In Android, more repeating code is good.
  • In OpenJDK, 1000 repeating code is good, 10000 repeating code is bad.
  • Compiler does not allow me to do 100000 repeating code.

ForLoop

Test code:

void test_forloop_xxx_yy(){
  // when xxx=imm, use i--
  // when xxx=ipp, use i++
  // when xxx=mmi, use --i
  // when xxx=ppi, use ++i
  // when yy=gt/lt, use > / <
  // when yy=ne, use !=
  for(int i=0;i<1000;++i){
    x+=x;
  }
  return x;  
}

void test_repeat(){
  x+=x;
  x+=x;
  ... (repeat 1000 times)
  x+=x;
  return x;  
}

Samsung Galaxy Note 2 run:

Linux OpenJDK 7 run:

com.luzi82.codeindex.ForLoop START
test_forloop_imm_gt: 10.5G line/s, 94.6ps/line
test_forloop_imm_ne: 10.0G line/s, 99.4ps/line
test_forloop_ipp_lt: 10.5G line/s, 95.1ps/line
test_forloop_ipp_ne: 10.2G line/s, 98.0ps/line
test_forloop_mmi_gt: 10.0G line/s, 99.7ps/line
test_forloop_mmi_ne: 10.1G line/s, 98.8ps/line
test_forloop_ppi_lt: 10.4G line/s, 95.4ps/line
test_forloop_ppi_ne: 10.0G line/s, 99.0ps/line
test_repeat: 1.32G line/s, 759ps/line
com.luzi82.codeindex.ForLoop END

Observation:

  • In for-loop, using i++ / ++i / i-- / --i does not affect much.
  • In for-loop, using i<MAX / i!=MAX does not affect much.
  • In Android, repeating code is faster.
  • In OpenJDK, for-loop is faster.

Code Index 5 released

Update

  • Add test to compare JNI array access method.
  • Reduce case in byte array copy/fill.
  • Test time per case increased to 2 sec.

JniGetReleaseByteArray @ Samsung Galaxy Note 2

Observations

  • Access time does not affected by array size.
  • Access time does not affected by JNIABORT flag.
  • Get/ReleasePrimitiveArrayCritical is slower than Get/ReleaseByteArrayElements.
  • In short array, using stupid Java for-loop is better.

Facts

  • Android src: dalvik/vm/Jni.cpp, Line 2351, 2594.
  • Implementation does not care JNIABORT.
  • Array size may affect the cost in native pinPrimitiveArray call. But I am not sure.
  • isCopy arg just output false.
  • Implementation of Get/ReleasePrimitiveArrayCritical and Get/ReleaseByteArrayElements are the same. However the Get/ReleasePrimitiveArrayCritical output void*. May be that is why Get/ReleasePrimitiveArrayCritical is slower.

2012/12/26

Code Index 3.0 released

Updates

  • New icon.
  • JNI test is added.
  • Benchmark score are more readable.
  • Test time per function are shorten to 1 sec.
  • Force app to portrait mode to prevent screen rotate and double test run.

Test run in Sumsung Galaxy Note 2

  • System.arraycopy is the fastest, even faster then JNI. Android have optimized the speed in dalvik VM. I am thinking of using assembly code to challenge the speed, but I do not know assembly code.
  • In array fill, Arrays.fill is slower then for-loop{a[i]=0} because Arrays.fill is just for-loop{a[i]=0}, and CPU is wasted in function call depth.
  • The fastest way to do array fill is using JNI. But I wonder if JNI call will create extra memory load.
  • Just wondering, Android use tons of System.arraycopy so they optimize it. But I think Array.fill is also good point to optimize.
  • In stackoverflow, you can find lots of ppl misunderstanding the performance without doing any benchmark.
  • I will try using int-array copy in the next release.

2012/12/23

Code Index 2.0 released

Just after 3.5 hours, 2.0 released.

ByteArrayFill: Arrays.fill vs for-loop{a[i]=x} vs System.arraycopy


In Sun J2SE:
com.luzi82.codeindex.ByteArrayFill START
test_Arrays_fill: 100000: 82531.7/s
test_Arrays_fill: 1000000: 8102.7/s
test_Arrays_fill: 10000000: 201.4/s
test_manualfill: 100000: 59810.0/s
test_manualfill: 1000000: 6200.7/s
test_manualfill: 10000000: 200.7/s
test_System_arraycopy: 100000: 80760.0/s
test_System_arraycopy: 1000000: 7536.2/s
test_System_arraycopy: 10000000: 201.3/s
com.luzi82.codeindex.ByteArrayFill END
As expected, Arrays.fill > System.arraycopy >> for-loop{a[i]=x} .

In Android:
Surprise. System.arraycopy >>> for-loop{a[i]=x} > Arrays.fill .

I am going to add JNI test in the coming release.

Code Index 1.0 released

Ok, just released.
The app is used to test the performance and behavior of Java code.
You may get the src from https://github.com/luzi82/CodeIndex .

Here is the first test case:

ByteArrayCopy: System.arraycopy vs for-loop{dest[i]=src[i]}

In Sun J2RE:
com.luzi82.codeindex.ByteArrayCopy START
test_System_arraycopy: 100000: 53210.0/s
test_System_arraycopy: 1000000: 5215.3/s
test_System_arraycopy: 10000000: 152.8/s
test_manualcopy: 100000: 42231.6/s
test_manualcopy: 1000000: 4033.9/s
test_manualcopy: 10000000: 150.8/s
com.luzi82.codeindex.ByteArrayCopy END

In Android:

Conclusion: System.arraycopy win. Use System.arraycopy, esp in Android.