Tuesday, October 4, 2011

JavaOne 2011: Virtualizing Your Java Applications: Best Practices

Justin Murray started his presentation ["Virtualizing Your Java Applications: Best Practices" (21860)] about five minutes early and began by saying that virtualization has arrived at the point where people no longer need to be concerned about taking advantage of virtualization. He said his presentation is about a year old and is the work of a team of people. The relatively smaller room (Nikko Carmel I/II) was packed and appeared to be close to standing-room only (ushers helped get everyone seated). The combination of the morning's strategy keynote going slightly over, the people jammed into the ballroom as they tried to exit, and the early start of this presentation led to a lot of people coming in late and that was a little distracting. I don't fault the speaker for starting early because he had a lot of good information to cover. The biggest problem was the logistical delay forced on people departing the keynote that had already gone over its planned end time.

Muray had seven best practices for virtualization, only the first two (virtual machine memory and virtual CPUs/physical CPUs/threads) of which are particular to Java. Most people make their virtualization mistakes in the memory area rather than in the processing/CPU area. Murray pointed out that Java developers do not have to do anything to use virtualization; they don't need to change any code and they don't need to change any settings.

  1. Virtual Machine Memory
  2. CPUs
  3. Disk I/O
  4. Network I/O
  5. Timekeeping
  6. Micro-benchmarks
  7. Monitoring and Management

An easy-to-understand initial best practice for virtualizing Java applications is: "The Java heap needs to be in physical memory all of the time." He also recommended using large memory pages. Don't cram so many virtual machines into physical memory so that total maximum heap space from each VM's use more memory than available in physical memory.

Murray's first formal best practice was "1. Virtual Machine Memory: Size the virtual machine's memory to leave adequate space." He noted that there is a formula for calculating the amount of memory to use to take into account many needs of that memory. He said that this formula is documented in a white paper and in a colleague's book. Murray said that while it's possible to have too much JVM memory for a small application, he generally favors having plenty of memory allocated for JVMs to use in virtualization. Murray also recommended using VMware Distributed Resource Scheduler (DRS). His one other advertising pitch was for vFabric Elastic Memory for Java (EM4J), which is delivered with VMware's "flavor of Tomcat" called tc server. Murray stated that while overcommitting of memory in Java is concern in most cases, it is not when using tc server and EM4J. Murray also used esxtop.

After covering memory issues, Murray moved onto covering CPUs. He started this section with observations about threads and virtual CPUs. His slide stated "A Java thread executes on one vCPU at any one time" and "A vCPU is scheduled on one physical CPU at any one time." Murray added that most customers he has worked with are using four virtual CPUs, but that up to 32 are now supported.. Threads are often waiting on a monitor or socket to be freed and Murray maintains that most Java applications are not fully using theads. Generally, Murray favors using "the lowest number of virtual CPUs that is practical for your application." Murray stated that using the command-line option -Xgcthreads is the equivalent of saying, "I know better than the JVM."

Murray stated that virtualization is not afraid of I/O any longer. He said that from a Java virtualization perspective, network I/O is more important than disk I/O. Murray pointed out that swapping is not good for VM on physical machine and it's likewise not good for VM on virtual machine.

A good piece of advice that should be more commonly understood (but doesn't seem to be) that Murray provided is to only benchmark your own application and only make decisions based on benchmarking of the particular application. He added that using a "representative subset" of the application is not good enough.

For monitoring and management, Murray recommended starting with one JVM process per machine and then scaling up to the number of desired JVMs in a virtual machine one at a time to determine the upper limit. His last bullet on that slide recommended using vCenter or esxtop to see which portions of the Java application are consuming resources.

Murray's "short story" is that Java developers generally do not need to adjust JVM garbage collection strategy, thread pool sizing, or JDBC connection pool sizing any differently for virtualization than they do for physical servers. The only command-line option that should be used is for specifying large pages.

Murray talked about using Capacity Planner for about 30 days of collection to see which candidates are good candidates for virtualization. Murray pointed out that you cannot create hardware out thin air, so heavily utilized hardware will not be helped by virtualization in those cases.

Murray concluded with, "Java Middleware and Applications should be virtualized."

Murray works for VMware and VMware offers an Enterprise Java Applications on VMware Best Practices Guide. Murray referenced VMware's Technical White Papers during his presentation.