Friday, May 31, 2013

In JDK-6962930[2], it requested that string table size be configurable. The resolved date of that bug was on 04/25/2011 and it's available in JDK 7. In another JDK bug[3], it has requested the default size (i.e. 1009) of string table be increased.

In this article, we will examine the following topics:

What string table is

How to find the number of interned strings in your applications

The tradeoff between memory footprint and lookup cost

String Table

In Java, string interning[1] is a method of storing only one copy of each distinct string value, which must be immutable. Interning strings makes some string processing tasks more time- or space-efficient at the cost of requiring more time when the string is created or interned. The distinct values are stored in a string intern pool, which is the string table in HotSpot.

The size of the string table (i.e., a chained hash table) is configurable in JDK 7. When the overflow chains become long, performance can degrade. The current default size of string table is 1009 (or 1009 buckets), which is too small for applications that stress the string table. Note that the string table itself is allocated in native memory but the strings
are java objects.

Finding Number of Interned Strings in the Applications

In HotSpot, it provides a product level option named PrintStringTableStatistics which can be used to print hash table statistics[4]. For example, using one of our applications (hereafter will be referred as JavaApp), it prints out the following information:

The conclusion is that increasing string table size from 60013 to 277331 helps JavaApp's performance a little bit at the expense of larger memory footprint. In this case, the benefit is minimal, keeping string table size to be 60013 is good enough.

There is insufficient native memory for the Java Runtime Environment to continue.

In this article, we will discuss what native memory is and how to debug running out of native memory in JRockit.

Native Memory vs. Heap Memory

There are two types of memory used by JVM and its applications, all of which are allocated from system memory:

Java Heap

Java heap is the area of memory used by the JVM to do dynamic memory allocation.

The amount of memory used for the heap can be controlled by the following command options:

–Xms2g

–Xmx2g

Heap memory can be garbage collected[4].

Native Memory

Internal JVM memory management is, to a large extent, kept off the Java heap and allocated natively in the operating system, through system calls like malloc. This non-heap system memory allocated by the JVM is referred to as native memory.

For JRockit, increasing the amount of available native memory is done implicitly by lowering the maximum Java heap size using –Xmx.

If the heap is too large, it may well be the case that not enough native memory is left for JVM internal usage—bookkeeping, code optimizations, and so on. In that case, the JVM may have no other choice than to throw an OutOfMemoryError from native code (for example, from line 537 of compilerfrontend.c in the previous example).

One example is when several parallel threads perform code optimizations in the JVM. Code optimization typically is one of the JVM operations that consumes the largest amounts of native memory, though only when the optimizing JIT is running and only on a per-method basis.

There are also mechanisms that allow the Java program, and not just the JVM, to allocate native memory, for example through JNI calls. If a JNI call executes a native malloc to reserve a large amount of memory, this memory will be unavailable to the JVM until it is freed.

Code Buffers

JRockit is unique in that it has no bytecode interpreter[1]. The native code is emitted into a code buffer and executed whenever the function it represents is called. There are two main problems associated with this compile-only strategy:

Larger compile-code size

This problem is mitigated by garbage collecting code buffers with methods no longer in use.

Long compilation time for large methods

This problem is solved by having a sloppy mode for the JIT.

Sometimes JRockit will use a lot of time generating a relatively large method, the typical example being a JSP.

However, once finished, the response time for accessing that JSP will be better than that of an interpreted version.

The problem of running out of memory for metadata in JRockit is not that different from the one in HotSpot, except for that it is native memory instead of heap memory. There are, however, two differences:

Cleaning up stale metadata is always enabled by default in JRockit

UseCodeGC = true (default)

Allow GC of discarded compiled code

FreeEmptyCodeBlocks = true (default)

Free unused code memory

There is no fixed size limit, be default, for the space used to store metadata

JRCMD[2]

When JRockit runs out of native memory and throws an OOM exception, JRCMD can be used for debugging. JRCMD is a small command-line tool that can be used to interact with a running JRockit instance. And it can be used to track native memory usage.

There is no need to pre-configure the JVM or the application to be able to later attach the tool. Also, the tool add virtually no overhead, making it suitable for use in live production environments.

The tools.jar in the JDK contains an API for attaching to a running JVM—the Java Attach API. This framework is utilized by JRCMD to invoke diagnostic commands.

For debugging OOM, you can invoke jrcmd with print_memusage command with displayMap argument:

In the header section, the first column contains the name of a memory space (i.e., "Java Heap") and the second column shows how much memory is mapped for that space. The third column contains details.

In the map section, the first column shows the category of memory chunks:

THREAD: Thread related, for example thread stacks.

INT: Internal use, for example pointer pages.

HEAP: Chunk used by JRockit for the Java heap.

OS: Mapped directly from the operating system, such as third party DLLs or shared objects.

MSP: Memory space. A memory space is a native heap with a specific purpose, for example native memory allocation inside the JVM.

GC: Garbage collection related, for example live bits.

CODE: compiled code

When tracking native memory leaks, it is useful to look at how much the memory usage changes over time. You can do by establishing a baseline first:

$jrcmd 411 print_memusage scale=M baseline

The argument baseline is used to establish a point from which to start measuring. The scale argument modifies the unit of the amounts of memory in the printout (default is KB). Once print_memusage is executed with the baseline argument, subsequent calls will include differentials against the baseline. This can facilitate the monitoring of memory usage changes over time.

Thursday, May 16, 2013

Since the writing of this blog article, things have changed.[5,6] For
example, in WebLogic Server 12.1.2, the following files have moved from
wlserver/server/libtoORACLE_HOME/oracle_common/modules/oracle.jdbc_11.2.0:

ojdbc5.jar

ojdbc6.jar

ojdbc6dms.jar

ojdbc5_g.jar

ojdbc6_g.jar

Updated (12/09/2014)

Added a new section "List of Thin JDBC Driver Versions"

There are built-in JDBC drivers installed with WebLogic Server. For example, the 11g version of the Oracle Thin driver (ojdbc6.jar for JDK 6) is bundled with WebLogic Server. That driver can be found in :

WL_HOME/server/lib

If you plan to use a different version of any of the drivers installed with WebLogic Server, you can replace the driver file in WL_HOME/server/lib with an updated version of the file or add the new file to the front of your CLASSPATH. However, be warned that if you replaced the default JDBC driver in WLS, you might miss some enhancements that were shipped with it. For example, you should not replace the one from WLS with the one from Oracle JDBC (i.e., D:\download\oracle\JDBC\JDBCDrivers\11.2.0.3\ojdbc6.jar) without consultation.

Copies of Oracle Thin drivers and other supporting files (e.g., a debug version named ojdbc6_g.jar) can also be found in

WL_HOME/server/ext/jdbc/

There is a subdirectory in this folder for each DBMS. If you need to revert to the version of the driver installed with WebLogic Server, you can copy the file from WL_HOME/server/ext/jdbc/ to WL_HOME/server/lib.

Manifest File

Built-in JDBC driver files are listed in the manifest of weblogic.jar (see below). So they can be loaded when weblogic.jar is loaded (when the server starts). Therefore, you do not need to add them to your CLASSPATH[3].

Be warned that upgrading the driver (say from 11.2.0.3 to 11.2.0.4) by downloading is not the right way. You could be missing some bug fixes needed by WebLogic Server. So, always consider to apply the patch provided by Oracle to upgrade your WLS JDBC driver appropriately.

The JDBC Thin driver (i.e., ojdbc6.jar) is a pure Java, Type IV driver that can be used in applications and applets. It is platform-independent and does not require any additional Oracle software on the client-side. The JDBC Thin driver communicates with the server using SQL*Net to access Oracle Database.

The JDBC Thin driver allows a direct connection to the database by providing an implementation of SQL*Net on top of Java sockets. The driver supports the TCP/IP protocol and requires a TNS listener on the TCP/IP sockets on the database server.

To use the Oracle Type 4 JDBC drivers, you create a JDBC data source in your WebLogic Server configuration and select the JDBC driver to create the physical database connections in the data source. Applications can then look up the data source on the JNDI tree and request a connection.

Sunday, May 5, 2013

There is a large amount of unstructured data in the real world. In 1998, Merrill Lynch cited a rule of thumb that somewhere around 80-90% of all potentially usable business information may originate in unstructured form[1].

There is no doubt that we need to process this information to extract meaning and create structured data about the information. One approach will be using Oracle Database to manage such data which includes multimedia (aka "rich media"). This book:

aims to help readers understand and manage unstructured data using Oracle Database.

Dealing with Unstructured Data

Unstructured Data refers to information that either does not have a pre-defined data model and/or does not fit well into relational tables[1].

Relational data can be considered a subset of structured data. Besides relational databases, structured data can also be stored in non-relational format such as in XML, inverted list databases[2], or object databases[4]. XML does not conform with formal structure of data models, but nonetheless it contains tags to separate semantic elements and enforce hierarchies of records and fields within the data. Therefore, one can consider XML as semi-structured.

It's possible to store unstructured data in a column of a relational table, which is structured. The traditional approach has been to just treat it as a blob (binary large object), but with a greater understanding of the variety of unstructured data types (i.e., video, audio, photographs, documents, etc.) that exist, the need to manage them has grown.

Metadata

To manage unstructured data, metadata is crucial. It is the data that describes the unstructured data and gives meaning to it. Metadata can be used for

Searching (covered in Chapter 4, Searching the Multimedia Warehouse)

Annotation

Adding meaning to unstructured data objects

Relating unstructured data objects (or adding structure)

Matching data stored in relational databases

It is envisaged that in the future technology will improve to the point that algorithms will be able to identify objects and people in a video or photo, and understand sounds and complex speech in audio files. When that point is reached, the need for metadata may be reduced or limited to a smaller scope.

Oracle Database

In the past few years, with changes in database technology and improvements in disk performance and storage, it now makes business sense to use the Oracle database to store and manage all of an organization's unstructured data.

For a database management system to begin to correctly handle the unstructured data, it must have support for objects[4]. The use of a database that can support objects makes it a lot easier to manage large volumes of digital objects. Though these objects can be stored in a file system, there are now advantages to having them stored inside the database. In Oracle, both relational data and objects are supported. After adding Online Analytical Processing (OLAP)[5] and XML[6], Oracle database grew from being relational to one supporting most structures.

Oracle multimedia uses blobs and new types, which can be accessed and used as required. In addition, it supports a variety of methods that simplify the act of loading and manipulating digital objects. This is covered in in Chapter 7, Techniques for Creating a Multimedia Database.

Most databases can enable unstructured data to be stored in them, but do not support the management, control, and manipulation of that data. Even though Oracle is a market leader in unstructured data management there are still a large number of major improvements needed. This is covered in Chapter 9, Understanding the Limitations of Oracle Products.

Scalability

When working with multimedia and unstructured data, a row in the database can be 10 GB in size, which could be greater than an entire relational database. Therefore, traditional tuning techniques might fail as the rules regarding them no longer make sense.

For example, in a multimedia warehouse (covered in Chapter 4), the concept of trying to achieve logical data consistency is not attempted, as it becomes apparent that the amount of data that is fuzzy forms the bulk of most of the digital objects. So, novel solutions to tuning problems are needed.

In Oracle Multimedia, there are also built-in supports for scalability. For example, the new 11G Securefile BLOBS using parallel techniques, that allows of loading of files much faster than using traditional BLOBS.

From the hardware front, technology also offers help. With the recent introduction of a low-cost terabyte SATA disks, and with the use of low-cost SAN's, the ability to store a petabyte is within the reach of a number of organizations.

To handle large amount of unstructured data, issues seen and solutions provided by Oracle include:

Hitting limits on the image size

An Oracle BLOB can be unlimited in size

Reaching internal structural limits within the database (max number of files that store data)

Oracle's use of tablespaces allow a large number of multimedia files to be stored in it

With the ability to control where a blob is stored, files can be split across multiple tablespaces and devices

Dealing with fragmentation

By using locally managed tablespaces, fragmentation is removed as a performance issue.

The efficient management of those images (for example, backup/recovery)

Using partitioning on LOBS allows a very large number of multimedia files to be stored and efficiently managed

Using RMAN, Oracle can be configured to back up large amounts of data

When dealing with Multimedia, one has to look at the different dimensions of scalability to best understand how the Oracle Database best handles it. This includes managing the CPU, memory, disk I/O, and network bandwidth. All of the above are covered in Chap 8, Tuning.

Disclaimer

The statements and opinions expressed here are my own and do not necessarily represent those of Oracle.For your computer health, follow me @xmlandmore. To improve your personal health, follow me @travel2health.

About Me

Healthy pursuits are like traveling. We know there is a wonderland called wellness. But, there are no fixed routes to reach there. The pursuits need effort and determination. We cannot act like tourists who don't know where they've been. We must take notes of warning signs sent by our bodies.

On the journey to wellness, there are many dangers to be avoided; there are many footsteps to be taken; unfortunately, there are no short-cuts.

Travelers walking in the night follow North Star. Healthy pursuits work similarly. We know our goal; we know the signposts; we know the dangers on the road; we can adjust pace if we are tired; we can change travel plans due to the weather. But, We walk steadily and persistently. With my companionship, hopefully, your journey will be made easier.