Use your business data to your advantage with the help of Syncfusion’s new data science offerings. Discover how a custom big data solution can provide your company with valuable predictions about key market trends.

Tier architecture

The rules surrounding technology are constantly changing. Decisions and architectures based on current technology might easily become out of date with hardware changes. To best understand how multimedia and unstructured data fit and can adapt to the changing technology, it's important to understand how and why we arrived at our different current architectural positions. In some cases we have come full circle and reinvented concepts that were in use 20 years ago. Only by learning from the lessons of the past can we see how to move forward to deal with this complex environment.

In the past 20 years a variety of architectures have come about in an attempt to satisfy some core requirements:

Allow as many users as possible to access the system

Ensure those users had good performance for accessing the data

Enable those users to perform DML (insert/update/delete) safely and securely (safely implies ability to restore data in the event of failure)

The goal of a database management system was to provide an environment where these points could be met. The first databases were not relational. They were heavily I/O focused as the computers did not have much memory and the idea of caching data was deemed to be too expensive. The servers had kilobytes and then eventually, megabytes of memory. This memory was required foremost by the programs to run in them. The most efficient architecture was to use pointers to link the data together. The architecture that emerged naturally was hierarchical and a program would navigate the hierarchy to find rows related to each other. Users connected in via a dumb terminal. This was a monitor with a keyboard that could process input and output from a basic protocol and display it on the screen. All the processing of information, including how the screen should display it (using simple escape sequence commands), was controlled in the server.

Traditional no tier

The mainframes used a block mode structure, where the user would enter a screen full of data and press the Enter key. After doing this the whole screen of information was sent to the server for processing. Other servers used asynchronous protocols, where each letter, as it was typed, was sent to the server for processing. This method was not as efficient as block mode because it required more server processing power to handle the data coming in. It did provide a friendlier interface for data entry as mistakes made could be relayed immediately back to the user. Block mode could only display errors once the screen of data was sent, processed, and returned.

As more users started using these systems, the amount of data in them began to grow and the users wanted to get more intelligence out of the data entered. Requirements for reporting appeared as well as the ability to do ad hoc querying. The databases were also very hard to maintain and enhance as the pointer structure linked everything together tightly. It was very difficult to perform maintenance and changes to code. In the 1970s the relational database concept was formulated and it was based on sound mathematical principles. In the early 1980s the first conceptual relational databases appeared in the marketplace with Oracle leading the way.

The relational databases were not received well. They performed poorly and used a huge amount of server resources. Though they achieved a stated goal of being flexible and adaptable, enabling more complex applications to be built quicker, the performance overheads of performing joins proved to be a major issue. Benefits could be seen in them, but they could never be seen as being able to be used in any environment that required tens to hundreds or thousands of concurrent users. The technology wasn't there to handle them.

To initially achieve better performance the relational database vendors focused on using a changing hardware feature and that was memory. By the late 1980s the computer servers were starting to move from 16 bit to 32 bit. The memory was increasing and there was drop in the price. By adapting to this the vendors managed to take advantage of memory and improved join performance.

The relational databases in effect achieved a balancing act between memory and disk I/O. Accessing a disk was about a thousand times slower than accessing memory. Memory was transient, meaning if there was a power failure and if there was data stored in memory, it would be lost. Memory was also measured in megabytes, but disk was measured in gigabytes. Disk was not transient and generally reliable, but still required safeguards to be put in place to protect from disk failure.

So the balancing act the databases performed involved caching data in memory that was frequently accessed, while ensuring any modifications made to that data were always stored to disk. Additionally, the database had to ensure no data was lost if a disk failed. To improve join performance the database vendors came up with their own solutions involving indexing, optimization techniques, locking, and specialized data storage structures. Databases were judged on the speed at which they could perform joins.

The flexibility and ease in which applications could be updated and modified compared to the older systems soon made the relational database become popular and must have. As all relational databases conformed to an international SQL standard, there was a perception that a customer was never locked into a propriety system and could move their data between different vendors. Though there were elements of truth to this, the reality has shown otherwise. The Oracle Database key strength was that you were not locked into the hardware and they offered the ability to move a database between a mainframe to Windows to Unix. This portability across hardware effectively broke the stranglehold a number of hardware vendors had, and opened up the competition enabling hardware vendors to focus on the physical architecture rather than the operating system within it.

In the early 1990s with the rise in popularity of the Apple Macintosh, the rules changed dramatically and the concept of a user friendly graphical environment appeared. The Graphical User Interface (GUI) screen offered a powerful interface for the user to perform data entry. Though it can be argued that data entry was not (and is still not) as fast as data entry via a dumb terminal interface, the use of colors, varying fonts, widgets, comboboxes, and a whole repository of specialized frontend data entry features made the interface easier to use and more data could be entered with less typing. Arguably, the GUI opened up the computer to users who could not type well. The interface was easier to learn and less training was needed to use the interface.

Two tier

The GUI interface had one major drawback; it was expensive to run on the CPU. Some vendors experimented with running the GUI directly on the server (the Solaris operating system offered this capability), but it become obvious that this solution would not scale.

To address this, the two-tier architecture was born. This involved using the GUI, which was running on an Apple Macintosh or Microsoft Windows or other Windows environment (Microsoft Windows wasn't the only GUI to run on Intel platforms) to handle the display processing. This was achieved by moving the application displayed to the computer that the user was using. Thus splitting the GUI presentation layer and application from the database. This seemed like an ideal solution as the database could now just focus on handling and processing SQL queries and DML. It did not have to be burdened with application processing as well. As there were no agreed network protocols, a number had to be used, including named pipes, LU6.2, DECNET, and TCP/IP. The database had to handle language conversion as the data was moved between the client and the server. The client might be running on a 16-bit platform using US7ASCII as the character set, but the server might be running on 32-bit using EBCDIC as the character set. The network suddenly became very complex to manage.

What proved to be the ultimate show stopper with the architecture had nothing to do with the scalability of client or database performance, but rather something which is always neglected in any architecture, and that is the scalability of maintenance. Having an environment of a hundred users, each with their own computer accessing the server, requires a team of experts to manage those computers and ensure the software on it is correct. Application upgrades meant upgrading hundreds of computers at the same time. This was a time-consuming and manual task. Compounded by this is that if the client computer is running multiple applications, upgrading one might impact the other applications. Even applying an operating system patch could impact other applications. Users also might install their own software on their computer and impact the application running on it. A lot of time was spent supporting users and ensuring their computers were stable and could correctly communicate with the server.

Three tier

Specialized software vendors tried to come to the rescue by offering the ability to lock down a client computer from being modified and allowing remote access to the computer to perform remote updates. Even then, the maintenance side proved very difficult to deal with and when the idea of a three tier architecture was pushed by vendors, it was very quickly adopted as the ideal solution to move towards because it critically addressed the maintenance issue.

In the mid 1990s the rules changed again. The Internet started to gain in popularity and the web browser was invented. The browser opened up the concept of a smart presentation layer that is very flexible and configured using a simple mark up language. The browser ran on top of the protocol called HTTP, which uses TCP/IP as the underlying network protocol.

The idea of splitting the presentation layer from the application became a reality as more applications appeared in the browser. The web browser was not an ideal platform for data entry as the HTTP protocol was stateless making it very hard to perform transactions in it. The HTTP protocol could scale. The actual usage involved the exact same concepts as block mode data entry performed on mainframe computers. In a web browser all the data is entered on the screen, and then sent in one go to the application handling the data.

The web browser also pushed the idea that the operating system the client is running on is immaterial. The web browsers were ported to Apple computers, Windows, Solaris, and Unix platforms.

The web browser also introduced the idea of standard for the presentation layer. All vendors producing a web browser had to conform to the agreed HTML standard. This ensured that anyone building an application that confirmed to HTML would be able to run on any web browser.

The web browser pushed the concept that the presentation layer had to run on any client computer (later on, any mobile device as well) irrespective of the operating system and what else was installed on it. The web browser was essentially immune from anything else running on the client computer. If all the client had to use was a browser, maintenance on the client machine would be simplified.

HTML had severe limitations and it was not designed for data entry. To address this, the Java language came about and provided the concept of an applet which could run inside the browser, be safe, and provide an interface to the user for data entry. Different vendors came up with different architectures for splitting their two tier application into a three tier one.

Oracle achieved this by taking their Oracle Forms product and moving it to the middle application tier, and providing a framework where the presentation layer would run as a Java applet inside the browser. The Java applet would communicate with a process on the application server and it would give it its own instructions for how to draw the display. When the Forms product was replaced with JDeveloper, the same concept was maintained and enhanced. The middle tier became more flexible and multiple middle application tiers could be configured enabling more concurrent users.

The three tier architecture has proven to be an ideal environment for legacy systems, giving them a new life and enabling them be put in an environment where they can scale. The three tier environment has a major flaw preventing it from truly scaling. The flaw is the bottleneck between the application layer and the database. The three tier environment also is designed for relational databases. It is not designed for multimedia databases.In the architecture if the digital objects are stored in the database, then to be delivered to the customer they need to pass through the application-database network (exaggerating the bottleneck capacity issues), and from there passed to the presentation layer.

Those building in this environment naturally lend themselves to the concept that the best location for the digital objects is the middle tier. This then leads to issues of security, backing up, management, and all the issues previously cited for why storing the digital objects in the database is ideal. The logical conclusion to this is to move the database to the middle tier to address this. In reality, the logical conclusion is to move the application tier back into the database tier.

Virtualized architecture

In the mid 2000s the idea of a virtualization began to appear in the marketplace. A virtualization was not really a new idea and the concept has existed on the IBM MVS environment since the late 1980s. What made this virtualization concept powerful was that it could run Windows, Linux, Solaris, and Mac environments within them. A virtualized environment was basically the ability to run a complete operating system within another operating system. If the computer server had sufficient power and memory, it could run multiple virtualizations (VMs). We can take the snapshot of a VM, which involves taking a view of the disk and memory and storing it. It then became possible to rollback to the snapshot.

A VM could be easily cloned (copied) and backed up. VMs could also be easily transferred to different computer servers. The VM was not tied to a physical server and the same environment could be moved to new servers as their capacity increased.

A VM environment became attractive to administrators simply because they were easy to manage. Rather than running five separate servers, an administrator could have the one server with five virtualizations in it.

The VM environment entered at a critical moment in the evolution of computer servers. Prior to 2005 most computer servers had one or two CPUs in them. The advanced could have as many as 64 (for example, the Sun E10000), but generally, one or two was the simplest solution. The reason was that computer power was doubling every two years following Moore's law. By around 2005 the market began to realize that there was a limit to the speed of an individual CPU due to physical limitations in the size of the transistors in the chips. The solution was to grow the CPUs sideways and the concept of cores came about. A CPU could be broken down into multiple cores, where each one acted like a separate CPU but was contained in one chip. With the introduction of smart threading, the number of virtual cores increased. A single CPU could now simulate eight or more CPUs.

This concept has changed the rules. A server can now run with a large number of cores whereas 10 years ago it was physically limited to one or two CPUs. If a process went wild and consumed all the resources of one CPU, it impacted all users. In the multicore CPU environment, a rogue process will not impact the others. In a VM the controlling operating system (which is also called a hypervisor, and can be hardware, firmware, or software centric) can enable VMs to be constrained to certain cores as well as CPU thresholds within that core. This allows a VM to be fenced in. This concept was taken by Amazon and the concept of the cloud environment formed.

This architecture is now moving into a new path where users can now use remote desktop into their own VM on a server. The user now needs a simple laptop (resulting in the demise of the tower computer) to use remote desktop (or equivalent) into the virtualization. They then become responsible for managing their own laptop, and in the event of an issue, it can be replaced or wiped and reinstalled with a base operating system on it. This simplifies the management. As all the business data and application logic is in the VM, the administrator can now control it, easily back it up, and access it.

Though this VM cloud environment seems like a good solution to resolving the maintenance scalability issue, a spanner has been thrown in the works at the same time as VMs are becoming popular, so was the evolution of the mobile into a portable hand held device with applications running on it.

Mobile applications architecture

The iPhone, iPad, Android, Samsung, and other devices have caused a disruption in the marketplace as to how the relationship between the user and the application is perceived and managed.

These devices are simpler and on the face of it employ a variety of architectures including two tier and three tier. Quality control of the application is managed by having an independent and separate environment, where the user can obtain their application for the mobile device. The strict controls Apple employs for using iTunes are primarily to ensure that the Trojan code or viruses are not embedded in the application, resulting in a mobile device not requiring a complex and constantly updating anti-virus software.

Though the interface is not ideal for heavy data entry, the applications are naturally designed to be very friendly and use touch screen controls. The low cost combined with their simple interface has made them an ideal product for most people and are replacing the need for a laptop in a number of cases. Application vendors that have applications that naturally lend themselves to this environment are taking full advantage of it to provide a powerful interface for clients to use.

The result is that there are two architectures today that exist and are moving in different directions. Each one is popular and resolves certain issues. Each has different interfaces and when building and configuring a storage repository for digital objects, both these environments need to be taken into consideration.

For a multimedia environment the ideal solution to implement the application is based on the Web. This is because the web environment over the last 15 years has evolved into one which is very flexible and adaptable for dealing with the display of those objects. From the display of digital images to streaming video, the web browser (with sometimes plugins to improve the display) is ideal. This includes the display of documents.

The browser environment though is not strong for the editing of these digital objects. Adobe Photoshop, Gimp, Garage Band, Office, and a whole suite of other products are available that are designed to edit each type of digital object perfectly. This means that currently the editing of those digital objects requires a different solution to the loading, viewing and delivery of those digital objects.

There is no right solution for the tier architecture to manage digital objects. The N-Tier model moves the application and database back into the database tier. An HTTP server can also be located in this tier or for higher availability it can be located externally.

Optimal performance is achieved by locating the application as close to the database as possible. This reduces the network bottleneck. By locating the application within the database (in Oracle this is done by using PL/SQL or Java) an ideal environment is configured where there is no overhead between the application and database.

The N-Tier model also supports the concept of having the digital objects stored outside the environment and delivered using other methods. This could include a streaming server. The N-Tier model also supports the concept of transformation servers. Scalability is achieved by adding more tiers and spreading the database between them. The model also deals with the issue of the connection to the Internet becoming a bottleneck. A database server in the tier is moved to another network to help balance the load.

For Oracle this can be done using RAC to achieve a form of transparent scalability. In most situations, Tuning, scalability at the server is achieved using manual methods using a form of application partitioning.

Basic database configuration concepts

When a database administrator first creates a database that they know will contain digital objects, they will be confronted with some basic database configuration questions covering key sizing features of the database.

When looking at the Oracle Database there are a number of physical and logical structures built inside the database. To avoid confusion with other database management systems, it's important to note that an Oracle Database is a collection of schemas, whereas in other database management the terminology for a database equates to exactly one schema. This confusion has caused a lot of issues in the past. An Oracle Database administrator will say it can take 30 minutes to an hour to create a database, whereas a SQL Server administrator will say it takes seconds to create a database. In Oracle to create a schema (the same as a SQL Server database) also takes seconds to perform.

For the physical storage of tables, the Oracle Database is composed of logical structures called tablespaces.

The tablespace is designed to provide a transparent layer between the developer creating a table and the physical disk system and to ensure the two are independent. Data in a table that resides in a tablespace can span multiple disks and disk subsystem or a network storage system. A subsystem equating to a Raid structure has been covered in greater detail at the end of this article.

A tablespace is composed of many physical datafiles. Each datafile equates to one physical file on the disk. The goal when creating a datafile is to ensure its allocation of storage is contiguous in that the operating system and doesn't split its location into different areas on the disk (Raid and NAS structures store the data in different locations based on their core structure so this rule does not apply to them). A contiguous file will result in less disk activity being performed when full tablespace scans are performed. In some cases, especially, when reading in very large images, this can improve performance.

A datafile is fragmented (when using locally managed tablespaces, the default in Oracle) into fixed size extents. Access to the extents is controlled via a bitmap which is managed in the header of the tablespace (which will reside on a datafile). An extent is based on the core Oracle block size. So if the extent is 128 KB and the database block size is 8 KB, 16 Oracle blocks will exist within the extent. An Oracle block is the smallest unit of storage within the database. Blocks are read into memory for caching, updated, and changes stored in the redo logs. Even though the Oracle block is the smallest unit of storage, as a datafile is an operating system file, based on the type of server filesystem (UNIX can be UFS and Windows can be NTFS), the unit of storage at this level can change.

The default in Windows was once 512 bytes, but with NTFS can be as high as 64 KB. This means every time a request is made to the disk to retrieve data from the filesystem it does a read to return this amount of data. So if the Oracle block's size was 8 KB in size and the filesystem block size was 64 KB, when Oracle requests a block to be read in, the filesystem will read in 64 KB, return the 8 KB requested, and reject the rest. Most filesystems cache this data to improve performance, but this example highlights how in some cases not balancing the database block size with the filesystem block size can result in wasted I/O. The actual answer to this is operating system and filesystem dependent, and it also depends on whether Oracle is doing read aheads (using the init.ora parameter db_file_multiblock_read_count).

When Oracle introduced the Exadata they put forward the idea of putting smarts into the disk layer. Rather than the database working out how best to retrieve the physical blocks of data, the database passes a request for information to the disk system. As the Exadata knows about its own disk performance, channel speed, and I/O throughput, it is in a much better position for working out the optimal method for extracting the data. It then works out the best way of retrieving it based on the request (which can be a query). In some cases it might do a full table scan because it can process the blocks faster than if it used an index. It now becomes a smart disk system rather than a dumb/blind one. This capability has changed the rules for how a database works with the underlying storage system.

ASM—Automated Storage Management

In Oracle 10G, Oracle introduced ASM primarily to improve the performance of Oracle RAC (clustered systems, where multiple separate servers share the same database on the same disk). It replaces the server filesystem and can handle mirroring and load balancing of datafiles. ASM takes the filesystem and operating system out of the equation and enables the database administrator to have a different degree of control over the management of the disk system.

Block size

The database block size is the fundamental unit of storage within an Oracle Database. Though the database can support different block sizes, a tablespace is restricted to one fixed block size.

The block sizes available are 4 KB, 8 KB, 16 KB, and 32 KB (a 32 KB block size is valid only on 64-bit platforms). The current tuning mentality says it's best to have one block size for the whole database. This is based on the idea that the one block size makes it easier to manage the SGA and ensure that memory isn't wasted.

If multiple block sizes are used, the database administrator has to partition the SGA into multiple areas and assign each a block size. So if the administrator decided to have the database at 8 KB and 16 KB, they would have to set up a database startup parameter indicating the size of each:

DB_8K_CACHE_SIZE = 2GDB_16K_CACHE_SIZE = 1G

The problem that an administrator faces is that it can be hard to judge memory usage with table usage. In the above scenario the tables residing in the 8 KB block might be accessed a lot more than 16 KB ones, meaning the memory needs to be adjusted to deal with that. This balancing act of tuning invariably results in the decision that unless exceptional situations warrant its use, it's best to keep to the same database blocks size across the whole database. This makes the job of tuning simpler.

As is always the case when dealing with unstructured data, the rules change. The current thinking is that it's more efficient to store the data in a large block size. This ensures there is less wasted overhead and fewer block reads to read in a row of data. The challenge is that the size of the unstructured data can vary dramatically. It's realistic for an image thumbnail to be under 4 KB in size. This makes it an ideal candidate to be stored in the row with the other relational data. Even if an 8 KB block size is used, the thumbnail and other relational data might happily exist in the one block. A photo might be 10 MB in size requiring a large number of blocks to be used to store it. If a 16 KB block size is used, it requires about 64 blocks to store 1 MB (assuming there is some overhead that requires overall extra storage for the block header).

An 8 KB block size requires about 130 blocks. If you have to store 10 MB, the number of blocks increases 10 times. For an 8 KB block that is over 1300 reads is sufficient for one small-sized 10 MB image. With images now coming close to 100 MB in size, this figure again increases by a factor of 10. It soon becomes obvious that a very large block size is needed. When storing video at over 4 GB in size, even a 32 KB block size seems too small.

As is covered later in the article, unstructured data stored in an Oracle blob does not have to be cached in the SGA. In fact, it's discouraged because in most situations the data is not likely to be accessed on a frequent basis. This generally holds true but there are cases, especially with video, where this does not hold true and this situation is covered later. Under the assumption that the thumbnails are accessed frequently and should be cached and the originals are accessed infrequently and should not be cached, the conclusion is that it now becomes practical to split the SGA in two. The unstructured, uncached data is stored in a tablespace using a large block size (32 KB) and the remaining data is stored in a more acceptable and reasonable 8 KB block. The SGA for the 32 KB is kept to a bare minimum as it will not be used, thus bypassing the issue of perceived wasted memory by splitting the SGA in two.

In the following table a simple test was done using three tablespace block sizes. The aim was to see if the block size would impact load and read times. The load involved reading in 67 TIF images totaling 3 GB in size. The result was that the tablespace block size made no statistical significant difference. The test was done using a 50-MB extent size and as shown shown in the next segment, this size will impact performance. So to correctly understand how important block size can be, one has to look at not only the block size but also the extent size.

UNIFORM extent size and AUTOALLOCATE

When creating a tablespace to store the unstructured data, the next step after the block size is determined is to work out what the most efficient extent size will be. As a table might contain data ranging from hundreds of gigabytes to terabytes determining the extent size is important. The larger the extent, the potential to possible waste space if the table doesn't use it all is greater. The smaller the extent size the risk is that the table will grow into tens or hundreds of thousands of extents. As a locally managed tablespace uses a bitmap to manage the access to the extents and is generally quite fast, having it manage tens of thousands of extents might be pushing its performance capabilities.

There are two methods available to the administrator when creating a tablespace. They can manually specify the fragment size using the UNIFORM extent size clause or they can let the Oracle Database calculate it using the AUTOALLOCATE clause. Tests were done to determine what the optimal fragment size was when AUTOALLOCATE was not used. The AUTOALLOCATE is a more set-and-forget method and one goal was to see if this clause was as efficient as manually setting it.

Locally managed tablespace UNIFORM extent size

Covers testing performed to try to find an optimal extent and block size. The results showed that a block size of 16384 (16 KB) is ideal, though 8192 (8 KB) is acceptable. The block size of 32 KB was not tested. The administrator, who might be tempted to think the larger the extent size, the better the performance, would be surprised that the results show that this is not always the case and an extent size between 50 MB-200 MB is optimal.

For reads with SECUREFILES the number of extents was not a major performance factor but it was for writes.

When compared to the AUTOALLOCATE clause, it was shown there was no real performance improvement or loss when used. The testing showed that an administrator can use this clause knowing they will get a good all round result when it comes to performance. The syntax for configuration is as follows:

EXTENT MANAGEMENT LOCAL AUTOALLOCATE segment space management auto

Repeated tests showed that this configuration produced optimal read/write times without the database administrator having to worry about what the extent size should be. For a 300 GB tablespace it produced a similar number of extents as when a 50M extent size was used.

As has been covered, once an image is loaded it is rare that it is updated. A relational database fragmentation within a tablespace is caused by repeated creation/dropping of schema objects and extents of different sizes, resulting in physical storage gaps, which are not easily reused. Storage is lost. This is analogous to the Microsoft Windows environment with its disk storage. After a period of time, the disk becomes fragmented making it hard to find contiguous storage and locate similar items together. Locating all the pieces in a file as close together as possible can dramatically reduce the number of disk reads required to read it in. With NTFS (a Microsoft disk filesystem format) the system administrator can on creation determine whether extents are autoallocated or fragmented. This is similar in concept to the Oracle tablespace creation. Testing was not done to check if the fragmentation scenario is avoided with the AUTOALLOCATE clause. The database administrator should therefore be aware of the tablespace usage and whether it is likely going to be stable once rows are added (in which case AUTOALLOCATE can be used simplifying storage management). If it is volatile, the UNIFORM clause might be considered as a better option.

Temporary tablespace

For working with unstructured data, the primary uses of the TEMPORARY tablespace is to hold the contents of temporary tables and temporary lobs. A temporary lob is used for processing a temporary multimedia object. In the following example, a temporary blob is created. It is not cached in memory. A multimedia image type is created and loaded into it. Information is extracted and the blob is freed. This is useful if images are stored temporarily outside the database. This is not the same case as using a bfile which Oracle Multimedia supports. The bfile is a permanent pointer to an image stored outside the database.

It's important when using this tablespace to ensure that all code, especially on failure, performs a dbms_lob.freetemporary function, to ensure that storage leakage doesn't occur. This will result in the tablespace continuing to grow until it runs out of room. In this case the only way to clean it up is to either stop all database processes referencing, then resize the datafile (or drop and recreate the temporary tablespace after creating another interim one), or to restart the database and mount it. The tablespace can then be resized or dropped and recreated.

UNDO tablespace

The UNDO tablespace is used by the database to store sufficient information to rollback a transaction. In a database containing a lot of digital objects, the size of the database just for storage of the objects can exceed terabytes. In this situation the UNDO tablespace can be sized larger giving added opportunity for the database administrator to perform flashback recovery from user error. It's reasonable to size the UNDO tablespace at 50 GB even growing it to 100 GB in size. The larger the UNDO tablespace the further back in time the administrator can go and the greater the breathing space between user failure, user failure detected and reported, and the database administrator doing the flash back recovery.

The following is an example flashback SQL statement. The as of timestamp clause tells Oracle to find rows that match the timestamp from the current time going back so that we can have a look at a table an hour ago:

SYSTEM tablespace

The SYSTEM tablespace contains the data dictionary. In Oracle 11g R2 it also contains any compiled PL/SQL code (where PLSQL_CODE_TYPE=NATIVE). The recommended initial starting size of the tablespace should be 1500 MB.

Redo logs

The following test results highlight how important it is to get the size and placement of the redo logs correct. The goal was to determine what combination of database parameters and redo/undo size were optimal. In addition, an SSD was used as a comparison. Based on the result of each test, the parameters and/or storage was modified to see whether it would improve the results. When it appeared an optimal parameter/storage setting was found, it was locked in while the other parameters were tested further. This enabled multiple concurrent configurations to be tested and an optimal result to be calculated.

The test involved loading 67 images into the database. Each image varied in size between 40 to 80 MB resulting in 2.87 GB of data being loaded. As the test involved only image loading, no processing such as setting properties or extraction of metadata was performed. Archiving on the database was not enabled. All database files resided on hard disk unless specified.

In between each test a full database reboot was done. The test was run at least three times with the range of results shown as follows:

Analysis

The tests show a huge improvement when the redo logs were moved to a Solid State Drive (SSD). Though the conclusion that can be drawn is this: the optimal step to perform it might be self defeating. A number of manufacturers of SSD acknowledge there are limitations with the SSD when it comes to repeated writes. The Mean Time to Failure (MTF) might be 2 million hours for reads; for writes the failure rate can be very high. Modern SSD and flash cards offer much improved wear leveling algorithms to reduce failures and make performance more consistent. No doubt improvements will continue in the future.

A redo log by its nature is constant and has heavy writes. So, moving the redo logs to the SSD might quickly result in it becoming damaged and failing. For an organization that on configuration performs one very large load of multimedia, the solution might be to initially keep the redo logs on SSD, and once the load is finished, to move the redo logs to a hard drive.

Increasing the size of the redo logs from 50 MB to 1 GB improves performance and all database containing unstructured data should have a redo log size of at least 1 GB. The number of logs should be at least 10; preferred is from 50 to 100. As is covered later, disk is cheaper today than it once was, and 100 GB of redo logs is not that large a volume of data as it once was. The redo logs should always be mirrored.

The placement or size of the UNDO tablespace makes no difference with performance.

The redo buffer size (LOG_BUFFER) showed a minor improvement when it was increased in size, but the results were inconclusive as the figures varied. A figure of LOG_BUFFER=8691712, showed the best results and database administrators might use this figure as a starting point for tuning.

The changing of multiblock read count (DB_FILE_MULTIBLOCK_READ_COUNT) from the default value of 64 to 256 showed no improvement. As the default value (in this case 64) is set by the database as optimal for the platform, the conclusion that can be drawn is that the database has set this figure to be a good size.

By moving the original images to an SSD showed another huge improvement in performance. This highlighted how the I/O bottleneck of reading from disk and the writing to disk (redo logs) is so critical for digital object loading.

The final test involved moving the datafile containing the table to the SSD. It highlighted a realistic issue that DBAs face in dealing with I/O. The disk speed and seek time might not be critical in tuning if the bottleneck is the actual time it takes to transfer the data to and from the disk to the server. In the test case the datafile was moved to the same SSD as the redo logs resulting in I/O competition. In the previous tests the datafile was on the hard disk and the database could write to the disk (separate I/O channel) and to the redo logs (separate I/O channel) without one impacting the other. Even though the SSD is a magnitude faster in performance than the disk, it quickly became swamped with calls for reads and writes. The lesson is that it's better to have multiple smaller SSDs on different I/O channels into the server than one larger channel. Sites using a SAN will soon realize that even though SAN might offer speed, unless it offers multiple I/O channels into the server, its channel to the server will quickly become the bottleneck, especially if the datafiles and the images for loading are all located on the server.

The original tuning notion of separating data fi les onto separate disks that was performed more than 15 years ago still makes sense when it comes to image loading into a multimedia database. It's important to stress that this is a tuning issue while dealing with image loading not when running the database in general. Tuning the database in general is a completely different story and might result in a completely different architecture.

Oracle Securefile architecture

In Oracle 11g, Oracle rearchitected the management of lobs. This new structure is called SECUREFILES. The storage structure also included a number of new optimizations and security features. It is anticipated that the older storage method (now called BASICFILES) will be made obsolete in the next major release.

The storage architecture also introduced a new method for indexing the lobs, enabling faster access to parts of it. Reading and writing were also improved.

The following sections describe some of the key storage options a database administrator might want to consider when setting up tables that hold unstructured data. Example table creation statements are included at the end of the article.

Enabling storage in row

This option is useful for small lobs. These are ones that are less than 4000 bytes. When this option is enabled, Oracle stores the first 4000 bytes in the row just like if the column was a RAW (4000) one. If the lob is larger than 4000 bytes, the remainder of the lob is stored in the specialized lob storage area. The 4000 bytes includes system information about where the actual lob is located.

CHUNK

Oracle indicates in their documentation that this parameter is now only used as an advisory size and is there for backwards compatibility only. It was used with the BASICFILE type lob but can now be removed from usage.

Logging

Logging indicates whether the lob on any DML is logged to the redo logs. If LOGGING is specified, in the event of media failure the database can roll forward and restore the lob from archives.

If NOLOGGING is specified then the lob cannot be recovered as information is not written to the redo logs. It ensures faster load time at the expense of recovery in the event of failure. If a tablespace containing only lobs is created and lobs are loaded into it with NOLOGGING, once the tablespace is made read only and backed up, the need for logging is removed making this a viable option. Generally though, LOGGING should be used.

If a lob is to be cached in the Buffer Cache, LOGGING is enabled. A third option, FILESYSTEM_LIKE_LOGGING, also ensures recovery in the event of failure. Data dictionary information about the new space for the lob is not logged, but extent information is logged. This ensures sufficient information exists to restore the lob.

Cache

Cache determines how a lob is stored in the Oracle Buffer Cache. The three values are as follows:

CACHE: This caches the lob. Considering the size of lobs, caching should be carefully considered. One large lob could swap out the whole Buffer Cache. The standard rule is to cache thumbnails, but not cache web quality or larger lobs.

NOCACHE: This doesn't cache the lob. This is the recommended value to use unless there is justification in caching it.

CACHE READS: This caches the lob only if it's read but not written. Oracle currently does not support being able to indicate which lobs are to be cached (lob row value indicated caching).

Managing duplicate images

Specifying DEDUPLICATE will result in Oracle checking if on loading an identical blob exists. If so, it will only store the blob once and maintain a pointer structure to it. This option is best used in a multimedia warehouse, where the likelihood of doubling up on images can happen.

The DEDUPLICATE options calculate a number of different checksum and comparison routines on the image before comparing to existing ones. Multiple routines are used to ensure that false positives do not occur. If a match is detected, only the original image is stored. The time to load the image with DEDUPLICATE is as long as loading in a normal image. The major advantage this option provides involves savings on storage.

Retention

This parameter is used to determine how the lob is to be used for read consistency and/or flashback. A value of AUTO will enable sufficient information to be maintained for read consistency. A value of MIN or MAX will impact the undo retention used by flashback.

Lob compression

This enables the lob to be compressed using LZ compression as used in the utl_compress supplied package. Lob compression on binary does not achieve any noticeable results. In some cases it can result in the lob becoming larger. Multimedia images need to be compressed using alternate methods. It will though produce storage benefits if character data is stored in the lob and major benefits if the data stored is XML.

The recommendation is to not compress unless the data can be shown to compress, and the storage reduction gained is balanced by the offset in extra CPU usage to decompress the lob when retrieved.

Encryption

The encryption clause enables a lob to be stored in an encrypted form in the database. Enabling it has to be a business decision as performance will be impacted by encryption. This includes CPU to encrypt as well as a large amount of CPU to decrypt it every time it's accessed. The amount of CPU time is proportional to the size of the lob being encrypted.

If encryption is used to protect the data from database administrators, alternative methods should be investigated, as administrators have a number of tools in their arsenal, which they can use to easily determine what the encrypted code is.

Read-only tablespace

There are some key advantages to making a tablespace read-only. If a large number of images are loaded into the database and the table becomes fixed or static (not changing), by making the tablespace read-only, all the datafiles only need be backed up once. This reduces the backup size footprint.

Recovery is quicker too as the database does not need to have to apply any transactional data to the contents of the tablespace.

The tablespace is also easily moved between databases and becomes easier to transport between databases. The datafiles associated with the tablespace and a small export of data dictionary information enables it to be moved and reattached to another database. This allows the administrator to perform large volume copies of the images in the table. Such an option is useful when copying images between an internal site and an external site.

Where does Oracle Multimedia fit in?

The Oracle Database supports three types of binary data types:

Binary Large Object (BLOB): This is a data type that contains binary data. Two structures exist in Oracle. The traditional one is called BASIC and a newer faster one, which was introduced in Oracle 11, is called SECUREFILES. The BASIC format will be desupported in a later release. The BLOB replaces the LONG data type which was the standard data type for any binary data. The LONG data type had a number of key limitations including only one being allowed per table and minimal support for controlling its storage. As of Oracle11g a BLOB does not have a maximum size.

Character Large Object (CLOB): This is a lob designed for text data. It is very similar in behavior to a VARCHAR of unlimited length. In Oracle 11g the difference is minor and text functions can be applied to a CLOB. A CLOB is useful for storing XML data. A CLOB is controlled by the database character set, whereas if an NCLOB is created then it is controlled by the national character set as configured by the database administrator. A document like Microsoft Word or a PDF is stored as a BLOB because these documents can contain binary data, which cannot be handled by a character set.

Binary File (BFILE): This is a pointer to a file outside the database. The BFILE is configured by specifying an Oracle Directory (covered later) and a physical filename. The physical filename can include a path embedded in it. Once set up the BFILE behaves exactly like a BLOB. A BFILE is read-only. Also if the physical file it points to is moved, the BFILE becomes orphaned. In that it now points to a non-existent file.

When using PL/SQL, the key routine for manipulating these types is via the DBMS_ LOB package. This package contains routines to read, load, and write to the type. In some cases when piece wise updates is done on a BLOB, the Oracle 11g PL/SQL limit of 32 KB is encountered, resulting in this manipulation to be done using the RAW data type. Oracle provides a supplied package called UTL_RAW that can be used to perform additional binary manipulation of this data.

Oracle Multimedia is a set of data types and methods (built in Java) that encompass the BLOB, CLOB, and BFILE and include the ability to reference a file via an HTTP reference. Each type is designed to identify a particular type of digital object and in some cases enable it to be transformed. The four major Multimedia types are as follows:

ORDSYS.ORDIMAGE

ORDSYS.ORDVIDEO

ORDSYS.ORDAUDIO

ORDSYS.ORDDOC

All these types are defined in a schema called ORDSYS. Each of these types has a set of supplied methods that can be used against the digital object. Object-oriented style syntax can be used, or the traditional PL/SQL calls can be used.

All these four types reference a core type called ORDSYS.ORDSOURCE. This type contains the BLOB, CLOB, BFILE, or HTTP reference and effectively hides (encapsulates) the digital object within its layer.

The ORDSYS data types are central to Oracle Multimedia. All tables need to reference these type to use the methods. Though it's possible for an architect to design their own data type based on a BLOB, to ensure a common standard usage it's more efficient to use the object-oriented concept of extending the Oracle Multimedia ORDSYS data type. This means to either create a new type referencing the ORDSYS type within it, or to actually extend the type and add additional attributes to it.

Understanding the ORDSYS data types

This section covers methods for creating and working with tables using the Oracle

Creating a table

As the ORDSYS Multimedia types reference ORDSYS.ORDSOURCE, which in turn references a BLOB, the database administrator has the option of specifying all the storage parameters for each of the columns. For example, the table created looks like the following:

By naming the lobs it becomes easier to identify them. A lob doesn't have to be specifically named. If it isn't, Oracle assigns a system generated name to it. The lob can be found by querying the SEGMENT_NAME column in the USER_LOBS data dictionary table or SEGMENT_NAME column in the USER_SEGMENTS data dictionary table.

How to query?

Data in an Oracle type is referenced using dot notation and an alias: the column name dot column name in type dot column name in sub type, and so on for however nested deep the type is.

The retrieved local column found in ORDSYS.ORDSOURCE referenced by the ORDSYS. ORDIMAGE type using the previous example column is as follows:

select m.myphoto.source.localfrom myimage m;

In an index creation statement the alias column isn't used:

create index i_myimage_1 on myimage(myphoto.width);

Multimedia methods

Methods are very similar to PL/SQL functions but are more an official objectoriented calling method. Methods are built very similar to a function and attached to a type.

Each of the ORDSYS multimedia types have associated a number of methods that perform a variety of functions including retrieving values, updating values, and processing a digital image.

In the following example using the test_load table, the image will be converted into a thumbnail size.

The process command accepts text as input (enabling dynamic modification when calling) and can perform a number of commands. In most cases when an image is processed, the row has to have been retrieved using a for update clause to ensure it's exclusively locked.

Some methods can be called like a SQL function. These ones have to have been built to return a SQL data type:

select u.vimg.getSourceType() from test_load u;

This SQL statement returns whether the image is a file or an HTTP.

Creating a schema

The following section goes through an example schema creation configuration:

The following two permissions are optional but can prove to be useful. They allow the schema to access the dbms_lock.sleep command (useful for asynchronous processing) and the dbms_file_transfer command, allowing the schema to manipulate external multimedia files.

grant execute on dbms_file_transfer to multimedia;grant execute on dbms_lock to multimedia;

Oracle HTTP servers

There are two HTTP gateway servers available in Oracle. One uses the Oracle listener to act as an HTTP server and it can deliver data from the database using PL/SQL stored procedures. The second involves using Apache and the Mod PL/SQL gateway. Each have advantages and disadvantages in using them.

The Apache HTTP server offers additional capabilities such as virtual directory configuration, SSL, URL rewrites, Single Sign On, and a whole range of open source plugins. The embedded gateway has limitations but is simple to configure and is designed to work with Mod PL/SQL (and Oracle Apex). It's lightweight and is embedded in the Oracle listener, requiring one less process to manage.

Configuring the Oracle embedded gateway

The goal of this section is to show you how to configure Oracle Database for the embedded PL/SQL gateway. For Oracle XE, the embedded gateway comes configured for Apex by default.

All configuration is done via SQL*Plus. SQL*Plus represents the lowest common interface available across all Oracle versions. The command-line interface when used with a text editor enables quick and easy modification of most Oracle features.

Tools like Oracle SQL Developer and Enterprise Manager can also be used, but it is best to have a good understanding of the core capabilities before using GUI development tools:

Use SQL*Plus to connect to the database as SYS:

sqlplus / as sysdba

Check to make sure the database parameter dispatchers are correctly configured to allow the database to talk to the listener:

The orcl is the SID (unique identifier) of the database installed on your machine. You can find out the SID of your database using the following command:

select name from v$database;NAME--------ORCL

If this parameter is not set or it is set incorrectly, the database will not be able to talk to the listener and the embedded gateway will not work.

Unlock the anonymous account:

alter user anonymous ACCOUNT UNLOCK;

The anonymous account is used by Oracle to connect to the database to find additional connection information about the embedded gateway. The gateway also uses Oracle XML DB, so with the query in the next check Oracle XML DB is enabled.

Check that Oracle XML DB required for the gateway is enabled. This is done by checking that user SYS has the XDBADMIN role:

select * from dba_role_privs where granted_role = 'XDBADMIN';

All web interfaces use a port to connect to the database. The Oracle Listener (which will handle all incoming HTTP requests and will act as an HTTP server, similar to Apache), needs to know which port to listen on for incoming HTTP requests. Talk to your network administrators to determine which port you can use.

The following script will open up two ports. One is for FTP and one is for HTTP requests.

Windows platforms can access port 80 without any problems. Unix platforms require root access to ports with numbers less than 1024. This means starting up the listener with permissions equivalent to that of root can get quite tricky. This is why on Unix most sites use HTTP ports above 1024.

In the following example port 80 is configured for HTTP and port 2121 for FTP:

In this step we create the HTTP DAD (or Device Access Descriptor). The DAD defines the mapping between the virtual directory and the actual schema.

An example URL used to access could be http://localhost:80/mm/program. Then /mm/ is the virtual directory and needs to be mapped to the schema in the database.

To do this step, we use the dbms_epg PL/SQL package. We define the DAD and control which PL/SQL programs will access it.

Perform the following steps to define the mapping between the virtual directory and the actual schema:

Create the DAD.

Note that first call will return an error as DAD doesn't exist and can be ignored:

exec dbms_epg.drop_dad( 'PICTION' );

This command defines the DAD and gives it the name mm. It is case sensitive. We can fine tune it to control which programs (PL/SQL packages) it can access, but for now let us give it full access:

exec dbms_epg.create_dad( 'PICTION','/mm/*');

We then map it to a schema. With the embedded gateway, we don't need to manage a password. If we had used mod PL/SQL and Apache then the password should be stored in a dads.conf file (which is encrypted).

These parameters are important security safeguards that ensure packages, which can be accessed from within SQL*Plus by multimedia, cannot be accessed from the URL. This is one of the important steps to safeguard against URL modification.

We then have to authorize the DAD. This is important, because we are effectively saying we are allowing access to the schema without a password using the embedded gateway.

exec dbms_epg.authorize_dad( 'PICTION','MULTIMEDIA');commit;

Note that these commands help if you cannot get in on 8890:

call dbms_xdb.setHttpPort(8890);alter system register;

Configuring Apache

Apache offers an open source multithreaded HTTP Server that can integrate with Oracle.

The Oracle Database can work with either Apache 1.3 or Apache 2.0. The integration is done via Mod PL/SQL. The Mod PL/SQL with Apache 1.3 superseded the previous integration Oracle provided when it provided its own web server.

The multi-threaded nature of the connection means that Apache establishes a number of fixed connections to the database (client/server). Multiple HTTP users then perform requests on a round robin type basis across these fixed connections. This means that there is no guarantee that if a user performs multiple HTTP requests, they will be using the same physical connection to the database. This is a key issue that has to be considered in the development. It means that the PL/SQL program (or Java program) cannot rely on session variables or the content of temporary tables to be there from call to call. Apache manages the load balancing and based on the usage can increase or decrease the number of physical connections to the database. The result is an incredibly scalable architecture enabling thousands of concurrent requests to access the database.

Because of the flexibility of the Oracle architecture, the connectivity between Apache 1.3 and Apache 2.0 will allow a connection to be made with any version of Oracle. As Apache 2.0 included a number of default security fixes, it is recommended that Apache 2.0 be installed wherever possible.

As the Apache install is a client/server configuration, if one treats Apache as a client application, the logic behind its installation is simplified. The challenge for the database administrator is that between Apache 1.3 and Apache 2.0 the installation changes dramatically.

This means that once Apache is installed, the TNS (Oracle Database network layer found with all client/server applications, not to be confused with the JDBC Thin Tier connection) needs to be configured along with a DAD (Device Access Descriptor, containing information about connecting to the database). The database itself needs a LISTENER configured.

It is possible to use Apache with Mod PL/SQL and just have it rewrite (effectively redirect) all requests to the Oracle embedded gateway. This is a realistic and workable option that can prove to be useful in an environment where Apache is the main HTTP server used for handling all HTTP packets and where the network administrator is not comfortable with having Mod PL/SQL installed, or where the Apache network is managed by a third-party company.

The Apache HTTP server is required if SSL is required as the Oracle embedded gateway is not SSL compliant.

Basic diagnostics

The administrator can do a quick check on Unix or Windows to see if the HTTP server is listening to requests. If it's not listening, this could be either because the service is not running or a firewall is blocking access to it.

The command is as follows:

telnet localhost 80

Assuming the HTTP server is listening on Port 80, this command will either fail (indicating it cannot access the HTTP server) or will open up a screen prompting for further information. The administrator can then type in quit to exit.

The telnet.exe program is not by default available on all platforms, even Unix ones, anymore. For Windows it was once bundled in as part of an extra Windows pack. Now it comes included in cygwin (which is a Unix shell on Windows) or can be downloaded from the Internet.

Testing whether the HTTP server is alive and listening for requests on the server is the first of many potential steps that might need to be performed when diagnosing bad connections to the server.

Windows

When Apache2 is installed on Windows, a Windows service is created allowing the administrator to start and stop it. Changing the owner of the service is not recommended in this version (it was possible to do this in older versions) as experience has shown it becomes difficult to connect to it:

The dads.conf file contains information enabling Apache to connect to the database. The file using the example installation is found at C:\oracle\apache2\instances\ instance1\config\OHS\ohs1\mod_plsql.

On installation the file is empty and descriptors need to be added to it. The following is an example configuration:

The definition of the parameters in the dads.conf file is detailed as follows. The dads.conf file can support multiple definitions. Once a DAD is configured the password has to be encrypted using a Perl program provided by Oracle. The provided documentation is incorrect in the release and can cause administrators to spend a lot of time to try and work out how to encrypt it.

The following code from DOS shows how to invoke Perl to encrypt the password in the dads.conf file using the example install:

This is Oracle version 11.2.0.2, it assumes c:\oracle\apache2 install and c:\oracle\ora11gR2 install.

As can be seen, reference is made to the Oracle 11g R2 install which contains the correct version of Perl that is required to do the encryption. Once the program is run, the output will indicate the number of passwords obfuscated (which is encrypted).

The encryption algorithm used is linked to the installed version of Apache2. So it's not possible to copy a dads.conf file from one environment to another and expect it to work. If multiple schemas are referenced in the dads.conf file and they all have the same password, they will be encrypted identically. This means it's possible to configure a new DAD just by copying it in the file including the encrypted password.

Unix

When Apache is installed on Unix, you need to execute the following in the command prompt window:

HTTPD.CONF file

The master parameter configuration file for Apache is called httpd.conf. It's found in the directory (if the install shown previously is used):

C:\oracle\apache2\instances\instance1\config\OHS\ohs1

There are a lot of parameters in the file and it is self documenting, but for an administrator who just wants to modify the basic values, the following are some of the core parameters in the file based on an Apache install on Windows (keep in mind for Unix, root needs to start the Apache Server if it's going to listen on any port less than 1024). Windows was chosen as all documentation one finds on Apache configuration always covers Unix.

ServerName: This is the DNS name of the server. On installation the identified name of the server is used. Multiple DNS can use the one HTTP file using the concept of a virtual server.

ServerName: www.site1.com

ServerRoot: This is the top level directory, where all the HTTP server configuration and log files are kept.

ServerRoot: C:\config

DocumentRoot: This is the location where all files to be delivered are stored.

Directory: This is set of parameters that can be used to control the security and access of the physical location. Commonly used to ensure users can't see files that they are not allowed to access. See the alias example for a detailed example:

<Directory "C:\web\home">

DirectoryIndex: This is the default filename to be used when an HTTP request is not specified in a file.

Listen: This is the TCP/IP port number that HTTP requests should listen on. Port 80 is the default HTTP but Unix requires root to start he HTTP server. The standard default value for Unix is 7777.

Listen 80

Virtual hosts

A virtual host allows multiple DNS sites to use the one HTTP server. In the following example the sites www.site1.com(configured previously) and www.site2.com serve different pages and directories as the configuration parameter values point to different locations.

Apache rewrites

Most sites now have firewalls and dedicated servers for handling outside traffic coming to an internal site. As the HTTP packets are required to be rerouted, there is likely going to be a requirement to change the location of the server, its port number, and even the name of the DAD being used.

External locations and security

When working with digital objects, it is inevitable that the database administrator will need to configure some type of external access to load external files into the database server. There are four available methods that can be used, each one using very different loading methods.

Oracle directory

A number of external access points can be used when accessing external files. The directory is one that Oracle Multimedia uses. Others include utl_file and it is also possible to use Java to access directories directly. But let's first define an external directory that the multimedia schema can access.

The following is an example on Windows:

create or replace directory LOADING_DIR as'C:\multimedia_files';

The following is an example on Unix:

create directory LOADING_DIR as '/u01/multimedia_files';

Granting access to a directory

Once created, access to the directory can be given using Oracle grants:

grant read on directory LOADING_DIR to multimedia;

Access is also implicitly given to any subdirectory. Access to the subdirectory requires appending the subdirectory location to the filename. So if the directory c:\multimedia_files\jpg existed, a program can access the jpg directory by prefixing jpg\ to the filename as follows:

crec.vimg.importfrom(ctx, 'file','LOADING_DIR', 'jpg\myimg.jpg');

UTL_FILE

Digital images can be loaded into the database and written back to disk using the PL/SQL supplied package called utl_file. This package has been bundled with the database since before the directory concept was introduced. It was originally designed to enable files to be read into the database as well as written back to disk. With later versions of Oracle support in the package was extended to support reading files as bytes (raw data), enabling a digital image to be loaded into a BLOB without relying on the Oracle Multimedia methods. The exact physical location of the directory is specified:

UTL_FILE.FOPEN( '/u01/myimages','myfile.jpg','rb',32767);

The security is different to directory and requires the configuration of the database parameter called utl_file_dir. It is a global setting and can't be fine tuned using grants like the DIRECTORY command can. When utl_file_dir = * is set, a PL/SQL program can access any directory on the server. By repeating this parameter and specifying a physical directory, access can be fenced into those directories only:

utl_file_dir = c:\myimagesutl_file_dir = v:\imagedir

The downside of utl_file_dir is this database parameter requires a complete database restart to come into effect. The DIRECTORY command does not have this limitation. Also the directory specified is the only one that can be accessed. Subdirectories cannot be accessed, resulting in all directories being specified. If a large load across thousands of directories was going to be run, configuring the utl_file_dir parameters for each one would prove to be impractical.

By Oracle 10, Oracle integrated support with a configured directory allowing a directory to be accessed rather than specifying an external physical directory:

UTL_FILE.FOPEN( 'LOADING_DIR','myfile.jpg','rb',32767);

UTL_TCP

Access to images isn't just limited to reading/writing from an external physical directory. Digital images can also be loaded into the database via FTP, HTTP, and SMTP. In Oracle 10 there was no security built into the database, which controlled ports allowing any program to access a port and read/write to it using the utl_tcp supplied package.

In Oracle 11 Oracle tightened this up by introducing a basic concept of a firewall inside the database. Access to ports by default is disabled and the DBMS_NETWORK_ ACL_ADMIN package has to be called to give a schema access to a TCP/IP port.

Java

Java running inside the database can access TCP/IP ports and external files. Java code like that shown as follows requires special permission to access external files:

The dbms_java routine is used to give a schema the privileges to access external resources. The following anonymous PL/SQL when run as SYS will give the schema MULTIMEDIA access to run Java programs that can read and write to any external location:

Discussing Raid, SSD, SANs, and NAS

When working with unstructured data there are two key considerations that need to be taken into account. The first is that there will likely to be a large amount of data (possibly tens of terabytes) to be loaded in. This requires an infrastructure that can handle large contiguous writes. The hardware used must be able to support high speed storage of hundreds of megabytes of data in bursts.

The other is that reading in data will result in large reads, potentially concurrently, at possibly random locations across any part of the storage. If it's not random and well known, an option to cache the well-known data items to improve performance must be considered. The I/O (or network) channel can become a bottleneck. The best solution for performance is to support multiple channels.

Solid State Disk

A Solid State Disk (SSD) is the equivalent of persistent memory, in that if power is lost the contents are not lost. It is a form of flash memory with the provision that it is marketed primarily as a hard drive replacement. This hardware technology has grown in popularity in the last three years as the technology behind it has matured. With traditional memory if the power is lost, so are the contents of the memory. An SSD behaves like a disk except it has no moving parts, uses a lot less power, has high speed I/O, is more resistant to shock, not susceptible to magnetic fields, and doesn't suffer from overheating that a hard drive can. The SSD is quite thin (making it ideal for laptops). An SSD also has a low read latency (time to find the data, which normally involves the hard drive spinning and moving the heads to a location) because all data in the SSD has equal access. The SSD has a very high MTF for reads, but most current models have a guaranteed failure time for writes. Most have built-in error correcting when it can't correctly write the data.

An SSD is ideal for applications with heavy reads but not large data volumes. Currently, the largest SSD is 512 G and it is anticipated the size will increase to over 2 TB in the next four years. The cost of an SSD is far greater than a disk. Target areas for SSD are the operating system kernel, database kernel, and any software installs. Read-only tablespaces are also well suited for SSD as well as the SYSTEM and SYSAUX tablespaces (again, depending on what applications are running and how heavy the I/O load is).

An important factor when reviewing the use of the SSD is the performance of the disk controller it is plugged into. This is hardware that enables the SSD to communicate with the CPU. Even though the SSD might offer high speed read/write times, the controller might be a bottleneck. An SSD in a NAS will only be useful if the network connection is high speed (for example a 10 GB Ethernet connection). Disk controllers for SANs are improving each year and the latest models can offer improvements in performance when an SSD is used over a high speed disk. Depending on the motherboard, Peripheral Component Interconnect Express (PCIe) can offer very high speed throughput from the SSD to the CPU.

Though they are ideally suited to improve performance, if used for REDO logs, this is not recommended, as the REDO log is the most heavy I/O structure in the database and will likely increase the failure time of the SSD.

Mirroring an SSD will not result in less likelihood of failure, like a hard disk drive, because in a mirror, the same level of writes are happening to both drives in the mirror, resulting in the same approximate failure time period. Striping would be a more effective option.

Raid 0: stripe across both disks

Involves getting two (or more disks) and alternating the block writes between them. This evenly distributes all data across both disks. Using this method reads and writes are typically twice as fast (three times faster if three disks are used and so on). Raid 0 offers an ideal solution to improving disk read/write times provided the disks are on separate I/O channels. Most hardware with Raid built into the motherboard inherently support this.

The major disadvantage of Raid 0 is that if a disk is lost, all data is lost, even the data on the surviving disks. Using Raid 0 for storage is used in conjunction with Raid 1 or when the database backups and archives are proven reliable.

Raid 1: mirror

This involves taking two (or more) disks of the same size. When data is written to one disk, the same data is written to the other disk. The write speed is as fast as the slowest disks. The read speed though can be twice as fast as both disks can be utilized to read in the data, each reading in separate parts of the disks and working in tandem.

Mirroring is a preferred solution as it is highly reliable. If one disk fails, the other will continue and performance will not be affected. The damaged disk can be transparently replaced and then resilvered (brought back into sync) with the surviving disk.

Raid 1 is an ideal solution for storing key database tablespaces (for example, SYSTEM, UNDO) or for storing the redo logs. It's also ideal for storing the operating system, database kernel, and any other application software because read performance is improved.

Raid 0+1: stripe then mirror

This involves a minimum of four disks. Two disks are striped as per Raid 0 (making it appear as one logical disk, sometimes called a volume). The resultant disk is then mirrored with the other two drives.

This ensures the read/write performance is greatly improved and if a disk is lost, the system will continue. If one disk in the mirror is lost, then if another disk is also lost, it depends on whether it was the other mirror (system continues on) or if it was a striped disk that was lost, whether there is complete failure or continued survivability.

Raid 1+0: mirrors then stripe

This is also referred to as Raid 10. It involves a minimum of four disks. Two disks are mirrored together creating a new logical disk. The other two disks are mirrored together also creating a new logical disk. These two resultant logical disks are then striped as per Raid 0. Writes are evenly split and then sent to the mirrored disks. In Raid 0+1, writes are evenly distributed and sent to the stripe disk which is then mirrored. The entire stripe is mirrored. With Raid 1+0, as the stripe is done first, one write is sent to one mirror, and the other write is sent to the other mirror. The subtle change in ordering improves reliability and performance of the raid structure.

As for Raid 0+1, if one drive is lost the system continues to run. The next drive to fail determines whether or not there is catastrophic failure. The strength of Raid 1+0 with reliability is seen when more than four drives are used.

Raid 5: parity check

Requires a minimum of three disks. A block of data is written to one disk, another block to the next disk, and a third disk contains parity information, which is information that can be used to restore either the block on the first or second disk if either disk is lost. The parity check is distributed evenly across all three disks.

This involves much slower writes as the parity information has to be calculated and written to disk. The parity is calculated using a logical xor function.

The advantage of Raid 5 is a saving on disk space. Whereas Raid 1 requires one disk to be a mirror for every disks, Raid 5 only requires an additional drive to support three or more disks. Raid 5 can survive one disk failure, but if another disk fails then there is catastrophic failure.

Raid 6: double parity check

This option requires a minimum of four disks as parity checks are now stored across two disks. Only if three disks are lost, does catastrophic failure occur. This option provides an additional layer of safety that Raid 5 does not give, allowing the database administrator additional breathing space to work with in the event of media failure.

Raid 5 and Raid 6 have in the last five years received a large amount of negative press about being used for storage of an Oracle Database. The testing done and conclusions drawn are that Raid 1+0 is better, more reliable, and faster. With individual disks now at the 3 TB size, it's feasible for a relational database to happily fit on a Raid 1+0 structure, even if each disk is only 1 TB in size (providing 2 TB of storage).

With unstructured data, the size of the database could grow into the tens to hundreds of terabyte range. As ideal as Raid 1+0 is, it just might not be cost effective to use it. If a large portion of the unstructured data is partitioned and made read only, a more realistic solution might be to use Raid 5 for the storage of all the partitions. Raid 6 might even be considered to be an option if backups are taken to tape and the time to restore from tape might take weeks. These structures might not give as good a performance as Raid 1+0, but will still prove to be highly reliable and storage scalable, allowing for large amounts of unstructured data to be stored, and even in the event of disk failure, no recovery is needed. If the Raid structure manages the parity data within specialized hardware controllers, the write overhead can be reduced further. As there are a large number of disks, the administrator might be in a position to mix up the Raid structure, possibly having a Raid 1+0 for the operating system and key database files, while using Raid 5 for the tablespaces containing multimedia data.

NAS

A Network Attached Storage (NAS) device is usually a low-cost computer acting as a fileserver. Access to it is via a network and typically uses a Raid disk structure to improve reliability. Storing the database on a NAS is not ideal as the network connection will very quickly become a bottleneck (the exception is covered later). Tablespaces containing multimedia, which is infrequently accessed, could be stored on an NAS, but the database administrator would have to factor in and balance network speed.

With the recent introduction of 10 GB Ethernet the rules have changed regarding the NAS. Rather than the network being the bottleneck, attaching multiple 10 GB Ethernet adapters enables a large amount of data to be passed at high speed from the disk system to the computer. This can now be sent faster than via a disk system. As this is new technology, which is still expensive, most hardware servers do not offer 10 GB Ethernet. This of course will change in the next five years enabling the NAS to become a popular high speed storage medium.

SAN

A Storage Area Network (SAN) is similar in some ways to an NAS; a SAN differentiates itself in that it can be treated by the server as a disk (offering high speed access). A SAN works on the block level and enables high speed replication (mirroring) to remote locations by quickly passing block changes to the remote server. A SAN is an ideal solution for storing large volumes of database files on it and then using the backup capabilities of the SAN to backup/recover the database, rather than using the database backup tools (like rman). It has to be carefully stated that before a database administrator jumps in and just solely relies on the SAN to perform backups, they should consult the vendor to see whether they are supported with Oracle and in addition, the administrator should do some high I/O tests and recovery to ensure that a restore is actually possible.

A SAN can internally handle a large number of disks, and most vendors support adding new disks to the existing SAN enabling it to grow in size. For storing and managing multimedia digital objects, a SAN is an attractive solution.

The downside of the SAN is the efficiency at which it is hooked up to the computer server. If it uses only one channel, a large amount of I/O will result in the channel becoming a bottleneck, which wouldn't normally be seen if the disks were distributed across multiple channels on the server.

Setting up Oracle XE to run Oracle Multimedia

Oracle XE has two major limitations that most database administrators would deem to be showstopper issues and not consider XE for use with multimedia. The first issue is the 11 GB database storage limit. If the decision is made to store all digital images outside the database and just use a BFILE or HTTP pointer to reference them, the database storage remaining that can be used for relational data is quite a large amount. An application with auditing can manage tens of thousands of images of any size. The administrator has to now manage backups of external files and deal with the potential for orphaned images, but can still manage a healthy volume of digital images using the remaining powerful features found in the database as well as PL/SQL.

The second major issue is that Oracle XE doesn't come with Java in the database and all the Oracle Multimedia methods are written in Java. This means there is no image processing, metadata extractions, watermarking, or identification. An experienced developer could write their own equivalent functions in PL/SQL, or use Open Source tools to achieve these capabilities. Also, not all digital image applications require use of these capabilities. Using the htp, utl_file, or dbms_lob routines (which do not require Java), an application can be written that uses the basic functions to develop a web-based application that manages and delivers multimedia.

Even with these limitations it's still possible to create the ORDSYS type definitions without the methods, so that the Oracle Multimedia object structure can be used. That way, the schema can be ported seamlessly to the standard or enterprise editions of the database.

The scripts to install the types can be found in the $ORACLE_HOME\ord\im\ admin directory (not included with XE but included with a version installed with Multimedia). Scripts to review include the following:

ordsrcsp.sql

ordispec.sql

ordimssp.sql

ordisits.sql

ordaspec.sql

ordvspec.sql

orddspec.sql

Summary

There are numerous ways to confi gure an Oracle Database to store large volumes of unstructured data. Some of the key areas to review include the database block size, the size of the UNDO tablespace, the placement and sizes of the redo logs, and the configuration and extent size of the tablespaces used to store the data. Different architectural configurations are available based on the number of disks, CPUs, and available memory on the server. The use of solid state drives can improve load time performance.

Even though the Oracle XE database has no built in support for Java and Oracle multimedia, its usage should not be dismissed outright as a database for storing multimedia as its core architecture offers features and performance characteristics that still make it ideal to use.

Alerts & Offers

Series & Level

We understand your time is important. Uniquely amongst the major publishers, we seek to develop and publish the broadest range of learning and information products on each technology. Every Packt product delivers a specific learning pathway, broadly defined by the Series type. This structured approach enables you to select the pathway which best suits your knowledge level, learning style and task objectives.

Learning

As a new user, these step-by-step tutorial guides will give you all the practical skills necessary to become competent and efficient.

Beginner's Guide

Friendly, informal tutorials that provide a practical introduction using examples, activities, and challenges.

Essentials

Fast paced, concentrated introductions showing the quickest way to put the tool to work in the real world.

Cookbook

A collection of practical self-contained recipes that all users of the technology will find useful for building more powerful and reliable systems.

Blueprints

Guides you through the most common types of project you'll encounter, giving you end-to-end guidance on how to build your specific solution quickly and reliably.

Mastering

Take your skills to the next level with advanced tutorials that will give you confidence to master the tool's most powerful features.

Starting

Accessible to readers adopting the topic, these titles get you into the tool or technology so that you can become an effective user.

Progressing

Building on core skills you already have, these titles share solutions and expertise so you become a highly productive power user.