Volume Configuration Guide

( For Minimizing Disk I/O Contention )

Overview


When a database handles a transaction, it generally records a redo log using WAL (Write Ahead Logging) for recovery purposes. Changing, saving or deleting data is the general process of handling the transaction after recording the redo log. Both redo logs and data require persistent physical storage space. This physical storage can be allocated either to the same disk or to physically distinct disks.

However, the user should note that significant disk I/O contention may occur if redo logs and data are placed on the same disk while processing numerous transactions.

This guide describes how Altibase writes redo logs and data and how to configure disk volumes to minimize disk I/O contention.

This guide is up to date as of Altibase version 6.5.

What Causes Disk I/O


Significant disk I/O contention may occur if the same physical disk is used to write redo logs and store memory table datafiles and disk table datafiles.

Redo Log

Altibase’s redo logs record all changes that were made by any transactions. Redo logs are crucial for the purpose of data recovery.

Redo logs contain information about any changes made to data, as well as changes made to any resources required to handle transactions.

Altibase’s redo logs are initially recorded in the memory mapping area by mmap and are then periodically saved into the redo log file by LogSyncThread. The history of changes in the memory mapping area is guaranteed to be saved into a file even if an executed process terminates abnormally. Therefore, the log file cannot be lost except in the case of power failure or an OS hang (if this issue is encountered, please refer to the Altibase Troubleshooting Guide).

This method is used by default in order to minimize frequent disk I/O and maximize performance. However, the redo log writing method can be changed if high transaction performance is not required.

Check Point

As a hybrid database, Altibase provides support for both in-memory tables and disk tables. In this section, activities that incur disk I/O for memory tables will be discussed.

During the startup stage, data contained within memory tables must be loaded from disk. The storage space for in-memory data is managed as pages, and each page has a fixed size of 32K. Each page is divided into n slots based on the table record size, and each slot records data.

When data is changed, saved or deleted by a transaction, the page that stored the modified data is registered to the list of dirty pages. This process is managed internally. The process of saving these dirty pages to physical storage on disk is referred to as a checkpoint

Due to the fact that memory is a volatile storage medium, the checkpoint process is necessary to provide data durability in the case of situations such as power failures.

As the number of transactions being processed by in-memory tables increases, the number of dirty pages that must be flushed to disk by the checkpoint process will increase as well. If redo logs and in-memory table datafiles are located on the same physical disk, this load may cause disk I/O contention. In order to avoid any possible performance degradation, it is highly recommended to store redo logs and datafiles on separate physical disks.

For example:

Classification Disk Configuration
Redo Logs /ALTIBASE_REDO_LOG
Datafiles /ALTIBASE_DATA

On-Disk DB

In the previous section, we discussed the process of storing redo logs and causes for disk I/O for in-memory tables. This section will describe causes for disk I/O for disk tables, as well as configuration recommendations.

Because disk tables keep all data on a disk file, disk queries must access the physical disk every time a transaction is executed. This is extremely costly from a disk I/O perspective. Therefore, virtually every disk DBMS creates and utilizes a temporary memory storage area commonly referred to as a buffer. By storing frequently used data in the buffer area, disk I/O costs can be reduced significantly.

The size of the memory buffer is determined by the user and has a page size of 8K.

When a query is executed against a disk table, Altibase will initially search the buffer area to determine if it contains the requested data. If the data does not exist in the buffer, Altibase will load the data from the physical datafile. Any modified pages will be registered to the Flush List. When a flush occurs, any of the pages registered to this list will be transferred and stored on disk. If the buffer area lacks sufficient space, the LRU algorithm will identify and transfer rarely accessed pages to disk. This process is also referred to as BufferReplace.

If the datafiles for in-memory tablespaces and disk tablespaces are allocated to the same physical disks, performance may degrade if frequent checkpoints and BufferReplaces cause excessive disk I/O.

Therefore, for maximum performance the user should always allocate memory and disk datafiles to separate physical disks.

For example:

Classification Disk configuration
Redo Logs /ALTIBASE_REDO_LOG
Memory Table Datafiles /ALTIBASE_MEMORY_DATA
Disk Table Datafiles /ALTIBASE_DISK_DATA

Undo TableSpace

When a transaction modifies data, memory tables manage an undo image in memory for recovery purposes. The original data is copied to a separate area, and an out-place update is made to the replicated data. This method is also known as MVCC.

In contrast, disk tables copy the original data to the undo tablespace. The transaction then modifies the data located in the original location. This method is known as an in-place update.

For recovery, the copy of the original data will be copied back into the undo tablespace. However, it is important to note that the undo tablespace is being constantly updated as disk tables continue to process transactions. Therefore, the user should consider placing disk table datafiles and undo tablespace datafiles on separate physical disks to prevent disk I/O related performance issues.

For example:

Classification Disk Configuration
Redo Logs /ALTIBASE_REDO_LOG
Memory Table Datafiles /ALTIBASE_MEMORY_DATA
Disk Table Datafiles /ALTIBASE_DISK_DATA
Undo Tablespace Datafiles /ALTIBASE_DISK_UNDO

Recommendations for Effective Data File Configuration


As described in the previous sections, significant disk I/O contention may occur if the same physical disk is used to write redo logs and store memory table datafiles and disk table datafiles. This issue becomes most apparent when the system is under significant load. Redo logs, memory table datafiles and disk table datafiles should be placed on separate physical disks to reduce the chance of disk I/O related performance degradation.

Disk I/O

1_Fig1

As depicted visually above, avoiding disk bottlenecks is difficult if redo log writing, checkpointing and buffer management all occur on a single disk.

Configuration Example (1)

Adhering to the configuration below is highly recommended. If this configuration is not feasible due to your current system environment, at a bare minimum the redo logs should be segregated onto a separate physical disk.

Classification Disk Configuration
ALTIBASE HOME /ALTIBASE
Redo Logs /ALTIBASE_REDO_LOG
Memory Table Datafiles /ALTIBASE_MEMORY_DATA
Disk Table Datafiles /ALTIBASE_DISK_DATA
Disk Indexes /ALTIBASE_DISK_INDEX
Undo Tablespace Datafiles /ALTIBASE_DISK_UNDO
  • ALTIBASE_HOME is not only reserved for binary, header, library and other files for development and operation. ALTIBASE_HOME also saves critical trace files and should be allocated its own physical disk for maximum performance.
  • There is no need to allocate a separate physical disk for memory indexes, because indexes are reorganized in memory after they are loaded during Altibase’s STARTUP phase. Changing memory indexes does not require a separate login.
  • Separating disk table datafiles and disk indexes onto separate physical disks is recommended for the purpose of minimizing disk I/O contention that may occur from changes such as datafile expansion.
  • This document does not describe disk I/O considerations related to backup procedures. Please refer to the Backup/Recovery document for more information.

Configuration Example (2)

If physical disk availability is limited, the following configuration is recommended as a bare minimum.

Classification Disk Configuration
Redo Logs /ALTIBASE_REDO_LOG
ALTIBASE HOME and All Datafiles /ALTIBASE

This configuration is also plausible if Altibase’s hybrid functionality is not utilized. When using only memory tables or only disk tables, the risk of disk I/O contention is minimized.

Configuration Example (3)

If memory tables are used sparsely and the vast majority of data and processing is performed by disk tables, the following configuration can be considered.

Classification Disk Configuration
Redo Logs /ALTIBASE_REDO_LOG
ALTIBASE HOME and Memory Table Datafiles /ALTIBASE
Disk Table Datafiles 1 (Complex Tasks) /ALTIBASE_DISK_COMPLEX
Disk Table Datafiles 2 (Simple Tasks) /ALTIBASE_DISK_SIMPLE

Placing tablespaces that regularly process complex queries and tablespaces that typically process simple queries on separate physical volumes is an effective method of dispersing disk I/O. However, if the environment frequently executes BufferReplace processes, this configuration may suffer from performance degradation.

File System


This section describes the characteristics and configuration of Altibase’s supporting file system.

Supporting File System

Altibase supports most file systems with the exception of those that do not provide support for mmap or direct I/O

To utilize direct I/O, it may be necessary to change the mount option from the related file system. In order to utilize direct I/O in a file system that does not provide support for direct I/O, the properties of Altibase itself must be changed. Please refer to the manual or the altibase.properties file for more information regarding relevant settings.

OS File System Characteristics
Solaris UFS To use Direct I/O,mount options must be changed.
VxFS
ZFS Direct I/O is not supported. DB properties must be changed.
HP HFS
JFS To use Direct I/O, mount options must be changed.
VxFS To use Direct I/O, mount options must be changed.
AIX JFS
VxFS
Windows NTFS
FAT32
Linux Ext2/Ext3/Ext4

Unsupported File Systems

Attempting to run Altibase on the following filesystems may result in issues due to their lack of support for mmap or direct I/O.

  • Raw Storage Device
  • Altibase cannot access files configured with a raw storage device. The purpose of using a raw storage device is to directly control the OS’s file cache functions from the database. For this purpose, Altibase can use a file system that supports direct I/O instead of a raw device.
  • Part of NFS(Network File System), NAS ( Network attached Storage )
  • Errors may occur when Altibase attempts to create a datafile and/or logfile if a NFS/NAS file system that lacks support for mmap is used.

Disk I/O Optimization


The performance of Altibase is closely tied to disk I/O performance. This section describes several methods of improving disk I/O performance.

Striping

Striping is a method used to distribute and store file blocks across multiple disks. It dramatically improves concurrent file input/output performance.

The chosen striping method has a great impact on overall disk performance. In general, either RAID 0 + 1 (Striping + Mirroring) or RAID 5 is used for speed and stability.

The chosen RAID configuration method will also impact the number of physical disks that are required. A storage expert should be consulted to ensure that the appropriate configuration is chosen for the database’s size and number of available disks.

Change OS File Cache Setup

A proper file cache setup can suppress “swap out” conditions in the memory area used by Altibase. This can minimize performance degradation caused by swapping related OS layer disk I/O delay.

The file cache is a kind of system buffer managed by the operating system to relieve bottlenecks by speed differences between main memory and secondary memory. Each operating system manages these file caches using its own unique policy, but it is usually closely related to the operating system’s swapping policy. Swapping is useful when managing an application or a datafile that cannot fit entirely in main memory. However, swapping can delay the disk I/O performance of the OS layer and cause inconsistent database performance. In severe cases, swapping may cause the database itself to hang. Therefore, it is important to take swapping into consideration.

To guarantee consistent response time in Altibase, the related file cache and the swap kernel should be configured in advance to minimize swapping.

Please refer to the documents below for information on how to configure the cache properly:

  • HPUX Setup Guide for Altibase
  • AIX Setup Guide for Altibase
  • Solaris Setup Guide for Altibase
  • Linux Setup Guide for Altibase

Direct I/O

The OS file system contains a memory area referred to as the file buffer cache. The file buffer cache is similar to the buffered I/O process depicted below, and is structured in a way that improves disk access performance during file access by caching the accessed blocks.

1_Fig2

However, when it directly caches the data at the application level, overhead costs are incurred because the data must move from the disk to the file buffer cache before moving to the database’s buffer cache. This process is also known as “double copying” and results in increased CPU and memory consumption.

In this situation, direct I/O can reduce database CPU and memory utilization because direct I/O does not pass through the OS’s file cache. This reduction in CPU and memory utilization can result in improved overall performance.

To allow Altibase to input/output datafiles and log files using direct I/O, Altibase’s properties must be configured as follows:

  • DIRECT_IO_ENABLED = 1   # 0: Buffered I/O, 1:Direct I/O
  • DATABASE_IO_TYPE = 1 # 0: Buffered I/O, 1:Direct I/O
  • LOG_IO_TYPE       = 1 # 0: Buffered I/O, 1:Direct I/OMounting options used to enable the use of direct I/O for specific operating systems and file systems are shown in the following table:
  • Some operating systems or file systems may not support direct I/O for files or lack support for direct I/O at the application level. In these cases, configuration changes may need to be made to enable the use of direct I/O.
OS File System Required Action
Solaris UFS None
VxFS Mount with convosync=direct
ZFS Do not support Direct I/O
HP HFS None
JFS None
VxFS Mount with convosync=direct
AIX JFS Mount with use -o dio
VxFS Mount with convosync=direct
Windows NTFS None
FAT32 None
Linux(2.4 > K ) Ext2/Ext3/Ext4 None

When to use Direct I/O

When the database size exceeds the amount of available system memory and the disk buffer size is large, the use of direct I/O is often beneficial.

If the database size is large and large amounts of changes are made frequently, large amounts of disk I/O activity will occur during processes such as checkpoints. If direct I/O is not used, the duplicated copies of data in the OS file cache and the DB buffer cache may result in excessive CPU and memory utilization. If this situation occurs regularly, the use of direct I/O should be considered.

When to use Buffered I/O

In most cases, buffered I/O is preferable for performance because buffered I/O supports multi-block reads. Buffered I/O also pre-fetches required disk pages to improve input and output speed.

Page size

Altibase uses the term of page size which is also commonly referred to as block size. Altibase’s page size is fixed at 32K for memory tables and 8K for disk tables. These page sizes cannot be changed.

In addition, changing the operating system’s block size to match Altibase’s page size is not recommended. Benchmark testing has shown that mismatches between Altibase’s page size and the operating system’s block size have no measurable impact on overall performance.

Copyright ⓒ 2000~2016 Altibase Corporation. All Rights Reserved.

These documents are for informational purposes only. These information contained herein is not warranted to be error-free and is subject to change without notice. Decisions pertaining to Altibase’s product characteristics, features and development roadmap are at the sole discretion of Altibase. Altibase may own related patents, trademarks, copyright or other intellectual property rights of products and/or features discussed in this document.