Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Keep relevant data closest to the CPU in memory once it has been read from disk. More memory reduces the need for costly “page-in” operations from disk by reducing the need to “page-out” data to make space for new data. Memory bus speed is still much slower than CPU bus speeds, often becoming a bottleneck as CPU speeds increase. It’s important to have the fastest memory speed and FSB that your chipset will support. More CPU cores allows you to parallelize workloads. A multithreaded database takes advantage of multi-processing bydistributing a query into several threads across multiple CPUs, drastically increasing the query’s efficiency while reducing its process time. Faster disks with high bandwidth and low seek times maximize read performance into memory for CPUs to process complex queries. OLAP databases benefit from this because they scan large datasets frequently. Using RAID allows you to aggregate disk I/O by striping data across several spindles, drastically decreasing the time it takes to read data into memory and write back onto the disks during commits, while also providing massive storage space, redundancy and fault-tolerance.

Set realistic goals, know the hardware’s expected limitations....Measure current performance....Analyze the results (research upgrades and possible performance problems)Modify the system..Benchmark again....Repeat as needed

Client issues a query across the network ....Database server searches cache and memory for database extents...if they’re not found in memory, they’re located on disk...Disks then seek out the blocks containing the database extents and begin loading the data into memory...Memory pages are then fed into CPU’s cache and ultimately into the CPU for processing...Results are found and sent back to the client

Each CPU has several coresInternal Clock Speed: processes per second in MHz or GHz (advertised)External Clock Speed: speed the FSB is accessed (typical bottleneck)Memory Clock Speed: speed at which RAM is given requests for data (another bottleneck)PostgreSQL is multi-process, one Unix process per DB connection.A single connection can only use on CPU, not multithreaded.

CPU speed has increased roughly 70% each year, memory speed hasn’t kept up.DDR Memory (double data rate) allows for sending data to the CPU at the top and bottom of the clock cycle (sine wave). Doubling throughput....still bottleneck.Memory tradeoff is typically speed for capacity, faster = less capacity/more expensiveFurther out from the CPU you go, the slower and greater capacity the storage. Also, disk is the only permanent storage....also holds swap.

Here you see how each Intel processor shares a common FSB bandwidth, dividing the bandwidth per CPU. Access to memory must be at the reduced bandwidth, through the northbridge memory controller, and into the memory banks.AMD’s approach places a Northbridge controller directly on each processor, so there’s no external chipsets to deal with. Each processor features three point-to-point HyperTransport links, delivering 3.2 GB/s of bandwidth in each direction (6.4GB/s full duplex). So AMD’s scalability was better in the earlier days of Xeon multiprocessing.

“Second Generation” (Harpertown) Xeon Processors E5200/E5400:==========================================Each CPU has a clock speed of 2GHz, 12MB of L2 cache, and a FSB of 1333MHz (1600MHz max on other models). The read bandwidth for each DDR2 667-MHz memory channel is 5.325 GB/s which gives a total read bandwidth of 21.3 GB/s for four memory channelsWrite memory bandwidth through the same four channels is 10.7 GB/s write memory bandwidth for the same four memory channels. Overall Effective bandwidth to memory is then 32 GB/s ... 21.3 GB/s read and 10.7 GB/s write.5500-series "Nehalem-EP" (Gainestown) adds: (December 2008)=================================Integrated memory controller supporting 2-3 DDR3 memory channelsPoint to Point processor interconnect called “QuickPath” (like AMD’s HyperTransport), bypassing FSBHyperthreading, doubling each core

There’s 3 delays associated with reading or writing to a hard drive:Seek Time, Rotational Delay, and Transfer TimeSeek Time is the time it takes for the drive’s read/write head to be physically moved into the correct place for the data being sought.Rotational Delay is the time required for the addressed area of the disk to rotate into a position where it is accessible by the read/write head….typically measured in milliseconds.Transfer Time is the time it takes to transfer data from the disk through the read/write head, across the storage bus, into memory for processing by the CPU.Seek Time/Rotational Delay is heavily influenced by the disk’s rotational speed (RPMs), data location on the actual platters, how many platters the disk has, and the diameter of the platters.Generally speaking, the faster a disk spins, the lower its seek times will be. Also, the further outside the circumference of the platter data is located, the faster it will be sought and lower it’s rotational delay will be.Bandwidth/Throughput (Transfer Time): Once data is located, this is the raw throughput rate at which data is transferred from disk into memory. This can be aggregated using RAID, which will be discussed later.SATA-I bandwidth is 1Gb/s which translates into ~ 150MB/s real speed.SATA-II and SAS bandwidth is 3Gb/s, which translates into ~ 300MB/s real speed.Generally speaking, the higher the data density of the platter, the more data will be sent through the read/write head per block…resulting in higher throughput and lower transfer times.

Buffer/Cache:Writeback-cacheData normally written to disk by the CPU is first written into disk’s the cache. This allows for higher write performance with the risk that data stored in cache isn’t flushed to disk before a power During idle machine cycles, the data are written from the cache into memory or onto disk. Write back caches improve performance, because a write to the high-speed cache is faster than to normal RAM or disk….this cache aids in Addressing the disk-to-memory subsystem bottleneck.I’ve enabled write-back caching on all of our RAID arrays. RAID will be discussed later.

4. Track Data Density :Defines how much information can be stored on a given track. The higher the track data density, the more information the disk can store on one track. If a disk can store more data on one track it does not have to move the head to the next track as often. This means that the higher the recording density the lower the chances are that the head will have to be moved to the next track to get the required data.

RAID: (n = number of drives in array) “Redundant Array of Inexpensive Disks”. RAID systems improve performance by allowing the controller to exploit the capabilities of multiple hard disks to get around performance-limiting mechanical issues that plague individual hard disks. Different RAID implementations improve performance in different ways and to different degrees, but all improve it in some way. RAID0 “Striping” (n) : Fastest due to no parity…raw cumulative speed. Single drive failure causes the entire array to fail. “All-or-none” RAID1 “Mirroring” (n/2): Each drive is mirrored, speed and capacity is ½ of RAID0, requires even number of disks in order to be divided. Entire source or mirror array can go bad before data is jeopardized.RAID5 “Striping w/Parity” (n – 1): Fast, with a drive set aside for fault-tolerance. Only one drive can fail before the array is lost.RAID6 “Striping with dual Parity” (n -2): Fast, with 2 drives set aside for fault tolerance. Two drives can fail before the array is lost.

Normal PCI’s bandwidth is 132MB/sAGP8x is 2.1GB/sPCI Express outperforms PCI significantly:PCIe is bidirectional/full-duplex...allowing data to flow in both directions simultaneously (doubling throughput):PCIe 1x = 500MB/s (250MB/s each way)PCIe 2x = 1GB/s (500MB/s each way)PCIe 4x = 2GB/s (1GB/s each way)PCIe 8x = 4GB/s (2GB/s each way)PCIe 16x = 8GB/s (4/GB each way)Even PCIe 32x = 16GB/s (8GB/s each way)So, to open the Internally, you want to use PCIe, not just plain PCI which is old and slow in comparison.AGP is also obsolete due to PCIe’s introduction, now graphics cards use this interface as well.Regular PCI is a bottleneck in modern computers.All of our 2950 servers have PERC6/i RAID controllers built in, the “i” means “integrated’ on the motherboard.I found that our throughput was significantly slower than what we expected, despite having 6 SATA-II drives even in RAID0. The settings we selected for the RAID Virtual Drives were: Stripe Element Size 64KB Read Policy: Adaptive Read-Ahead (to optimize large read operations) Write Policy: Write BackWe were seeing read speeds in RAID5 of approximately 150-225MB/s across 4 drives..which we knew was way too slow given the hardware.After rebuilding the array several times and searching around on the Internet, I came across DELL’s PERC firmware update site, which showed that a newer release was available: v.6.2.0-0013“performance enhancements including significant improvements in random-write performance, multi-threaded write performance, and reduction in maximum and average I/O response times.”I couldn’t flash the PERC controllers without a floppy, so I had to create a Linux based FreeDOS bootable CD with the updated PERC firmware in a subdirectory, allowing me to successfully flash the controller’s BIOS. Later, I discovered that DELL’s OpenManage CD provides a tool to handle BIOS updates, however, I wasn’t able to get this working....so the FreeDOS solution worked out.I also dug around and found that I could set filesystem read-ahead parameters through “hdparm” in Linux that would allow me to tell the OS to read ahead 2048 blocks whenever a read operation was performed. I set this in /etc/rc.local to persist after a reboot.Once the PERC controller was flashed and linux filesystem readahead was set, performance increased dramatically:We’re now seeing just over 500MB/s reads in RAID5/6.This significantly reduces the time it takes to load tables into memory for complex queries, thereby reducing overall query execution time. Performance is now on par with GreenPlum without having to pay $40,000/year licensing.

Keep relevant data closest to the CPU in memory once it has been read from disk. More memory reduces the need for costly “page-in” operations from disk by reducing the need to “page-out” data to make space for new data. Memory bus speed is still much slower than CPU bus speeds, often becoming a bottleneck as CPU speeds increase. It’s important to have the fastest memory speed and FSB that your chipset will support. More CPU cores allows you to parallelize workloads. A multithreaded database takes advantage of multi-processing bydistributing a query into several threads across multiple CPUs, drastically increasing the query’s efficiency while reducing its process time. Faster disks with high bandwidth and low seek times maximize read performance into memory for CPUs to process complex queries. OLAP databases benefit from this because they scan large datasets frequently. Using RAID allows you to aggregate disk I/O by striping data across several spindles, drastically decreasing the time it takes to read data into memory and write back onto the disks during commits, while also providing massive storage space, redundancy and fault-tolerance.

Keep relevant data closest to the CPU in memory once it has been read from disk. More memory reduces the need for costly “page-in” operations from disk by reducing the need to “page-out” data to make space for new data. Memory bus speed is still much slower than CPU bus speeds, often becoming a bottleneck as CPU speeds increase. It’s important to have the fastest memory speed and FSB that your chipset will support. More CPU cores allows you to parallelize workloads. A multithreaded database takes advantage of multi-processing bydistributing a query into several threads across multiple CPUs, drastically increasing the query’s efficiency while reducing its process time. Faster disks with high bandwidth and low seek times maximize read performance into memory for CPUs to process complex queries. OLAP databases benefit from this because they scan large datasets frequently. Using RAID allows you to aggregate disk I/O by striping data across several spindles, drastically decreasing the time it takes to read data into memory and write back onto the disks during commits, while also providing massive storage space, redundancy and fault-tolerance.

Database Types:The time and expense involved in retrieving answers from databases means that a lot of business intelligence information often goes unused. The reason: most operational databases (OLTP) are designed to store your data, not to help you analyze it. The solution: an online analytical processing (OLAP) database, a specialized database designed to help you extract business intelligence information from your data.

A connection from an application program to the PostgreSQL server has to be established.The parser stagechecks the query transmitted by the application program for correct syntax and creates a query tree. The rewrite systemtakes the query tree created by the parser stage and looks for any rules (stored in the system catalogs) to apply to the query tree. It performs the transformations given in the rule bodies. The planner/optimizer takes the (rewritten) query tree and creates a query planthat will be the input to the executor. It does so by first creating all possible paths leading to the same result. For example if there is an index on a relation to be scanned, there are two paths for the scan. One possibility is a simple sequential scan and the other possibility is to use the index. Next the cost for the execution of each path is estimated and the cheapest path is chosen. The executor recursively steps through the plan tree and retrieves rows in the way represented by the plan. The executor makes use ofthe storage systemwhile scanning relations, performs sorts and joins, evaluates qualifications and finally hands back the rows derived.

GreenPlum and PostgreSQL:We found that despite the claims above, GreenPlum was overpriced, slow, and problematic. Furthermore, our GreenPlum PSA database grew to exceed the hardware we had in place, requiring us to constantly have to manually delete old tables. To replace GreenPlum while maintaining the table structures already in place, we opted to go with PostgreSQL, aware that it’s not pre-optimized for OLAP/data-warehouse applications.The mindset in doing this was that we could tweak PostgreSQL to mimic the actual performance we saw from GreenPlum without having to pay an expensive license.Understanding how to tweak PostgreSQL to mimic the performance of GreenPlum requires an understanding of PostgreSQL query execution characteristics and its tweak file concepts.

Max_connections sets the maximum number of client connections per server. Several performance parameters use “max_connections” as part of their formula for tweaking Postgresql.Shared buffers: As the name implies, this is the maximum shared memory allowed to PostgreSQL. Too much and you risk paging.Working Memory:You need to consider what you set max_connections to in order to size this parameter correctly. This is a setting where data warehouse systems, where users are submitting very large queries, can readily make use of many gigabytes of memory. This size is applied to each and every sort done by each user, and complex queries can use multiple working memory sort buffers. Set it to 50MB, and have 30 users submitting queries, and you are soon using 1.5GB of real memory. Furthermore, if a query involves doing merge sorts of 8 tables, that requires 8 times work_mem.

Max_connections sets the maximum number of client connections per server. Several performance parameters use “max_connections” as part of their formula for tweaking Postgresql.Shared buffers: As the name implies, this is the maximum shared memory allowed to PostgreSQL. Too much and you risk paging.Working Memory:You need to consider what you set max_connections to in order to size this parameter correctly. This is a setting where data warehouse systems, where users are submitting very large queries, can readily make use of many gigabytes of memory. This size is applied to each and every sort done by each user, and complex queries can use multiple working memory sort buffers. Set it to 50MB, and have 30 users submitting queries, and you are soon using 1.5GB of real memory. Furthermore, if a query involves doing merge sorts of 8 tables, that requires 8 times work_mem.

Partitioning refers to splitting what is logically one large table into smaller physical pieces. Partitioning can provide several benefits: Query performance can be improved dramatically in certain situations, particularly when most of the heavily accessed rows of the table are in a single partition or a small number of partitions. The partitioning substitutes for leading columns of indexes, reducing index size and making it more likely that the heavily-used parts of the indexes fit in memory. When queries or updates access a large percentage of a single partition, performance can be improved by taking advantage of sequential scan of that partition instead of using an index and random access reads scattered across the whole table. Bulk loads and deletes can be accomplished by adding or removing partitions, if that requirement is planned into the partitioning design. ALTER TABLE is far faster than a bulk operation. It also entirely avoids the VACUUM overhead caused by a bulk DELETE. Seldom-used data can be migrated to cheaper and slower storage media.

Vacuuming ensures the databases remain ACID.... Atomic, Consistent, Isolated, and Durable.Atomicity: Guarantees that either all of the tasks of a transaction are performed or none.Consistency: Only valid data will ever be written to the database.Isolation: other operations cannot access or see the data in an intermediate state during a transaction.Durability: once the user has been notified of the transaction’s success, the transaction will persist and not be undone….thus surviving a system failure.

disk I/O was roughly 90MB/s Database volume was constantly around 90% capacity, causing Mike to have to manually delete tables…space was at a premium.Licensing with GreenPlum was expensive ($20,000/6 months….$40,000/yr.) made sense to leverage the underlying free open source code and scrap the proprietary distributed DB solution

Through process of elimination and online research (Google and Postgresql forums) we have setteled on the above settings in the PSA server’s configuration file:max_connections = 25Mike confirmed that only 10-15 PSA ever really connect at any given time, so this setting allows for spikes while remaining conservative enough to not inflate “work_mem” as work_mem uses max_connections in its memory allocation formula.shared_buffers = 4096MB This number comes from best practices, ¼ total physical memory (16GB)

4.
4<br />Designing Information Structures for Performance and Reliability : Discussion Outline<br />Systematic tuning essentially follows these five steps:<br />Assess the problem and establish numeric values that categorize acceptable behavior. (Know the system’s specifications and set realistic goals.)<br />Measure the performance of the system before modification. (Benchmark)<br />Identify the part of the system that is critical for improving the performance. This is called the “bottleneck”. (Analyze)<br />Modify that part of the system to remove the bottleneck. (Upgrade/Tweak)<br />Measure the performance of the system after <br /> modification. (Benchmark)<br />Repeat steps 3-6 as needed. <br /> (Continuous Improvement)<br />

5.
5<br />I. Database Server Performance: Data Flow Concepts<br />DB Files are stored in the filesystem on disk in blocks.<br />A “job” is requested, initiating a “process thread”, associated files are read into memory “pages”.<br />Memory pages are read into the CPU’s cache as needed.<br />“Page-outs” to disk occur to make space as needed. <br />“Page-ins “ fromdisk are what slows down performance<br />Once in CPU cache, jobs are processed in threads per CPU (or “core”).<br />

6.
6<br />I. Database Server Performance: Hardware & Operating System Considerations<br />Server Performance Considerations:<br />CPU:<br /> Each CPU has at least one core, each core processes jobs (threads) sequentially based on the job’s priority. Higher priority jobs get more CPU time. Multi-threaded jobs are distributed evenly across all cores (“parallelized”).<br />Internal Clock Speed: Operations the CPU can process internally per second in MHz, as advertised.<br />External Clock Speed: Speed at which the CPU interacts with <br /> the rest of the system….also known as the front side bus (FSB).<br />Memory Clock Speed: Speed at which RAM is given requests for data.<br />Important PostgreSQL Performance Note:<br />PostgreSQL uses a multi-process model, meaning each database connection has its own Unix process. Because of this, all multi-cpu operating systems can spread multiple database connections among the available CPUs. <br /> However, if only a single database connection is active, it can only use one CPU. <br /> PostgreSQL does not use multi-threading to allow a single process to use multiple CPUs. <br />

7.
7<br />I. Database Server Performance: Hardware & Operating System Considerations<br />Server Performance Considerations:<br />Memory Architecture and FSB (Front Side Bus):<br /> On Intel based computers the CPU interfaces with memory through the “North Bridge” memory controller, across the FSB (Front Side Bus).<br /> FSB speed and the NorthBridge MMU (memory management unity) drastically affects the server’s performance, as it determines how fast data can be fed into the CPU from memory.<br /> Unless special care is taken, a database<br /> server running even a simple sequential <br /> scan on a table will spend 95% of its cycles<br /> waitingfor memory to be accessed.<br /> This memory access bottleneck is even more <br /> difficult to avoid in more complex database <br /> operations such as sorting, aggregation and <br />join, which exhibit a random access pattern.<br /> Database algorithms and data structures <br /> should therefore be designed and optimized <br /> for memory access from the outset.<br />

8.
8<br />I. Database Server Performance: Hardware & Operating System Considerations<br />Intel “Xeon” based systems: Memory Access Challenges<br />FSB is a fixed frequency and requires a separate chip to access memory.<br />Newer processors will run at the same fixed FSB speed. Memory access is delayed by passing through the separate controller chip. <br />Both Processors share the same Front Side Buseffectively halving each processors bandwidth to memory, thereby stalling one processor while the other is accessing memory or I/O.<br />All processor to system I/O and control must use this one path. <br />One interleaved memory bank for both processors, again, effectively halving each processor’s bandwidth to memory. <br />Half the bandwidth of a 2 memory bank architecture. <br />All program access to graphics, PCI(e), PCI-X or other I/O must be through this bottleneck<br />

15.
11<br />I. Database Server Performance: Hardware & Operating System Considerations<br />Disk Considerations (secondary storage):<br />Seek Time/Rotational Delay:<br /> How fast the read/write head is positioned appropriately for reading/writing and how fast the addressed area is placed under the read/write head for data transfer…<br />SATA (Serial Advanced Technology Attachment) drives are cheap and come in sizes up to 2.5TB, typically maxing out at 7200RPMs. (“Velociraptor” is the exception @ 10,000RPM)<br />SAS (Serial Attached SCSI) drives are twice as fast (15,000 RPMS) and typically twice as expensive, with roughly 1/5 the max capacity of SATA (~450GB).<br />Bandwidth/Throughput (Transfer Time):<br />Raw throughput rate at which data is transferred from disk into memory. This can be aggregated using RAID, which will be discussed later.<br />SATA-I bandwidth is 1Gb/s which translates into ~ 150MB/s real speed.<br />SATA-II and SAS bandwidth is 3Gb/s, which translates into ~ 300MB/s real speed.<br />

16.
12<br />I. Database Server Performance: Hardware & Operating System Considerations<br />Disk Considerations (secondary storage):<br />Buffer/Cache:<br /> Disks contain intelligent controllers, read cache and write cache. When you ask for a given piece of data, the disk locates the data and sends it back to the motherboard. It also reads the rest of the track and caches this data on the assumption that you will want the next piece of data on the disk. <br /> This data is stored locally in its read cache. If, sometime later you request the next piece of data and it is in the read cache the disk can deliver it with almost no delay.<br />Write back cache improves performance, because a write to the high-speed cache is faster than writes to normal RAM or disk….this cache aids in addressing the disk-to- memory subsystem bottleneck.<br /> Most good drives feature a 32MB buffer cache.<br />

17.
13<br />I. Database Server Performance: Hardware & Operating System Considerations<br />Disk Considerations :<br />4. Track Data Density :<br /> Defines how much information can be stored on a given track. The higher the track data density, the more information the disk can store.<br /> If a disk can store more data on one track it does not have to move the head to the next track as often. <br /> This means that the higher the recording <br /> density the lower the chances are that the <br /> head will have to be moved to the next track <br /> to get the required data.<br />

18.
14<br />I. Database Server Performance: Hardware & Operating System Considerations<br />Disk Considerations:<br />5. RAID: (n = number of drives in array)<br /> “Redundant Array of Inexpensive Disks”. Pools disks together to aggregate their throughput by “striping” data in segments across each disk. Also provides fault-tolerance. (n = number of drives)<br />RAID0 “Striping” (n) : Fastest due to no parity…raw cumulative speed. Single drive failure causes the entire array to fail. “All-or-none”<br /> RAID1 “Mirroring” (n/2): Each drive is mirrored, speed and capacity is ½ of RAID0, requires even number of disks in order to be divided. Entire source or mirror array can go bad before data is jeopardized.<br />RAID5 “Striping w/Parity” (n – 1): Fast, with a drive set aside for fault-tolerance. Only one drive can fail before the array is lost.<br />RAID6 “Striping with dual Parity” (n -2): Fast, with 2 drives set aside for fault tolerance. Two drives can fail before the array is lost.<br />

20.
16<br />I. Database Server Performance: Hardware & Operating System Considerations<br />Filesystem Considerations<br />As an easy performance boost with no downside, make sure the file system on which your database is kept is mounted &quot;noatime&quot;, which turns off the access time bookkeeping.<br />XFS is a 64-bit filesystem, supports a maximum filesystem size of 8 binary exabytes minus one byte.<br />On 32-bit Linux systems, XFS is “limited” to 16 binary terabytes.<br />Journal updates in XFS are performed asynchronously to prevent a performance penalty.<br />Files and directories in XFS can span allocation groups, each allocation group manages its own inode <br />tables (unlike EXT3/EXT2), providing scalability and parallelism.<br />Multiple threads and processes can perform I/O operations on the same filesystem simultaneously.<br />On a RAID array, a “stripe unit” can be specified within XFS at creation time. This maximizes throughput<br /> by aligning inode allocations with RAID stripe sizes.<br />XFS provides a 64-bit sparse address space for each file, which allows both for very large file sizes, <br />and for holes within files for which no disk space is allocated.<br />

21.
17<br />I. Database Server Performance: Hardware & Operating System Considerations<br />Takeaways from Hardware Performance Concepts:<br /> Keep relevant data closest to the CPU in memory once it has been read from disk. <br />More memory reduces the need for costly “page-in” operations from disk by reducing the need to “page-out” data to make space for new data.<br />Memory bus speed is still much slower than CPU bus speeds, often becoming a bottleneck as CPU speeds increase. It’s important to have the fastest memory speed and FSB that your chipset will support.<br />More CPU cores allows you to parallelize workloads. <br /> A multithreaded database takes advantage of multi-processing by<br /> distributing a query into several threads across multiple CPUs, <br /> drastically increasing the query’s efficiency while reducing its <br /> process time.<br />Faster disks with high bandwidth and low seek times maximize <br /> read performance into memory for CPUs to process complex queries. <br /> OLAP databases benefit from this because they scan large datasets <br /> frequently.<br /> Using RAID allows you to aggregate disk I/O by striping data across several spindles, drastically decreasing the time it takes to read data into memory and write back onto the disks during commits, while also providing massive storage space, redundancy and fault-tolerance.<br />

24.
20<br />II. Software & Application Considerations: OLAP and OLTP <br />Database Types:<br />OLAP (Online Analytical Processing):<br />OLAP databases should only receive historical business data and remain isolated from OLTP (transactional) databases. Summaries not transactions.<br />Data in OLAP databases never change, OLTP data constantly changes.<br />OLAP databases typically contain fewer tables arranged into a “star” or “snowflake” schema. <br />The central table in this star schema is called the “fact table”. The leaf tables are called “dimension tables”. The facts within a dimension table are called “members”.<br />The joins between the dimension and fact tables allow you to browse through the facts across any number of dimensions.<br />The simple design of the star schema makes it easier to write queries, and they run faster. OLTP database could involve dozens of tables, making query design complicated. In addition, the resulting query could take hours to run.<br />OLAP databases make heavy use of indexes because they help find records in less time. In contrast, OLTP databases avoid them because they lengthen the process of inserting data.<br />

25.
21<br />II. Software & Application Considerations: OLAP and OLTP <br />Database Types:<br />OLAP (Online Analytical Processing):<br />The process by which OLAP databases are populated is called: Extract, Transform, and Load (ETL). No direct <br />data-entries are made into a OLAP database, only summaritive bulk ETL transactions.<br />A cube aggregates the facts in each level of each dimension in a given OLAP schema. <br />Because the cube contains all of the data in an aggregated form, it seems to know the answers to queries in advance.<br />This arrangement of data into cubes overcomes a limitation of relational databases.<br />

34.
“In-database” compression, 3-10x disk space reduction, </li></ul> with corresponding I/O improvement.<br />License was $20,000 every 6 months ($40,000/yr.)<br />It’s important to note that PostgreSQL is free and can be modified to perform similarly to GreenPlum. We did just that with our PSA server reconstruction project.<br />

35.
PostgreSQL tweaks explained:<br />PostgreSQL is tweaked through a configuration file called: “postgresql.conf” <br />This flat file contains several dozen parameters from which the master<br />PostgreSQL service “postmaster” reads at startup. <br />Changes made to this file require the “postgresql “ service to be bounced (restarted) via the command as root: “service postgresql restart”<br />Corresponding “postgresql.conf” parameter affecting query performance:<br />Maximum Connections (max_connections): Determines the maximum number of concurrent connections to the database server. Keep in mind that this figure is used as a multiplier for work_mem. <br />Shared Buffers (shared_buffers): The shared_buffers configuration parameter determines how much memory is dedicated to PostgreSQL to use for caching data. If you have a system with 1GB or more of RAM, a reasonable starting value for shared_buffers is 1/4 of the memory in your system. <br />Working Memory (work_mem): If you do a lot of complex sorts, and have a lot of memory, then increasing the work_mem parameter allows PostgreSQL to do larger in-memory sorts which, unsurprisingly, will be faster than disk-based equivalents. <br />25<br />II. Software & Application Considerations: PostgreSQL Tweaks<br />

36.
26<br />The default POSTGRESQL configuration allocates 1000 shared buffers. Each buffer is 8 kilobytes. Increasing the number of buffers makes it more likely backends will find the information they need in the cache, thus avoiding an expensive operating system request. The change can be made with a postmaster command-line flag or by changing the value of shared_buffers in postgresql.conf.<br />The default POSTGRESQL configuration allocates 1000 shared buffers. Each buffer is 8 kilobytes. Increasing the number of buffers makes it more likely backends will find the information they need in the cache, thus avoiding an expensive operating system request. The change can be made with a postmaster command-line flag or by changing the value of shared_buffers in postgresql.conf.<br />II. Software & Application Considerations: PostgreSQL Tweaks<br />PostgreSQL tweaks explained:<br />Shared Buffers<br />PostgreSQL does not directly change information on disk. Instead, it requests data be read into the PostgreSQL shared buffer cache. PostgreSQL backends then read/write these blocks, and finally flush them back to disk.<br />Backends that need to access tables first look for needed blocks in this cache. If they are already there, they can continue processing right away. <br />If not, an operating system request is made to load the blocks. The blocks are loaded either from the kernel disk buffer cache, or from disk. These can be expensive operations. <br />The default PostgreSQL configuration allocates 1000 shared buffers. Each buffer is 8 kilobytes. <br />Increasing the number of buffers makes it more likely backends will find information in cache...to a limit.<br />

37.
27<br />The default POSTGRESQL configuration allocates 1000 shared buffers. Each buffer is 8 kilobytes. Increasing the number of buffers makes it more likely backends will find the information they need in the cache, thus avoiding an expensive operating system request. The change can be made with a postmaster command-line flag or by changing the value of shared_buffers in postgresql.conf.<br />The default POSTGRESQL configuration allocates 1000 shared buffers. Each buffer is 8 kilobytes. Increasing the number of buffers makes it more likely backends will find the information they need in the cache, thus avoiding an expensive operating system request. The change can be made with a postmaster command-line flag or by changing the value of shared_buffers in postgresql.conf.<br />II. Software & Application Considerations: PostgreSQL Tweaks<br />PostgreSQL tweaks explained:<br />Shared Buffers “How much is too much?” <br />Setting “shared_buffers” too high results in expensive “paging”...which severely degrades the database’s performance.<br />If everything doesn&apos;t fit in RAM, the kernel starts forcing memory pages to a disk area called swap. It moves pages that have not been used recently. This operation is called a swap pageout. Pageouts are not a problem because they happen during periods of inactivity. <br />What is bad is when these pages have to be brought back in from swap, meaning an old page that was moved out to swap has to be moved back into RAM. This is called a swap pagein.This is bad because while the page is moved from swap, the program is suspended until the pagein completes.<br />

38.
PostgreSQL tweaks explained:<br />Horizontal “Range” Partitioning:<br />Also known as “shard” involves putting different rows into different tables for improved manageability and performance. <br />Benefits of partitioning include:<br />Query performance can be improved dramatically in certain situations, particularly when most of the heavily accessed rows of the table are in a single partition or a small number of partitions. The partitioning substitutes for leading columns of indexes, reducing index size and making it more likely that the heavily-used parts of the indexes fit in memory.<br />When queries or updates access a large percentage of a single partition, performance can be improved by taking advantage of sequential scan of that partition instead of using an index and random access reads scattered across the whole table.<br />Seldom-used data can be migrated to cheaper and slower storage media. <br />28<br />II. Software & Application Considerations: PostgreSQL Tweaks<br />

39.
PostgreSQL tweaks explained:<br />Partitioning (cont.)<br />The benefits will normally be worthwhile only when a table would otherwise be very large. <br />The exact point at which a table will benefit from partitioning depends on the application, although a rule of thumb is that the size of the table should exceed the physical memory of the database server. <br />The following forms of partitioning can be implemented in PostgreSQL: <br />Range Partitioning (aka “Horizontal”)<br />The table is partitioned into &quot;ranges&quot; defined by a key column or set of columns, with no overlap between the ranges of values assigned to different partitions. For example one might partition by date ranges, or by ranges of identifiers for particular business objects. <br />List Partitioning <br />The table is partitioned by explicitly listing which key values appear in each partition. <br />29<br />II. Software & Application Considerations: PostgreSQL Tweaks<br />