3 Number of Passes of External Sort N B=3 B=5 B=9 B=17 B=129 B= , , , ,000, ,000, ,000, ,000,000, Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 7 Internal Sort Algorithm Quicksort is a fast way to sort in memory. An alternative is tournament sort (a.k.a. heapsort ) Top: Read in B blocks Output: move smallest record to output buffer Read in a new record r insert r into heap if r not smallest, then GOTO Output else remove r from heap output heap in order; GOTO Top Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 8 More on Heapsort Fact: average length of a run in heapsort is 2B The snowplow analogy Worst-Case: What is min length of a run? How does this arise? Best-Case: B What is max length of a run? How does this arise? Quicksort is faster, but... Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 9

Sorting revisited How did we use a binary search tree to sort an array of elements? Tree Sort Algorithm Given: An array of elements to sort 1. Build a binary search tree out of the elements 2. Traverse

Heaps CSE 0 Winter 00 March 00 1 Array Implementation of Binary Trees Each node v is stored at index i defined as follows: If v is the root, i = 1 The left child of v is in position i The right child of

In-Memory Databases Algorithms and Data Structures on Modern Hardware Martin Faust David Schwalb Jens Krüger Jürgen Müller The Free Lunch Is Over 2 Number of transistors per CPU increases Clock frequency

Data storage Tree indexes Rasmus Pagh February 7 lecture 1 Access paths For many database queries and updates, only a small fraction of the data needs to be accessed. Extreme examples are looking or updating

Magento & Zend Benchmarks Version 1.2, 1.3 (with & without Flat Catalogs) 1. Foreword Magento is a PHP/Zend application which intensively uses the CPU. Since version 1.1.6, each new version includes some

DBMS / Business Intelligence, SQL Server Orsys, with 30 years of experience, is providing high quality, independant State of the Art seminars and hands-on courses corresponding to the needs of IT professionals.

C H A P T E R12 Query Processing Practice Exercises 12.1 Assume (for simplicity in this exercise) that only one tuple fits in a block and memory holds at most 3 blocks. Show the runs created on each pass

Google File System Web and scalability The web: - How big is the Web right now? No one knows. - Number of pages that are crawled: o 100,000 pages in 1994 o 8 million pages in 2005 - Crawlable pages might

Virtual Memory COMP375 Computer Architecture and Organization You never know when you're making a memory. Rickie Lee Jones Design Project The project is due 1:00pm (start of class) on Monday, October 19,

Analysis of Algorithms I: Binary Search Trees Xi Chen Columbia University Hash table: A data structure that maintains a subset of keys from a universe set U = {0, 1,..., p 1} and supports all three dictionary

CS473 - Algorithms I Lecture 9 Sorting in Linear Time View in slide-show mode 1 How Fast Can We Sort? The algorithms we have seen so far: Based on comparison of elements We only care about the relative

Converting a Number from Decimal to Binary Convert nonnegative integer in decimal format (base 10) into equivalent binary number (base 2) Rightmost bit of x Remainder of x after division by two Recursive

Data Structures and Algorithm Analysis (CSC317) Intro/Review of Data Structures Focus on dynamic sets We ve been talking a lot about efficiency in computing and run time. But thus far mostly ignoring data

Computer Organization and Architecture Chapter 4 Cache Memory Characteristics of Memory Systems Note: Appendix 4A will not be covered in class, but the material is interesting reading and may be used in

Effective Java Programming measurement as the basis Structure measurement as the basis benchmarking micro macro profiling why you should do this? profiling tools Motto "We should forget about small efficiencies,

Dedicated Hosting The best of all worlds. Build your server to deliver just what you want. Only pay for what you use with no long term contracts. High availability, your server is in the cloud. Dedicated

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB Planet Size Data!? Gartner s 10 key IT trends for 2012 unstructured data will grow some 80% over the course of the next

Summer Student Project Report Dimitris Kalimeris National and Kapodistrian University of Athens June September 2014 Abstract This report will outline two projects that were done as part of a three months

CIS 631 Database Management Systems Sample Final Exam 1. (25 points) Match the items from the left column with those in the right and place the letters in the empty slots. k 1. Single-level index files