Computer Science 4611
Database Management Systems

Introduction

In programming assignments 2-5, you will implement a simple DBMS.
For an overview of how you will construct the DBMS, read
this page.
For this assignment you will implement some of the query oriented operations
of the query language provided for you.
You will use the previously provided code for the Disk Space Manager, your
code for the Buffer Manager layer, your code for Heap Files, and your code
for the System Catalog from previous assignments.

You can find your team
assignment here.
You should begin by reviewing the book information on implementing query
operators.
This material has also be covered in class.
Most of the code to integrate the query language will be based on code you
developed in the Catalog part of this lab.
You will need to complete two functions: QLDeleteStatementNode::evaluate and
QLQueryItemStatementNode::evaluate which can be found in the file ql_evaluate.C.

To make a local copy of the code you need to implement you should download
the file raq.tar.Z.
This is a tared archive file.
To unpack this file you should do the following:

% uncompress raq.tar
% tar xvf raq.tar

This will create a directory RAQueryLanguage that contains the code provided to
you as well as skeletons for the code you need to write.
This code comes with a provided make file makefile.
It is unlikely you will need to modify this file (the only file(s) you should
need to change are ql_evaluate.C and possibly ql_parse.h).
The directory also includes a number of files that you will likely not need
to change (though you may adjust things as you need).
These will be discussed below.
The code also comes with a testing program ql_tester.
This test program will parse a series of query statements in a language discussed
below and then attempt to evaluate those statements.
When running the program you will type:

ql_tester QUERY_FILE

If you want to preserve the changes resulting from your queries you should add
the name of the database at the end of this line:

ql_tester QUERY_FILEDATABASE_FILE

This test program is automatically constructed by the makefile by simply
typing "make".
To recompile all of the code you first type "make clean" which will eliminate
all current .o and executable files and then type "make" again to recompile.
Note that you will have to copy your buf.h, buf.C, hfpage.h, hfpage.C,
heapfile.h, heapfile.C, scan.h, scan.C, attrcatalog.h, attrcatalog.C,
relcatalog.h, relcatalog.C, and attrcatalog.o files to this
directory for it to compile.

For those who did not get the first two parts of the project working, you can
download the archive raqa.tar.Z.
This has three libraries with working copies of the earlier code.

A Simple Query Language

Our test program will parse files consisting of a series of query statements.
The file ql_batter in the provided directories includes a sample query file
using the language described below.
The parser will not attempt to evaluate any query until all of the query
statements have been parsed.
Once the parser has parsed the entire file, it will print its version of
the queries parsed and then attempt to evaluate each query.

A statement in our query language will come from the following and end with
a semi-colon (;):

SHOW CATALOG

Print the contents of the catalog relations.

SHOW RELATION RelName

Print the contents of the relation RelName.

CREATE RELATION RelName ( Name1 : TYPE1, Name2 : TYPE2, ... )

Create the relation named RelName. The attribute names and their
types are listed between the parentheses. The possible types are
INT, FLOAT or STRING(Number) where
Number is a positive integer indicating the size of the string.

INSERT INTO RelName FROM "FileName"

Load the relation RelName with tuples from the file FileName.

INSERT INTO RelName ( Value1, Value2, ... )

Insert a tuple with values Value1, Value2, ... into relation
RelName. Note that the values must be provided in order and have
appropriate types for inserting into this relation.

DELETE (Condition) FROM RelName

Delete all of the tuples from RelName where condition Condition
is true.

A statement may also consist of a recursively defined query involving
SELECT, PROJECT and JOIN statements (and corresponding conditions).
Query items are constructed recursively using the following possible
components:

SELECT (Condition) (QueryItem)

Select those tuples from the query QueryItem that match
the condition.

PROJECT (RV1.AN1,RV2.AN2,...) (QueryItem)

Project the attributes RV1.AN1, RV2.AN2, ... from those tuples
produced by QueryItem. The RV values are relation
variables introduced as shown below. The AN values are
attribute names.

JOIN (Condition) (QueryItem1)(QueryItem2)

Join those tuples produced by QueryItem1 and QueryItem2
where the condition Condition is true. Note, the resulting
tuples should include all of the attributes from both relations.

[ RelationNameRelationVariable ]

Defines the relation variable RelationVariable which is
connected to the relation RelationName.

The conditions used in DELETE, SELECT and JOIN conditions are also recursively
defined using various components. A condition may be:

Code to implement the scanning and parsing aspect of this query language
can be found in the files ql_scan.C and ql_parse.C and the corresponding .h
files ql_scan.h and ql_parse.h.
The code for implementing these operations as queries can be found in
ql_evaluate.C.

Completing the Provided Code

You will be implementing the DELETE, SELECT, PROJECT and JOIN operations.
The relevant class structures for these operations include QLDeleteStatementNode
(DELETE), QLSelectQueryItem (SELECT), QLProjectQueryItem (PROJECT), and
QLJoinQueryItem (JOIN).
You will also need to look at the class QLQueryItemStatementNode, and the
classes used to define Conditions and the attribute list for PROJECT
(QLAttributeListNode).

In order to execute these operations you will need to further check the
parsed operations (not all of the checking has been done).
For the DELETE, SELECT and JOIN operations you need to check the provided
condition.
Things to check:

The final result of the condition should be an integer.

Any RelationVariable.AttributeName combination should correspond to
a relation variable and attribute name produced by the arguments to
the operation (e.g. SELECT (T.ID == G.A) FROM ([Teams T]) would have
two problems if ID is not an attribute of the relation Teams (the other
problem is that the relation variable G is not defined).

The comparison operators (e.g., ==, !=, etc.) must be applied to
arguments of the same type (INT and INT, etc.). The one exception you
make should be to allow comparisons of INT values and FLOATs.

The logical operators (AND, OR, NOT) must be applied only to integer
arguments.

You can define the operation to check this process recursively (to check
a two operand operation call your routine recursively on each argument to
get its type, then compare the types that are returned).
If a condition violates the type requirements you may simply terminate the
evaluation process.

Similarly, you will need to create a routine that calculates whether a
condition is true based on a tuple provided to that routine. This routine
can also be written recursively similarly to how you write the type checking
routine for conditions.

In order to construct the query processing routines you may want to add to
the existing classes or define your own class structure. Queries involving
SELECT, PROJECT and JOIN can be answered by constructing a structure similar
to the Scan structure for each type of query item in a query tree.
For example, a query item of the form [RelationVariable] can be made to
correspond to a scan that produces tuples from Relation one at a time
(using the getNext feature).
A SELECT is then hooked to a scan that produces tuples and the SELECT's getNext
routine runs until one of the tuples provided to the SELECT meets the
Condition.
The PROJECT and JOIN operations are similarly defined.
Note that the JOIN may be implemented using a simple nested loops approach.

Error Protocol and Debugging

Be sure to follow the error protocol described in new_error.h.
Note that you will likely want to add new error codes and new error messages
to the tables provided for you.

The make file compiles the code using the -g flag. This means that you
can debug the executables produced using gdb.
I have also set up the code with a command line debugging system.
When running either of the test programs you can add command line arguments
of db, bm, hf, or gory.
These turn on debugging flags in the Disk Space module (db), Buffer Manager
module (bm), HeapFile (hf) and some extended (gory details) flags (gory).
Note that you may want to add debugging commands in your Buffer Manager
code following this protocol.

What to Turn In, and When

Print out your versions of all of the files you changed.
You should test your code using the test routine ql_tester and
print out the results.
Try to completely test your code.
In your test directory I have included a sample query file ql_batter.
This set of commands is a starting point.
You should add more test files and comprehensively test your code and turn
in these results.

Once you have completed testing, write up a team report of how your code is implemented.
This report should give an overview of how you completed the evaluation code
and any changes you made to the classes provided and any new classes you added.
It should also discuss the algorithms you used to solve the problem.
This report should be at least two pages long but no longer than four pages.
Each team member should also write up an individual report (at least half
a page but no more than a page) discussing their contributions to the coding
process and how the overall team interaction went.