Maurice Calhoun - Datasets Everywhere

Most commonly a datasets corresponds to the contents of
a single database table, or a single statistical data
matrix, where every column of the table represents a
particular variable, and each row corresponds to a given
member of the data set in question. The data set lists
values for each of the variables, such as height and weight
of an object, for each member of the data set. Each value
is known as a datum. The data set may comprise data for one
or more members, corresponding to the number of rows.

To me a datasets is an array of items... an array of items.
MYSQL, MSSQL, Oracle, PostgreSQL, SQLite, Mongo, etc... all
contains datasets or tables. To access these datasets you must write
database query in a language call SQL(Structured Query
Language). After running or execute the query(SQL), our results
is sometime a set of data. Awesome... so what's the
problem? Remember a datasets is an array of items not just
database objects.

What if our datasets does not come from a database, and when
we receive it, it comes in the form of an array, serialize
data, csv, excel, xml, etc. How do we query these items? I know
most of the time the before stated format are the results, but
what if they were the origins. How do we query it, how does that
query language looks? My programming language of choice is PHP,
and in that language we do query arrays, we parse them. Parsing
arrays can be messy and problematic. Here an example

That works, but it is inconstinct. There are so many what
if's. The above are parsing function, not querying statement.
But $company is a dataset, and if it was in a database, the previous
requests would have been consistent and concise. That is because
we would have written a query statement (SQL).

Select * From company Where last_name = 'Doe' and salary > 50000;

Database are not special, and its datasets are not more
important then any other kind. Datasets are datasets no matter
where they come from. The only important datasets are the ones
you are currently using. But I can not query them because the
language I use does not see my datasets as database tables.
This is what I would like to do

I have tried some methods to get this to work, like writing
regular expressions to parse a query statement. That's
crazy, have you seen Mysql
SELECT Syntax , that would be a massive undertaking. Why
recreate the wheel, this syntax already exists.

After exhausting that possibility, I started to look back at
the idea of a database. But here are my reservations about
that,

I would have to create a database

I would have to also create the tables and columns

I would also have to manager that database

That is a lot to do, for just to query a dataset. Or is it?
The part that I have the biggest concern about was the
management of the database. Why? We already have the dataset,
we don't need to store, we just need to query it.

I have realized that SQLite allows you to use a database
file or your can create it in memory. In memory.... That's
what I'm talking about. It all starts with

$pdo = new PDO('sqlite:memory:');

Now I can create my tables and columns on the fly, and I can
also import my data. I can run query on my data in a clean and
concise way. The best part about this solution, is that there
is no need for database management. The data has already been
management somewhere else, I just needed to query it.

The datasets type does not make a difference. All we need to
do is convert whatever we have to an array, and import it. The
bonus is that the database would only be around for as long as we need it.