'SQL Everywhere Edition' - What. How. Why.

Description

Meet some of the minds behind 'SQL Everywhere Edition', a database technology that, among other things, enables you to embed a database right into your Windows application that you can program against in ways your are accustomed to when working against
a full SQL server application.

What is 'SQL Everywhere Edition'?? You'll find out exactly what this technology is from the key minds behind it: Anil Nori, Distinguished Engineer and database guru, Technical Lead Steve Asker and Software Engineer Carlton Lane.

The Discussion

SQL Server Everywhere is a relational store and does not have the file system integration like WinFS did. Yes, it does enable some of the WinFS scenarios. BTW, you can store data from multiple applications in the same .sdf file if they are created by the
same user. That is, you can run multiple applications accessing the same .sdf file. The difference is the database (dlls) are embedded in the apps.

Yes, we are exploring XML support. Do you want to just store XML docs or do you also want to query (e.g. using XPath, XQuery)?

As I replied to some other mail, it is possible for multiple apps to share the same database (.sdf file). SO, there can be n:1 relationship between the apps and the database. The key is it is a single user database (but can be accessed by multiple apps).

Yes two different two databases can be synchronized. Currently, SQL Server Everywhere supports a Hub and Spoke sync model. That is, several Everywhere databases can sync to a server DB (e.g. SQL Server). SO, it is possible for two .sdf s to sync via SQL Server.

In future, we do plan to support Peer-to-Peer sync, where two Everywhere DBs can synchronize with each other.

Very informative video, great job guys. My friend was telling me recently that he was getting compiler errors when he was calling the Dispose() method of the SqlCeDataReader object. Is this still an issue?

Several scenarios that I thought were possible through WinFS can be done using this edition. Well, I cannot use data of other applications but persistence and query processor by themselves give lot of power to an application and cuts down lot of custom code.
Above all it is free.

This has been around for a while. Apache Derby (formerly
IBM Cloudscape) is DB2 compatible, open source, and written entirely in Java. It can be embedded in an application or run as a server. If I recall the footprint was only like 2MB.

Hi Anil,This looks like a nice initiative for application scenarios which you mentioned (like flying in a plane and needing the app to run).

My question from that view was, how would SQL Everywhere support distributed apps (with a centralized database?). Correct me if I am wrong, is SQL Everywhere targeted only for 1-to-1 scenarios between an app and an database? What if both you and me are on a
plane, and running the same app and making local changes to our copies of database (either to disk or in-memory). Would there be an options to 'synch' our copies later on?

Yes, there have been other small footprint SQL databases. SQL Server Everywhere is different from Derby (Cloudscape) in couple of ways: - SQL Server Everywhere is componentized -- the storage engine, the query processor, API providers, and the sync components are separate DLLs. So, it is possible for an application to choose just the storage engine without taking the other components. This
provides extensibility -- in fact, you could right your own query processor on top of the storage engine

- SQL Server Everywhere runs on the devices as well as on the win32 platforms. I am not sure if Cloudscape runs on devices

- SQL Server Everywhere provides flexible sync components. Sync is an important part of the Everywhere scenarios

Anil, during the interview, you basically dismissed (but justified) the loss of stored procedures. I then started musing over this. I suppose I should preface what I’m about to say by stating up front that I’ve only recently started to work with Relational
databases (about 15 months, ago), so if I’m off-base, please be kind.

Anyway, I basically asked myself, “What problems do stored procedures solve ?” and “Is there a way of leveraging these solutions, so that business logic follows, or is synchronised along with the data ?”. Compounding this is the issue of Synchronisation, itself.

Synchronisation, first. A few days ago, there was a presentation at the Sydney .NET users group by Geoff Orr, where he discussed the use of Partitioned Tables, a new (?) feature of SQL 2005 Enterprise, and that seems like a perfect fit for disconnected data
sources. In this scenario, each disconnected-user maintains a Partition within each of the (Partitioned) Tables on a Server, and thus user synchronisations are “sandboxed”. Any thoughts on this ?

Now moving to the business logic. Is the reason why your not supporting stored procedures centred around the SQL engine’s size, and if so, would there be any benefit to re-examining how Stored procedures are implemented ?. Eg The stored procedures actually
being (re)compiled into .NET Intermediate Language, extending IL, if necessary ?

I suppose I should actually read up on how and when SQL compiles stored procedures, but with experts around... . What I’m picturing is that TSQL is initially compiled into IL, and the IL form of the stored procedures are synchronised. If appropriate/necessary,
the IL form is then further processed (JIT-compiled) by the respective data storage engine.

Obviously, there are going to be issues with what or how much of the business logic needs to stay with the data (appropriate synchronisation). The point being that at the enterprise-level, we can continue to move business logic into the data-tier (following
best practices, yes ?). I’m imagining some form of (high-tech) code analysis, or a rudimentary (programmer-controlled) form of procedure tagging (attributed code).

I wish I had more time to look into WinFS and LINQ. I mean, does LINQ surface stored procedures as methods of a (database) object ?

I think arguing about suporting stored procedures in SQL/e based only on where to put your business logic is pointless, since wheter placing the BL on the data base or not, you're going to end up some other large SQL sentence like queries or complex CRUD
methods that are worth placing on your data base out of pure "well-formedness" of your code.

One of the other, reasons why not to support SP, is that the SDF is an pure data format, hence safe to send over, say, e-mail. This arguement comes from the usage of Extendes Stored Procedures that can execute external programs becoming a source of attacks.

However, simply do not support Extended stored procedures. But leave regular Stored Procedures. I don't see how security of your box are going to be afected by store procedures that can only access/modify the data within the file.

SP are required to have a really painless upgrade to a fully fledged SQL Server product. I'd even dare to say that leave the excution plan part of SP and just store the SQL sentence to allow it be executed SP-style, in order to keep SQL/e lean and mean (something
a really love about it, I most say)

Raptor

PS: Anyway, You guys are really in the right direction with SQL/e. Having tried the CTP, I just love it.

I think having SQL Everywhere databases be data-only is the right decision. Binding executable code to the format, even if done in a safe manner, could undermine the compentized model due to additional coupling. For instance, it might be harder to keep
the storage engine and query analyzer separate. The ability to pick and choose, as Anil mentioned, is definitely one of the strengths of this product.

A better solution, perhaps, would build on Richard's reasoning: make T-SQL a .Net language. This might streamline migration considerably. Although SQL Server 2005 can handle SPs in C# and VB, so migrating may actually not be that difficult. (I certainly appreciate
that some SPs are more easily written in T-SQL... though LINQ may blur this distinction.)

You mail and questions touch on multiple aspects: partitioning, sync, and stored proc. Let me express my thoughts on the first two in this response. I will cover stored procs in a separate replay, as it has wider interest and broader impact.

Richard, I think your thoughts on partitioning are consistent with distributed applications but SQL Server 2005 partitioning is too low level and at a physical abstraction. Besides table partitioning in commercial DBMS is primarily provided to support high
availability and scalability. Most DBMSs provide a single table abstraction and seldom support direct operations on partition (the exception is in Tandem's NonStop SQL, which provided partition level autonomy). Except for some physical admin operations, most
of operations are at the table level. Therefore, such partioning does not really help build SOA (service oridented architecture) apps.

However, the general idea of partitioning data and the logic is a scalable way to build large applications.

As for sync, data sync is critical in building distributed apps. BTW, I distinguish between synchronization and replication. Synchronization is between any data sources -- i.e the participating data sources can be heterogeneous. Replication is homogeneous
synchronization -- i.e. the participating data sources are same (or subsets). While data sync is is practical, logic sync is complex and not pragmatic. That is, synchronizing the logic, whenever data is synchronized, that is associated with data. While logic
associated with data can be propogated when the data is synchronized, it is realistic in a highly controlled and homogeneous environments. This is fundamentally contradictory to loosely connected, distributed, SOA based applications. Most loosely coupled and
disconnected applications do not requie symmetic logic and synchronization of locig. Since SSEv (SQL Server Everywhere Edition) focus is "edge" apps in the loosely couple application architecture, SSEv does not pursue logic synchronization.

Now let's get to the crux of Richards mail -- "To sproc or Not To sproc." This I want to continue in a separate mail

Today we have a know issue with SqlCeDataReader::Dispose(bool). Currently developers cannot access this function due to its protection level. We have written a KB article regarding this issue.
https://support.microsoft.com/?kbid=912432

At present the work around is to use the Dispose() function without any parameters.

﻿I think having SQL Everywhere databases be data-only is the right decision. Binding executable code to the format, even if done in a safe manner, could undermine the compentized model due to additional coupling. For instance, it might be harder to keep
the storage engine and query analyzer separate. The ability to pick and choose, as Anil mentioned, is definitely one of the strengths of this product.

So,Store them in: the "Storage Engine"Analyze them in: the "Query Analyzer"

Which means, Storage Engine without the Query analyzer (or the other way around): No stored procedures, Period!! at least as we know'em.

The point is, offer the option.

Raptor

PS: I'd like to know who wants to use SQL/e with out the Query Analyzer and the T-SQL subset it provides, which really the single MOST important feature of it.

So,Store them in: the "Storage Engine"Analyze them in: the "Query Analyzer"

Which means, Storage Engine without the Query analyzer (or the other way around): No stored procedures, Period!! at least as we know'em.

The point is, offer the option.

Raptor

PS: I'd like to know who wants to use SQL/e with out the Query Analyzer and the T-SQL subset it provides, which really the single MOST important feature of it.

First to make sure we all are on the same page, SQL Server Everywhere (SSEv -- btw, this is the preferred acronym), does support Storage enginge and T-SQL Query Processor. It also supports Query Analyzer. You can use SSMS (management studio) and look at the
query plans. SSEv is well integrated with SSMS.

Yes, SSEv supports only a subset of T-SQL but a rich SQL subset. What we do not support is the procedural T-SQL (stored procedures) and that is what seem to be the issue. While store procs are interesting they are not a show stopper.

There is huge interest in SSEv internally and externally. Several Windows apps will be using SSEv technology in Vista. Currently, most customers are using it on devices but many are interested in running these apps on the desktop as well.

BTW, there many apps which just want to use the storage engine. They just want key based access like in ISAMs. AD (Active Directory) is a good example which just uses the Jet storage engine. It does not use the query processor; Exchage is another example. So,
most of the Jet usage is just using the storage engine.

Again, I go back to the componentization arguments. We provide the storage engine and the QP; app can chose what it wants to use.

Hi Anil,I think that you can help me in this field to you congenial.I must to realize a database application for an Automation system consisting of a Industrial PC with Operative System microsoft windows embedded.

1° question: Is Microsoft sql server 2005 everywere compatible for Microsoft windows embedded?(I ask this because in specific of its is not clear)

2° question: In the case that it's compatible, if I instal it in a industrial Pc can it to act by database server or necessity of a remote server?

I have been meaning to post a reply on store procedures but did not get around to doing it. I think the discussion on Stored Procedures (sprocs) is somewhat “religious.” I am sure there are some who want to use sprocs for most of their application logic
and there is also huge trend where much of the logic is outside the database (so use sprocs in a specialized way). Here are some of my personal thoughts.

Prior to Microsoft, I have done both database server development and as well as multi-tier application development. My views on application logic development have evolved over time.

Stored procedures really came into play during the Client – Server days. Sybase is probably the first one to introduce T-SQL procedures in late 80s. Oracle followed it with PL/SQL and DB2 with DB2 Stored Procedures. The main purpose was to push logic into
the database - -in fact, that was the only place you could place it because clients were not rich clients. This happened after the TP monitor-era. As the DBMSs got rich in functionality, most of the TP monitor capabilities were subsumed by the DBMS. Until
Oracle Applications Release 10.7 (even may be Release 11), most of the application logic is coded as PL/SQL code running in the Oracle RDBMS.

How much of the different types of logic is hosted in stored procs (in the database) depended on the specific database system and the application. For example, earlier Oracle applications placed all the logic (data, business process, and presentation) in
stored procedures. The database server was the hosting server for apps; there is probably couple million lines of PL/SQL code running in the database server. Even the forms were generated in the DBMS (like HTML pages being generated in the web servers).
Oracle workflow engine ran in the DBMS and modeled and interpreted the business processes. Note that not necessarily all the logic was related to data. This is partly due to the richness of the PL/SQL language itself (it is pretty much a full-fledged modern
block-structured programming language). Such 2-tier architecture, while my look simpler, has its limitations both in application scalability and in application portability. As the applications got complex and the number of users increased, DBMSs had to be
installed on bigger and expensive boxes. Also porting the apps to other databases became a problem; of course this was not a consideration for Oracle applications. BTW packaged applications like SAP, which ran on multiple DBMSs, coded most of the business
process and presentation logic in ABAP and hosted it in the mid-tier. SAP wanted to be database agnostic; therefore, used very little of the vendor-specific capabilities. However, for performance reasons, SAP does host large number of stored procedures in
the DBMS.

Early to mid 90s saw the emergence of the browser, application server and multi (3) –tier applications. By late 90s, Java and .NET application server became very common. In the 3 –tier app stack, the business process and presentation logic moved to the app
server tier. Database hosted the data and the really data-centric logic (as stored procedures) and the browser provided client interaction via HTML. Note that the HTML pages were still generated in the application/web server using ASP/JSP technologies. The
key take away is some of the stored procedure logic moved to the mid-tier. In fact, even Oracle apps (11 and 11i) started moving to 3-tier architecture – starting with presentation logic in the mid-tier to more business process logic in the application
server.

What followed the app server era is the web services and application integration era. Web services allowed data and business processes to be composed into richer and flexible applications. This led to the generalization of the application stack to Service
Oriented Architecture (SOA), where data and the data-centric logic could be hosted in different service boundaries. Data as the authoritative source resided in some back-end database or application but caches (or copies) of the same data could reside in other
service tiers, as reference data. For example, consider a Product Catalog browsing application, where Product data may be coming from one or more backend apps but the integrated product data could be cached in the mid-tier for performance and scalability reasons.
With large amounts of cached data, applications demanded rich data processing on the cached data – like key based access, filtering, reporting etc. In was unacceptable and inefficient to push the data into a traditional relational database, just to query
data or to run stored procedure logic. In such architectures, even the data-centric logic could not be locked inside the DBMS. These applications require rich data access and management capabilities on the data that is part of the application (physically
in the same process). This has lead to the emergence of in-memory/embedded DBMSs. An interesting and relevant question is whether stored procedures make sense in these embedded/in-memory DB environments, where data co-resides with the application logic. The
applications can write even the data-centric logic in an abstraction that is close to the host programming environment (e.g. Linq) and run on the data residing in the application. This is no longer needs to be stored procedure logic.

More recent phenomenon is the emergence of Web 2.0, where even some of the rich presentation (generation) and web service integration (mash-ups) is occurring in the browser. The trend is to move logic, therefore data, even closer to the client. This fits-in
with SOA apps, where the clients (e.g. browser) become “edge” applications. As these edges can be anywhere (e.g. on the desktops, devices), the need for disconnected and occasionally connected “edge” applications is increasing. Every app wants to become like
an Outlook app with Cached mode, where users interact with the application the same way whether the app is connected to or disconnected from the backend data sources. This requires rich data processing capabilities in the clients now. Again, what does it
mean to have stored procedures in these clients?

I have described the structure of the multi-tiered application logic and the role of stored procedures in such environments. I happened to believe that stored procedures would be scoped primarily to the database servers, capturing data-centric logic for
performance reasons. A class of such stored procedures includes specialized user-defined functions (UDFs) and user-defined aggregates (UDAs). It should be noted that embedded and in-memory DBs also need to provide some mechanisms to support user-defined aggregates.
This could be achieved by registering application defined aggregates with the database so that they can be invoked on a per row basis.

Another reason why people want stored procedures is to provides encapsulation for some application logic and require the same encapsulation to run on all tiers. For example, the desire to run the same T-SQL stored procedures (app logic) on SQL Server on
the backend as well as in SQL Server Everywhere in the mid and client tiers. This is the down-sizing requirement – that is, requirement to run the server-side logic on the client tiers. I am nor sure how practical this is but I believe this by far the most
compelling argument in favor of stored procedures in SQL Server Everywhere and in all tiers. For SQL Server Everywhere, we have to weigh this requirement (it benefits) with the negative impact it will have on the footprint and the complexity of SQL Server
Everywhere. Some replies here suggested compiling all DB logic to some IL and supporting such IL on all tiers. This is an interesting thought and requires further investigation.

In summary,

• I believe stored procedures are definitely useful and will continue to be critical for data-centric logic in DB servers.

• Multi-tiered architectures will host more logic outside the database; the benefits of stored procedures outside the back-end database tiers is not very clear

• Currently the requirements for small footprint and the embedded-ness of SQL Server Everywhere outweigh the stored procedure support requirement.

﻿Hi Anil,I think that you can help me in this field to you congenial.I must to realize a database application for an Automation system consisting of a Industrial PC with Operative System microsoft windows embedded.

1° question: Is Microsoft sql server 2005 everywere compatible for Microsoft windows embedded?(I ask this because in specific of its is not clear)

2° question: In the case that it's compatible, if I instal it in a industrial Pc can it to act by database server or necessity of a remote server?

I guess you are asking does SQL Server Everywhere Edition run on Windows Eembedded platforms?

1. I believe SQL Server Everywhere runs on embedded platforms, for example on XP/Embedded. However, currently it is not certified on the embedded platforms. That mean, currently it is not supported on embedded platforms officially

2. I did not quite understand the 2nd question. Are you asking if it runs as a server -- the answer is no.

Anil,thanks for posting your thoughts about Stored Procedures and, more specifically, their history – it’s good to see that at least some of the “folk-lore” of Stored Procedures that I have heard, do have some basis in fact. It also, perhaps, explains the
“enthusiasm” of some the responses (sensitivity to Ba$tardisations of the past).

So what was I raving on about ?

As I stated up front, I do not have an in-depth understanding SQL, and more specifically, SQL Server. I see, now, that some of the things that I referred to as “Business Logic”, fall into the category of “Validation”, and can be achieved without the use of
(user-defined) “Stored Procedures”. Though, I suspect, these functionalities would actually be specialisations of Stored Procedures, behind the scenes.

It’s with this naivety that lead me to propose ways of working-around things that have already been side-stepped via this Data-centric logic: “Appropriate Synchronisation” (between SQL Server and SQL EveryWhere) being achievable via Code-Analysis and/or Code-Attribution
of Stored Procedures.

So thanks for taking the time to point this out. Though in hindsight, it is a little embarrassing, on my part – I knew that, sort of thing…

Anyway, I had a sense of deja-vou when you talked about memory-based databases.

I’ve been pretty much a low-level, bits’n’Bytes developer for a couple of decades, now. And the reason why I just recently had to start using SQL was due a sub-contracting job where they were building an Account Reporting (read-only) application that needed
to have Query responses in tenths of a second. The problem was that the “Database solution” was quoted as requiring a dozen or more Servers, and a similar number of Administrators !?. Well to me, this sounds like someone wants to buy a new house, or another
agenda is at work. But, they’re the experts, I suppose.

Anyway, I was asked to (re)write a pilot of memory-based data-storage engine, on the cheap, and I decided to start from scratch, but without all nice, flexible stuff such as being able to define the structure of the data, instead opting to hard-code this
– the existing RecordSet/DataSet stuff I looked at being far too slow – less than 10K fine. Some were OK to ~100K, but forget a Million or more. So, the end result was a COM object that has no trouble storing (caching) 3.5 Million records in ~200MB of RAM,
and it’s methods being able to return results (hard-coded queries) in the low tenths of a second – 25 Test clients continually hitting the COM server, 6 threads (usually). Onsite, I estimate they’re loading between 6 and 9 Million records – at least in the
stress testing. 4GB RAM in the Server.

This seems to have been successful, and now I need to evolve this. One of the things they’d like, is to be able to use SQL in their queries: “Yeah, OK. And for my next magic trick …”. So now the question (yay !!):

Would there be a way to cherry-pick features out of SQL EveryWhere, so I could, for example, use the Query Engine ? And better still, be able to “pre-compile” queries, in place of stored procedures ? - the aim, obviously, being able to replace my existing
hard-coded “storage engine”.

PS to Site developers: My paragraph formatting is being lost. I hope it isn't lost when I post this edit (again)...

crifo wrote:﻿Hi Anil,I think that you can help me in this field to you congenial.I must to realize a database application for an Automation system consisting of a Industrial PC with Operative System microsoft windows embedded.

1° question: Is Microsoft sql server 2005 everywere compatible for Microsoft windows embedded?(I ask this because in specific of its is not clear)

2° question: In the case that it's compatible, if I instal it in a industrial Pc can it to act by database server or necessity of a remote server?

I guess you are asking does SQL Server Everywhere Edition run on Windows Eembedded platforms?

1. I believe SQL Server Everywhere runs on embedded platforms, for example on XP/Embedded. However, currently it is not certified on the embedded platforms. That mean, currently it is not supported on embedded platforms officially

2. I did not quite understand the 2nd question. Are you asking if it runs as a server -- the answer is no.

Can you clarify your questions?

anil

Looks like I mis-spoke about the SQL Server Everywhere Edition support on XP/embedded. I believe we do certify it and support it. So, the good news is it will be supported on XP/e when SSEv ships later in the year.

However, the usage is still as a embedded in the application. That is, it cannot be used as a service.

I am tech lead on an unmanaged C++ project. I hope you might be able to answer these questions on SSEv:

1.) Can I use unmanaged C++ to work with SSEv; must I use Visual C++ for Devices?

2.) Can I use an ISAM interface? 2.a.) If so, can you comment on the ISAM performance relative to going through the query engine?2.b.) Where can I find good detailed information on using the ISAM interface?

3.) Is it possible to use OLEDB from unmanaged C++ in order to create my own schema?3.a.) Can I do this with ISAM?

4.) I have heard that there is a 128 MB limit on the DB file size by default. Also, I have heard that setting the connection string to the 4 GB limit does not work from unmanaged code. Please comment.

I just started experimenting with this - and so far it really looks like the perfect fit to several desktop applications that I'm working on.So thanks for making this available!

Two questions:

1) It appears that "TOP" (as in "SELECT TOP 5 * FROM...") is not supported.Did I miss something - and if not - could you explain why this was left out?Is there a better way, or some reason this should not be used.

2) For desktop applications - What are the major differences between using SQL Everywhere and Access?