Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS

Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS

Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS

Abstract

The invention establishes the context in which data exchanged between dissimilar relational database management systems can be mutually understood and preserved. The invention accomplishes this by establishing layers of descriptive information which isolate machine characteristics, levels of support software, and user data descriptions. Optimized processing is achieved by processing the different descriptor levels at different times during the development and execution of the database management systems. Minimal descriptive information is exchanged between the cooperating database management systems. For systems which match, data conversion is completely avoided. For systems which do not match, data conversion is minimized.

Description

~ackc3round of the Invention Technica] Field This invention relates to the characterizatioll of data for interprocess exchange. More particularly, the invention relates to estahlishillg the context in which data exchanged between dissimilar (heteroc3etleous) re]ational database management systems can he m~ltually ~Inderstood and preserved.

ne.scription of the Pr-ior Art Currently there is great interest in joininy together multiple database management sites to form a distributed system which provldes any ~Iser at any s;te with access to data stored at any ot}ler slt:e; see ror example, ANINTRODUCTION TO D~T~3~S~ ~Y.STEMS, Vo]. ], by C. J. Date (4th Edition, 1986), at pp. 5S7-62~. Date etlvisions that each site wo~ld colls1it~lte ~n elltile (iatabase system with its own database management system (DBMS), terminals, users, storage, alld CPU. In a distriblltec] database system sucll as the type described by Date, the DBMS at ally site may operate on a machine type Wllich is differellt than the machine type of another site. Indeecl, there may be as many different machine types as sites. FOL~ example, the IBM Corporation (Assignee of 1.his patellt application) has DBMSs which ~i.~034~ ~ 5 operate on System/370~ machilles, A~/400~ machines, and PS/2 machines. The machines upon which the DBMSs of a heterogeneous database system rull all represent information in different internal formats. For examrle, nu1neric information on PS/2 machines is storecl wit11 the bytes in low order to high order sequence. On othel: mac}li.lle.s, such information may be stored in high order to l.ow order sequence. Eor floating point information, thele are IEEE f].oating point machines and hexadecimal floatincJ point mach11les. Character information is processed in many differellt code repre.sentations, the choice of which ref]ects historical or cultural roots.~ s DBMSs gl~OW alld evol~e over time, they may be embodied in a series o~ versions or releases. Each of these may require additional informatioll to he exchanged in a distributed database systell1. Whell these changes are introduced, all sites must be informed. When a database program i.s written, compiled and executed entirely in one environment (machine and DBMS), it rarely is sensitive to the exact representation of the data which it processes. lhe data compi ]ed into the program and the data stored in ~]atabase str~ctures are all represented identically so the opel-atiolls behave as expected. Thus, a COMPARE command executed in a sing]e database environment can a].ways be made to manipulate c~ata correctly, just by using the high level language operatiolls o~ the system. Thus, givel) dis~ ity in machille types and the ever-evo]ving natll]~e of DBMSs, it is ~.nevitable that a distributed databa.se system C'all be heterogerleous in the sense that any site may mallacJe a databa.se by means of a combination of macllitlP alld DBMS whicl1 is different from the combination at anotlle}- site Provision is made in the prior art for solving the problems of machine arld system incompatibility in a distributed, heterogelleolls clatabase system. Three solutions are of interest. The earliest sollltioll may be termed "application beware". This solution usually starts as a connection between identical database systems whicll grows over time to incorporate some machines which differ slightly from the original. In tllese sol~ltions, there is no way for the SA9-89-096 3 203~90~

system to automatically handle the differences, with the result that the application program was given this responsibility. If access to heterogeneous databases was needed badly enough, the application was written to make any necessary accommodations. The s~cond solution lltilizes a canonical representation of data. This approach calls for conversion of data into a single, generic (canonical) representation before transport from one database site to another. Superficially, this solves the problem of automating the system to handle differences between differing databases. However, this approach requires many extra conversions which are inefficient, and introduces many conversion errors, making the approach inaccurate. For example, conversion of a floating point number always requires rounding off, with a concomitant loss of accuracy. When converting from one to another floating point representation, say, from IEEE to hexadecimal, precision is lost. In changing from hexadecimal to IEEE, scale is lost. Where character translations are performed, many of the special characters are lost because of lack of equivalence between character codes. In this solution, conversion errors which do occur are introduced at a point in the process far removed from the application. This increases the difficulty of identifying and responding to errors. The last solution employs a gateway conversion in which a central facility is responsible for matching any database representation to any other. Ideally this reduces the inefficiency, inaccuracy, and error propenslty of the canonical representation since conversions call be avoided when they aren t needed. However, inter-site communication i9 lengthy, slow, and expensive. The gateway is a single node to which all inter-site paths connect for all interactions. Instead of a request and response between the two participating sites, there are two requests and two responses for every data transfer. When conversions are required, they are still done in a part of the distributed system which is remote from the application. Thus, there is an evident need in distributed, heterogeneous database systems to support effective and accurate exchange of data, while reducing the number of j. ~

SA9-89-096 4 203~1905 conversions, and the communications overhead. It is also desirable to perform any needed conversion at the site where the request for the data to be converted was generated.

Summary of the Invention This invention describes a method and system for establishing the context in which data exchanged between heterogeneous relational DBMSs can be mutually understood and preserved. Particularly, a sequence of information transfer is described which enables a database to request or receive data expressed in a non-native form. Thus, with the practice of this invention, a DBMS may receive data in a foreign format and itself perform the necessary conversion of the data. According to the invention, a database in a distributed, heterogeneous database system contains predefined descriptions of all machine environments in the system (machine descriptors) and predefined descriptions of database language structures (system descriptors) for each DBMS with which it can perform data exchange. When a database operation begins which requires two heterogeneous databases to conduct an information exchange, a communication link is established between them. Next, each DBMS identifies its machine and system descriptors to the other. This establishes a data context and is done only once for the life of the communication link. Once established, req~lests can be sent and data received. When data is sent to the receiving DBMS, specific descriptions of the data precede the data itself and refer to the machine and system descriptors earlier identified. When the data is received, information contained in the specific descriptions enable a conversion process at the receiver to interpret the data by referencing machine and system descriptors. Taken together, the specific descriptions, and the machine and system descriptors which they reference, precisely characterize the environment where the data originated and establish, at the receiver, a context for predictable conversion of the data into a format which is native to the machine/system combination of the receiving site. The invention reduces the total overhead required in making and responding to a data request, and attenuates the ~ ~ 3 ~ ~ ~ 5 processing required by the receiver DBMS to interpret the sent data, while increasing the speed with which a request for data is serviced.

Brief Description of the Drawings FIG. lA is an illustration of user data which is to be transferred between non-equivalent database systems. FIG. lB is a top level representation of the relationships of all descriptors which define the environmental context of the user data illustrated in FIG. lA. FIGS. 2A, 2B, 2C and 2D illustrate machine descriptors.FIGS. 3A and 3B illustrate system descriptors together with their references to machine descriptors. FIGS. 4A and 4B illustrate user data descriptors of the invention with references to machine and system descriptors. FIG. 5 illustrates the procedure of the invention in the example of command and data flows to set up a connection and transfer a set of user data. FIG. 6 illustrates a representative system architecture on which the invention may be practiced. FIG. 7 illustrates the procedure of the invention in the example of command and data flows between a requesting and serving machine passing through an intermediate DBMS system. Description of the Preferred Embodiment When used herein, the term "descriptor" means a unit of data used to characterize or categorize information. The term "data object" is taken to mean a data structure element that is necessary for the execution of a program and that is named or otherwise specified by a program. A "reference" is a construct such as a pointer which designates a declared object. Descriptors, data objects, and references are all understood in the context of a programming language. In relational DBMSs, a well-known relational language is SQL.An "environmental context" is an information set which describes the database system which originates a block of user data. Data is in "native form" if it is made up of data types and control information in a form used by the database system which is processing it; otherwise, it is "non-native" or "foreign". A database system "environment"is the set of logical and physical resources used to support database management.

~ g~ 3 4 Q ~ ~

For relational database systems, the S~L language describes several diffel-ent data -types. The6e types include INTEGER, FLOAT, VARCHAR, and many more. Vepending on the machine on which an S~l. dat~base manager is implemented, the actual bit representatiolls for ~ata values having S~L data types vary. For example, the IBM~ System/370 employs a hexadecimal floating poi.nt representation, the IBM AS/400 employs the IEEE forma~, and on IBM OS/2~ machines, IEEEbyte reversed formats are used. These differences are implied by the machille environmeIIt and are not formally exposed to appli.ca1..ioll progr.ams e~ecuting in the environments. Furthermore, thele are many SQL i.dentified standard control blocks, sucll as the SQI, commurlication area (SQLCA) which is defined i71 ~er~ms of SQI. types. The S~LCAs in the machine environments defi.ned above are not identical. This invention formalizes a method and means for exchanging information between heterogerleous DBMSs about machine characteristics and DBMS languacJe structures such that little or no descriptive i.nformation must flow during the exchange of data in ordel~ to convert c3ata to a native form.Additionally, wl1en datab~se si.tes matcll, no conversions are performed at all, thtls preserving completely the integrity of the data beinc3 exchallged. When these sites do not match, data conversions are performed c]ose to the ultimate point of use where errors intr:od~lced by imperfect conversions can be dealt with accord:ing to the needs of the requesting application. This inventi.on is described using simple machines, database management systems, and user data. Those skilled in the art wi]l understand that detailed implementations will require suppol:l for many more gener-ic data types than are described hereinbe]ow, many more DBMS information block.s, and user c]~ta with many more fields. The extensions to handle these cases are manifest and would only serve to unnecessarily obscul~e the invention if presented herein. FIG. lA illustrates user data in hexadecimal format.The data are shown in two forms, 1.0 and 20. The user data appears as it would i.n a typical personal computer environment. The data 20 has the same meaning as the data 2~3~

10 except that it appears in the form it would have in a typical mainframe environment. The user data forms illustrated in FIG. lA form the example upon which explanation of the invention is based.When the user data is to be sent from the personal computer environment to the mainframe environment as illustrated by the direction 31-30, it must be changed in form from 10 to 20. Conversely, if the user data is to be sent from the mainframe to the personal computer environment, that is, direction 30-31, it must be changed in form from 20 to 10. The actual data being transferred consists of three rows. Each row is an entity containing all of the fields necessary to describe the outcome of an SQL statement. With reference to the rows in each user data, the first fields 11 and 21 are SQL communication areas which describe the outcome of an SQL statement. In this regard the SQLstatement is assumed to be a command in the SQL language which results in the manipulation of data in a database. In the example, the data affected by the outcome of the SQLstatement consists of five fields. In the first row, the first fields 12 and 22 contain the integer value 100, the second fields 13 and 23 contain the character value "ABC", the third fields 14 and 24 contain the integer value 80, the fourth fields 15 and 25 hold the character value "ZYX", and the last fields 16 and 26 contain the floating point value 12.3. The second row of each representation in FIG. lAconsists of an SQL commu1lication area followed by the integer value 200, "DEF", the integer value 160, the characters "WVIJ", and the floating point value 45.6.Similarly, the third row includes an SQL communication area, integer 300, characters "GHI", integer 240, characters "TSR", and floating point 78.9. It will be appreciated that the integers in the user data 10 are in low to high sequence, and in high to low sequence in the user data 20. ~haracters are coded in ASCIIin the personal computer user data and in EBCDIC in the mainframe user data. Floating point is IEEE low to high in fields 16 of the user data and hexadecimal excess-64 in fields 26 of the user data 20. The communication areas 11 in the personal computer machine use formats which are ~ ~ 3 4 9 0 5 different than t1le SQL comm~ ication areas 21 in the mainframe environment. No distinction has yel been made as to which direction the data in ~IG. lA will be transferred. That is, no statement has been made as to which environment is the server and which the receiver of data. The method of this invention i.s complete]y symmetric and reversible. However, for the purpose of exp].anation, assume that a re~lest for data is made by a personal computer DBMS, that the receiver will receive data and that a mainframe DBMS receives the request and sends the data. Re].atedly, in FIG. lA, the requested data is user data 20 which tnust be rendered ultimately into the form represented by user data 10. The invention provides fol est:ablishing a context by which the receiver can accept u.qer data 20 ~nd prepare for conversion of that data into the format represented by user data 10.This inventi.on does not concern the actual conversion process itself, but rather with a method and means for delaying conversion until the data reaches the location where it is to be processed. In FIG. lB, ther~e is illu.strated a personal computer machine en~irotmlent 40 ("receiver") and a mainframe machine environment 50 ("server"). The personal computer environment can inc]ude, for example, the IBM PS/2 product programmed with an OS/2 SQL-based database management system. For convenience, the machine name of the personal computer 40 is indicated by reference numeral 41. The DBMSwhich runs Oll the machille ~0 utilizes a code to represent characters and ccn~rol rnnction meanings. For this purpose, a code paye maps al] code points to the graphic characters and control funclion meanillg~ lltilized by the DBMS. The code page is represented by reference number 42. The DBMSis a language-base~ system, and in the example, it is assumed that the DBMS ;s an SQL-based system. Further, it is assn,lled that the version or level of the language is SQLAM3. This i~ denoted by reference numeral 46 in FIG. lB. A complete characterization of the personal computer environment which processes use]~ data ]0 therefore must include an indication of the type of machine which processes the data, representation of machine-level information such as the data types whi cll exis~ in the machine, and 233~aa information showing in what form the data exists in the machine. In the example of FIG. lA, the data types are integers, floating point numbers, and character strings. In the personal computer, integers are represented in low to high sequence, characters are in ASCII, and floating point is IEEE low to high. All of this information is represented in the invention by a machine-level representation descriptor. In FIG. lB, the machine-level representation descriptor for the personal computer machine is indicated by reference numeral 44. The context for conversion also requires information describing characteristics of the language in which the DBMSsending the information is written. In this description it is assumed that all of the DBMS sites are written in one or another version of an SQL-based language. In order to accommodate non-identical representations of control information produced by these varying versions, a system-level language characteristic descriptor must be available to the converting machine in order to convert the language-specific portions of the user data. In FIG. lA, the language-specific portions are the communication areas.In order to provide this system-level information about the language characteristics, the converting machine is provided a system-level information unit descriptor. In FIG. lB, the system-level descriptor for the personal computer language environment is indicated by reference numeral 45. The context is completed by the provision of application ]evel information specific to the user data being transferred. This information is given in the form of an application ]evel user data descriptor. For the personal computer machine environment, this descriptor is indicated by reference numeral 47. The application level descriptor includes a control block for conveying information between cooperating DBMSs. This descriptor is in the form of a prefix appendQd to the user data which is sent to the receiver. This prefix contaills information setting forth the machine and system-level characteristics of the transferred data. The system-level information provides a reference to the system-level information in the system-level descriptor corresponding to the language vQrsioll of the serving system. as well as a reference to the ~' ' L

SA9-89-096 10 2034~a~

machine-level information in the machine-level descriptor representing the serving machine. In FIG. lB, reference numeral 47a represents the system-level reference made in the application level descriptor 47, while reference numeral 47b indicates the machine-level reference made in the descriptor. As FIG. lB illustrates, reference is also made by the system-level descriptor to the machine-level descriptor. In FIG. lB, the reference numeral 45a represents the machine-level reference made in the descriptor 45. It should be evident that a user data context can be conveyed in whole each time user data is sent to a receiver by appending machine, system, and application level descriptors to the data. Appropriately, this would be employed in an asynchronous communication environment. In a synchronous communication environment, context could be completely conveyed by trans~er of machine and system-level descriptors initially, followed by transmission of application level descriptors and binding of the received application level descriptors to the machine and system descriptors at the receiver s site. In either case, the machine and system-level descriptors would have to be transferred, in whole, at least once between the sites.This transfer is not required in this invention. Thus, in FIG. lB, assuming that the mainframe machine is sending data, its machine-level and system-level descriptors 54 and 55 must be made available to the requesting personal computer machine 40 and linked to application level descriptors 57 in order to convert the user data of FIG. lA from the representation 20 to the representation 10. Relatedly, the descriptors 54, 55 and 57 must be made available to the machine 40. Similarly, for data transferred in the opposite direction, that is, from the personal computer machine 40 to the mainframe machine 50, the descriptors 44, 45 and 47 uould have to be made available to the mainframe machine 50 in order to convert user data 10 into the form of user data 20 in FIG. lA. In the practice of this invention, it is asserted that, during the implementation of each DBMS site which participates in a distributed, heterogeneous database system, a decision is made as to which specific machine - ~--tS$A, ....

~ SA9-89-096 11 2 0 3 ~

representations will be supported at the site. It is asserted that the "native" representation of the machine on which the site s DBMS will run is supported. That is, if ~ data from another DBMS identica] to the receiving site s ~ DBMS is received, the representation is "native" and requires no conversion. Additionally, other machine types will be supported as partners. Generally, the "non-native"representations of data types will be converted to "native"ones at the receiving site before processing. In the preferred embodiment, at each site, there is maintained a list of acceptable machine partner types at each site.Thus, at the site of the personal computer machine 40, a list of acceptable machine types includes the machine and code page corresponding to identifiers 51 and 52 in the machine 50. The list indexes to a set of descriptors which include the machine-level and system-level descriptors for all acceptable system partners. Thus, if the personal computer machine 40 receives simply identification of the machine and language present at the site 50, these identifications can be brought to the list which will index to the necessary descrip-tors in the list of descriptors, the indexed descriptors corresponding to the descriptors 54 and 55.Therefore, when the personal computer 40 generates a request for data from the DBMS runIIing on the machine 50, the receiver machine 40 sends its machine, codepage, and system-level name (PS/2, 437, SQLAM3) to the sys-tem to which connection is desired, in this case, the machine 50. If the machine 50 finds these names unacceptable (not in its list) then an error is returned to the receiver and the connection is broken. If the names are acceptable to the machine 50, the machine 50 assumes the status of server and completes connection by sending its machine and system-level names to the receiving machine 40. If these names are acceptable to the receiver, then the connection is established. Otherwise the connection is broken.~ aving established the connection, the receiver sends a data request to the provider/server machine 50. The provider/server construc-ts a user data descriptor 57 according to the characteristics of the data requested, in this case according to the characteristics of user data 20 SA9-89-096 12 2~3~90~

in FIG. lA. In this case, the descriptor is built to reflect the presence of the SO~L communication area 21 and the five user data fields 22, 23, 24, 25, and 26. The provider/server machine 50 then sends the descriptor 57 and the user data 20 to the receiver machine 40. Upon receipt, the receiver uses the descriptors obtained in response to the machine and system-level identifier sent by the machine 50 and the references contained in the application level descriptor 57 to convert the user data 20 from the server s representations to the receiver s representations for subsequent processing. It should be appreciated that the identifiers exchanged between the machine and the user data descriptor must be in a canonical form which is understood by both systems. In the preferred embodiment, this form is defined by a particular code page in EBCDIC. With this common basis for understanding, each site in the distributed system will recognize and correctly interpret the identifiers and user data descriptors of any other site. Refer now to FIGS. 2A-2D for an understanding of the structure and contents of machine-level representation descriptors according to the invention. The descriptors for four machine types are illustrated. The descriptor 60 represents a machine exemplified by an IBM PS/2 executing a DBMS which uses code page 437 for character coding. The descriptor 70 characterizes a mainframe machine of the System/370 type whose DBMS uses code page 037. The descriptor 80 is for an A~/~00 machine using code page 037.The descriptor 90 is for an hypothetical machine called OTHER using code page 850 for character encoding. The descriptors in these figures represent the machines in very simplified terms. The representations correspond to generic data types which are recognized and used by all database sites in a distributed system. In the figures, the data types are INTEGER for integer numbers, FLOAT for floating point numbers, and CHARACTER for non-numeric information, such as letters of an alphabet. Each generic data type is also described in terms of a format which is particular to the machine --hose descriptor is illustrated. The machine-level descriptors of FIGS. 2A-2D are shown as multi-field data objec~s, with the fields containing SA9-89-096 13 203~0~

information relating to a generic data type or a data type format. Thus, detailed specifications of the INTEGER data type are contained in multi-field sections 62, 72, 82, and 92 of the machine-level descriptors. Detailed specifications of the representation of floating point numbers for these machines are contained in sections 64, 74, 84, and 94 of the descriptors. Character representations specifications are contained in sections 66, 76, 86, and 96 of the descriptors. In the illustrated descriptors, each generic data type is identified initially by a marker definition (MD). For integers, the markers are in fields 67, 77, 87, and 97 of the descriptors. The markers for each generic type are identical in each machine descriptor set. They identify unambiguously the generic type of the representation which follows. Each MD is followed a type definition (TD). The type definition shows exactly, to the bit, how the data is represented in the machine identified by the descriptor.Taking integers as an example, the four TDs shown have three different bit representations. Two machines represent, in fields 88 and 78, that integers are binary values with the bytes ordered from high to low significance. One machine uses binary values, but reverses the order of the bytes, storing the low order byte first, as indicated in field 68 of the descriptor 60. The machine represented by the descriptor 90 represents integers in decimal digits in a packed format as indicated by field 98. The TDs shown are not all the same. However, the TDs which are the same mean exactly the same thing and are represented in exactly the same way among the descriptors. Thus, for example, integer high-to-low in fields 78 and 88 of descriptors 70 and 80 is exactly one representation. In the invention, the MD-TD pairs can each represent a family of types. Integers of length l, 2, 3, 4 ... bytes all can share the same descriptor. Similarly, for cllaracters the actual code page to be associated with the type is a separate parameter. The code page specified in the TDs 69, 79, 89, 99 is a default which can be overridden.This is explained below with reference to FIGS. 3A and 3B. The sig~i'icance of a TD is not limited to association with a single generic data type. Thus, the same TD may be SA9-89-096 14 2~3~0~

used as the representation for several different generic data types or MDs. For example, if there were a generic type called WEIGHT, it could be represented by floating point or integer numbers. It may be the case that different machines would support it differently. Such variation is supported fully by this invention. Thus, a generic data type is named specifically by the MD; how the data type is represented is specified by the TD. One additional feature that will be evident to the ordinarily skilled practitioner is that the order of the generic type specifications (MD-TD pairs) is not important to the significance of the machine-level descriptors. The meaning set by the MD is used to establish linkage to an appropriate TD from a higher level descriptor, such as a system-level descriptor. FIGS. 3A and 3B illustrate the structure and contents of system-level descriptors and how those descriptors are referenced to machine-level descriptors. In FIGS. 3A and 3B, the system-level descriptors illustrate two different levels of language support. In this regard, the descriptor 100 of FIG. 3A is for a ~evel called SQLAM3 while the descriptor 120 of FIG. 3B is for a level called SQLAM4. The system-level descriptor for an SQL information block which includes status of the SQL calls, the SQLCA, consists of markers (MDs), grouping descriptors (GDs), and row descriptors (RDs). Consider now the two system-level descriptors 100 and 120. In these descriptors, the first markers, 102 and 122, respectively, identify groups of fields which are to be included in the SQL communication a~ea. These markers give generic meaning to the group of fields which follows, and are the same at all levels of DBMSimplementation. In each descriptor, the group marker is identified as S~LCAGRP. In the system-level descriptors, the group markers are followed by grouping descriptors (GD) 104 and 124, which indicate exactly which fields should be included in the SQLcommunication areas for user data. A grouping descriptor has the property that members of its described group are ordered but not yet arrayed into a linear vector OL fields.

SA9-89-096 15 203~9~

Another property of grouping descriptors can be appreciated with reference to FIG. 3A where the grouping descriptor 104 includes machine-level references, in this case, Integer-4 and Character-80. These members of the indicated group are references to data types identified by marker definitions in machine-level descriptors. As FIG. 3Ashows, these references provide reference directly to the integer and character MDs of the machine-level descriptor 60 (which is identical with the descriptor of FIG. 2A). The references are indicated by bindinys 112 and 11~. In the invention, the value specified in a reference is used to override the default length built into the machine descriptor. Thus, reference 114 in FIG. 3A overrides the machine length of its respective integer representation in the machine-level descriptor 60 to four bytes. In FIG. 3B, references 133 and 134 also override the machine length to four bytes for integers. In FIG. 3B, the system-level descriptor 120 denotes one more field in the SQL communication area than the previous system-level descriptor 100 for SQLAM3. One more value therefore is returned to status in a CA of user data at this higher level. It is contemplated in the practice of the invention that, during connection processing, both the requesting and receiving DBMS tells the other what to expect. Preferably, the systems agree to operate at the same system descriptor level, and agree to the lower one. The group description, 104 in FIG. 3A and 124 in FIG. 3B, is followed by another marker, 106 in FIG. 3A and 126 in FIG. 3B. These markers indicate that a row description for an S~L communication area is following. In each case, the row descriptor references the group described earlier, 110 in FIG. 3A and 130 in FIG. 3B. This vectorize~all of the fields and completes the definition of information which can be exchanged. In this row form, the S~L communication area can be exchanged on commands which do not involve user data, but which do involve system status information. The row descriptors 108 and 128 are identical in both levels; they refer to groups which contain differing information (104 and 124), which, in turn, reference different machine descriptors 60 and 70. The row ~' ~

20~49~

descriptors therefore make an actual block dependent on a language level or version, and make the block specific with regard to an identified machine, as well. The system-level descriptor blocks illustrated in FIGS. 3A and 3B of either level can map onto any of the machine descriptors, independent of the order of the descriptors in the machine. The reference is to the MDentries in the machine descriptors. The generic type of field in the SQL communication area is linked to the representation of that generic type in the machine-level descriptor. As thus descri~ed, a system-level descriptor can provide a machine-independent way to specify the contents of blocks of information to be exchanged between DBMSs. They provide both group descriptions (which will be used by higher-level descriptors) and row descriptors which describe complete objects which may be exchanged. The row descriptors included in the system-level descriptor can be referenced by another row descriptor to produce an array.This is illustrated and discussed below with reference to FIG. 4. The system-le~el descriptors are established early in the implementation of a DBMS and assembled into a set.The set required for any level of implementation can be given a name which any other implementation will understand.In this regard, the name of the system-level descriptor 100 in FIG. 3A is "SQLAM3". In FIG. 3B, the name of the descriptor 120 is "S~LAM4". In the practice of the invention, it is these names alone which are exchanged when a communication connection is made, thereby obviating the need to exchaIIge the descriptors themselves. Refer now to FIG. 4 for an understanding of the structure and content of a user data descriptor and how its linkages with machine- and system-level descriptors provide a complete context which enables a receiver DBMS to understand and convert user data from non-native to native form. The user data descriptor 200 provides a system-level a~d machine-level-independent way to describe user data. It is contemplated that this descriptor would not be built until run time, since, particulars of user data aren't known until after installation of a DBMS and creation and population of the relational tables in the DBMS, and until receipt of a particular request for data.

~.~3~$~

The user data de.scriptor 200, like the system- and machine-level descriptors discussed previously, consists of markers and other descriptors. A first marker MD 211 tags a group 212 consisting of an SQL communication area reference 213 to the system-level descriptor lO0, and five references 213, 214, 215, 216, and 218 to the machine-level descriptor 70. These six references correspond to the six fields which make up a respective row of user data 20, which is reproduced for convenience in FIG. 4B. Thus, the group 212 identifies the fields within a row of the user data, field by field, tells what generic type of data is in each field, and specifies the si~e of the data. For example, the group 212 has a first reference S~LCAGRP-0 which indicates that the first field in a row of user data will consist of an SQLcommunication area. As shown in FIG. 4B, a communication area "CA" is in the first field 21 of each row of the user data 20. Similarly, the second field, field 22, in a user data row is a two-byte integer, the third field, field 23 is a three-byte character, the fourth field, field 24 is a two-byte integer, the fifth field, field 25 is a three-byte character, while the last field, field 26 is a four-byte floating point number. The six references, 213, 214, 215, 216, 218, and 220 result in a group with seven fields, two from the system-level descriptor (Integer-4 and Character-80) and five from the user data. All of these references are independent of a particula]~ language level and a specific machine. It is true that -the described data i 5 very much dependent on these descriptive levels, but the key of this invention is that the description of the data is not. Returning to the explanation of the user data descriptor 200 in FIG. 4A, the next MD marker 224 tags a RDrow descriptor with a reference 226 to the marker 211. This reference is used to define a row of elements in the row of user data. As explained above, such a row would include seven fields, the two fields of the SQL communication area followed by the five user data fields, in order. This completely enumerates the fields for one object which can be exchanged between heterogeneous DBMSs. It includes information in the communication area as to the status related to the request which fetched the data, along with Q ~ 5 SA9-89-096 ~

the data. MD 2~4 i.~. followecl by an RD ~32 with a modifier "l", which indica-tes that one copy of each element of the defined group should be included itl the final row. The fina] MD 228 tags an RD 234 with a reference 230 to the row descriptor 23~. This clescriptor is used to make an array or table of the set of rows comprising the user data 20 of FIG. 4B. The modifier "0" in the RD 234 indicates that the referenced row should be repeated as many times as necessary to include all the data which ~ollows. This is of significant importance in querie.s since it is seldom possible to predict tlle fina] n11mber of rows in an answer set before sending t~l1e first 011-' (and the user data descriptor) to the receiver. In sum1nary, the 11ser ~1ata descLiptol- provides a mode of describing data w11ich is inc1epet1de1lt of the system-level and machine-level artifacts ill the described data. The data described includes ~BMS stat11s info~mation and the user data proper. The usel da~a c1escr;ptor thus accommodates the definition of any data which a user can retrieve from (or send to) a DBMS. ~hen the user data descriptor is bound to both a system-level descriptor and a machine-level descriptor, actual data can be u1ldel-stood ;n physical terms.Thus, the reference.s (bi1ldings~ to the machine-level descriptor 70 and system-]evel descriptor 100 gi.ve physical meaning to the referenci1lg members of the group 212. It is asserted that these references can be implemented conventionally by sta1lda1-d binc1i1lc3 tech1liques. FIG. 5 illustrates ~ procedure for establishing a convel-sioll conte~.t usi1lg the data objects whose structures and functions have been described above. The procedure includes generic comma1lds for requesting and completing a communication connectio1l and for requesting and providing data. The procedure of FIG. 5 is illustrated as a simplified sequence which shows only the parameters which must, of necessity, be exchanged for transferring user data to a re~uesting from a sending machine and establishing translation contexts in the machines for conversion of the data. It is reiterated that the procedure of the invention is not concerned with how data is converted, but rather with where the data is co1l~erted. Particularly, the invention ; ~

~ 9 ~ 4 ~ ~ 5 SA9-89-096 1.9 enables the recei~er of the dat;a to ~onvert the data for establislling pa~ticllLar values for those parameters. The .struct~.lre Or EIG. 5 places all of the requesting machine actions i~ e left-i)anc] side of the drawing and all of the sending machille actions in the right-hand side.Thus, the requesti.llg machirle first receives a user request for data which ifi maintai.~led by a DBMS in the sending machine. Next, tl~e requesting machine executes a REQUEST-CONNECTION commalld sequence directed to the sending machine. Tlle REQUE,ST-CO~NECTION sequence includes parameters to i~1ellti~y the requesting mac.hine, its character code, a.nd its lal1~uage ]evel. In the example, the requesting mac}l~ e i.s identi.fi.ed as a DBMS executing on a PS/2 personal computel, ut-ilizi.ng a specific code page (437) for character codi tlC~, anc3 operatinc~ at a specific DBMSlanguage level (SOL~M3) Tlle parameters in the REQUEST-CONNECTION step are no more than identifiers. Next, in step 250 the sendillg mac.hine reCei~Jes the connection request, validates the machine- and system-level identifiers, and uses these identifications to index to machine- and system-level de~criptors. Assuming that the sending machille recogni7.es and possesses the machine- and system-le~.~el descri.ptol:s 'identified in the connection request, it exec~ltes a COMPLETE-CONNECTION s-tep including parameters which pl-ovide machine- and system-level identification to the req~estillg maclline In step 252, the COMPLETE-CONNECTIOr~ respollse recei.ved from the sending machine is validated :ill the requestillg machlne and the identifiers are used to index the appropriate the machine-and system-leve] descriptors Yor examp].e, in the example of FIG. 5, the reques~ y machille would index to the machine and system descriptors 70 and 100 illustrated in FIG. 2B, 3B, and 4A Next, in step 254, the re~uesting machine assembles a data request alld sends it as a REQUEST-DATAcommand to the sending machirle In step 256, the sending machine responds to tlle RE~UEST-DATA command by obtaining the requested data and huilding a user-data descriptor such as the descriptor 200 il]ustrated in FIG. 4A, and executes a PROVIDE-DATA command hy selldillg the user-data descriptor and the user data afi, for example, the first row in the user data 20 in FIGS. ]B alld 4B The requestillg machine then, in SA9-89-096 20 2 03 ~

step 258, receives the transmitted descriptor and user data and calls a conversion process to convert the data.According to the needs of a particular implementation, the bindings between the user data descriptor and the indexed machine and system descriptors can be done in step 258 or left for the called conversion process. In ~IG. 5 a second REQUEST-DATA command is sent, and its parameters are processed, with the server, in step 260, obtaining the data.The remaining user data is returned with another PROVIDE-DATA command having as its parameters the user data.The requesting machine again processes the data according to the user data descriptor received in step 258 and to the system level, machine type, and code page values received during the connection portion of the process. Now if the receiver requires more data, it issues another REQUEST-DATAcommand, with the server returning the data with another PROVIDE-DATA command, and so on. An architecture for implementing the procedure of FIG. 5 is illustrated in FIG. 6. In FIG. 6, the re~uesting machine is assumed to be, as set out above, a PS/2 personal computer, while the server is assumed to be a mainframe of a System/370 type. The receiver is identified by reference numeral 280, the server by 282. It is asserted that the receiver 280 is conventionally structured with a CPU, storage, a communications adapter for communicating with the sending machine 282, and a user interface for receiving commands from and providin~ responses to a user. In the storage resides an application program 300, a relational DBMS of the S~I, type wllich includes an SO~L application manager 306 and a communications agent 310. In addition, two dictionaries, 342 and 346, are stored which include machine descriptors (dictionary 342) and system-level descriptors (dictionary 346). The sending machine 282 can comprise a conventionally-programmed System/370 mainframe computer with an SQL-type DBMS 334. An S~L application manager 318 interfaces a communication agent 314 with the DBMS 334. Two dictionaries 322 and 326 are provided for storing system-level descriptors (dictionary 322) and machine-level descriptors ~dictionary 326).

SA9-89-096 21 2~3~9~

The process of FIG. 5 is initiated when the application program 300 issues a database management re~uest. For example, the request can be an SQL statement OPEN. This request results in a call 304 to the application manager 306. The receiver's application manager 306 first has to establish a connection with the desired database management system. This is accomplished by constructing a REQUEST-CONNECTION command in accordance with FIG. 5, with parameter values which describe itself (machine and code page designations) and the level of DBMS services desired (in this case, SQLAM3), as well as parameters (not shown) which identify the sending DBMS. This command is passed at 308 to the communications agent 310 which is responsible for establishing the communication link with the identified sending machine. The REQUEST-CONNECTION command is sent over a conventional communications link 312 to the communications agent 314 responsible for the other end of the communication channel. The agent 314 examines the DBMSparameter (FIG. 5) to determine which level of server SQLapplication manager to invoke. Once the determination and invocation is completed, the agent 314 forwards at 316 the request to the invoked S,OL application manager 318. The manager 318 accesses at 320 the DBMS level dictionary 322 to obtain a descriptor which matches the receiver s desired level of function. (In the preferred implementation, the content of this descriptor is known early in the development cycle and for processing efficiency is "built into" the application manager 318.) Similarly, the manager 318 accesses at 324 the identified machine level descriptor from the dictionary 326. This includes descriptors matching the machine designated in the MACH parameter of the REQUEST-CONNECTION command and a code page identified in the CODE PAG~ parameter. Having verified that the receiver s environment can be handled, the application manager 318 constructs a COMPLETE-CON~IECTION command as illustrated in FIG. 5. The command states the sending machine's characteristics in the MACH code page parameters and the level of system function which the sending machine will support in the DBMSparameter. It is contemplated in the invention that the connection steps may repeat in order to support negotiation ~, SA9-89-096 22 203~9Q~

to a different system level than that originally requested.Next, the COMPLETE-CONNECTION response is returned to the requesting machine 280 by sending it on 328 to the agent 314 which communicates it on the communication link 312 to the receiver agent 310. The receiver agent 310 sends the COMPLETE-CONNECTION command and parameters at 330 to the application manager 306. Il~ a manner similar to the manager 318, the manager 306 accesses the system and machine-level descriptors which describe the server machine and system characteristics from the dictionaries 342 and 346. Having validated that connection has been established, and having obtained the machine- and system-level descriptors at both ends of the connection, processing continues. The receiver s application manager 306 constructs a REQUEST-DATAcommand corresponding to the OPEN request from the application program 300. This command is sent to the server's application manager 318 via 308, 310, 312, 314 and 316. The application manager 318 recognizes a "first request" and issues an OPEN command 332 to the local DBMS334, which returns the status in the form of an SQLCommUniCatiOIl area 336. Next, the manager 318 issues a request on 332 to determine the format of the answer set.The format of the answer set is returned on 336 to the server application manager 318 which constructs the user data descriptor for the data. Assuming that data is buffered into the manager 318 and that there is room in the buffer, the server s application manager 318 issues an SQLFETCH request on 332 to the DBMS 334 which returns a first row of data on 336. (It is assumed that the data to be returned corresponds to the user data 20 in FIGS. lA and 4B.) Next, the server application manager 318 places the data in a reply buffer and issues a PROVIDE-DATA command to send the data to the receiver s application manager 306 via 328, 314, 312, 310, and 330. It is contemplated, but not required, that the server s app]ication manager 318 may read ahead to fill buffers in anticipation of the next REQUEST-DATA command. Upon receiving data with the first PROVIDE-DATAcommand, the receiver s application manager 306 processe~the user data descriptor to verify correctness and to prepare to convert data subsequently received. Assuming no e ~

SA9-89-096 23 2 8 3 ~ ~ ~ 5 " errors are found, the application manager 306 then sends the result of the OPEN back to the application program 300 on 338. From now on, the receiver s application manager 306 may, but is not required to, read ahead requesting additional buffers from the server s application manager 318 in anticipation of FETCH requests. Having successfully opened a Cursor, the application program 300 issues an SQL FETCH which is processed by the receiver s application manager 306. At this time, the receiver s application manager 306 has a row of data to return, and converts the data and returns it on 338 to the application program 300. The application 300 processes the data and SQL communication area received and subsequently issues another FETCH command on 304 to obtain another row of data. The receiver s application manager 306 having no data to satisfy this request sends another REQUEST-DATA command to the server s application manager 318 via 308~ 310~ 312 ~314~ and 316. The server s application manager 318 having read ahead, has a buffer containing the last two rows of data. These rows are sent with the PROVIDE-DATA command to the receiver s application manager 306 via 328~ 314~ 312 ~310~ and 330. The application manager 306 then converts the first row of the buffer using the previously constructed conversion descriptors and passes it on 338 to the application 300. To comp]ete the example, the application 300 issues another S~L FETCH request on 304 to request the last row of the answer set. The receiver s application manager 306 converts this last row of data and returns it to the application. In this example, end-of-query is understood by the application by content of the result achieved. Additional FETCH requests would be required as would an SQLcommunication area indicating end-of-query, which would have heen constructed to allow the DBMS 334 to signal this condition to the applicatioll. However, this is beyond the scope of this invention and is not discussed. A data converter is not illustrated as an element in any of the figures, it being understood, that data format conversion is well-known in the art. For example, U.S.patent 4~ 559~ 614 of Peek et al., assigned to the Assignee of this application, describes in detail how conversion from a first to a second internal code format is accomplished for data transmitted between two dissimilar computer systems.To the extent necessary, this pa1ent is incorporated herein by reference.

It will be appreciated that, although specific embodiments of the invention have been described herein for purposes of illustration, various modifications may be made without departing from the spirit and scope of the invention. Accordingly, the scope of protection of this invention is limited only by the following claims and their equivalents.

Claims (28)

1. A combination for establishing a data conversion context at a first computer system which receives data from a second computer system, wherein the second computer system stores and processes data in a format different than the first computer system, the combination comprising:(a) descriptors in the first computer system defining machine and language characteristics of a plurality of computer systems;(b) means for obtaining from said descriptors, first and second descriptors which respectively define machine and language characteristics for said second computer system; (c) means for receiving data from the second computer system; and (d) means for combining the first and second descriptors with data descriptors in the received data which describe characteristics of data native to the second computer system to produce a context for converting the received data to data which is native to the first computer system.

2. A method for use by a user processor in making data requests to a server processor which stores and processes data in a format different than the user processor, comprising the user processor executed steps of: storing a dictionary of processor descriptors;creating a communication link to a server processor by:communicating to the server processor an identifier denoting the user processor; and receiving from the server processor an identifier denoting the server processor;sending a request for data to the server processor in the user processor data format, the request including a descriptor of the user processor data format;receiving from the server processor data responsive to the request for data, the data being in the server processor data format, and being accompanied by a descriptor of the server processor data format; and using the identifier denoting the server processor and the descriptor of the server processor data format, converting the data to the user processor data format.

3. The method of claim 2, wherein the step of creating a communication link further includes validating the identifier denoting the server processor.

4. The method of claim 2, wherein the identifier denoting the server processor includes identification of a server processor machine descriptor.

5. The method of claim 4, wherein:the storing step includes storing a plurality of server processor machine descriptors at the user processor; and the step of converting includes:obtaining a server processor machine descriptor from the dictionary in response to the identifier denoting the server processor; and in response to the server processor machine descriptor, converting bit representations in the data received from the server processor to bit representations used by the user processor.

6. The method of claim 2, wherein the identifier denoting the server processor includes identification of a computer language descriptor.

7. The method of claim 6, wherein:the storing step includes storing a plurality of computer language descriptors in the dictionary;and the step of converting includes:obtaining a computer language descriptor from the dictionary in response to the identifier denoting the server processor; and in response to the computer language descriptor, converting control fields in the data received from the server processor to control fields of a computer language used by the user processor.

8. The method of claim 2, wherein the identifier denoting the server processor includes identification of a server processor code page.

9. The method of claim 8, wherein:the storing step includes storing a plurality of server processor code page descriptors in the dictionary; and, the step of converting includes:obtaining a server processor code page descriptor from the dictionary in response to the identifier denoting the server processor; and in response to the code page descriptor, converting character representations in the data received from the server processor to character representations used by the user processor.

10. The method of claim 8, wherein:the storing step includes storing a plurality of server processor code page descriptors in the dictionary; and, the step of converting includes:obtaining a code page descriptor from the dictionary in response to the identifier denoting the server processor; and in response to the code page descriptor, converting control function representations in the data received from the server processor to control function representations used by the user processor.

11. A combination for use by a user processor in making data requests to a server processor which stores and processes data in a format different than the user processor, the combination comprising, in the user processor: means for creating a communication link to a server processor by:communicating to the server processor an identifier denoting the user processor; and receiving from the server processor an identifier denoting the server processor;a dictionary of server processor descriptors;means for sending a request for data to the server processor on the communication link in the user processor data format, the request including a descriptor of the user processor data format;means for receiving from the server processor data responsive to the request for data, the data being in the server processor data format and being accompanied by a descriptor of the server processor data format; and means for converting the data to the user processor data format in response to the identifier denoting the server processor and the descriptor of the server processor data format.

12. The combination of claim 11, wherein the means for creating a communication link is further for validating the identifier denoting the server processor.

13. The combination of claim 11, wherein the identifier denoting the server processor includes identification of a server processor machine descriptor.

14. The combination of claim 13, wherein:the server processor descriptors include a plurality of server processor machine descriptors;and, the means for converting includes:means for obtaining a server processor machine descriptor from the dictionary inresponse to the identifier denoting the server processor; and means for converting bit representations in the data received from the server processor to bit representations used by the user processor in response to the server processor machine descriptor.

15. The combination of claim 11, wherein the identifier denoting the server processor includes identification of a computer language descriptor.

16. The combination of claim 15, wherein:the server processor descriptors include a plurality of computer language descriptors;and, the means for converting includes:

means for obtaining a computer language descriptor from the dictionary in response to the identifier denoting the server processor; and means for converting control fields in the data received from the server processor to control fields of a computer language used by the user processor.

17. The combination of claim 11, wherein the identifier denoting the server processor includes identification of a server processor code page.

18. The combination of claim 17, wherein:the server processor descriptors include a plurality of server processor code page descriptors;and, the means for converting includes:means for obtaining a server processor code page descriptor from the dictionary in response to the identifier denoting the server processor; and means for converting character representations in the data received from the server processor to character representations used by the user processor in response to the code page descriptor.

19. The combination of claim 17, wherein:the server processor descriptors include a plurality of code page descriptors;and, the means for converting includes:means for obtaining a code page descriptor from the dictionary in response to the identifier denoting the server processor; and means for converting control function representations in the data received from the server processor to control function representations used by the user processor in response to the code page descriptor.

20. In a system including a first database system for managing a first database including data having a first data format and a second database system for managing a second database including data having a second data format, a method for converting data transmitted between the first and second database systems, the method including the steps of:storing at the first database system and at the second database system respective sets of machine descriptors describing machine data formats and character code sets, and system descriptors describing system language characteristics;sending a request for connection from the first database system to the second database system, the request for connection including information identifying a first machine on which the first database system executes, a character code used by the first machine, and a system language used by the first database system; in response to the request for connection:validating a first machine descriptor and a first system descriptor stored at the second database system, the first machine descriptor describing a machine data format and a character code used by the first machine and the first system descriptor describing a system language used by the first database system; and sending a connection response from the second database system to the first database system, the connection response including information identifying a second machine on which the second database system executes, a character code used by the second machine, and a system language used by the second database system;in response to the connection response, validating a second machine descriptor and a second system descriptor stored at the first database system, the second machine descriptor describing a machine data format and character code used by the second machine and the second system descriptor describing a system language used by the second database system;sending from the first database system to the second database system a database query command containing data in the first data format;at the second database system, converting the data in the database query command into the second data format using the validated first machine descriptor and first system descriptor;at the second database system, obtaining resulting data from the second database in response to the database query command and the data in the second data format;transmitting the resulting data to the first database system without converting the resulting data; and at the first database system, converting the resulting data from the second data format to the first data format using the second machine descriptor and second system descriptor.

21. The method of claim 20, wherein the system further includes a first database system unit connected between the first database system and the second database system, wherein:the step of sending the database query command includes receiving the database query command at the first database system unit and sending the database query command from the first database system unit to the second database system without converting the data in the database query command into a data format native to the first database system unit.

22. The method of claim 21, wherein the system further includes a second database system unit connected between the first database system and the second database system, wherein:the step of transmitting the resulting data includes receiving the resulting data at the second database system unit and sending the resulting data from the second database system unit to the first database system without converting the data format of the resulting data at the second database system unit.

23. The method of claim 20, wherein the first database system is a first version of a database management system and the second database system is a second version of the database management system, the first and second versions being non-identical.

24. The method of claim 20, wherein the first database system is for executing on a first digital computer which represents data in a first internal format and the second database system is for executing on a second digital computer which represents data in a second internal format, the first and second internal formats being non-equivalent.

25. In a distributed database system including a plurality of database managers, each database manager for executing in a respective digital computer, a method for minimizing conversion of data transferred between database managers of the plurality of database managers, comprising the steps of:storing at a first database manager and at a second database manager respective sets of machine descriptors describing machine data formats and character code sets, and system descriptors describing system language characteristics;sending a request for connection from the first database manager to the second database manager, the request for connection including information identifying a first machine on which the first database manager executes, a character code used by the first machine, and a system language used by the first database manager; in response to the request for connection:validating a first machine descriptor and a first system descriptor stored at the second database manager, the first machine descriptor describing a machine data format and a character code used by the first machine and the first system descriptor describing a system language used by the first database manager; and sending a connection response from the second database manager to the first database manager, the connection response including information identifying a second machine on which the second database manager executes, a character code used by the second machine, and a system language used by the second database manager;in response to the connection response, validating a second machine descriptor and a second system descriptor stored at the first database manager, the second machine descriptor describing a machine data format and character code used by the second machine and the second system descriptor describing a system language used by the second database manager;sending a database query command containing input data from the first database manager to the second database manager, the input data having a first data format used by the first database manager;at the second database manager, if the second database manager uses a data format which is non-equivalent to the first data format used by the first database manager, using the validated first machine descriptor and first system descriptor to convert the input data from the first data format into a second data format used by the second database manager, otherwise, leaving the input data in the first data format;

at the second database manager, obtaining resulting data from a database in the data format used by the second database manager;sending the resulting data in the data format used by the second database manager to the first database manager; and at the first data base manager, if the data format used by the first database manager is not equivalent to the data format used by the second database manager, using the validated second machine descriptor and second system [language] descriptor to convert the resulting data into the first data format used by the first database manager.

26. In a system in which a user database manager controls a first database and a server database manager controls a second database, a method for minimizing the conversion of data communicated between the user and server database managers, the method including the steps of:storing at the user database manager and at the server database manager respective sets of machine descriptors describing machine data formats and character code sets, and system descriptors describing system language characteristics;sending a request for connection from the user database manager to the server database manager, the request for connection including information identifying a first machine on which the user database manager executes, a character code used by the first machine, and a system language used by the user database manager; in response to the request for connection:validating a first machine descriptor and a first system descriptor stored at the server database manager, the first machine descriptor describing a machine data format and a character code used by the first machine and the first system descriptor describing a system language used by the user database manager; and sending a connection response from the server database manager to the user database manager, the connection response including information identifying a second machine on which the server database manager executes, a character code used by the second machine, and a system language used by the server database manager;in response to the connection response, validating a second machine descriptor and a second system descriptor stored at the user database manager, the second machine descriptor describing a machine data format and character code used by the second machine and the second system descriptor describing a system language used by the server database manager;sending from the user database manager to the server database manager a command containing input data, the input data having a data format native to the user database manager;at the server database manager, converting the input data from the data format native to the user database manager into a data format native to the server database manager using the validated first machine descriptor and first system descriptor;at the server database manager, executing the command by processing the input data after converting the input data;in response to executing the command, obtaining resulting data from the second database, the resulting data having the data format native to the server database manager;sending the resulting data in the data format native to the server database manager to the user database manager; and at the user data base manager, converting the resulting data from the data format native to the server database manager into the format native to the user database manager using the validated second machine descriptor and second system descriptor.

27. In a system including a processor system for processing data having a first data format and a database system for managing a database including data having a second data format, a combination for converting data transmitted between the processor system and the database system, the combination including:processor system storage storing a set of machine descriptors describing machine data formats, a character code set, and a set of system descriptors describing system language characteristics;database system storage storing a set of machine descriptors describing machine data formats, a character code set, and a set of system descriptors describing system language characteristics;a communications link connecting the processor system and the database system for communication;

means in the processor system for sending a request for connection from the processor system to the database system, the request for connection including information identifying a first machine on which the processor system executes, a character code used by the first machine, and a system language used by the processor system;means in the database system responsive to the request for connection for validating a first machine descriptor and a first system descriptor stored in the database system storage, the first machine descriptor describing a machine data format and a character code used by the first machine and the first system descriptor describing a system language used by the processor system;means in the database system for sending a connection response from the database system to the processor system, the connection response including information identifying a second machine on which the database system executes, a character code used by the second machine, and a system language used by the database system;means in the processor system responsive to the connection response for validating a second machine descriptor and a second system descriptor stored in the processor system storage, the second machine descriptor describing a machine data format and a character code used by the second machine and a second system descriptor describing a system language used by the database system;means in the processor system responsive to the connection response for sending a database query command containing data in the first data format from the processor system to the database system means in the database system for:converting the database query command into the second data format using the validated first machine descriptor and first system descriptor;obtaining resulting data from the database in response to the database query command and the data in the second format; and transmitting the resulting data to the processor system without converting the resulting data;and means in the processor system for converting the resulting data from the second data format to the first data format using the validated second machine descriptor and second system descriptor

28. The combination of claim 27, further including:

a router machine connected in the communications link between the processor system and the database system, the router machine including: means for receiving the database query command;means for determining that no conversion of the data in the database query command is necessary at the router machine; and means for sending the database query command to the database system without converting the data in the database query command.