SGML and XML are metalanguages - languages for describing other languages - which let users design their own customized markup languages for limitless different types of documents.

SGML is very large, powerful, and complex. It has been in heavy industrial and commercial use for over a decade, and there is a significant body of expertise and software to go with it. XML is a lightweight cut-down version of SGML which keeps enough of its functionality to make it useful but removes all the optional features which make SGML too complex to program for in a Web environment.

HTML is just one of the SGML or XML applications, the one most frequently used in the Web.

The Web is becoming much more than a static library. Increasingly, users are accessing the Web for 'Web pages' that aren't actually on the shelves. Instead, the pages are generated dynamically from information available to the Web server. That information can come from data bases on the Web server, from the site owner's enterprise databases, or even from other Web sites.

And that dynamic information needn't be served up raw. It can be analyzed, extracted, sorted, styled, and customized to create a personalized Web experience for the end-user. To coin a phrase, web pages are evolving into web services.

For this kind of power and flexibility, XML is the markup language of choice. You can see why by comparing XML and HTML. Both are based on SGML - but the difference is immediately apparent:

Both of these may look the same in your browser, but the XML data is smart data. HTML tells how the data should look, but XML tells you what it means. With XML, your browser knows there is a product, and it knows the model, dealer, and price. From a group of these it can show you the cheapest product or closest dealer without going back to the server.

Unlike HTML, with XML you create your own tags, so they describe exactly what you need to know. Because of that, your client-side applications can access data sources anywhere on the Web, in any format. New "middle-tier" servers sit between the data sources and the client, translating everything into your own task-specific XML.

But XML data isn't just smart data, it's also a smart document. That means when you display the information, the model name can be a different font from the dealer name, and the lowest price can be highlighted in green. Unlike HTML, where text is just text to be rendered in a uniform way, with XML text is smart, so it can control the rendition.

And you don't have to decide whether your information is data or documents; in XML, it is always both at once. You can do data processing or document processing or both at the same time. With that kind of flexibility, it's no wonder that we're starting to see a new Web of smart, structured information. It's a "Semantic Web" in which computers understand the meaning of the data they share.

A DTD is a formal description in XML Declaration Syntax of a particular type of document. It sets out what names are to be used for the different types of element, where they may occur, and how they all fit together.

The XML Specification explicitly says XML uses ISO 10646, the international standard 31-bit character repertoire which covers most human (and some non-human) languages. This is currently congruent with Unicode and is planned to be superset of Unicode.

XML parser is a software module to read documents and a way to give access for their content. XML parser generates a structured tree to return the results to the browser. An XML parser resembles a processor that determines the construction and properties of the data. An XML document can be read by an XML parser to create an output to build a screen sort. There are numerous parsers available and some of these are listed below:

The Xerces Java Parser
The primary function of the Xerces Java Parser is the building up of XML-aware web servers
Also to ensure the integrity of e-business data expressed in XML, James Clark contributed this parser to the community.

XP and XT

XP is a Java XML and XT is an XSL processor. Both are written in Java.XP detects all non well formed files. It plans to function as most rapid conformant XML parser in Java and gives high performance. On the other hand XT is a set of tools for building program transformation systems. The tools include pretty printing; bundling

SAX

Simple API for XML (SAX) originated by the members of a public mailing list (XML-DEV).It gives an occasion based approach to XML parsing. It indicates that instead of going from node to node, it goes from event to event. SAX is event. Events contain XML tag, detecting errors etc, such as this reference.

It’s optimal for a small XML parser and applications that need fast. It must be used when all of the procedure must be performed economically and quickly to input components.

XML parser

It runs on any platform where there’s Java virtual machine. It’s sometimes called XML4J.It has an interface which allows you to have a chain of XML formatted text, decide the XML tags and rely on them to extract the tagged information.

Kerberos is one of the most important authentication systems available to developers and network architects. It’s aim is simple – to provide a single sign on to an environment comprising of multiple systems and protocols. Kerberos therefore allows mutual authentication and importantly secure encrypted communication between both users and systems. It’s different too many authentication systems in that it does not rely on security tokens but relies on each user or system to maintain and remember a unique password.

When a user authenticates against the local operating system, normally there is an agent running which is responsible for sending an authentication request to a central Kerberos server. This authentication server responds by sending the credentials in encrypted format back to the agent. This local agent then will attempt to decrypt the credentials using the password which has been supplied by the user or local application. If the password is correct, then the credentials can be decrypted and the user validated.

After successful validation the user is also given authentication tickets which allow them to access other Kerberos- authenticated services. In addition to this, a set of cipher keys is supplied which can be used to encrypt all the data sessions. This is important for security which is especially relevant when dealing with a wide range of different applications and systems with a single authentication system.

After validation is completed also, no further authentication is necessary – the ticket will allow access until it expires. So although the user does need to remember a password to authenticate, only one is required to access any number of systems and shares on the network. There are a lot of configuration options to finely tune Kerberos particularly in a Windows environment where Kerberos is used primarily to access Active Directory resources. You can restrict access based on a whole host of factors in addition to the primary authentication. It’s effective in authentication in a fluid environment where users may log on to many different systems and applications, even when these systems can keep changing their IP address (note: http://www.changeipaddress.net/ )

There is one single reason that Kerberos has become so successful, it’s because it’s freely available. Anyone can download and use the code free of charge, which means it’s widely utilised and is constantly developed and improved too. There are many commercial implementations of Kerberos such as from Microsoft and IBM (Global Sign On) these normally have additional features and a management system. There have been concerns over various security flaws in Kerberos however because it is open source these have all been fixed in the latest implementation Kerberos V.

There are of course, many tools for configuring, installing and troubleshooting DNS issues, many can make life an awful lot easier. Anyway here’s some of the perhaps most popular ones which exist in various platforms.

nslookup

This utility is probably the oldest and most widely used DNS tool available. IT’s primary functions are to run individual and specific queries on all manner of resource records. It is even possible to perform zone transfers using this tool, which is why it’s so important.

ipconfig

This tool is often used daily to release and renew DHCP addresses. However it can also be used to perform some DNS functions, it’s certainly a useful client tool to get to grips with. There are a couple of very useful switches which supply DNS related functionality. The /displaydns switch will return the contents of the client resolver cache. It will show you the Record Name, TYpe TTL, Data Length and RR Data. It will use cache data to return these records at least until the TTL expires when it will query a name server. The /flushdns switch is used for erasing the contents of the resolver cache. In troubleshooting terms this means that cached data will not be used and a fresh request will be sent to a name server. Finally /registerdns which will refresh it’s DHCP lease and network records.

netdiag

One of the most useful general diagnostic tools you will find in a Windows environment. It performs a long list of network connectivity tests, including a specific DNS test. Using the switch /test:DNS the program will check each active network card and see whether it has a A record registered in the domain. The additional switch /DEBUG can be used in conjunction with this to produce a verbose output to the screen which is extremely helpful in troubleshooting DNS issues. It can be found in the Windows support tools directory which is on the installation disks and shares. It’s surprisingly useful when checking a DNS service or programs.

dnsdiag

This useful utility is especially useful in checking through email issues that are DNS related. A DNS misconfiguration can cause all sorts of email issues as many have experienced. It functions by simulating all the DNS related activities which would be done by an SMTP agent when delivering email There is a caveat in it’s use for this sort of diagnostic work, you’ll need to run it on a computer which has either and Exchange or SMTP agent installed locally.

Most of these tools can be used to solve a huge range of DNS related issues, so they’re worth getting to grips with. A great test is to use them with a new installation, or DNS design, perhaps run through the tools to check out that Smart DNS free trial you got working.

If you want to write programs that can utilize DNS messages then you must understand the format. So where will you find all the queries and responses that DNS uses to resolve addresses? Well the majority are mostly contained within UDP, each message will be fully contained within a UDP datagram. They can also be relayed using TCP/IP but in this instance they are prefixed with a 2 byte value which indicates the length of the query or response. The extra 2 bytes are not included in this calculation – a point which is important!

All DNS communication exists with a format simply called a message. Every different function in DNS from simple queries to Smart DNS functions will all use this very same format. The format of the message follows this basic template –

Header

Question – For the Name Server

Answer – Answering the Question

Authority – Point Towards Authority

Additional – Additional Information

Some sections will be missing depending on the query, however the header will always be present. This is because within the header you’ll find fields which specify which of the remaining sections are indeed present, also whether the message is a query or a response and finally if there are any specific codes present.

Each name of the sections following the header are derived from their actual use, it’s all pretty common sense stuff. The Question section is indeed a question directed at a Name Server, within this section are fields which define the question.

QTYPE – Query Type

QCLASS – Query Class

QNAME – Query Domain Name

Specifically if you are programming or developing any application which relies on this functionality like the best Smart DNS service for example it is important to understand these classes properly. Also programmers will need to understand the specific format of the classes. The QNAME represents the domain name being queried as a sequence of labels. Each one of these labels consists of a length octet followed by a number.

As the complexity of networks increases, with diverse systems and multiple infrastructure components such as varied routers and switches (all from different vendors and suppliers) – so managing these systems in a standard way become much more difficult. The network might run on a standard protocol but in any larger organisation a whole host of subsystems and protocols will exist. This can be a nightmare to manage for both support teams and application developers seeking to get their systems to run correctly within the environment.

SNMP – the Simple Network Management Protocol seeks to provide some common framework to control all these network elements. It’s core function is to divide the network into components – manager and agent to define these elements and centralize control and monitoring between diverse systems. It’s quite a simple protocol which operates on a request-reply basis, i.e an SNMP manager and an SNMP agent. The variables defined by the agent are included in the management information base (MIB) which can be set or queries by the manager.

The variables are in turn are identified by object identifiers which are arranged in a hierarchical naming scheme. These are normally very long numerical values which are abbreviated into a simple name specifically for support staff to be able to read. These are further divided, for example to control many routers from a specific vendor by assigning object identifiers to each instance.

There are lots of groups of SNMP variables, such as system, interface, address translation, IP, ICMP, TCP and UDP for example. These can be used to either manage or query specific devices on a network by utlising these groups. You can use the queries to get information about any aspect of the network such as requesting an MTU or querying for the correct IP addresses of a specific device (note this could be fake – watch this)

The other key function of SNMP is that of SNMP traps, which is a way for the agent to notify the manager that something significant has happened. This is of course essential in order to effectively manage a network properly and effectively identify problems before they cause a significant problem. These traps allow the agent to communicate with the manager where as the majority of the communication flows from the manager to the agent in the form of controls and queries. Usually these SNMP traps are sent to UDP port 162 on the managing device, these used to be in the clear and could be intercepted but the later versions such as SNMPv2 provide some levels of authentication and privacy. This secrity could be supplemented by allowing the support and admin staff to use a VPN especially when accessing the manager remotely from outside the internal network over the internet.

Any circuit level tunneling through a proxy server such as SOCKS or SSL, will allow most protocols to be passed through a standard proxy gateway. Whenever you see a statement like this, you should remember that it implies that the protocol is not actually understood but merely transparently transmits it. For instance, the popular tunneling protocol SSL is able to tunnel virtually any TCP based protocol without problem, it’s often used to add some protection for weak protocols like FTP and Telnet.

But it can create a little bit of a headache for a proxy administrator. Not only can all sorts of protocols be allowed access to a network but often the administrator has no knowledge of the contents due to encryption. There are some short terms solutions which will provide a limited amount of protection – for example blocking access based on port numbers. That is only allow specific ports to be tunneled such as 443 for HTTPS, 636 for secure LDAP. This can work well but remember some advanced security programs like Identity Cloaker allow the configuration of the port, allowing protocols and applications to be tunneled on non standard ports -a bit like a proxy unblocker . It is therefore not an ideal solution and one that cannot be relied upon in the longer terms to keep a network and proxy secure.

The obvious solution of course is to utilise a proxy server that can verify the protocol that is being transmitted. This requires an awful lot more intelligence built into the proxies but it is possible. It does require a bigger overhead, it does make the proxy server more expensive and perhaps more complicated and trickier to manage. However without this sort of intelligence or something similar you will get the possibility of an FTP session being set up through an SSL tunnel for example.

In some ways proxies already do some of this, and protocols that are proxied rather than tunneled at the application level cannot be exploited like this. Examples include HTTP, FTP and even Gophur cannot be used to trick entry, simply because there is no ‘dumb’, direct tunnel the proxy understands and will only relay legitimate responses.

Zone transfers are an important part of distributing changes between name servers. Every domain on the internet (and within private networks for that matter) much have a master server which contains the definitive records of names and addresses for that domain. Zone transfers are the system which allows any changes on the master server be distributed out to the slave name servers which could be spread far and wide. It’s important that these are done regularly even if changes are not frequent if only to ensure the validity of the current name space.

For example when a slave name server restarts, or a periodic intervals it will contact the master server if possible and check for updated records. If the server finds updates then it will requests a zone transfer from the master server. This is simply a transfer of zone maps and DNS records from the master to the slave name server, it performs the core function for keeping a DNS service up to date. What is different from the majority of DNS transaction, the protocol used in this instance is TCP. The main reason is that a Zone transfer will potentially contain a huge amount of data in many instances and to ensure reliable delivery TCP is the best transport mechanism that is usually available.

The Zone transfer is obviously a huge target for any hacker who wants to attempt to compromise a domain or specific server. Being able to intercept or even modify zone transfers gives an attacker the potential to take over any system. Obviously modifying addresses will be difficult for any attacker but even intercepting these transfers can be very dangerous. A zone file includes the details of every device on a specific network or domain, all ip addresses assigned to every device. Some of these hosts will typically be non-internet facing for security reasons, so it’s important that zone transfers are secured. This relies on configuration of the name servers themselves, and in particular how zone transfers are accomplished. In versions of BIND 4.9.4 and later for example you can modify parameters to specify only certain ip addresses or subnets to be authorised to both send and receive transfers. There are other useful security features implemented in later versions of BIND too.

For older systems, you should ideally look to update, however blocking traffic to port 53 which is the standard port for the transfer of DNS traffic could be viable. However these port blocking solutions can often cause other difficulties with specific applications and internal devices, you may very well end up blocking legitimate traffic and breaking applications. Like breaking the vice director’s international VPN which he uses to watch ITV in Spain, this sounds an unlikely scenario for DNS security measures but one that’s happened to me !!

The TCP window size is basically the method employed by the receiving client/host to inform the sender, what the current buffer size should be for all the data within that connection. It’s a flow control system which ensure that the receiving host doesn’t get overloaded with data, it’s very important that this is a dynamic figure which allows for various receive rates based on all sorts of outside factors such as network speed. For example the windows size will become much smaller when data has been received but not yet processed by the receiving host. If the buffer become full perhaps communicating with a fast VPN system, then the window will be set at zero, which informs the sender to temporarily stop transmitting data packets. When some of the data is processed and there is some room in the buffer then the receiving device will send a windows size update to resume the flow of data.

From this explanation we can see that most of the control of the TCP window size is controlled by the receiving host, this allows control of the TCP session and prevent the client becoming overloaded. It’s worth bearing this in mind because it’s probably a natural to assume that the data flow is controlled by the device sending, not the device receiving. In much networking analysis this principle holds which when you think about is entirely logical to ensure that both devices operate withing their own operational limits.

The TCP Window size is of course of special interest to hackers, security and intrusion detection analysts as it does give some very useful information about the client you are talking too. For instance if you use tools like Nmap, you can by firing data packets at an unknown system, fingerprint and identify the operating system by analyzing the response and how the TCP windows size is set. For example most Windows systems have initially defined default TCP Packet receives sizes set in the registry which will not normally change under normal circumstances. For Nmap and other fingerprinting tools, the TCP Window size is a useful way of identifying a client operating system with minimal interaction with the system. Some of the best VPN software also allows you to control the flow of data in order to manipulate and identify clients using the TCP Windows size.

It’s other useful attribute for security specialists is such as in the use of Honey pots and IDS systems like Snort and La Brea. La Brea can effectively slow down a connection from an attacker by modifying the TCP Windows size, in many ways it can thwart and attack or at least make it a much more time consuming and cumbersome task.

Any identity system which is automated needs some way of both creating and distributing authorization and authentication assertions. One of the most famous is of course Kerberos, which has it’s own methods for dealing with this requirement. However many digital systems are now starting to use SAML – the Security Assertion Markup Language – it’s becoming the de facto security credential standard.

SAML of course uses XML as a standard to represent security credentials, but it also defines a protocol for requesting and receivfing the credential data from an authority services (SAML based). One of the key benefits to SAML is that using it is pretty straight forward, this fact alone has increased it’s usage considerably. A client will make a request about a subject through to the SAML authority. The authority in turn makes assertions about the identity of the subject in regards to a particular security domain. To take an example – the subject could be identified by an email address linked to it’s originating DNS domain, this is just one simple example though.

So what exactly is a SAML authority? Well it is quite simply a service (usually online) that responds to SAML requests. These SAML requests are called assertions. There are three different types of SAML authorities which can be queried – authentication authorities, attribute authorities and policy decision points (PDPs). These types of authorities all return distinct types of assertions –

SAML authentication assertions

SAML attribute assertions

SAML authorization assertions

Although there are three different definitions here, in practice most authorities are set up to produce each type of assertions. Sometimes in very specific applications, you’ll find an authority that is designed to only produce a specific subset but this is quite rare especially in online applications – although they’re sometimes used as proxy authorisation – see this. All of them contain certain elements however like IDs for issuers, time stamps, assertion IDs, subjects including security domains and names.

Each SAML attribute request will begin using a standard syntax – <samlp:Request…..> – the content then would refer to the specific parts of the request. This could be virtually anything but in practice it’s often something straight forward like asking which department or domain an email is associated with.

Just like every other type of communication method that exists online, you can use encryption for securing XML documents. In fact it is recommended if possible that all important XML documents should be encrypted completely before being transmitted across the wire. The document would then be decrypted using the appropriate key when it reaches it’s correct destination.

There is a problem with this however, in that when you encrypt something you also obfuscate the entire message. This means that unfortunately some parts of an XML message will need to be sent using clear text only. Take for example SOAP messages, these are a format that computers use to exchange rpc (remote procedure calls) over the internet. Although you can encrypt certain parts of the SOAP message, at a minimum the headers must be in clear text otherwise intermediary devices would not be able to see routing and other important information.

The other alternative is to encrypt the channel itself, typically using something like SSL or SSH. This ensures that the message is protected in transit by encrypting the entire channel. However there is another issues here that channel encryption only protects the two endpoints, the message will otherwise be displayed in clear text. These problems were real issues for XML developers and to combat them – the XML encryption standard was developed.

The primary goal of this standard is to allow the partial and secure encryption of any XML document. The encryption standard, very much like other XML standards like the signature protocol has quite a lot of different parts. This is to enable the standard to deal with all sorts of different contingencies, however the core functions are quite simple and easy to follow.

Any encrypted element in an XML document is identified using the following element – , this element consists of two distinct parts –

An optional element that gives information. The element is actually the same one that is defined in the XML signature specification.

A element that can either include the actual data which is being encrypted inside the element. Alternatively it can contain a reference to the encrypted data enclosed in a element.

For instance XML encryption may be used in something like an online payment system which sends orders through an XML document. The order document may contain all the information about the order including sensitive information like the payment details, credit card numbers all contained in a element. In this example most of the order should be left in clear text so that it can be processed quickly, but the payment information should be encrypted and decrypted only when the payment is actually being processed. XML encryption allows this facility by ensuring the specific encryption of certain parts of the document – i.e the payment information.

Pages

Resources

SGML and XML are both languages that are used for defining markup languages. More specifically, they are metalanguage formalisms that facilitate the definition of descriptive markup languages for the purpose of electronic information encoding and interchange. SGML and XML support the definition of markup languages that are hardware- and software-independent, as well as application-processing neutral.

SGML is an International Standard, defined in the document ISO 8879:1986. Information Processing - Text and Office Systems - Standard Generalized Markup Language (SGML), as amended. A key philosophical commitment underlying SGML is separating the representation of information structure and content from information processing specifications. Information objects modeled through an SGML markup language are named and described (using attributes and subelements) in terms of what they are (from a defined perspective) not in terms of how they are to be displayed or otherwise processed.

XML (Extensible Markup Language) is a dialect of SGML that is designed to enable 'generic SGML' to be served, received, and processed on the World Wide Web. XML originated in 1996, as a result of frustration with the deployment of SGML on the Internet. The SGML family of standards that include SGML (the modeling framework), DSSSL (the transformation framework for presentation) and HyTime (the linking and timing framework) are ISO standards that proved difficult to implement and aroused little interest outside of specialist fields of expertise. XML simplified the requirements for implementation, with the specific intention of enabling deployment of markup applictions on the Internet.

Both SGML and XML supported by a suite of companion standards addressing such features as transformation, presentation, linking, and event triggering. A broad range of commercial and public-domain software has been developed to assist users with markup implementation.