Authoring dynamic websites with SXML

Transcription

1 Authoring dynamic websites with SXML Peter Bex February, Introduction There are roughly two ways of dynamically generating websites. One way is the PHP way (or Perl, Ruby, etc). This means you simply write some HTML code and sprinkle code with side-effects in between. There are clear disadvantages to this. For example, operating on fragments of code must be done on the stringlevel, which is too low to do meaningful post-processing without writing ad-hoc HTML parsers. This also has the disadvantage that malicious or obnoxious HTML and scripts can be inserted relatively easy in the output by any potential attackers of your site, unless you take great care to escape HTML characters. The other way is to use XML. Then you need to learn a number of different XML technologies like XSL, which includes XSLT and XPath or XQuery. On top of that, you still need to use a scripting language to express your business logic (XExpr, or any other scripting language like PHP). XML is also quite hard to read for a human being because of its verbosity. Any Scheme hacker who has done some moderate to heavy web programming will be annoyed by this state of affairs. Why can t we just use one tool to do it all? Well, we can! By using SXML instead of these other technologies, you can use your existing knowledge of Scheme and a handful of procedures that can assist you in transforming XML in a completely functional way. Another advantage is that if you happen to have some existing XSL code, you do not have to discard it. You can simply take that code and feed it XML output from your SXML code without any problems. There is quite a bit of information available at the SSAX project page, but in my opinion it s quite fragmented and too academic. That s why I decided to write this hands-on tutorial. This tutorial is aimed at people who have never worked with SXML. It is assumed the reader is familiar with XHTML and has a working knowledge of Scheme. No knowledge of the corresponding XML technologies is assumed, but it may make it easier for you to understand. If you do not know Scheme yet, you may want to check out to see what it s all about. 1

2 2 What is SXML? SXML is simply a way to write XML as s-expressions. The official specification for SXML can be found at A simple XHTML page looks like this: <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>an example page</title> </head> <body> <h1 id="greeting">hi, there!</h1> <p>this is just an >>example<< to show XHTML & SXML.</p> </body> </html> We can translate this to SXML by hand 1 and obtain the following: (html (xmlns "http://www.w3.org/1999/xhtml") (xml:lang "en") (lang "en")) (head (title "An example page")) (body (h1 (id "greeting")) "Hi, there") (p "This is just an >>example<< to show XHTML & SXML."))) Each element s tag pair is replaced by a set of parentheses. The tag s name is not repeated at the end, it is simply the first symbol in the list. The element s contents follow, which are either elements themselves or strings. There is no special syntax required for XML attributes. In SXML they are simply represented as just another node, which has the special name This can t cause a name clash with an tag, is not allowed as a tag name in XML. This is a common pattern in SXML: Anytime a tag is used to indicate a special status or something that is not possible in XML, a name is used that does not constitute a valid XML identifier. We can also see that there s no need to escape otherwise meaningful characterslike&and>as&and>entities. Allstringcontentisautomatically escaped because it is considered to be pure content, and has no tags or entities in it. This also means it is much easier to insert autogenerated content and there is no danger that we might forget to escape user input when we display it to other users (which could lead to all kinds of nasty cross-site scripting attacks or other annoyances). 1 If we had translated the XHTML to SXML with a parser like SSAX, we d end up with a slightly different structure, because it would interpret and encode the namespace information differently. To keep things simple, we ll just treat namespaces as simple attributes here. 2

3 3 SXML for websites Now we know how to translate any X(HT)ML document to SXML, let s see how we can write SXML that gets translated to XHTML. The following illustrates the typical pattern we ll see a lot when generating websites: (define document (html (xmlns "http://www.w3.org/1999/xhtml") (xml:lang "en") (lang "en")) (head (title "An example page")) (body (h1 (id "greeting")) "Hi, there") (p "This is just an >>example<< to show XHTML & SXML.")))) (SRV:send-reply (pre-post-order document universal-conversion-rules)) The call to SRV:send-reply has the side-effect of displaying the HTML to the current output port so if you want it in a string you ll have to explicitly capture the current output port (eg, with with-output-to-string or some other implementation-specific procedure). The procedure pre-post-order is the core of SXSLT. Right now we ve only used it as a translator from generic SXML to something SRV:send-reply can output. If you just try to run (SRV:send-reply document), you ll see the output is some kind of dumb concatenation of the flattened SXML tree. What pre-post-order does here is transform the SXML tree to some semi-flattened form of the SXML that can be concatenated so an XML string can be created by SRV:send-reply. The universal-conversion-rules are rules that tell it how it can do that. Don t worry if you don t understand this yet. We ll look at pre-post-order in much more detail in a few moments. 4 Semantic content If you would only use the information above, you d already have a very useful tool at your disposal. You can view any XML tree as a simple Scheme list. This means that any operation you can perform on lists, you can perform on SXML as well. A simple but useful example is when we would like to describe our pages in a more semantic way. For example, we would like to be able to write the following: (define semantic-page (page "Welcome to my homepage" (navigation) (greeting "Hi there") (p "This is a nice example page") (footer))) 3

4 We could use the same structure in every page. Actually, if every single page has a navigation and footer, we could even leave those out. We can see how this is a much more semantic way to describe our page. To actually transform this to valid XHTML, we could use the following code (which could be common code we include in all pages in our site): (define (translator content) (cond ((null? content) ()) ((list? (car content)) (cons (translator (car content)) (translator (cdr content)))) ;; Recurse down into lists ((eq? (car content) page) (html (xmlns "http://www.w3.org/1999/xhtml") (xml:lang "en") (lang "en")) (head (title,(cadr content))) (body,(translator (cddr content))))) ((eq? (car content) greeting) (h1 (id "greeting")),(cadr content))) ((eq? (car content) navigation) (cons (ul (li (a (href "home")) "homepage")) (li (a (href "about")) "about this site")) (li (a (href "contact")) "contact us"))) (translator (cdr content)))) ((eq? (car content) footer) (p "Copyright (c) 2007")) (else (cons (car content) (translator (cdr content)))))) (define document (translator semantic-page)) (SRV:send-reply (pre-post-order document universal-conversion-rules)) I m sure you ll agree this explicit rewriting of the SXML tree with custom code is not exactly fun. We d like to have some kind of generalised way to do these rewrites, without having to explicitly write the behaviour every time. In other words, we d like to define our transformations in a sort of stylesheet DSL. This is exactly what SXSLT is. We can write the above as follows: (define my-rules ((page.,(lambda (tag page-title. contents) (html (xmlns "http://www.w3.org/1999/xhtml") (xml:lang "en") (lang "en")) (head (title,page-title)) (body,contents)))) (navigation.,(lambda (tag) 4

5 (ul (li (a (href "home")) "homepage")) (li (a (href "about")) "about this site")) (li (a (href "contact")) "contact us"))))) (greeting.,(lambda (tag str) (h1 (id "greeting")),str))) (footer.,(lambda (tag) (p "Copyright (c) 2007"))) (*text*.,(lambda (tag str) str)) (*default*.,(lambda x x)))) (SRV:send-reply (pre-post-order (pre-post-order semantic-page my-rules) universal-conversion-rules)) Not only is the SXML shorter to write and less error-prone, but it is also clearer what is happening. Every high-level tag we defined is listed on the left, and the transformation code to run on that tag is shown on the right part. If you would like to take a look at the generated SXML code, do the following: (pre-post-order semantic-page my-rules) 4.1 Slowing down a bit Let s look at what happens here in more detail by investigating one rule up close: (greeting.,(lambda (tag str) (h1 (id "greeting")),str))) The pre-post-order procedure walks the SXML tree almost in the same way our custom code did. The custom code simply looked at every element in the tree to see if it matched one of the expected symbols. But pre-post-order actually only looks at tags, ie the first symbol of a sublist. If the first rule does not match, it looks at the next rule, much like our custom code. If it finds a match for the tag, the tag name and all of its childnodes are passed to the transformation procedure as arguments. If there are no matches at all, the *default* rule is applied, which in this case leaves the content untouched. The *text* rule is applied to all leafnodes (ie, non-list nodes, which can be strings or symbols among other things). More about these special rules later. In our case, the greeting element has only one element under it, the greeting s text. This is put inside a h1. If we would like the name of the page to be printed smaller, we could simply modify this rule and every page would have its name printed smaller. It would also allow us to attach an id or class to it so we can target it with CSS for further styling. If we look at the SXML code again for a second, we see that the greeting element looks very much like a procedure call to the lambda defined above: 5

6 (greeting "Hi there") The only difference is that the lambda accepts one more argument: the tag s name. This can be useful if you use the same procedure for several rules (or for a *default* rule). 5 Tree traversal methods We have only seen part of pre-post-order s power. The procedure is called that way because there are two different orders in which one can traverse an SXML-tree: Inside-out or outside-in. Let s look at another example: (define counter (child-count (children))) (define counting-rules ((child-count.,(lambda (tag children) (kids,(length children),children))) (children.,(lambda (tag) ;; Just create 10 child tags (list-tabulate 10 (lambda _ (child))))) (*text*.,(lambda (tag str) str)) (*default*.,(lambda x x)))) (pre-post-order counter counting-rules) This isasimpleset ofrules. Thechildrenrule generates10childelements. The child-count rule simply counts its children and puts the number in front of them. The question is now: Will it count 1 or 10? What it prints depends on whether pre-post-order traverses the tree pre-order or post-order. Go ahead and try it out. You ll see that the default order (the order we ve seen up till now) is actually post-order, or inside-out. The children are generated first, and the resulting subtree is used in the call to the child-count rule. The result is (kids 10 ((child) (child) (child) (child) (child) (child) (child) (child) (child) (child))) If we don t like this behaviour, we can change the child-count rule s order: (child-count *preorder*.,(lambda (tag children) (kids,(length children),children))) This will produce the following result: 6

7 (kids 1 (children)) Wait a minute! That s not what we expected, is it? The(children) element isn t transformed anymore! That s because *preorder* rules block the transformation process. To obtain truly outside-in behaviour, we need to explicitly call pre-post-order in the rule: (child-count *preorder*.,(lambda (tag children) (pre-post-order (kids,(length children),children)) counting-rules)) This results in the correct response of (kids 1 ((child) (child) (child) (child) (child) (child) (child) (child) (child) (child))) We could ve just called pre-post-order on the children, but the shown pattern is so common that there is a shortcut: (child-count *macro*.,(lambda (tag children) (kids,(length children),children))) This does exactly the same as calling pre-post-order on a *preorder* rule s result. Be careful not to introduce endless loops this way! If the macro s rule returns an element that is transformed by another rule, it may be possible that there will be no end to the transformations. It is tempting to make everything *macro* rules, because very often rules produce new content that also contains tags that need to be rewritten. There are many examples where we need *macro*, even if we don t really care about the order of transformation. Here is one: (kids.,(lambda (tag. contents) The kids tag is of course not a valid HTML rule, so we probably want to reduce it further. If we use the *preorder* rule, the kids node is obviously not reduced to a h2. But if we use the original post-order rule (the one without *preorder* or *macro*), the result doesn t have pre-post-order applied to it either. Calling pre-post-order on a post-order rule s result is wasteful because it will traverse the whole subtree again. However, if we use *macro*, it will traverse the subtree only once. Unfortunately, we ll have to traverse the children rule first, and the resulting tag as well, so we can t really evade traversing the tree twice. (child-count.,(lambda (tag children) (pre-post-order (kids,(length children),children)) counting-rules)) 7

9 (link (rel "stylesheet") (type "text/css") (href "layout.css"))) (title,title)) (*text*.,(lambda (tag str) str)) (*default*.,(lambda x x)))) (define doctype-rules ((doctype.,(lambda (tag) (string-append "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Strict//EN\"" " \"http://www.w3.org/tr/xhtml1/dtd/xhtml1-strict.dtd\">"))))) (define content (page "test" (h1 "blah"))) (SRV:send-reply (pre-post-order (pre-post-order content my-rules) (append doctype-rules universal-conversion-rules))) 7 A few useful tools In order to streamline this whole stuff a bit, it s nice to build a few helper procedures. I will now give you a few tools I ve found useful when writing my own pages. 7.1 Hassle-free output ;; Requires SRFI-1, for fold (define (sxml-apply-rules content. rules) (fold (lambda (rules content) (pre-post-order content rules)) content rules)) (define (output-html content. rules) (SRV:send-reply (apply sxml-apply-rules content rules))) These two 2 are very useful, since we often end up nesting a lot of calls to pre-post-order, resulting in a big bunch of code just to send the output to the user s browser. That s what sxml-apply-rules is for, now we can just call (sxml-apply-rules document my-rules universal-conversion-rules). If 2 Chicken users will find these in the spiffy-utils egg. 9

10 we don t want the resulting SXML, we can call output-html instead of sxml-apply-rules and it not only applies pre-post-order for each of the rules, but it also sends the output directly to the browser. 7.2 Entities I have not yet explained how to output HTML entities like >. If you simply try to output an entity as a string, you ll see that the universal-conversion-rules will recode the & to &, which means that > will look like &gt; in the final output. This is definitely not what we want. We ll have to define a rule that we can add to the universal-conversion-rules. We ll take the rule from the Chicken scheme system, which already provides one exactly for this reason in its default universal-conversion-rules for its sxml-transforms egg: (define universal-conversion-rules (append universal-conversion-rules ((&.,(lambda (tag. elts) (map (lambda (elt) (string-append "&" elt ";")) elts)))))) Now we can just write a page like this: ;; 10 > 1 and 1 < 10 (define document ((page "Entities example" "10 > 1 and 1 " (& "lt") " 10"))) And now it doesn t matter how many rulesets we apply to this, since only the final universal-conversion-rules translates. 7.3 Adding classes Very often, you need to conditionally add a class to an already existing piece of content. It s quite useful to be able to have a procedure that does this. ;; Uses sxml-match, SRFI-1 for lset-union ;; and SRFI-13 for string-tokenize and string-join (define (add-classnames content. new-names) ;; If there are no new names, we can simply return the content. (if (null? new-names) content ;; Add the classnames in a clean way, by comparing them ;; against the existing tags and only adding them if they re ;; not already there. 10

11 (let ((add (lambda (old-names) (string-join (lset-union string=? new-names (string-tokenize old-names)))))) ;; Little hack to force the tag to get matched. (sxml-match (cons tag (cdr content)) ((tag (class,old-names).,rest).,body) (,(car content) (class,(add ((tag (,(car content) (class,(add ((tag.,body) (,(car content) (class,(add ;; Example use: (add-classnames (p (class "even")) "blah") "selected") => (p (class "selected even")) "blah") Here, I ve used the sxml-match library by Jim Bender. This is a pattern matching library which doesn t match s-expressions literally, but knows about SXML. This means, among other things, that it disregards attribute orderings. That s why it s possible to match the class in any position even though it s listed as the first attribute in the pattern. This library is a valuable addition to our toolkit. I ve hacked around a bit to make it match any tag we feed it by replacing the tag itself in the input to the matcher by a preselected tag called simplytag. Thisisbecausethefirstelement, likeinamacroexpression,can tbe variable. I recommend reading the documentation on the SXML-match library if you would like to know more. The library is part of a bigger web framework called WebIt!, which also includes a Scheme DSL for generating CSS. It is certainly possible to exclusively use sxml-match for generating your output by macro translation instead of pre-post-order. The disadvantage of this approach is that rulesets are not composable like they are with pre-post-order. Otherwise, it seems to be pretty much equivalent in functionality. On the other hand, if you don t like the extra dependency, you could also leave out sxml-match and write the add-classnames procedure manually, but it s not going to look pretty. 7.4 Getting child nodes and attributes Often, you only want to look at the child nodes of an element. SXML can be tricky because it treats attribute nodes as regular child nodes. This means you sometimes want to skip those, if they re there. On other occasions, you want to be able to assume there are attributes to make your code simpler to follow. These two procedures will help with this: (define (child-nodes contents) 11

14 table-rules page-rules (append doctype-rules universal-conversion-rules)) Our layout.css can look something like this:.even { background-color: #aaff00; } Now every even row in the table will have a lime background color. Of course, you need to write a details.sxml for this page to work as it should. 9 More information If you would like to know more about SXML, visit the SSAX project homepage and Oleg Kiselyov s SXML page. You can find not only the official specification of SXML here, but also information about other SXML technologies (including how to write XML-to-SXML parsers). Happy Scheming! 14

Introduction to Web Design Curriculum Sample Thank you for evaluating our curriculum pack for your school! We have assembled what we believe to be the finest collection of materials anywhere to teach basic

Foreword We live in a time when websites have become part of our everyday lives, replacing newspapers and books, and offering users a whole range of new opportunities. You probably visit at least a few

CSCI110: Examination information. The exam for CSCI110 will consist of short answer questions. Most of them will require a couple of sentences of explanation of a concept covered in lectures or practical

Web Development Owen Sacco ICS2205/ICS2230 Web Intelligence Introduction Client-Side scripting involves using programming technologies to build web pages and applications that are run on the client (i.e.

21.1 Advanced Tornado Advanced Tornado One of the main reasons we might want to use a web framework like Tornado is that they hide a lot of the boilerplate stuff that we don t really care about, like escaping

27 November 2012 Status: Draft Author: Jean-Claude Dauphin JISIS and Web Technologies I. Introduction This document does aspire to explain how J-ISIS is related to Web technologies and how to use J-ISIS

10CS73:Web Programming Question Bank Fundamentals of Web: 1.What is WWW? 2. What are domain names? Explain domain name conversion with diagram 3.What are the difference between web browser and web server

Computer Programming In QBasic Name: Class ID. Computer# Introduction You've probably used computers to play games, and to write reports for school. It's a lot more fun to create your own games to play

Beginning Web Development with Node.js Andrew Patzer This book is for sale at http://leanpub.com/webdevelopmentwithnodejs This version was published on 2013-10-18 This is a Leanpub book. Leanpub empowers

Extracted from: Web Development Recipes This PDF file contains pages extracted from Web Development Recipes, published by the Pragmatic Bookshelf. For more information or to purchase a paperback or PDF

2002-6-29 Synopsis In this tutorial, you will learn how to use forms with PHP. Page 1 Forms and PHP One of the most popular ways to make a web site interactive is the use of forms. With forms you can have

HTML5/CSS3/JavaScript Programming Description: Prerequisites: Audience: Length: This class is designed for students that have experience with basic HTML concepts that wish to learn about HTML Version 5,

Web Design 1A First Website Intro to Basic HTML So we're set? Have your text-editor ready. Be sure you use NotePad, NOT Word or even WordPad. Great, let's get going. Ok, let's just go through the steps

Chapter 1 MATERIAL Creating Your First ColdFusion Template Beginning with this chapter and continuing throughout this book, you will create Cold- Fusion templates. A ColdFusion template is nothing more

Sitecore CMS 6.2 Building A Very Simple Web Site Rev 100601 Sitecore CMS 6. 2 Building A Very Simple Web Site A Self-Study Guide for Developers Table of Contents Chapter 1 Introduction... 3 Chapter 2 Building

JJY s Joomla 1.5 Template Design Tutorial: Joomla 1.5 templates are relatively simple to construct, once you know a few details on how Joomla manages them. This tutorial assumes that you have a good understanding

Your First Web Page It all starts with an idea Every web page begins with an idea to communicate with an audience. For now, you will start with just a text file that will tell people a little about you,

HTML Web Page That Shows Its Own Source Code Tom Verhoeff November 2009 1 Introduction A well-known programming challenge is to write a program that prints its own source code. For interpreted languages,

Duration 1.5 months Our program is a practical knowledge oriented program aimed at learning the techniques of web development using PHP, HTML, CSS & JavaScript. It has some unique features which are as

Visual Logic Instructions and Assignments Visual Logic can be installed from the CD that accompanies our textbook. It is a nifty tool for creating program flowcharts, but that is only half of the story.

Instructions for Embedding a Kudos Display within Your Website You may use either of two technologies for this embedment. A. You may directly insert the underlying PHP code; or B. You may insert some JavaScript

Everyday Lessons from Rakudo Architecture Jonathan Worthington What do I do? I teach our advanced C#, Git and software architecture courses Sometimes a mentor at various companies in Sweden Core developer

PHP Tutorial From beginner to master PHP is a powerful tool for making dynamic and interactive Web pages. PHP is the widely-used, free, and efficient alternative to competitors such as Microsoft's ASP.

Cross Site Scripting (XSS) and PHP Security Anthony Ferrara NYPHP and OWASP Security Series June 30, 2011 What Is Cross Site Scripting? Injecting Scripts Into Otherwise Benign and Trusted Browser Rendered

What I Wish I Had Known In my first 30 days with umbraco What to expect A fast-paced session for beginners who want to learn from (rather than repeat) others mistakes. Concepts Best practices, tips, rules

MPD Technical Webinar Transcript Mark Kindl: On a previous Webinar, the NTAC Coordinator and one of the Co-Chairs of the NTAC introduced the NIEM MPD specification, which defines releases and IEPDs. In

How-to Guide: MIT DLC Drupal Cloud Theme This guide will show you how to take your initial Drupal Cloud site... and turn it into something more like this, using the MIT DLC Drupal Cloud theme. See this

Lab 4.4 Secret Messages: Indexing, Arrays, and Iteration This JavaScript lab (the last of the series) focuses on indexing, arrays, and iteration, but it also provides another context for practicing with

Chapter 15 Functional Programming Languages Introduction - The design of the imperative languages is based directly on the von Neumann architecture Efficiency (at least at first) is the primary concern,

Timeline for Microsoft Dynamics CRM A beautiful and intuitive way to view activity or record history for CRM entities Version 2 Contents Why a timeline?... 3 What does the timeline do?... 3 Default entities

Moving from CS 61A Scheme to CS 61B Java Introduction Java is an object-oriented language. This document describes some of the differences between object-oriented programming in Scheme (which we hope you

UIL Computer Science for Dummies by Jake Warren and works from Mr. Fleming 1 2 Foreword First of all, this book isn t really for dummies. I wrote it for myself and other kids who are on the team. Everything

Website Planning Checklist The following checklist will help clarify your needs and goals when creating a website you ll be surprised at how many decisions must be made before any production begins! Even

Form and function The simplest and really the only method to get information from a visitor to a Web site is via an HTML form. Form tags appeared early in the HTML spec, and closely mirror or exactly duplicate

Test automation of Web applications can be done more effectively by accessing the plumbing within the user interface. Here is a detailed walk-through of Watir, a tool many are using to check the pipes.

The presentation explains how to create and access the web services using the user interface. Page 1 of 14 The aim of this presentation is to familiarize you with the processes of creating and accessing

XML CIS-3152, Spring 2013 Peter C. Chapin Markup Languages Plain text documents with special commands PRO Plays well with version control and other program development tools. Easy to manipulate with scripts

Web Editing Basics 2: Reference We will be using the sample pages as a reference point for this training. They can be found on the Web Policy site: http://www.umkc.edu/web-policy/downloads.asp TOPICS 1.

WWLash02 6/14/02 3:20 PM Page 18 CHAPTER TWO USING VARIABLES Now that we have discussed some PHP background information and learned how to create and publish basic PHP scripts, let s explore how to use

Short notes on webpage programming languages What is HTML? HTML is a language for describing web pages. HTML stands for Hyper Text Markup Language HTML is a markup language A markup language is a set of

jquery Tutorial for Beginners: Nothing But the Goods Not too long ago I wrote an article for Six Revisions called Getting Started with jquery that covered some important things (concept-wise) that beginning

Building a Customized Data Entry System with SAS/IntrNet Keith J. Brown University of North Carolina General Administration Chapel Hill, NC Introduction The spread of the World Wide Web and access to the

VISUAL GUIDE to RX Scripting for Roulette Xtreme - System Designer 2.0 UX Software - 2009 TABLE OF CONTENTS INTRODUCTION... ii What is this book about?... iii How to use this book... iii Time to start...

CSCI110 Exercise 4: Database - MySQL The exercise This exercise is to be completed in the laboratory and your completed work is to be shown to the laboratory tutor. The work should be done in week-8 but

How to Make a Working Contact Form for your Website in Dreamweaver CS3 Killer Contact Forms Dreamweaver Spot With this E-Book you will be armed with everything you need to get a Contact Form up and running

White Paper In the past few years, SQL Injection attacks have been on the rise. The increase in the number of Database based applications, combined with various publications that explain the problem and