The Text Encoding Initiative (TEI) is one of the longest-lived and most influential projects in the field now known as the Digital Humanities. Its purpose is to provide guidelines for the creation and management in digital form of every type of data created and used by researchers in the Humanities, such as source texts, manuscripts, archival documents, ancient inscriptions, and many others. As its name suggests, its primary focus is on text rather than sound or video, but it can usefully be ...

JATS is an application of NISO Z39.96-2012, which defines a set of XML elements and attributes for tagging journal articles and describes three article models.
The content on this site is the supporting documentation for the standard. JATS is a continuation of the NLM Archiving and Interchange DTD work begun in 2002 by NCBI.

Recent proposals for creating digital scholarly editions (DSEs) through the crowdsourcing of transcriptions and collaborative scholarship, for the establishment of national repositories of digital humanities data, and for the referencing, sharing, and storage of DSEs, have underlined the need for greater data interoperability. The TEI Guidelines have tried to establish standards for encoding transcriptions since 1988. However, because the choice of tags is guided by human interpretation, TEI-XML encoded files are in general not interoperable. One way to fix this problem may be to break down the current all-in-one approach to encoding so that DSEs can be specified instead by a bundle of separate resources that together offer greater interoperability: plain text versions, markup, annotations, and metadata. This would facilitate not only the development of more general software for handling DSEs, but also enable existing programs that already handle these kinds of data to function more efficiently.

Andornot is an independent consulting firm incorporated in 1995 and based in Vancouver, Canada. For nearly 20 years we have helped a wide range of corporations, law firms, public institutions, government organizations, non-profits, archives and museums utilize the latest information management solutions.

The eXtensible Text Framework (XTF) is a powerful open source platform for providing access to digital content. Developed and maintained by the California Digital Library (CDL), XTF functions as the primary access technology for the CDL's digital collections and other digital projects worldwide.

his is an anonymized dump of all user-contributed content on the Stack Exchange network. Each site is formatted as a separate archive consisting of XML files zipped via 7-zip using bzip2 compression. Each site archive includes Posts, Users, Votes, Comments, PostHistory and PostLinks. For complete schema information, see the included readme.txt.

URL patterns use an extremely simple syntax. Every character in a pattern must match the corresponding character in the URL path exactly, with two exceptions. At the end of a pattern, /* matches any sequence of characters from that point forward. The pattern *.extension matches any file name ending with extension. No other wildcards are supported, and an asterisk at any other position in the pattern is not a wildcard.
First, the container prefers an exact path match over a wildcard path match. Second, the container prefers to match the longest pattern. Third, the container prefers path matches over filetype matches. Finally, the pattern <url-pattern>/</url-pattern> always matches any request that no other pattern matches

The site mlat.uzh.ch is a Latin text (meta-)repository and tool under way of development. Users should take into account that many functions do not yet work satisfactorily. This Corpus Córporum is being developed at the University of Zurich under the direction of Ph. Roelli, Institute of Medieval Latin Studies. The project uses exclusively free and open software and is non-commercial. Our main goals are:
- To provide a platform into which standardised (TEI) xml-files of Latin texts can be loaded (if you would like to share your texts, please contact us) and downloaded (unless copyrights restrict this).
- To present these texts in a way that they may be read online. Latin words in the text can be resolved to their grammatical form by clicking them (powered by Perseus and TreeTagger). The entries in the following dictionaries are then displayed: Georges (Latin-German), Lewis and Short (Latin-English) and DuCange (mediaeval Latin).
- To make these texts searchable in complex manners (including by lemma). Searches, wordlists and concordances can be generated for the current level at the bottom left of the page (we use the open-source software Sphinx).
- To be able to use the platform to publish Latin texts online (cf. the Richard Rufus Project's corpus).
- Texts may be downloaded as TEI xml or txt-files for non-commercial use (soon also as pdf).

Update: Minor correction in the last two rows of the table -- thanks to a comment by Michael Ludwig. I will talk about the efficiency of this and other related XPath expressions in my next post. In my first post I provided a compact one-liner XPath expression that obtains all duplicate items in a given…

We provide the content of legislation in XML format using a Legislation Schema that includes both metadata and the content of legislation. The Legislation Schema uses Dublin Core for metadata, XHTML for tables and MathML for formulae.

XRC file is a XML file with all of its elements in the http://www.wxwidgets.org/wxxrc namespace. For backward compatibility, http://www.wxwindows.org/wxxrc namespace is accepted as well (and treated as identical to http://www.wxwidgets.org/wxxrc), but it shouldn't be used in new XRC files.XRC file contains definitions for one or more objects -- typically windows. The objects may themselves contain child objects.