Longhorn To Get NUI Foundation Platform " In other words, if you want to tell your Longhorn computer to find a file, you will have the option of saying the word "Find," evoking a dialog box and/or drop-down menu on your screen.

This kind of capability is predicated on a new type of file system, Lee says.

"First, we need a file system that is more of a structured database," Lee explains. "You can't reason with everything being a file type. That's why we need WinFS," the Windows File System at the heart of SQL Server "Yukon" (and the data-store component of WinFS that will be embedded in Longhorn), Lee says.

"Then, we need to ask, if the richer (data) store lets you reason about nouns, why not verbs ... like 'format,' 'delete,' 'print'?" Lee continues. "If you think about 'find' as a verb, then everything fits into a reasonable taxonomy.""

Friday, September 26, 2003

Trust Metrics (idea) "Trust metrics are going to be one of the building blocks of future online communities. Advgato Trust Metric has the most visible example, but the concept of a trust metric is of great importance in computer security as well as in building community."

Marking Up Bureaucracy "Much other work is currently under discussion, but very few individuals were willing to discuss upcoming implementations; that said, it can be inferred that many projects lean toward making publicly available information (like that found on FirstGov and on THOMAS) available via a public, web-services based API. And the occasional tantalizing PDF shows that sites which present information to the public are thinking in terms of taxonomies, which, along with agenda proceedings from a September 8, 2003 conference on "Semantic Technologies for eGov" indicate that the US Government is not shying away from the promise of the Semantic Web."

Thursday, September 25, 2003

Metaweb "The Metaweb is a collaborative structure for learning. In our first phase, we are annotating the ideas and historical period explored in Neal Stephenson's novel Quicksilver, seeding the Metaweb with an initial base of information. We are currently working on 109 articles, and hope you will expand and relate these and many other entries."

Wednesday, September 24, 2003

Berners-Lee: Web inventor endorses Safari "During a lecture last night at the Royal Society in London, Tim Berners-Lee revealed that he invented the World Wide Web using a NeXT computer. He presented his lecture using Apple's OS X Web browser Safari on a PowerBook. He also referenced the Web's potential by talking about the possibilities of iCal, Apple's calendar program.

Berners-Lee was discussing one of the Web's many futures, in what he calls the Semantic Web. Devised by himself, this is due to become the next big thing, he told the audience. It will enhance the supply and exchange of information and data for the benefit of the Web user. "

Berners-Lee Talks Up Semantic Web "What if the World Wide Web were one giant database, linking both human readable documents and machine readable data in a way useful to both mankind and machine?"

"By implementing products based on RDF as an EAI "hub," companies can link together documents, and data stored in disparate databases, and pull related concepts together when analyzing the information. That sort of thing can be done with XML Web services today, but it can be a laborious task, Berners-Lee explained."

Distributed Computing Economics and the Semantic Web "Now along comes Gray, making an argument that, when you think about it, implies that the semantic web, as currently conceived, might just be all wrong. His basic point is that it's far cheaper to vend high-level apis than give access to the data (because the cost of shipping large amounts of data around is prohibitive). Since the semantic web is basically a data web, one wonders: why doesn't Gray's argument apply?"

Agoric Open Systems "Agoric systems should form an attractive knowledge medium. In a large, evolving system, where the participants have great but dispersed knowledge, an important principle is: "In the incentive structure lies the power". In particular, the incentives of a distributed, charge-per-use market can widen the knowledge engineering bottleneck by encouraging people to create chunks of knowledge and knowledge-based systems that work together."

I've been reading bits and pieces of SP-4221 The Space Shuttle Decision. It's a great insight into the management of projects and also a source of technical details such as why aluminum was picked over titanium, why it has a large delta wing, etc.

Some interesting quotes:
"Troubleshooting, also, was hit-and-miss. We all have had the experience of taking a car to a garage for repair, having a mechanic replace a part, paying the bill - and finding that the problem remains unsolved. Such experiences were also common in the airline industry. The American Airlines managers wrote that...over a recent six-month period, 44 percent of the components replaced during maintenance of the air conditioning system did not eliminate the pilot's complaint. Fifty-two percent of the replacements in the autopilot system did not eliminate the pilot's complaint. "

And:
"An in-house review...showed that NASA's principal automated spacecraft programs had increased in price by more than threefold...Gemini had gone from an initial estimate of $529 million, late in 1961, to a final expenditure of $1.283 billion. Apollo, with a program cost estimated at $12.0 billion in mid-1963, ballooned to $21.35 billion by the time of the first moon landing in July 1969...What had caused these overruns? Here too, cost meant people. Major overruns resulted when large technical staffs drew salaries to little effect, as when projects encountered technical stumbling blocks, forcing major redesigns. Such difficulties brought delays and pushed up costs by wasting much of the earlier work."

Failure and Exceptions "Having one big catch clause on the outside really only works if your exception handling philosophy is simply to die...But pretty much all that a try catch block like that can do is blow the request away. There's no ability to respond gracefully. There's no ability to take account of local context to cope and adapt, which is really one of the key hallmarks of truly reliable software.

Monday, September 22, 2003

Funky to Go "The power behind Google is that the company owns the algorithms used to find data from the featureless mess of HTML that exists today. The more sophisticated the data storage, the less important the algorithms, and the less edge that Google has. Microsoft, by controlling the origination of much of this data can build in the missing knowledge about the data and basically undercut the ground on which the House of Google is written...But I wouldn't be surprised if Microsoft isn't working on creating it's own metadata and ontology XML vocabulary and data model, one that it will share with others, of course, putting it at the center of knowledge-based query in the years to come."

jBNC is a Java toolkit for training, testing, and applying Bayesian Network Classifiers. Implemented classifiers have been shown to perform well in a variety of artificial intelligence, machine learning, and data mining applications.

From the Catalyst story:
"The water supplies of Melbourne and Adelaide are well below 50% capacity and in Perth their reservoirs are less than a quarter full. If the next 18 months are as dry as the last, these cities and their six million residents face a water crisis."

"The heatwaves and fires that we experienced in Australia recently and in Europe currently are indeed a glimpse of the future. We would expect more heatwaves, more droughts and of course a greater stress on people living in cities."

"But while the rest of the world gets warmer due to greenhouse gases, Antarctica is cooling due to ozone loss. So there’s a bigger difference in temperature between the equator and poles. The combined effect of ozone and greenhouse, it seems, is making the polar vortex spin faster."

Unweaving the tangled web of dumb data "Semantic encoding can be particularly useful in inference engines. Encouraging relationships between pieces of information enables you to analyse that information for new relationships...Using such technologies within the corporate firewall is one thing, but building a whole new web based on them is quite another. If we could create a second generation web using semantic technology, the benefits would be huge...The semantic web is not likely to hit your browser any time soon, but the semantic intranet just might. The underlying technology has been on the agenda since the mid-to-late 1990s, but it is now starting to move from theory into commercial products as companies begin to release RDF-capable knowledge management systems and inference engines. UK-based Inference Networks is one such firm, and in the US, Amblit Technologies has a semantic browser, and Intellidimension has an RDF data management system."

Thursday, September 18, 2003

Some of the features include:
* Content is organised into discrete spaces.
* Permissioning per space, on a user or group basis.
* Textile-based text formatting.
* Page templating allowing rapid creation of boiler plate pages.
* Exporting of a whole space or single page to PDF or HTML.
* Dump/restore of the database to XML, with daily backup option
* Full text seraching across all pages visible to a user.
* Multiple RSS feeds the application and each space.
* Importing of page content from plain text files.
* Attach arbitrary files to any page.
* Tracking of all internal and external links.
* Flexible user and group management.

Wednesday, September 17, 2003

This was a recent one day seminar held by TopQuadrant at the White House Conference Center that included "...solution stories from both vendors of semantic technologies and agencies that are already using them in applications."

Tuesday, September 16, 2003

"In the context of new work on distributed computation, Semantic Web Services (SWSs) go beyond current services by adding ontologies and formal knowledge to support description, discovery, negotiation, mediation and composition. This formal knowledge is often strongly related to informal materials. For example, a service for multi-media content delivery over broadband networks might incorporate conceptual indices of the content, so that a smart VCR (such as next generation TiVO) can reason about programmes to suggest to its owner. Alternatively, a service for B2B catalogue publication has to translate between existing semi-structured catalogues and the more formal catalogues required for SWS purposes. To make these types of services cost-effective we need automatic knowledge harvesting from all forms of content that contain natural language text or spoken data."

Why pen computing could put you out of business "Being born in an English-speaking nation has, for the last century or so, meant never having to say you're sorry. People all over the world are still dying to try out their English on you... At some level, I'm all for technology lowering the linguistic barriers to communication and commerce...Globalisation is happening, for good or ill, bringing with it something we English-speaking first-worlders will just have to deal with: Tomorrow, we'll be a lot less special than we are today."

Metanology Joins Open-Source Tools Group "Our development team was able to produce an advanced programming tool with features that exceed those provided by non-Eclipse based competitors. Instead of spending our programming effort creating infrastructure, we were able to focus on the features of MDE that would create value for our users."

Sunday, September 14, 2003

Metatomix(TM), Inc. Closes $8.3 Million Venture Round to Bring Real-Time Visibility and Integration Software Technologies to Commercial and Government Markets "Metatomix, founded in December 2000, has developed its Real-Time Visibility and SMARTE(TM) (Surveillance Monitoring and Real-Time Events) Suites based on its innovative application of Semantic Web-based architectures. The Semantic Web extends the Internet by allowing data-driven interactions to enable greater access to all types of information. This evolution extends the original Web concept of people interacting with computers to the sea changing philosophy of machines interacting with machines. Metatomix has harnessed this data-driven network computing technique to deliver real-time information integration and visibility software for commercial applications."

Metatomix has a piece of software they call the Hologram Store which "...is stored in a form that allows the data to be queried and coalesced from a variety of perspectives. The data is captured and expressed in Resource Description Framework (RDF), a W3C (World Wide Web Consortium) standard that is a form of XML. By using this data format, new data sources are easily added without the requirement to redesign or reprogram the data model."

Saturday, September 13, 2003

"Beyond the great wall of data on the Internet lies a goldmine for enterprises called the Semantic Web.

Based on standards pioneered by the W3C, the Massachusetts Institute of Technology, Hewlett-Packard (Quote, Company Info) and a network of grassroots communities, the Semantic Web uses the Resource Description Framework (RDF) to piece together a variety of applications using XML for syntax and URLs for naming."

e-Business & Web Services: The Missing Semantic Metadata Link "The Universal Data Element Framework (UDEF) is a cross-industry metadata identification strategy designed to facilitate convergence and interoperability among e-business and other standards. The objective of the UDEF is to provide a means of real-time identification for semantic equivalency, as an attribute to data elements within e-business document and integration formats."

A recent discussion on RDFIG "I work for a cellphone firmware company, where I have been pitching the idea that we would be smart to start putting some semweb code into our product. If anybody is curious, I'd like to briefly describe my strategy and hear any feedback you folks might have...The internal marketing involves finding pre-deployment internal apps. A few of these are PIM data, putting timestamps and text annotations on photos and sound bites, and metadata descriptors for internal applets. There is a lot of interest in tracking user preferences."

Friday, September 12, 2003

Recording Query Results "This document describes a way to record the results of queries where the queries are from languages that return bound variables. The recording of the results is in RDF, enabling graph comparison to be used for testing whether two sets of query results are equivalent."

I'm not sure whether I'd had read this before or not. It will be good to get a proper HTTP client in with Joseki support for Kowari.

Data visualisation: is it coming of age? "Fractal Edge's unique approach to data visualisation has enabled the power of the underlying data to be greatly enhanced. Data is better explained and understood within a defined vision or context. Huge volumes of rapidly changing information may be presented on screen in visual form without losing detail. Bottlenecks in the access and delivery mechanisms for data are eliminated. Data may be colour coded for ease of macro presentation and identification purposes and the structure of the data may be easily arranged depending on criteria important to the user.

Fractal Edge applications map data neutrally. They can aggregate multiple proprietary and third party data sources to bring a combined view of relevant sources of information. Basic adapters are available to CSV and Excel and more sophisticated adapters to Windows and Bloomberg, the financial data and market information providers. One-way and two-way links are available to data sources. Fractal applications allow launch of functionality from underlying applications."

Here's another article on IBM's Web Fountain: " The result is a new online service called WebFountain. A big computer at IBM hoovers up web pages and information from other sources such as newsgroups, syndicated content and newswires. Each incoming page is analysed to determine what language it is in. The context—a news report, a page on a company's website, a web-log entry—is determined. Verbs, nouns, adjectives, proper nouns, place names and even entire phrases are extracted, and are analysed for positive or negative connotations. The page is also classified by category—is it about baseball, Iranian politics or global warming?"

Sunday, September 07, 2003

A TALES syntax for RDF "It is becoming increasingly important to have the ability to link documents along more dimensions than just hyperlinks and to specify partial metadata. The standard model for this is the W3C's RDF. Using RDF, a cross-planet, distributed relational system can be set up, relating resources by predicates...We feel that TALES provides a system for a concise, yet explainatory format for RDF queries. With the current plans for an RDF layer in Plone, it is felt that developing a good syntax for this tool to use in formulating its queries would be beneficial."

Pondering RDF Path "Most RDF Path proposals to date are sketchy, and do not provide clear equivalents of the facilities in XPath, or do not account for the fact that selections of a resource occuring as a node and an arc causes problems."

Saturday, September 06, 2003

RDF Templates "RDF Templates (RDFT) are an XML format for creating representations of RDF graphs. In a similar way to XSLT, RDF Templates define template rules with patterns which are matched against nodes. Template rules specify output actions and further node selections which trigger further template operation. However, instead of acting on an XML tree, RDFT acts upon an RDF graph. Nodes are specified using a 'nodepath' syntax which defines conditional node/arc/node graph traversals. A macro definition facility is provided to reduce long nodepaths to easier to read strings."

The transcript includes quotes such as "I think you should see this...It’s just a kid.", "One little thing can solve an incredibly complex problem.", "Knowledge amplification. What he learns, we all learn. What he knows, we all benefit from.", "Plumbing, it’s all about the tools." and "THE FUTURE IS OPEN". Remind you of anything else?

Thursday, September 04, 2003

"TKS has been available commercially for two years, but is in the final stages of release under the Mozilla Public License, version 1.1. Release under the MPL is anticipated in October, 2003."

Cool, this isn't the currently distributed version of TKS either. This is the new, you-beaut 64-bit version, with streaming results and disk based queries. In the next few days there should be some more information about this. I'd like to get some sort of paper for WWW2004 done - maybe based on the data structured used to store RDF.

Tuesday, September 02, 2003

"The Streaming API for XML (StAX) parsing will specify a Java-based, pull-parsing API for XML. The streaming API gives parsing control to the programmer by exposing a simple iterator based API. This allows the programmer to ask for the next event (pull the event) and allows state to be stored in a procedural fashion."

"If you're reading this and thinking "all he's saying is that lucene is useful for indexing xml fragments," you're halfway there. And If you're an XML lunatic who then says "Hey! Wow! And the world, in its entirety, is entirely composed of XML (or possibly RDF) fragments," then you've gone way too far. What I know is that the world is mostly made up of semi-structured data and I know that database schemas often evolve at a ferocious rate because, when we impose more structure, we often get it wrong.

And so now what I'm wondering is if I was completely off base in 1997. That is, I'm wondering if Moore's law really says that relational databases are going to become vastly less important over time, because for most applications there's a less-structured (and less efficient) way to do things that's more convenient for the programmers."

Well, being a Semantic Web blogger I'd have to say that the relational model, as it is applied in RDF, is only going to increase in usage not decrease. Things like JDBM (used for the backend of LDAPd) and hopefully TKS will become increasingly commoditized into being just something you throw in like logging.

Monday, September 01, 2003

I'm trying to think of what to call an open source project. I started thinking of Australian animals, especially ones that start with "j" like jumbuck or jabiru. Jabiru was good because it's the biggest stork, stork -> store, wading in data, etc. I got all excited over "tyuk" which sounds like "chook" as both mean chicken. Lots of chicken references. Pretty sad. Any ideas? Taipan? Cod? Krill?