semantic web
Submitted by administrator on Sat, 05/17/2008 - 19:12.
Something which was part of the early design concepts from the XHTML Role Attribute Module, which has got a little lost, is that elements from any language can provide a handy source of role values too.
To reconstruct the logic:
A role value is simply a URI, or resource. The reason for this is so that the extensibility hook that we're creating puts us straight into the world of RDF.
Now, some values of @role will need to be invented. This might be because they simply don't exist, or because we want the values to be 'cross-cutting', and apply to many different mark-up languages.
But there are many values that already exist, that are suitable for use in a variety of situations. For example, XForms has a hint element, that can apply to its form controls:
<xf:input ref="surname"> <xf:label>Surname:</xf:label> <xf:hint>Please enter your surname or family name</xf:hint> </xf:input> The semantics of 'XForms hint' and pretty well defined, so it should be straightforward to apply them to other situations. For example, an Ajax library could pick up a hint and do something with it in an (X)HTML document, even without XForms:
<input name="surname" /> <div role="xf:hint"> Please enter your name </div> SVG
This whole topic came up recently because someone asked whether it would be possible to add some new values for role which would identify paragraphs, sections, headers, and so on, and that could e used in languages like SVG; but the answer is that if we use the XHTML p, section, h1, h2, etc., values then we don't need to invent new roles:
<svg:text role="xh:h1">Metadata</svg:text> <svg:text role="xh:p"> Metadata is data about data...which is also data...kind of turtles all the way down... </svg:text> As you can see, a role-aware voice system would be able to provide feedback to a user in any mark-up language, simply by knowing XHTML role values.
Submitted by administrator on Thu, 05/01/2008 - 12:50.
May is going to be pretty busy with talks about XForms, RDF and RDFa coming up.
First up is my talk XForms, REST, XQuery...and skimming at XTech 2008. The talk embraces themes I've been pursuing for a couple of years now; that as we put more functionality into the client, and servers get 'cleverer', it becomes much easier to build sophisticated web applications. Of course server technologies are moving so fast now that this whole approach is making more and more sense, so I'm looking forward to taking in recent developments in my talk. For example, both Amazon and Google effectively have 'databases in the cloud' that can be used to store data and query it, via APIs, with literally no configuration.
A few weeks later I'm going to be giving a tutorial on RDF at SemTech. This is an interesting development for me, because RDF and the semantic web were always my first interests--before XForms and before XHTML 2. (As well as writing RDF parsers, and designing applications, I also contributed chapters on RDF and RDFS to a couple of books on metadata and XML.)
But one problem I always had when trying to build semantic-web applications was that defining the user interface was pretty hairy. This was partly because RDF Schema is tricky to process, but also because HTML was insufficiently powerful in its core feature-set, so the translation from RDF to HTML involved a lot of work.
The need for a user interface language that was much richer than HTML was therefore why I got involved in the XForms standard (and worked with a team of people to produce the first fully conforming XForms processor, formsPlayer). So although it may not seem directly connected to the semantic web, I believe that in the coming period XForms will start to become a key part of the semantic web's architecture.
Another problem I kept coming up against whilst developing for the semantic web was the difficulty in actually publishing metadata. In particular I always found it frustrating that there was a lot of really useful metadata just sitting in ordinary web pages, and no-one could get at it. Attempting to resolve this problem gave rise to RDFa, and I'm excited that the RDFa in XHTML working draft is extremely close to becoming a stable recommendation. And as interest in RDFa grows, I'm pleased to say that some of my other presentations in May will be 'tech talks' on RDFa at Yahoo!, eBay and Google. (I'm really excited that I might be getting to meet some of the guys behind Yahoo!'s SearchMonkey.)
My final talk of the month will be at the excitingly-named Kings of Code, and I'm looking forward to talking about XHTML, XHTML 2, HTML 5, XAML, and anything else I can think of in relation to web languages.
Submitted by administrator on Fri, 12/14/2007 - 16:51.
I've not yet had a chance to write up my experiences at XML 2007 in Boston last week, but an announcement today by Amazon prompts me to at least summarise one of the talks I did, called XForms, REST, XQuery...skimming. (Slides)
It's a theme I've touched on a little in this blog, and have spoken on before (at an XML UK meeting, last year, and at XTECH 2007, this year), and the idea goes something like this; building web applications by dynamically generating user interfaces that contain data, is awkward, and difficult to maintain. Web services (which includes REST and APP) removes some of the dependency on databases by hiding the data behind a simple abstraction layer--a kind of 'ODBC for the web'--and XQuery takes this further by giving us a standard language to get at the data.
But XForms is the icing on the cake, since it gives us a rich-client language that can be deployed like traditional web applications, yet is one that ensures a clean separation of the data and the user interface.
During the course of the presentation I showed how incredibly easy it is to create an application with XForms, since forms can be first created on the desktop--loading and saving their data from local XML files--before being deployed to any type of web server. By creating static forms that load their own data, rather than the more common model of dynamically generating static files that contain data, we end up with files that are completely agnostic about where they are delivered from. My demonstrations took them from the desktop, to a simple web server, to eXist (an XML database).
But these applications could even run in 'the cloud', with no web-server. That may sound odd, but I could store any of these XForms applications in SVN or Amazon's S3 and they would work as well as if they were in IIS or Apache. This is the ultimate in simple maintenance and future-proofing, since you are no longer reliant on any particular operating system or server-side language...the forms just 'are'.
GData
The final step of my presentation was to take this a little further and show an XForm that interacts with Google's GData; the form is simple, and accepts a longitude, latitude and comment which is then inserted as a row into a Google Spreadsheet, using Atom Publishing Protocol (APP). Another form is then used to retrieve the entire spreadsheet, and each row is used to create a pin on a map.
The great thing about this demonstration is that it shows that not only are our forms now agnostic about where they run--desktop, server-side, S3, etc.--but now the data becomes agnostic; we don't care how the data is stored, all we are concerned about is the protocol used to interact with it.
And crucailly, XForms' ability to get at this data shows that it really is the missing link when it comes to building a new kind of web application.
SimpleDB
But this generalisation about data just took another leap with the announcement today by Amazon of a closed beta called SimpleDB. This will allow programmers to store and query for data in a manner similar to the way that GData works.
By putting data into the cloud, and then using an easily deployed, declarative rich-client language like XForms, it is possible for the architecture of web applications to be altered in a quite fundamental way, in a direction that is completely different to what we're used to.
Submitted by administrator on Wed, 10/10/2007 - 10:41.
I have been using the eXist native XML database/web server for about nine months now and it is starting to change the way I think about metadata management.
My latest project for a financial institution requires us to quickly build XForms to manage various metadata as well as data. What I am finding is that my old method of storing metadata in XML files on a file system and then transforming the metadata using Apache Ant was a complex process. My new approach is to store the XML directly in eXist. This used to be a little bit hard since I thought that you had to use the eXist web interface to upload each XML file and ant scripts to backup your eXist database.
This all changed when I was shown how to use the Microsoft Windows WebDAV tool. Copying files to eXist and backing up the entire data store is just a drag and drop using Windows.
Now an entire new set of metadata web services are becoming much easier to build. Take the simple task of building a pick list of enumerate values for a form. XForms allows you to use the select1 control and specify an itemset using an XPath expression. I can now just load the data elements into an instance and grab the values and labels directly from the enumerations in the metadata registry files.
The only drawback to this is the fact that you load more metadata (like the full definitions) then you need in building the form. But once again eXist comes to the rescue. It is just a few lines of code to create a little web service (using XQuery) that you pass a code table to that returns just the label/value pairs. Using this method the selection list are always up to date and don't require any "batch" updates.
What I am learning from this is that in the past, metadata management was usually an after thought. Something that the coding standards people used to enforce database column naming conventions. But with metadata being stored in eXist metadata becomes part of application services. Building apps is just assembling forms that pull metadata from the registry in real time.
If you are concerned that the metadata registry server will be overloaded with requests for information each time forms load, we should remember that these services are RESTful. The results can also be cached so they don't have to be regenerated. I still have more to learn about how to make these services fast but since metadata is small it can usually always exist in RAM and disk I/O is very limited.
All of these developments are just small pieces of the puzzle at putting well-managed metadata at the core of your enterprise development methodologies. It is really the heart of the model-driven enterprise.
Let me know if you are creating metadata web services. I would like to know what things you feel are useful to your users.
- Dan
Submitted by administrator on Tue, 08/28/2007 - 19:23.
A couple of recent discussions in the RDFa and microformat communities concern areas of particular interest to those of us working on Sidewinder, a semantic web applications framework.
The initial discussion is taking place on the microformats lists, and concerns how to allow authors to indicate what actions are available to be performed on items appearing in a document. The second discussion is taking place on the 'RDF in XHTML Task Force' list; this post provides a good summary of some of the issues.
All in all I find these very exciting discussions, because they concern exactly the types of use-case that prompted me to get involved with the XHTML work at the W3C a number of years ago.
This was because I'd been trying to create the kind of flexible user interface that these threads are describing--no doubt just as lots of other people had--and in my own endeavours I ran up against a number of very serious problems that made me conclude that it was pretty much impossible with the technologies available at the time. And since I still haven't seen a convincing solution to the problem of creating extremely flexible user interfaces, I've concluded that the issues I ran up against are of quite a fundamental nature.
Some of the problems that seem to me to be absolutely necessary to solve are:
- an HTML page contains insufficient metadata about what its content 'means', making it difficult to work out what kind of UI constructs to render;
- even were you are able to work out what the data means, HTML is not itself powerful enough to express the kinds of complex user interfaces that you would want to 'bind' to this underlying data;
- and even if you work out what the data means and define complex UI components, you still can't define binding rules that indicate what widget to use with what data;
- the browser offers only one 'paradigm' for interacting with information of interest, whilst we often want to create applications that can make use of the same rich features.
For every problem...
The first problem--that HTML is not 'rich' enough--is now largely solved by RDFa. It has taken quite a long time to get here, but I think the effort that has gone into getting RDFa right is going to be worth it. The key thing that RDFa does is to get RDF into HTML--once you've got that, all sorts of possibilities open up.
The second problem--that it's difficult to define rich user interfaces with only HTML--is largely solved by XForms. That XForms is a solution is currently not obvious to most people so XForms remains peripheral in application development at the moment. Of course it is possible to define widgets using script but it quickly gets very messy. There is also an enormous problem of re-use, in the recursive sense; most Ajax libraries are works of art, but very few can support the kind of complexity we need. Take the example shown here:

Here we have widgets that are made up of other widgets to an arbitrary depth, but the key thing is that the binding mechanism is based on abstractions; it could be the run-time data type or an abstract widget type, and in both cases it means that at any point in the hierarchy a different widget could be swapped in without disturbing anything above or below. This kind of complexity is extremely difficult to achieve with procedural languages. (See also Introduction to custom controls and Understanding the MVC separation.)
The third problem--the use of binding rules to indicate which widget should be connected to what data--is still a little in the air. One part of it we've solved by specifying binding 'rules' using XPath selectors that are data type aware. This means that we can indicate that data of type 'geo location' should have one set of behaviour bound to it, whilst data of type 'time' can have a different type. These are essentially binding stylesheets and I think this is where we can answer the question posed on the RDFa list as to whether it is the author, the end-user or the browser vendor that should be in control of the widgets--the answer is all of them! By allowing users to express binding rules that override those from authors, we can achieve the best of both worlds.
Finally, the problem that the only paradigm we have for interacting with web-based data is 'browsing' is what we are trying to solve with Sidewinder. The ultimate goal is to have a framework that can be used to build any type of internet-facing application, whether web or desktop based. By allowing all applications to make use of the same semantic functionality and the same binding rules we can allow the users themselves to get control of their data and their applications. (See also Web 2.0, Copernicus and Sparticus: Moving the centre of the web.)

Once you have this kind of 'platform' (see Platform 2.0) then things really start to open up. I could build a clock widget using Silverlight, perhaps, and have that appear anywhere that the time data type is used--whether in my messaging client, my email client, my Twitter gadget, my Facebook desktop notification system, and so on. But you might choose a clock built using SVG, whilst someone else might decide to have a clock that speaks. But any of us could swap one of these behaviours for another at will, for a component we have created or one that we have downloaded from elsewhere--the ultimate, flexible, programming platform.
Submitted by administrator on Wed, 08/22/2007 - 08:42.
The XHTML.com site have published the first part of a two part series on what might be wrong with the web, and how we might fix it. The idea is that the first part consists of short opinion pieces by people involved in various ways with the web, and the second part is a collection of responses from anyone who wants to send something in.
Contributors are Chris Wilson, Daniel Glazman, Joe Clark, Doug Geoffray, Roberto Scano, Jeffrey Veen, Dave Raggett, Mike Andrews, James Pearce, Nova Spivack and myself.
In my comments I focused on the need for a better way to create standards:
I actually have a very positive view of the transition to the 'future Web', and I see many of the things that one might say we need for a better Web already emerging. For example, we need the differences between browsers to be removed, but due to the high quality of libraries such as YUI, Dojo, and Prototype this is well underway. We also need there to be a convergence of application development techniques, so that the same languages can be used for the Web, desktop applications, gadgets, widgets, and so on; but with an explosion of interest in HTML, JavaScript and XForms--as well as the appearance of multi-language platforms like Silverlight--this trend also looks set to continue.
So whilst I might say that these things are crucial to the future of the Web, they are also very much underway, and barring a collective overnight loss of memory by the development community, look certain to continue. But there is one thing that to me doesn't look so assured, and that is whether standards are agreed upon around Ajax, and more generally, the whole approach to standards development.
In general, standards (at least those from the W3C) are developed in a kind of 'kitchen-sink' style, where some specification tries to include just about everything that can be devised on a particular topic--and therefore takes years to write. A good example is SVG, and even, to some extent, my own favourite, XForms. In both cases there are many useful things that could be factored out of these specifications to be made available elsewhere as smaller, more manageable components, but it is only recently that this approach has started to be tried. Good examples of the 'bite-size' approach are the 'role attribute', 'access', and RDFa, which provide techniques for adding metadata to web documents as well as making them more accessible. (See also The XHTML role attribute: small and perfectly formed and Using RDFa in XHTML 1.)
A good measure that some are learning that this is a promising approach to writing specifications is the backplane initiative at the W3C. But unfortunately, an illustration that lessons haven't been learned is the W3C's embracing of HTML 5, which is not only a case study in kitchen-sink design--let's throw everything in there--but a case study too of the problems caused by the 'not invented here' mindset.
If the W3C continues to support the creation of enormous specifications that take years rather than months to complete, then it will almost certainly become increasingly irrelevant to the 'future Web'. Of course, based on the W3C's lack of coherent leadership around the whole HTML 5 question, some might not see that as necessarily a bad thing, but it does raise the question as to whether other organisations can fill the space with a better process. For example, can the Open Ajax Alliance take a much more focused approach to standards writing, creating small specifications that can be easily combined? If so, it's possible that the vacuum would be filled, and the future of the 'future Web' would stand on a firmer footing.
Whilst I'm not quite sure where the future of standards creation will come from, the sheer dynamism we see in the Web development space makes me certain that it is imminent. If we can devote even a fraction of that creative energy to evolving a new approach to standards development, the future of the 'future Web' looks very bright indeed. But if we don't, then for the foreseeable future we are looking at increasing fragmentation, and the lack of standardisation in the browser merely being transferred to a 'lack of standardisation in the libraries'.
Anyone is free to respond to the discussion, which needs to be done by September 10th. They'll then use the responses to create 'part 2'.
Submitted by administrator on Fri, 06/15/2007 - 19:16.
So tomorrow morning it's off to Alexandra Palace for the start of Hack Day London. My wife is raising an eyebrow as I ready my sleeping-bag, and she may have a point...but I'm secretly quite looking forward to hanging out with a lot of smart people, hearing about their good ideas, with no-one around to tell me what time to go to bed. For myself, I'm hoping to get involved in projects relating to metadata and rich-clients. I'm sure there will be people working in this area, but if not, I've got plenty of new things we've been experimenting with that I might be able to get others to take a look at.
RDFa Parsing
Perhaps the most interesting is the use of our RDFa parser within a blog or other document. By adding metadata to elements in a document you are providing a hook which gives you more information about some item. Onto this hook you can 'hang' further functionality or more information.
To illustrate, let's say I have a cookery blog on which I mentioned Canteen Cuisine by Marco Pierre White. Given that I probably want to make some money from the site with my Google Ads and reseller links, I will most likely place a link to a site like Amazon around the words "Canteen Cuisine":
I found a good recipe in <a href="http://www.amazon.com/gp/product/0091808189?ie=UTF8&tag=escuelerie-20&linkCode=as2&camp=1789&creative=9325&creativeASIN=0091808189"> Canteen Cuisine </a> which you really should buy so that I get a kick-back. Obviously this satisfies my commercial yearnings, but it doesn't really contribute to the great 'mass' of semantic information that others might find useful. It doesn't actually tell you which book is being referred to, only that there is a link to a page on Amazon. It would be far better--and easier, as it happens--to simply tag the book with an ISBN number, and leave the creation of links to Amazon, or Barnes and Noble, to some other layer of behaviour. The key is to use RDFa to indicate a unique identifier for the book, and also to indicate an object type, i.e., to indicate that the item we're dealing with is really a book:
I found a good recipe in <span xmlns:bib="http://somebig.org/" about="urn:ISBN:0091808189" class="bib:book"> Canteen Cuisine </span> which you really should buy so that I get a kick-back. If you know RDF then you'll understand that the generated triples are:
<urn:ISBN:0091808189> rdf:type <http://somebig.org/book> . If you don't know RDF don't worry; the main point is that we've added mark-up that says nothing more than:
"urn:ISBN:0091808189" represents a book. But that simple fact is pretty useful, since now we could use the ISBN number to go and get more information from Amazon, and then display the data in a tooltip, or add links so that a reader of the cookery blog could buy the book or add it to their wish-list.
Before we look at how we get that data, it's worth pointing out that Amazon does offer the Quick Linker widget for TypePad users, but you'll see that the mark-up looks very strange:
<a type=amzn asin=B000012345>I love this item</a> In some ways it's similar to what I've shown with RDFa, where we use a small amount of mark-up as a hook to generate richer functionality, but in their case it's an Amazon-specific solution. A key design goal of RDFa is to provide a generic solution that would work with all types of data.
RDFa action handlers
Once the RDFa parser has 'found' items in the page it then does something with them. Up until now all of our experiments have been to simply display data in a different way, such as showing FOAF information as a card with the person's picture, or event information as a pin on a map. The Operator Firefox extension (which now supports RDFa as well as microformats) takes a slightly different approach and allows the user to do things with the data, such as add an hCard to their contacts database.
The Amazon book experiment is a first step towards combining these approaches. The action handler that gets run when a book is found actually goes to get further data about the book from Amazon, before it shows a panel that contains an image of the book and its title. This means that we don't need to know any URLs for the book on Amazon or where its images are located, since they are derived in the action handler; all we need is the ISBN number.
Yahoo! Pipes
But an interesting twist here is that we don't go directly to Amazon to get the data, but instead go via Yahoo! Pipes. There are many reasons for doing this--not least that it fits well with my ideas about skimming--but the main one is simply because Pipes gives us JSON data for any feed we can access, which works nicely with the model we're building here. By running all of our data requests through Pipes we can provide a single format to our RDFa processor, even if we decide to change the source of our data at some later point, or add further services into the pipe to add more metadata.
The pipe that we've defined for Amazon actually takes the URI of the ISBN number (such as urn:ISBN:0091808189) rather than just the ISBN number, because at some point I'd like this to be a proper RDF query. So I'm actually saying, "give me everything you know about this URI", and whilst that URI currently represents a book, in the future the same query might be able to return information about URIs that represent people, ships, planets, and so on.
What's next?
Everything we've been working on so far is about providing a solid widget framework that can be used to enhance any page on any site, in a simple and consistent way. I'm sure that at Hack Day there will be people working on ideas to do with enhancing blogs, ideas related to using the semantic web, microformats (whether in the traditional or RDFa style), and many other things besides, and so I'm really hoping that this will be an opportunity to move some of this work on, in the context of real applications that people want to build.
|