Feed items

RDFa hacking at Hack Day

Hack Day: London, June 16/17 2007 So tomorrow morning it's off to Alexandra Palace for the start of Hack Day London. My wife is raising an eyebrow as I ready my sleeping-bag, and she may have a point...but I'm secretly quite looking forward to hanging out with a lot of smart people, hearing about their good ideas, with no-one around to tell me what time to go to bed.
For myself, I'm hoping to get involved in projects relating to metadata and rich-clients. I'm sure there will be people working in this area, but if not, I've got plenty of new things we've been experimenting with that I might be able to get others to take a look at.

RDFa Parsing

Perhaps the most interesting is the use of our RDFa parser within a blog or other document. By adding metadata to elements in a document you are providing a hook which gives you more information about some item. Onto this hook you can 'hang' further functionality or more information.

To illustrate, let's say I have a cookery blog on which I mentioned Canteen Cuisine by Marco Pierre White. Given that I probably want to make some money from the site with my Google Ads and reseller links, I will most likely place a link to a site like Amazon around the words "Canteen Cuisine":

I found a good recipe in <a href="http://www.amazon.com/gp/product/0091808189?ie=UTF8&tag=escuelerie-20&linkCode=as2&camp=1789&creative=9325&creativeASIN=0091808189">
Canteen Cuisine
</a> which you really should buy so that I get a kick-back.

Obviously this satisfies my commercial yearnings, but it doesn't really contribute to the great 'mass' of semantic information that others might find useful. It doesn't actually tell you which book is being referred to, only that there is a link to a page on Amazon. It would be far better--and easier, as it happens--to simply tag the book with an ISBN number, and leave the creation of links to Amazon, or Barnes and Noble, to some other layer of behaviour.
The key is to use RDFa to indicate a unique identifier for the book, and also to indicate an object type, i.e., to indicate that the item we're dealing with is really a book:

I found a good recipe in <span xmlns:bib="http://somebig.org/" about="urn:ISBN:0091808189" class="bib:book">
Canteen Cuisine
</span> which you really should buy so that I get a kick-back.

If you know RDF then you'll understand that the generated triples are:

<urn:ISBN:0091808189> rdf:type <http://somebig.org/book> .

If you don't know RDF don't worry; the main point is that we've added mark-up that says nothing more than:

"urn:ISBN:0091808189" represents a book.

But that simple fact is pretty useful, since now we could use the ISBN number to go and get more information from Amazon, and then display the data in a tooltip, or add links so that a reader of the cookery blog could buy the book or add it to their wish-list.

Before we look at how we get that data, it's worth pointing out that Amazon does offer the Quick Linker widget for TypePad users, but you'll see that the mark-up looks very strange:

<a type=amzn asin=B000012345>I love this item</a>

In some ways it's similar to what I've shown with RDFa, where we use a small amount of mark-up as a hook to generate richer functionality, but in their case it's an Amazon-specific solution. A key design goal of RDFa is to provide a generic solution that would work with all types of data.

RDFa action handlers

Once the RDFa parser has 'found' items in the page it then does something with them. Up until now all of our experiments have been to simply display data in a different way, such as showing FOAF information as a card with the person's picture, or event information as a pin on a map. The Operator Firefox extension (which now supports RDFa as well as microformats) takes a slightly different approach and allows the user to do things with the data, such as add an hCard to their contacts database.

The Amazon book experiment is a first step towards combining these approaches. The action handler that gets run when a book is found actually goes to get further data about the book from Amazon, before it shows a panel that contains an image of the book and its title. This means that we don't need to know any URLs for the book on Amazon or where its images are located, since they are derived in the action handler; all we need is the ISBN number.

Yahoo! Pipes

But an interesting twist here is that we don't go directly to Amazon to get the data, but instead go via Yahoo! Pipes. There are many reasons for doing this--not least that it fits well with my ideas about skimming--but the main one is simply because Pipes gives us JSON data for any feed we can access, which works nicely with the model we're building here. By running all of our data requests through Pipes we can provide a single format to our RDFa processor, even if we decide to change the source of our data at some later point, or add further services into the pipe to add more metadata.

The pipe that we've defined for Amazon actually takes the URI of the ISBN number (such as urn:ISBN:0091808189) rather than just the ISBN number, because at some point I'd like this to be a proper RDF query. So I'm actually saying, "give me everything you know about this URI", and whilst that URI currently represents a book, in the future the same query might be able to return information about URIs that represent people, ships, planets, and so on.

What's next?

Everything we've been working on so far is about providing a solid widget framework that can be used to enhance any page on any site, in a simple and consistent way. I'm sure that at Hack Day there will be people working on ideas to do with enhancing blogs, ideas related to using the semantic web, microformats (whether in the traditional or RDFa style), and many other things besides, and so I'm really hoping that this will be an opportunity to move some of this work on, in the context of real applications that people want to build.

Update on CDI XForms

Well, the new version of the Firefox XForms extension came out last week, and all my forms are working again. I am working on a version for Orbeon as well and I have a lot of thoughts about Orbeon vrs. Firefox, but will save that for a later post.
I have posted a version of my [...]

Xform lipsynch study

xform, mofukaz! :)

Author: liviutz

Keywords: xform lip synch

Added: May 31, 2007

Mozilla XForms Project

FormFaces.com

XForms Essentials

Impressions of Sem-Tech -07

Impressions of Sem-Tech -07

I just returned from the 2007 Semantic Technology Conference in San Jose California. It was a great conference and opened my mind to several new ideas. Well worth the time!

The conference was held over four days and had over around 125 presentations including tutorials and research projects. There were almost 800 attendees. This is the third semantic technology conference that I attended and the second that I presented a paper at.

Here are some high-level observations and some patterns that I detected.

The Semantic Web gets the “Web 3.0” Label

Most people at the conference have tried to embrace the idea that the semantic web will be adopting the popular culture label “Web 3.0”. The final straw was the Nov 2006 NYT article by John Markoff which set the blogosphere is a buzz. This was a web that includes technologies to enable intelligent reuse of data. Wikipedia, after a long pro-active discussion about if “Web 3.0” deserved an entry, finally undeleted the page and let is stand. See http://en.wikipedia.org/wiki/Web_3.0 and check out the discussion page on the article for more.

From If to When to How

Eric Miller (now at Zepheira) calls some of the new technologies “recombinant data. Eric spent about five years with the w3c. He is perhaps the most well-connected person in the world with accurate knowledge of who is using semantic web technologies to solve real business problems today. His observation was that three years ago at the first semantic technology we were wondering if the semantic web would take off. Last year the speculation was when the semantic web technologies would start to become common place. This year the focus was debates on how the semantic web should be implemented.

Venture Capitalists are Becoming Educated on the Semantic Web

One example of this is the fact that last year, most semantic-web startup companies had to carefully explain what the semantic web was to venture capital companies when trying to get their initial rounds of funding. This year, many venture capital companies not-only had some level of understanding of the semantic web but were asking each of their potential startups how their technologies fit into the semantic web. Other companies that did not have a semantic-web focus were not coming to the conference to get educated.

Some of the companies that got VC last year have already been purchased and absorbed by larger firms. These companies were replaced by new venture-funded companies.

Consensus on the RDF/SPARQL Foundation

One of the first things that struck me was the consistent use of RDF and triple-stored to solve many hard problems. The use of RDF and SPARQL seemed to be the primary distinguishing factor about if people thought you were really using semantic web technologies or not. If you we not using RDF, you were not really in the club, just and outsider looking in.

OWL Fragmentation

Another thing that also surprised me was the discord about the use of OWL and its relative sub-functions. Central to this was the rise of use-cases of simple things examples of things that OWL could not do. Much of this centered around DLP (Data Log Programming).

A good example of a non-OWL solution was the use of SKOS to store things typically stored in a metadata registry. SKOS is a great example of a simple standard, built on top of RDF that attempts to solve common problems without getting overly complex. See SKOS in Wikipedia.

We are also starting to see that rules must also be exchanged between systems in semantically precise ways. The need for a Rule Interchange Format (RIF). Nice to see vendors like FairIssac supporting complex business rules running INSIDE the web browser using XForms. They rock!

REST is Deep

Perhaps my favorite presentation was given by David Wood and Brian Sletten from Zepheria. In this presentation, David and Brian gave a demo of the NetKernel system. They demonstrated how NetKernel embraces REST at a much deeper level then I had previously anticipated. Now they were not yet generating XForms from an XML Schema but it showed a great example of convergent evolution of my ideas and theirs.

Case Studies

This year also started to show examples of new startup companies actually using semantic web technologies to differentiate themselves in the marketplace. Although because they are all trying to differentiate themselves, much of the actual technologies they were using was not disclosed.

RDF Taggers, Harvesters, Linkers and Analyzers

The conference seemed to have three sets of problems that everyone agreed on. First was how do you harvest RDF from a web page or any other resource. Most of these presentations related to getting RDF out of un-structured and structured data. Lots of discussion of microformats (pros and cons).

iReader Rocks

The coolest demo I saw was the iReader demo from http://www.syntactica.com. This is an awesome FireFox (and IE) extension that does concept mapping from unstructured text. The people behind this have been doing research on linguistics for about 40 years and have only recently got a round of venture capital to start to publicize this tool. But they are not yet converting to RDF for storage other systems.

URL Design Patterns

One common themes that came up was the need for best practices about good URL design. Everyone said a few things: The design of URLs is very important, most people screw it up the first time and then have to redo the designs, there are not a lot of good documents out there and the ones that are available are at least five years old. The people from the w3c did take notes on this.

Reception of My Paper

My presentation was titled “The Semantics of Declarative Systems”. This talk covered how by using a set of small languages with precise semantics, you can build entire applications that allow non-programmers to draw pictures of their business requirements and generate working Apps. The purpose of this talk was really to test the metaphors on how I could explain these concepts to non-computer scientists. I got some positive feedback and was happy to see convergent evolution of our designs with other organizations.

The URL to the paper is here:

http://www.danmccreary.com/presentations/sem-web-07

The Semantics of Declarative Systems

I put my presentation that I did for the 2007 Semantic Technology Conference on my web site. The link is here:




http://www.danmccreary.com/presentations/sem-web-07



Please give me feedback on both the PowerPoint presentation (note the builds and notes pages) as well as the paper.




I am specifically interested if the metaphors worked for a non-technical audience.




- Dan