Friday, 27 April 2007

Wednesday, 25 April 2007


Describing the behaviour of complex interacting systems is difficult.

Here is a clip of Rooney second goal against Milan on Tuesday night on YouTube.

Here is a diagram of that move as shown in the Independent the following day.

Is this more complex to describe than a web application, or less?

Features of a football move
  • Concurrent movement of of 22 actors (players ) to tracked
  • Only two kinds of actor - home, away (or 22 if positions are needed)
  • Only a few actors at any time are significant to the outcome
  • Only a single object,the ball, is interchanged
Features of a web application interchange
  • Several different kinds of actor, all with very different behaviour
  • Each interchange involves different objects
  • Objects are created dynamically
  • Receipt of an object (such as a script or page) alters the actor's behaviour greatly
  • Actors have fixed locations but you have to find them
Sequence diagrams are a useful technique for explaining moves in a web application. They can be drawn using a number of case tools, such as any which supports UML, but the tight binding to the rest of an object model can create difficulties. Diagram are also tedious to maintain, and there is a case for generating diagrams from a textual description.

As a small demo, I have created an application using an XML description of a sequence diagram transformed into html using either XQuery or XSLT 2.0.

(Saxon not yet activated on this server)

Of course this application will require a little more work to make it more configuable.

Thursday, 19 April 2007

Week 22- Metadata and media types

This week's lecture will look at the idea of meta data more generally, and in particular look at how meta data supports applications such as browsers.

Powerpoint Slides

Links for this topic:

Thursday, 22 March 2007

Timeline workshop

SIMILE timeline is a JavaScript module which provides an API with which a programmer can create a display of a set of events defined in an XML file.


  1. Work through the tutorial, creating a simple timeline as far as the section on Differentiating the Two Bands
  2. Create a new xml file of events in the required format which represents the University calendar or some other set of events

Wednesday, 21 March 2007

Week 21 - Revision

There will be a lecture this Friday to help you prepare your revision over Easter. We will look at the exam structure and last year's paper. I also want to gather your thoughts on topics which would benefit from additional revision material to present after Easter, although we still have material to cover after Easter.

The workshop will be on the representation of time using the tutorial on the SIMILE timeline project. Think of this as a GoogleEarth of time.

Tuesday, 20 March 2007

Multiple nodes in PHP

Several of you have asked about looping round elements in XML. This is obviously desirable because, although only six places are asked for in the data, the code should not have to change when a new place is added.

Heres how to do it - [Caution - this code is unchecked]

//assume places.xml contains multiple places, each with a name and description

$placesxml = simplexml_load_file("places.xml");

$places = $placesxml->xpath("//Place");

// $places is now an array of XML elements, each a Place element

foreach ($places as $place) {

//$place now points to each Place in the XML document in turn
// so we can use object references to access the elements within a Place

print $place->name, $place->description;

// or even XPath expression as well

// and of course, if the are repeated elements in Place, such Link, with elements
//url and text, we can use a inner loop to work with these

foreach( $place->Link as $link) {
print "<a href='$link->url'/>$link->text</a>"

Wednesday, 14 March 2007

Week 20 - Sematic Web -16th of March

We are going to continue to look at the Semantic Web. This week we are going to look at this excellent introduction to the Semantic Web, Schema Languages and Ontology Languages. Then, for self-study, read the section on RDF Schema in the RDF primer that I gave out last week.

Further reading - see the links for last week.

Sunday, 4 March 2007

Week 19 - RDF - 9th of March

In the next two weeks we are going to looking at RDF and RDF Schema languages. We will be reading the RDF primer so please take a look at this document.

See also a short set of slides and the handout for the session.

Further reading:

Wednesday, 28 February 2007

Week 18 - XSLT

XSLT is a rather different language for transforming XML documents than XQuery, but it shares much of the same functionaility. It is more widely used than XQuery partly because there are a number of XSLT processors readily available to use server-side and browsers now include an XSLT processor so that the transformation can be made in the client.

I also want show how XML schemas can be used - for validation of an XML document, and in InfoPath to create a data entry form.


Based on the sample data, stylesheet and CSS in this directory

1. Copy these files to a directory of your own. There is a zip of the files you need.

2. Test distillery-2.xml to make sure it is working as it does in my directory

3. Make a simple modifucation to the CSS to change the output

4. Add another simple template to the XSLT to display another item of information in the file.

5. Modify the stylesheet for distilleries to display

6. Use InfoPath to create a simple form for either the whisky data or for your own data.

Thursday, 22 February 2007

Week 17 - Xquery and XML database

This week we look at an alternative technology for working with XML. This uses a language called XQuery which is an extension of XPath. There are many XQuery implementations and we will be using one which is part of an open-source project providing a native XML databas.

Monday, 19 February 2007

Periodic table of Visualization Methods has published this great Periodic Table of methods of visualisation. This displays around 100 diagram types, with examples and a multi-faceted classification by:
  • simple to complex
  • data/information/concept/strategy/metaphor/compound
  • process/structure
  • detail/overview
  • divergence/convergence
The web page uses a Javascript library to display an example of a diagram type when you mouse-over its box. A neat trick but perhaps not very accessible, so I took the liberty of massaging this table to create a full listing of all the diagram types in alphabetical order. This format is more convenient for my purpose when teaching, and is a nice example of XML-scraping using XQuery.

These listings are made by:
  1. taking the HTML source of the Periodic table
  2. loading it into the eXist database. The source is accepted by eXist even though it is not well formed XML - missing quotes, bare <>
  3. writing a query on XQuery to generate the page.
    1. Find the html document with 'Periodic' in the title
    2. Find all the A tags,
    3. Get the onmouseover attribute
    4. use some string functions to get the name and the source of the image from this string
    5. sort by name
    6. generate a div per tag
Here is the basic XQuery script for the plain listing:

List of methods

for $item in data(/HTML[contains(.//TITLE,'Periodic')]//A/@onmouseover)
let $name := lower-case(substring-before(substring-after($item,"window.status='"),"';"))
let $pix := substring-before(substring-after($item,'src="'),'">')
where string-length($pix) >0
order by $name
<div><a href="'{$pix}'">{$name}</a>

In fact, instead of running a query against the raw HTML, I wrote a slightly different query to generate a simple XML file in which the basic data was stored in alphabetical order. Using an intermediate file also allow me to correct a couple of typos in the method names, and of course it is faster to generate the page. In addition, I've added the facility for a user to group methods and tag the group. Some links to Google images and Wikipedia have been added too. There's a lot more could be done with this.

Now what would be nice would be to get the raw data including the class names as XML so it could be re-organised and extended, without having to descend to scraping.

Sunday, 18 February 2007

Hotlinks - week 16

Friday, 16 February 2007

Coursework tips


Here is the full set of icons which Goggle supply.

See Lecture 16 for an example of its use


Google Earth will display the location of a point in either decimal degrees or degrees minutes seconds - you can select which in the options.

With GoogleMaps, the lat and long appear in the URL of a place - you may have to zoom in and out to get it to appear in this format.

If your data has locations in degrees, minutes and seconds, you can convert to decimal degrees using the formula

decimal-degrees := degrees + minutes / 60 + seconds /3600

You will have to take account of the direction too. N and E are positive, S and W negative.

Wednesday, 14 February 2007

Week 16 - designing an XML vocabulary, XML Schema

This week we turn our attention to the design of an XML document or documents. We look at designing the schema for a single document using the QSEE case tool to generate XML schema - the top-down route. We also look at a bottom-up approach using trang.

The workshop looks at creating a simple XML document to describe Whisky Distilleries.

Saturday, 10 February 2007

Pipes and Filters Architecture

Yahoo have recently launched pipes, a visual programming environment for creating a mashup RSS feed from user inputs and available RSS sources. XML languages for defining pipelines are emerging.

Here are some bloogers on the subject:
Issues with these languages include
  • whether the pipeline itself is expressed in XML (and thus processable with XML tools)
  • whether non-XML data streams are allowed. For example where an intermediate file is non-XML (e.g. Graphviz dot) or the output is non XML (a GIF image)
The origin of Pipes as a concept in which the output of one process is conected to the input of another is in the Unix operating system - Unix Pipe

Ant is widely used in the Java community as a build tool, but can perform XML pipelining.

To use a pipe architecture, we need component filters to carry out standard transformations.
  • Dapper is a tool for scraping HTML pages to create an XML or RSS feed . The neat thing about this tool is that you can give it a number of similar pages and Dapper will try to infer which data items differ page to page, and how to recognise each item. You then name the items you want to scrape and can form these into an HTML, XML or RSS feed.
  • RSS or Atom to PDF e.g. BBC Bristol Weather

Thursday, 8 February 2007

Coursework 2

Here is the specification for the second Coursework. This is an individual assignment in which you will build a simple mashup based on GoogleEarth. This brings together the results of workshops in which PHP and SimpleXML is used to transform RSS and generate kml and a basic XML schema and data are developed.

Specification HTML Word

Wednesday, 7 February 2007

Lecture and workshop week 15

In this workshop you will be preparing to create a dynamic overlay for GoogleEarth.

The key learning points are

  • GoogleEarth is extended with user defined overlays, either static or dynamic
  • the XML vocabulary is called kml
  • a valid kml file needs very few elements to create a minimal file
  • the file must have a Mime type of application/
  • there are geocoding services which will translate a place name to its latitude and longitude

Friday, 2 February 2007

Workshop 2 - SimpleXML in PHP

Using the xpath function in SimpleXML in PHP is a bit tricky, so here is how to do the decoding:

Create an XML file like this. called bbcCodes.xml

and these PHP statements do the lookup:

$name = $_REQUEST["name"];

$places = simplexml_load_file("bbcCodes.xml");

$codes = $places->xpath("//Place[name='$name']/code");

print $codes[0];

Here it is running -

This PHP script uses the xpath function which returns an array of SimpleXMLElements (since this match will usually produce a sequence of elements) so you need to pick out the first one (assuming there is only one match)

Lecture week 14 - XML and XPath

In this lecture I will discuss character encoding, an issue which arose form the workshop last week. It turns out that the essential problem here is the same as the problem which namespaces try to solve - how to mix data from multiple sources (here in multiple languages).

Then we do a bit of revision on XML structures and well-formedness, introducing the XML diagrammer in QSEE.

Then I look at the basics of XPath, a language for selecting parts of a XML document.

This leads into the continuation of last week's worksheet, extending the PHP script with the means to enter a place name and get the formatted forecast for that area.

Thursday, 1 February 2007

Workshop 2 - continuing with the weather feed.

Last week you wrote a PHP script using SimpleXML to fetch an RSS feed from the BBC and formatted a page to display the forecast which was embedded in the RSS.

You noted that the way detailed weather data was handled by the three feeds (The Weather Channel, BBC and Yahoo) were very different and illustrate the point that merely using XML doesn't solve problems of communicating complex data. We also encountered problems with namespaces and attributes with the Yahoo feed. However the worksheet is about the BBC feed so we will avoid this problem for the moment.

In the last part of the work sheet, it asks you to parameterise the script so it can be used for different locations, identified by name This is a problem because the feeds are identified by an id internal to the BBC.

To solve this problem you can add your own data file which contains pairs of Place names and the corresponding BBC code. This data could be held in any of several forms - as a simple text file, as a MYSQL table but for this part of the course, you will create a small XML file to hold these pairs and then use a bit of XPath to find the matching record.

We will cover the basics of XPath in the lecture and how it is used in SimpleXML.

Next week we will be looking at creating more complex XML - kml files to create overlays for Google Earth. In preparation, please take a look at the introduction to GoogleEarth in this blog.

Thursday, 25 January 2007

Term 2 Schedule

week noTopicsLecturer
1326 JanRecap on XML, trees, Simple XML, intro to workshop, RSS, namespacesCW
142 FebXPath, XML structures
159 Feb XML and Google EarthCW
1616 FebXML Schemas, schema creation and inductionCW
1723 FebXQuery and XML databases
182 MarchXSLT , Schema driven input
199 MarchTriples, RDFMB
2016 MarchOntologiesMB
2123 MarchPreparation for RevisionCW/MB
2520 AprilMultimodal - Voice + XML, Visualisation
2627 AprilXML in businessCW
274 MayRevision

Coursework 1 marked

Coursework 1 is ready for collection. At the back is a feedback sheet showing the breakdown of marks by section. Section 2 has been broken into three parts for the report, for the site and its functionality and for the way in which is was implemented in PHP, CSS and HTML. There are also comments on the coursework itself.

Marks range from 55 to 75 with an average of 65.

Generic feedback is here and will be handed out in the lecture.

Tuesday, 23 January 2007

Workshop 1 Term 2 - RSS and PHP

A voice message (2 min 16 secs)

In this workshop we will continue the work looking with PHP and the SimpleXML class by using this approach to transform data from an RSS feed (a weather feed from the BBC).

You will also compare three sources of data - from the Weather Channel,. Yahoo and the BBC to identify differences in both structure and content of these data sources, and explore the reasons for these differences.

You will also be introduced to the notion of namespaces and the basics of location data, in preparation for work with Google Earth in the next workshop

Monday, 22 January 2007

Lecture Week 14

In the lecture this Friday we will cover the following topics:
  • A recap of trees, XML and the Simple XML interface in PHP
  • an overview of the schedule for this term
  • outline of the coursework for this term
  • introduction to the workshop on RSS
Attendance was very poor for the last three lectures of last term. I appreciate that you all put a lot of work into the assignment. Those who missed will, we assume, have looked at (or even listened to) the missed lectures and the workshop sessions which are all available from this blog.

Thursday, 18 January 2007

Google Earth

A brief message from the module leader:

This term , we will be studying a number of XML 'vocabularies' or languages. One which has received a great deal of attention is kml - keyhole mark-up language. Keyhole Corp was acquired by Google in 2004 and their software is the basis of GoogleEarth (GE). kml is the XML language which defines user additions called 'overlays' to the base digital imagery. A kml file is created when you create placemarks and other features in GE and save them as a file. kml is the plain text format, and kmz is a zip compressed format. These files can then be shared by providing a link on a web site, or adding to a GE community site. Moreover kml files can now be accessed by GoogleMap.

Where location data for a subject of interest is available from another source, kml can be generated dynamically using a server-side script such as PHP or XQuery. This is the aspect which we will be exploring in tutorials.


Local examples