DSA2006: 2007

Friday, 27 April 2007

Last Year exam paper

Last years' Summer exam paper

Worked and commented answers to the Multiple Choice Questions

Wednesday, 25 April 2007

Describing the behaviour of complex interacting systems is difficult.

Here is a clip of Rooney second goal against Milan on Tuesday night on YouTube.

Here is a diagram of that move as shown in the Independent the following day.

Is this more complex to describe than a web application, or less?

Features of a football move

Concurrent movement of of 22 actors (players ) to tracked
Only two kinds of actor - home, away (or 22 if positions are needed)
Only a few actors at any time are significant to the outcome
Only a single object,the ball, is interchanged

Features of a web application interchange

Several different kinds of actor, all with very different behaviour
Each interchange involves different objects
Objects are created dynamically
Receipt of an object (such as a script or page) alters the actor's behaviour greatly
Actors have fixed locations but you have to find them

Sequence diagrams are a useful technique for explaining moves in a web application. They can be drawn using a number of case tools, such as any which supports UML, but the tight binding to the rest of an object model can create difficulties. Diagram are also tedious to maintain, and there is a case for generating diagrams from a textual description.

As a small demo, I have created an application using an XML description of a sequence diagram transformed into html using either XQuery or XSLT 2.0.

(Saxon not yet activated on this server)

the xml description of a simple interaction
the XQuery script - run it
the XSLT 2.0 stylesheet

Of course this application will require a little more work to make it more configuable.

Thursday, 19 April 2007

Week 22- Metadata and media types

This week's lecture will look at the idea of meta data more generally, and in particular look at how meta data supports applications such as browsers.

Powerpoint Slides

Links for this topic:

Metadata (poor)
MIME type

IANA international registry MIME media types
MIME type
XHTML

File Systems

Filename Extension

NTFS

File Managers

Windows Explorer

Browsers

FireFox

How IE sniffs the content-type

Handling Mime Types in IE

Web Server

Apache

Adding file extensions

XHTML file type

MIME types

Thursday, 22 March 2007

Timeline workshop

SIMILE timeline is a JavaScript module which provides an API with which a programmer can create a display of a set of events defined in an XML file.

Work

Work through the tutorial, creating a simple timeline as far as the section on Differentiating the Two Bands
Create a new xml file of events in the required format which represents the University calendar or some other set of events

Wednesday, 21 March 2007

Week 21 - Revision

There will be a lecture this Friday to help you prepare your revision over Easter. We will look at the exam structure and last year's paper. I also want to gather your thoughts on topics which would benefit from additional revision material to present after Easter, although we still have material to cover after Easter.

The workshop will be on the representation of time using the tutorial on the SIMILE timeline project. Think of this as a GoogleEarth of time.

Tuesday, 20 March 2007

Multiple nodes in PHP

Several of you have asked about looping round elements in XML. This is obviously desirable because, although only six places are asked for in the data, the code should not have to change when a new place is added.

Heres how to do it - [Caution - this code is unchecked]



//assume places.xml contains multiple places, each with a name and description

$placesxml = simplexml_load_file("places.xml");

$places = $placesxml->xpath("//Place");

// $places is now an array of XML elements, each a Place element

foreach ($places as $place) {

//$place now points to each Place in the XML document in turn
// so we can use object references to access the elements within a Place

print $place->name, $place->description;

// or even XPath expression as well

// and of course, if the are repeated elements in Place, such  Link, with elements
//url and text, we can use a inner loop to work with these

   foreach( $place->Link as $link)  {
       print "<a href='$link->url'/>$link->text</a>"
  }
}

Wednesday, 14 March 2007

Week 20 - Sematic Web -16th of March

We are going to continue to look at the Semantic Web. This week we are going to look at this excellent introduction to the Semantic Web, Schema Languages and Ontology Languages. Then, for self-study, read the section on RDF Schema in the RDF primer that I gave out last week.

Further reading - see the links for last week.

Sunday, 4 March 2007

Week 19 - RDF - 9th of March

In the next two weeks we are going to looking at RDF and RDF Schema languages. We will be reading the RDF primer so please take a look at this document.

See also a short set of slides and the handout for the session.

Further reading:

Eric Miller's tutorial on RDF at DC-2002, October 13-17, 2002, Florence, Italy.
Ivan Herman's talk at WWW2006 in Edinburgh on the Semantic Web.
Henry Story's video presentation about RDF produced by Sun Microsystems.
Peter Patel-Schneider's Google TechTalk on the Semantic Web.

Wednesday, 28 February 2007

Week 18 - XSLT

XSLT is a rather different language for transforming XML documents than XQuery, but it shares much of the same functionaility. It is more widely used than XQuery partly because there are a number of XSLT processors readily available to use server-side and browsers now include an XSLT processor so that the transformation can be made in the client.

I also want show how XML schemas can be used - for validation of an XML document, and in InfoPath to create a data entry form.

slides

Workshop

Based on the sample data, stylesheet and CSS in this directory

http://www.cems.uwe.ac.uk/~cjwallac/apps/scotch/

1. Copy these files to a directory of your own. There is a zip of the files you need.

2. Test distillery-2.xml to make sure it is working as it does in my directory

3. Make a simple modifucation to the CSS to change the output

4. Add another simple template to the XSLT to display another item of information in the file.

5. Modify the stylesheet for distilleries to display

6. Use InfoPath to create a simple form for either the whisky data or for your own data.

Thursday, 22 February 2007

Week 17 - Xquery and XML database

This week we look at an alternative technology for working with XML. This uses a language called XQuery which is an extension of XPath. There are many XQuery implementations and we will be using one which is part of an open-source project providing a native XML databas.

slides
Distillery example

eXist Database
Xquery on w3 schools

Monday, 19 February 2007

Periodic table of Visualization Methods

Visual-Literacy.org has published this great Periodic Table of methods of visualisation. This displays around 100 diagram types, with examples and a multi-faceted classification by:

simple to complex
data/information/concept/strategy/metaphor/compound
process/structure
detail/overview
divergence/convergence

The web page uses a Javascript library to display an example of a diagram type when you mouse-over its box. A neat trick but perhaps not very accessible, so I took the liberty of massaging this table to create a full listing of all the diagram types in alphabetical order. This format is more convenient for my purpose when teaching, and is a nice example of XML-scraping using XQuery.

These listings are made by:

taking the HTML source of the Periodic table
loading it into the eXist database. The source is accepted by eXist even though it is not well formed XML - missing quotes, bare <>
writing a query on XQuery to generate the page.

Find the html document with 'Periodic' in the title
Find all the A tags,
Get the onmouseover attribute
use some string functions to get the name and the source of the image from this string
sort by name
generate a div per tag

Here is the basic XQuery script for the plain listing:

List of methods

for $item in data(/HTML[contains(.//TITLE,'Periodic')]//A/@onmouseover)
let $name := lower-case(substring-before(substring-after($item,"window.status='"),"';"))
let $pix := substring-before(substring-after($item,'src="'),'">')
where string-length($pix) >0
order by $name
return
<div><a href="'http://www.visual-literacy.org/periodic_table/{$pix}'">{$name}</a>
</div>

In fact, instead of running a query against the raw HTML, I wrote a slightly different query to generate a simple XML file in which the basic data was stored in alphabetical order. Using an intermediate file also allow me to correct a couple of typos in the method names, and of course it is faster to generate the page. In addition, I've added the facility for a user to group methods and tag the group. Some links to Google images and Wikipedia have been added too. There's a lot more could be done with this.

Now what would be nice would be to get the raw data including the class names as XML so it could be re-organised and extended, without having to descend to scraping.

Sunday, 18 February 2007

Hotlinks - week 16

Bloglines Image Wall
Mashups tagged with science
Elliotte Rusty Harold's XML predictions for 2007

Friday, 16 February 2007

Coursework tips

Icons

Here is the full set of icons which Goggle supply.

See Lecture 16 for an example of its use

Locations

Google Earth will display the location of a point in either decimal degrees or degrees minutes seconds - you can select which in the options.

With GoogleMaps, the lat and long appear in the URL of a place - you may have to zoom in and out to get it to appear in this format.

If your data has locations in degrees, minutes and seconds, you can convert to decimal degrees using the formula

decimal-degrees := degrees + minutes / 60 + seconds /3600

You will have to take account of the direction too. N and E are positive, S and W negative.

Wednesday, 14 February 2007

Week 16 - designing an XML vocabulary, XML Schema

This week we turn our attention to the design of an XML document or documents. We look at designing the schema for a single document using the QSEE case tool to generate XML schema - the top-down route. We also look at a bottom-up approach using trang.

The workshop looks at creating a simple XML document to describe Whisky Distilleries.

Saturday, 10 February 2007

Pipes and Filters Architecture

Yahoo have recently launched pipes, a visual programming environment for creating a mashup RSS feed from user inputs and available RSS sources. XML languages for defining pipelines are emerging.

Here are some bloogers on the subject:

John Musser (Programmeable Web)
Tim O'Reilly.
Fred Stutzman (about the need to change web applications to support fine-grained RSS feeds.)
TechCrunch
Kurt Cagle (on XML pipelines)
Jeni Tennison's Xtech paper is an excellent overview of XML pipelines

Issues with these languages include

whether the pipeline itself is expressed in XML (and thus processable with XML tools)
whether non-XML data streams are allowed. For example where an intermediate file is non-XML (e.g. Graphviz dot) or the output is non XML (a GIF image)

The origin of Pipes as a concept in which the output of one process is conected to the input of another is in the Unix operating system - Unix Pipe

Ant is widely used in the Java community as a build tool, but can perform XML pipelining.

To use a pipe architecture, we need component filters to carry out standard transformations.

Dapper is a tool for scraping HTML pages to create an XML or RSS feed . The neat thing about this tool is that you can give it a number of similar pages and Dapper will try to infer which data items differ page to page, and how to recognise each item. You then name the items you want to scrape and can form these into an HTML, XML or RSS feed.
RSS or Atom to PDF e.g. BBC Bristol Weather

Thursday, 8 February 2007

Coursework 2

Here is the specification for the second Coursework. This is an individual assignment in which you will build a simple mashup based on GoogleEarth. This brings together the results of workshops in which PHP and SimpleXML is used to transform RSS and generate kml and a basic XML schema and data are developed.

Specification HTML Word

Wednesday, 7 February 2007

Lecture and workshop week 15

In this workshop you will be preparing to create a dynamic overlay for GoogleEarth.

The key learning points are

GoogleEarth is extended with user defined overlays, either static or dynamic
the XML vocabulary is called kml
a valid kml file needs very few elements to create a minimal file
the file must have a Mime type of application/vnd.google-earth.kml+xml
there are geocoding services which will translate a place name to its latitude and longitude

Resources

blog entry on Google Earth
my wiki entries on Geocoding and Location
slides
worksheet
simple PHP to kml script

Friday, 2 February 2007

Workshop 2 - SimpleXML in PHP

Using the xpath function in SimpleXML in PHP is a bit tricky, so here is how to do the decoding:

Create an XML file like this. called bbcCodes.xml

and these PHP statements do the lookup:

$name = $_REQUEST["name"];

$places = simplexml_load_file("bbcCodes.xml");

$codes = $places->xpath("//Place[name='$name']/code");

print $codes[0];

Here it is running -

This PHP script uses the xpath function which returns an array of SimpleXMLElements (since this match will usually produce a sequence of elements) so you need to pick out the first one (assuming there is only one match)

Lecture week 14 - XML and XPath

In this lecture I will discuss character encoding, an issue which arose form the workshop last week. It turns out that the essential problem here is the same as the problem which namespaces try to solve - how to mix data from multiple sources (here in multiple languages).

Then we do a bit of revision on XML structures and well-formedness, introducing the XML diagrammer in QSEE.

Then I look at the basics of XPath, a language for selecting parts of a XML document.

This leads into the continuation of last week's worksheet, extending the PHP script with the means to enter a place name and get the formatted forecast for that area.

Thursday, 1 February 2007

Workshop 2 - continuing with the weather feed.

Last week you wrote a PHP script using SimpleXML to fetch an RSS feed from the BBC and formatted a page to display the forecast which was embedded in the RSS.

You noted that the way detailed weather data was handled by the three feeds (The Weather Channel, BBC and Yahoo) were very different and illustrate the point that merely using XML doesn't solve problems of communicating complex data. We also encountered problems with namespaces and attributes with the Yahoo feed. However the worksheet is about the BBC feed so we will avoid this problem for the moment.

In the last part of the work sheet, it asks you to parameterise the script so it can be used for different locations, identified by name This is a problem because the feeds are identified by an id internal to the BBC.

To solve this problem you can add your own data file which contains pairs of Place names and the corresponding BBC code. This data could be held in any of several forms - as a simple text file, as a MYSQL table but for this part of the course, you will create a small XML file to hold these pairs and then use a bit of XPath to find the matching record.

We will cover the basics of XPath in the lecture and how it is used in SimpleXML.

Next week we will be looking at creating more complex XML - kml files to create overlays for Google Earth. In preparation, please take a look at the introduction to GoogleEarth in this blog.

Thursday, 25 January 2007

Term 2 Schedule

week no	Topics	Lecturer
13	26 Jan	Recap on XML, trees, Simple XML, intro to workshop, RSS, namespaces	CW
14	2 Feb	XPath, XML structures	CW
15	9 Feb	XML and Google Earth	CW
16	16 Feb	XML Schemas, schema creation and induction	CW
17	23 Feb	XQuery and XML databases	CW
18	2 March	XSLT , Schema driven input	CW
19	9 March	Triples, RDF	MB
20	16 March	Ontologies	MB
21	23 March	Preparation for Revision	CW/MB
25	20 April	Multimodal - Voice + XML, Visualisation	CW
26	27 April	XML in business	CW
27	4 May	Revision

Coursework 1 marked

Coursework 1 is ready for collection. At the back is a feedback sheet showing the breakdown of marks by section. Section 2 has been broken into three parts for the report, for the site and its functionality and for the way in which is was implemented in PHP, CSS and HTML. There are also comments on the coursework itself.

Marks range from 55 to 75 with an average of 65.

Generic feedback is here and will be handed out in the lecture.

Tuesday, 23 January 2007

Workshop 1 Term 2 - RSS and PHP

A voice message (2 min 16 secs)

In this workshop we will continue the work looking with PHP and the SimpleXML class by using this approach to transform data from an RSS feed (a weather feed from the BBC).

You will also compare three sources of data - from the Weather Channel,. Yahoo and the BBC to identify differences in both structure and content of these data sources, and explore the reasons for these differences.

You will also be introduced to the notion of namespaces and the basics of location data, in preparation for work with Google Earth in the next workshop

Monday, 22 January 2007

Lecture Week 14

In the lecture this Friday we will cover the following topics:

A recap of trees, XML and the Simple XML interface in PHP
an overview of the schedule for this term
outline of the coursework for this term
introduction to the workshop on RSS

Attendance was very poor for the last three lectures of last term. I appreciate that you all put a lot of work into the assignment. Those who missed will, we assume, have looked at (or even listened to) the missed lectures and the workshop sessions which are all available from this blog.

Thursday, 18 January 2007

Google Earth

A brief message from the module leader:

This term , we will be studying a number of XML 'vocabularies' or languages. One which has received a great deal of attention is kml - keyhole mark-up language. Keyhole Corp was acquired by Google in 2004 and their software is the basis of GoogleEarth (GE). kml is the XML language which defines user additions called 'overlays' to the base digital imagery. A kml file is created when you create placemarks and other features in GE and save them as a file. kml is the plain text format, and kmz is a zip compressed format. These files can then be shared by providing a link on a web site, or adding to a GE community site. Moreover kml files can now be accessed by GoogleMap.

Where location data for a subject of interest is available from another source, kml can be generated dynamically using a server-side script such as PHP or XQuery. This is the aspect which we will be exploring in tutorials.

Resources

kml tutorial and reference
GoogleEarth Community
OgleEarth blog by Stefan Geens
A GoogleTech Talk
Using GoogleEarth - blog by John Gardiner, a GE developer (and keen mountain biker)
KoKae - a GE tutorial from Richard Treves, a geographer at Southhampton University
kml Schema

Local examples

A generated kml map of entertainment places in Bristol - view in Google Map