XSLT is a rather different language for transforming XML documents than XQuery, but it shares much of the same functionaility. It is more widely used than XQuery partly because there are a number of XSLT processors readily available to use server-side and browsers now include an XSLT processor so that the transformation can be made in the client.
I also want show how XML schemas can be used - for validation of an XML document, and in InfoPath to create a data entry form.
Workshop
Based on the sample data, stylesheet and CSS in this directory
http://www.cems.uwe.ac.uk/~cjwallac/apps/scotch/
1. Copy these files to a directory of your own. There is a zip of the files you need.
2. Test distillery-2.xml to make sure it is working as it does in my directory
3. Make a simple modifucation to the CSS to change the output
4. Add another simple template to the XSLT to display another item of information in the file.
5. Modify the stylesheet for distilleries to display
6. Use InfoPath to create a simple form for either the whisky data or for your own data.
Wednesday, 28 February 2007
Thursday, 22 February 2007
Week 17 - Xquery and XML database
This week we look at an alternative technology for working with XML. This uses a language called XQuery which is an extension of XPath. There are many XQuery implementations and we will be using one which is part of an open-source project providing a native XML databas.
- slides
- Distillery example
- eXist Database
- Xquery on w3 schools
Monday, 19 February 2007
Periodic table of Visualization Methods
Visual-Literacy.org has published this great Periodic Table of methods of visualisation. This displays around 100 diagram types, with examples and a multi-faceted classification by:
These listings are made by:
List of methods
for $item in data(/HTML[contains(.//TITLE,'Periodic')]//A/@onmouseover)
let $name := lower-case(substring-before(substring-after($item,"window.status='"),"';"))
let $pix := substring-before(substring-after($item,'src="'),'">')
where string-length($pix) >0
order by $name
return
<div><a href="'http://www.visual-literacy.org/periodic_table/{$pix}'">{$name}</a>
</div>
In fact, instead of running a query against the raw HTML, I wrote a slightly different query to generate a simple XML file in which the basic data was stored in alphabetical order. Using an intermediate file also allow me to correct a couple of typos in the method names, and of course it is faster to generate the page. In addition, I've added the facility for a user to group methods and tag the group. Some links to Google images and Wikipedia have been added too. There's a lot more could be done with this.
Now what would be nice would be to get the raw data including the class names as XML so it could be re-organised and extended, without having to descend to scraping.
- simple to complex
- data/information/concept/strategy/metaphor/compound
- process/structure
- detail/overview
- divergence/convergence
These listings are made by:
- taking the HTML source of the Periodic table
- loading it into the eXist database. The source is accepted by eXist even though it is not well formed XML - missing quotes, bare <>
- writing a query on XQuery to generate the page.
- Find the html document with 'Periodic' in the title
- Find all the A tags,
- Get the onmouseover attribute
- use some string functions to get the name and the source of the image from this string
- sort by name
- generate a div per tag
List of methods
for $item in data(/HTML[contains(.//TITLE,'Periodic')]//A/@onmouseover)
let $name := lower-case(substring-before(substring-after($item,"window.status='"),"';"))
let $pix := substring-before(substring-after($item,'src="'),'">')
where string-length($pix) >0
order by $name
return
<div><a href="'http://www.visual-literacy.org/periodic_table/{$pix}'">{$name}</a>
</div>
In fact, instead of running a query against the raw HTML, I wrote a slightly different query to generate a simple XML file in which the basic data was stored in alphabetical order. Using an intermediate file also allow me to correct a couple of typos in the method names, and of course it is faster to generate the page. In addition, I've added the facility for a user to group methods and tag the group. Some links to Google images and Wikipedia have been added too. There's a lot more could be done with this.
Now what would be nice would be to get the raw data including the class names as XML so it could be re-organised and extended, without having to descend to scraping.
Sunday, 18 February 2007
Hotlinks - week 16
- Bloglines Image Wall
- Mashups tagged with science
- Elliotte Rusty Harold's XML predictions for 2007
Friday, 16 February 2007
Coursework tips
Icons
Here is the full set of icons which Goggle supply.
See Lecture 16 for an example of its use
Locations
Google Earth will display the location of a point in either decimal degrees or degrees minutes seconds - you can select which in the options.
With GoogleMaps, the lat and long appear in the URL of a place - you may have to zoom in and out to get it to appear in this format.
If your data has locations in degrees, minutes and seconds, you can convert to decimal degrees using the formula
decimal-degrees := degrees + minutes / 60 + seconds /3600
You will have to take account of the direction too. N and E are positive, S and W negative.
Here is the full set of icons which Goggle supply.
See Lecture 16 for an example of its use
Locations
Google Earth will display the location of a point in either decimal degrees or degrees minutes seconds - you can select which in the options.
With GoogleMaps, the lat and long appear in the URL of a place - you may have to zoom in and out to get it to appear in this format.
If your data has locations in degrees, minutes and seconds, you can convert to decimal degrees using the formula
decimal-degrees := degrees + minutes / 60 + seconds /3600
You will have to take account of the direction too. N and E are positive, S and W negative.
Wednesday, 14 February 2007
Week 16 - designing an XML vocabulary, XML Schema
This week we turn our attention to the design of an XML document or documents. We look at designing the schema for a single document using the QSEE case tool to generate XML schema - the top-down route. We also look at a bottom-up approach using trang.
The workshop looks at creating a simple XML document to describe Whisky Distilleries.
The workshop looks at creating a simple XML document to describe Whisky Distilleries.
Saturday, 10 February 2007
Pipes and Filters Architecture
Yahoo have recently launched pipes, a visual programming environment for creating a mashup RSS feed from user inputs and available RSS sources. XML languages for defining pipelines are emerging.
Here are some bloogers on the subject:
Ant is widely used in the Java community as a build tool, but can perform XML pipelining.
To use a pipe architecture, we need component filters to carry out standard transformations.
Here are some bloogers on the subject:
- John Musser (Programmeable Web)
- Tim O'Reilly.
- Fred Stutzman (about the need to change web applications to support fine-grained RSS feeds.)
- TechCrunch
- Kurt Cagle (on XML pipelines)
- Jeni Tennison's Xtech paper is an excellent overview of XML pipelines
- whether the pipeline itself is expressed in XML (and thus processable with XML tools)
- whether non-XML data streams are allowed. For example where an intermediate file is non-XML (e.g. Graphviz dot) or the output is non XML (a GIF image)
Ant is widely used in the Java community as a build tool, but can perform XML pipelining.
To use a pipe architecture, we need component filters to carry out standard transformations.
- Dapper is a tool for scraping HTML pages to create an XML or RSS feed . The neat thing about this tool is that you can give it a number of similar pages and Dapper will try to infer which data items differ page to page, and how to recognise each item. You then name the items you want to scrape and can form these into an HTML, XML or RSS feed.
- RSS or Atom to PDF e.g. BBC Bristol Weather
Thursday, 8 February 2007
Coursework 2
Here is the specification for the second Coursework. This is an individual assignment in which you will build a simple mashup based on GoogleEarth. This brings together the results of workshops in which PHP and SimpleXML is used to transform RSS and generate kml and a basic XML schema and data are developed.
Specification HTML Word
Specification HTML Word
Wednesday, 7 February 2007
Lecture and workshop week 15
In this workshop you will be preparing to create a dynamic overlay for GoogleEarth.
The key learning points are
The key learning points are
- GoogleEarth is extended with user defined overlays, either static or dynamic
- the XML vocabulary is called kml
- a valid kml file needs very few elements to create a minimal file
- the file must have a Mime type of application/vnd.google-earth.kml+xml
- there are geocoding services which will translate a place name to its latitude and longitude
- blog entry on Google Earth
- my wiki entries on Geocoding and Location
- slides
- worksheet
- simple PHP to kml script
Friday, 2 February 2007
Workshop 2 - SimpleXML in PHP
Using the xpath function in SimpleXML in PHP is a bit tricky, so here is how to do the decoding:
Create an XML file like this. called bbcCodes.xml
and these PHP statements do the lookup:
$name = $_REQUEST["name"];
$places = simplexml_load_file("bbcCodes.xml");
$codes = $places->xpath("//Place[name='$name']/code");
print $codes[0];
Here it is running -
This PHP script uses the xpath function which returns an array of SimpleXMLElements (since this match will usually produce a sequence of elements) so you need to pick out the first one (assuming there is only one match)
Create an XML file like this. called bbcCodes.xml
and these PHP statements do the lookup:
$name = $_REQUEST["name"];
$places = simplexml_load_file("bbcCodes.xml");
$codes = $places->xpath("//Place[name='$name']/code");
print $codes[0];
Here it is running -
This PHP script uses the xpath function which returns an array of SimpleXMLElements (since this match will usually produce a sequence of elements) so you need to pick out the first one (assuming there is only one match)
Lecture week 14 - XML and XPath
In this lecture I will discuss character encoding, an issue which arose form the workshop last week. It turns out that the essential problem here is the same as the problem which namespaces try to solve - how to mix data from multiple sources (here in multiple languages).
Then we do a bit of revision on XML structures and well-formedness, introducing the XML diagrammer in QSEE.
Then I look at the basics of XPath, a language for selecting parts of a XML document.
This leads into the continuation of last week's worksheet, extending the PHP script with the means to enter a place name and get the formatted forecast for that area.
Then we do a bit of revision on XML structures and well-formedness, introducing the XML diagrammer in QSEE.
Then I look at the basics of XPath, a language for selecting parts of a XML document.
This leads into the continuation of last week's worksheet, extending the PHP script with the means to enter a place name and get the formatted forecast for that area.
Thursday, 1 February 2007
Workshop 2 - continuing with the weather feed.
Last week you wrote a PHP script using SimpleXML to fetch an RSS feed from the BBC and formatted a page to display the forecast which was embedded in the RSS.
You noted that the way detailed weather data was handled by the three feeds (The Weather Channel, BBC and Yahoo) were very different and illustrate the point that merely using XML doesn't solve problems of communicating complex data. We also encountered problems with namespaces and attributes with the Yahoo feed. However the worksheet is about the BBC feed so we will avoid this problem for the moment.
In the last part of the work sheet, it asks you to parameterise the script so it can be used for different locations, identified by name This is a problem because the feeds are identified by an id internal to the BBC.
To solve this problem you can add your own data file which contains pairs of Place names and the corresponding BBC code. This data could be held in any of several forms - as a simple text file, as a MYSQL table but for this part of the course, you will create a small XML file to hold these pairs and then use a bit of XPath to find the matching record.
We will cover the basics of XPath in the lecture and how it is used in SimpleXML.
Next week we will be looking at creating more complex XML - kml files to create overlays for Google Earth. In preparation, please take a look at the introduction to GoogleEarth in this blog.
You noted that the way detailed weather data was handled by the three feeds (The Weather Channel, BBC and Yahoo) were very different and illustrate the point that merely using XML doesn't solve problems of communicating complex data. We also encountered problems with namespaces and attributes with the Yahoo feed. However the worksheet is about the BBC feed so we will avoid this problem for the moment.
In the last part of the work sheet, it asks you to parameterise the script so it can be used for different locations, identified by name This is a problem because the feeds are identified by an id internal to the BBC.
To solve this problem you can add your own data file which contains pairs of Place names and the corresponding BBC code. This data could be held in any of several forms - as a simple text file, as a MYSQL table but for this part of the course, you will create a small XML file to hold these pairs and then use a bit of XPath to find the matching record.
We will cover the basics of XPath in the lecture and how it is used in SimpleXML.
Next week we will be looking at creating more complex XML - kml files to create overlays for Google Earth. In preparation, please take a look at the introduction to GoogleEarth in this blog.
Subscribe to:
Posts (Atom)