This helps me organise information about what I do and may help others too.
Let's look at the last NCeSS Service Deliver Board Minutes... Seems there not available yet, but the preparatory material gives me a good idea as to what it was all about.
Blimey, how time flies! It's not that I've been doing nothing (I've not being doing nothing), I've just been sufferring from a lack of organisation since returning from a weeks hoilday to Ireland. Ireland was a fantastic break I was ready for it so I though, but my return to work was chaotic and I did not know what to do first. This was as much a systems failure than anything else. More and more I find that while I do not implement Dublin Core Metadata Institute recommendations the more work I have. Eeek it could all grind to a halt!
So, I'm still trying to use this blog to drive my information management... Let me track back... What have I done since my last entry:
Introduction to e-Infrastructure: Enabling the research of the future
The workshop was good.
Mike Mineter went through a lot of important introductory stuff most of which I had seen before. I'm sure it was a great help for others there that were just getting into e-Science.
I learned some useful background on Shibboleth. Our admin guys at Leeds University seem quite advanced on the implementation of it. It should enable much greater collaboration within academia especially for courses. Penn State and Leeds have set up a federation for developing joint degree programmes. They have a token guinea pig student taking a module. It was said that Swiss and some Scandinavian universities are well advanced in this area.
The DAME guys have been successful in getting yet more funds, so they can keep going for time. This has to be a good thing :)
There were various hints that e-Science money is running out and that traditional e-Scientists are looking for application areas outside the harder sciences. I think this is a good thing...
Sorry I felt a little uncomfortable giving my presentation. It was partly the place being a little tight - not much room to wave arms and move around the screen. I was nervous, there were a number of important people there and I didn't manage to relax during my presentation and get into the swing of things. I don't think I presented well, but the practice should help me do better next time...
It was hard to describe all of NCeSS, outline the future of e-Infrastructure for e-Social Science and go into any detail of Modelling and Simulation in 20 minutes. I refered to the notes that accompany the slides in the powerpoint presentation I prepared. I wanted the audience to be checking out the URLs and feeding back...
I got on dodgy ground at one stage while I was covering a grey area regarding data (there are many of them). I don't recall my exact words, but I bolted upright in bed last night... I said something along the lines of: "sometimes it is best not to relax and not worry too much about usage/license agreements". I was trying to make several points, but this was a bit of a foot in mouth thing to say and I got a bit befuddled. I didn't mean it, it just came out wrong. It was not an attempt at a joke or anything, I was thinking out loud. No excuse... shoot me at dawn!
Social science metadata for discovery is usually not available, most of the data is old and non-standard and for the most part non-XML. Making data securely available on the NGS requires something like shibboleth and encryption/decryption. For much social science data you are not to disclose some patterns you identify. The data and usage aggreements are complex, so it is often that the data providers want to see your outputs before allowing publication. This is an issue when there are a number of data providers involved. I'm not planning on worrying about what I said too much. I haven't done anything really bad I hope, but I feel like what I said could be interpreted that I was encouraging something I shouldn't... Can I offer a similar defence to Robert:
"I know that you believe that you understood what you think I said, but I am not sure you
realise that what you heard is not what I meant..." Robert McCloskey, US State Department
Spokesman. On the back cover of Eats Shites and Leaves: Crap English and how to use it
A. Parody ISBN 1-84317-098-1
I'm not going to loose any more sleep on it hopefully. I'll try to back up by putting more explanation on-line if you want...
Thanks to everyone involved. I look forward to our on-going collaboration.
I realise my blog has been way to formal (boring and not really conversational). Also it has not been nearly colourful enough or contained nearly enough interesting material.
I've just checked out Jody Garnett's Blog . He is a bit of a hero that develops cool stuff and keeps us informed about all sorts of goings on in open geospatial stuff. So, Jody pointed us to this OGC Web Services Demo . I'd like to iterate that if you have 15 minutes spare this is well worth consuming.
I think I'm going to drop standard English as it's too long winded...
Time to figure out RSS. Should probably use software once figured out how works. How to test if works...
In preperation for the next MASS meeting I read
Pontius R.G. (Jr), Huffaker D., Denman K. (2004)
Useful techniques of validation for spatially explicit land-change models.
Ecological Modelling, Volume 179, Issue 4, Pages 445-461.
Section 1.2 Paragraph 5:
Useful definition of noise and over training/calibrating a model based on a sample.
Section 1.3 Paragraph 2:
"...it is helpful to use a validation technique that:
(a) budgets the sources of error,
(b) compares the model to a Null model,
(c) compares the model to a Random model,
(d) performs the analysis at multiple scales."
and
"It is important to compare the model to both a Null model and a Random model in order
to assess the additional predictive power, if any, that the model provides. Scale is
important to consider during any comparison of maps, because results can be sensitive
to scale and certain patterns may be evident at only certain scales (Kok et al., 2001
and Quattrochi and Goodchild, 1997)."
Section 2.2:
The neural network modelling of
MEDALUS III
, and
MedAction
relates to this. In predicting the quantity of land types, the models were calibrated
on the baseline (existing data) and then a prediction was made using the same cut-off
and neural network parameters. However, for the population modelling a linear
interpolation of NIDI's forecasts was used to constrain the results. A model that
predicts the locations of something implicitly predicting the number of locations
or amount of that thing.
Section 2.5:
Considering scale issues is good. The details of aggregation in the paper does not
mention the many different aggregations that can result.
The different aggregations of this type are illustrated in Figure 1 of
Turner (2000)
. It would be less biased to consider all possible aggregations and at each level of
aggregation. Some average could then be used. However, aggregating in this way is
inherently biased due to the unsymmetrical nature of squares. What is less biased in
principle is drawing values into a statistic based on circular regions. Doing this
brings up questions of whether distance weighting should be applied. Usually some
kind of distance weighting is desirable and often it is a monotonic function
with further away values being weighted less. More complex non-monotonic weighting
can be applied by subtracting some such weighting. It is the distance weightings that
ramp up monotonically to some maximum and then back down again monotonically that
focus on a particular scale. Such weightings are useful for studying distributions
of plant species and human settlement. In the case of comparing if two surfaces of
distribution are similar various Geographically Weighted Statistics (GWS) may be of
interest.
Turner (2006)
provides more details on raster based GWS.
Section 2.6:
"The Null Resolution is the resolution at which the accuracy of the predictive model
matches the accuracy of the Null model."
Section 4.3: The bias of masking
"Whatever the statistical criterion, it is dangerous to mask out parts of the study
area during the validation phase. Results of statistical analysis can be extremely
sensitive to any procedure that ignores parts of the study area."
This is why I deceided against masking non-road areas in PhD studies of the distribution
of Road Accidents.
References
Kok K., Farrow A., Veldkamp T.A. and Verberg, P., 2001. A method and application of multi-scale validation in spatial land use models. Agr. Ecosyst. Environ. 85 1-3, pp. 223-238.
Quattrochi D.A., Goodchild M.F., (Eds.), 1997. Scale in Remote Sensing and GIS. Lewis Publishers, Boca Raton, FL.
Turner A.G.D. (2006) Raster Based Geographically Weighted Statistics for Studying the
Spatial Change of Incidence Distributions Over Time: An Application to Stats 19 Personal
Injury Road Accident Data, PhD Working Paper.
Turner A.G.D. (2000) Density Data Generation for Spatial Data Mining Applications.
Paper presented at the 5th International Conference on GeoComputation, England, September.
Browsed High Performance Computing for Statistical Inference . - Worth checking out/ attending...
Drafted a . proposal for a Regional Information Sharing Grid GIS Project: Yorkshire from the Air - It is now up to Robin to feedback...
Wrote a definition of geomorphometrics : Terrain surface geometry measures . Metrics for geomorphology . - Link more details on this research interest.
Attended the River Basin Processes and Management Research Cluster meeting where Alona Armstrong and Lee Brown outlined their research.
Had a look at Unidata NetCDF Java Library Home Page . - Well worth looking at integrating with Grids for GeoTools
Wrote a definition of e-Social Science : Social Science using Grid Computing . A subset of e-Science . - e-Social Science is also what I am trying to do albeit from a computational geography angle.
Browsed around the following Wikipedia pages: Social Science , Portal .
Browsed around The Institute for Fiscal Studies Web Site and had a close look at The English Longitudinal Study of Ageing - I was pointed to this by my colleague Justin Keen who works with me on The MoSeS Project .
Browsed around the following Wikipedia pages: Anarchy , Anarchism , Government , State , Law , Lawlessness . - There are multiple meanings of many terms, but it seems that the terms anarchy and anarchism are well defined and it seems a misusage of the term anarchy when disorganisation, chaos or lawlessness is meant. However it does get used in that way. This was related to a conversation with my colleagues Stuart Hodkinson and Paul Chatterton
Skim read Armstrong M.P., Cowles M.K., Wang S. (2005) Using a Computational Grid for Geographic Information Analysis: A Reconnaissance. Pages 355-375 . In The Professional Geographer Vol. 57 Issue 3 Page 339-494 . - Looks good! Read this in more detail...
Browsed contents of Accident Analysis & Prevention Volume 37, Issue 4, pages 591-806 . - Much here to come back to, but didn't spot any maps!
Skim read Hewson P.J. (2005) Epidemiology of child pedestrian casualty rates: Can we assume spatial independence? Pages 651-659 . In Accident Analysis & Prevention Volume 37, Issue 4, pages 591-806 . - More relevant to Richard Thompson ( David Clarke's PhD student).
Skim read Hewson P.J. (2005) A statistical profile of road accidents during cross-flow turns. Pages 721-730 . In Accident Analysis & Prevention Volume 37, Issue 4, pages 591-806 . - Read this in more detail. Reply to David Clarke's email.
Wrote a definition of humanosphere : The space-time region that humans influence . - Not sure if I've ever seen the word before, but think it a useful term!
Read Moss, Scott and Edmonds, Bruce (2005). 'Towards Good Social Science'. Journal of Artificial Societies and Social Simulation 8(4) - I agree with most of this. Another important aim for scientific simulation is for results to be easily replicable, not merely theoretically replicable. If there is some randomness implicit then this needs explicit capturing in provenance when a simulation is run so that the simulation can be run again to perform exactly the same. There are many benefits of doing this, and it is in some ways more important than making the source code of the program that performed the simulation available although I would argue that this is important too!
Browsed JASSS Volume 8, Issue 4 October, 2005 - Lots here to come back to!
Attended the Biosystems Reading Group - Discussion based on the following papers:
- How can simulation using Agent Based Models be made more scientific?
Attended the Ecology and Global Change Research Cluster meeting. A seminar given by Richard Law on Spatial Patterns and Inferences about Dynamics in Plant Communities - Illustrated an example of torus type distance weighting being useful. Interesting notions of pair and multiple densities. Often it is interesting when things appear in two's! Illustrated the Janzen-Connell hypothesis as described in Hyatt l.A., Rosenberg M.S., Howard T.G., Bole G., Fang W., Anastasia J., Brown K., Grella R., Hinman K., Kurdziel J.P. Gurevitch J., (2003) The distance dependence prediction of the Janzen-Connell hypothesis: a meta-analysis. OIKOS 103: 590-602. Described the use of inhomogenous K-function (as described here and here ) for work on cancer epidemiology by Diggle P.J. and colleagues. - All the talk of kernels, scales and distance was refreshing. To encourage further collaboration I emailed Richard and pointed him to work on GAM/K
Browsed from http://geobloggers.blogspot.com/ - Useful set of links.
Edited http://en.wikipedia.org/wiki/First_flush - Adding links for hydrology and runoff.
Read Reitsma F., Albrecht J. (2005) Implementing a new data model for simulating process. International Journal of Geographical Informaiton Science Vol. 19, No. 10, November, pages 1073-1090. - Focuses on storing system state at each time step of a dynamic model. The method is prototyped with a watershed runoff simulation. Emailed the reference to The Multi Agent Systems and Simulation Research Interest Group and Brian Irvine .
Read Rushton G. (2004) Book Review of Spatial Epidemiology: Methods and Applications (2001) Edited by P. Elliot, J. Wakefield, N. Best, and D. Briggs (Oxford: Oxford University Press) ISBN 0-19-851532-4. International Journal of Geographical Informaiton Science Vol. 18, No. 6, September, pages 627-629. . - Elaboration the need for provenance data without actually calling it that. This is a useful reference for MoSeS work.
Read Albani M., Klinkenberg B., Andison D.W., Kimmins J.P. (2004) The choice of window size in approximating topographic surfaces from Digital Elevation Models. International Journal of Geographical Informaiton Science Vol. 18, No. 6, September, pages 577-593. DOI: 10.1080/13658810410001701987 - "Presents a general analytical method to estimate the propagation of elevation errors to the principal derived topographic variables (slope, aspect and surface curvatures) as calculated with the quadratic approximation method with variable evaluation window size of Wood (1996). It expands the work of Florinsky (1998b) to incorporate evaluation windows of sizes larger than 3x3, and considers spatially correlated elevation error." (Taken form the conclusion) Like the paper a lot! It has an excellent conclusion and is well referenced. Much of the referenced work should be looked at for GEOG5060 and Geomorphometrics work. Paper should be on the reading list for the GEOG5060 students. As should: Wood, J. D. (1996) The geomorphological characterisation of Digital Elevation Models. PhD thesis, University of Leicester.
Read Shortridge A.M. (2004) Geometric variability of raster cell class assignment. In International Journal of Geographical Information Science Vol. 18, No. 6, September, pages 539-558. DOI: 10.1080/13658810410001702012 - Reports a set of experiments concerning square celled rasterisation of vector data and variability of changing cell resolution and origin. The focus is on classified area data. I liked this paper! The rasterisations being considered were square celled, but the paper did not discuss rasters with a triangular/hexagonal cell structure. In relation to this, there was no discussion of rotational variance and the alignment of the cells on axes. It was especially pleasing to see the work of Steve Carver and Chris Brunsden referenced.
Browsed The CeLSIUS Website - That of the support team for academic users of the Office for National Statistics' Longitudinal Study The LS is something we are looking to use for MoSeS.
Read Rogerson, P. A. (2001) Monitoring point patterns for the development of space-time clusters. Journal of the Royal Statistical Society (A), 164, pp. 87 - 96. - Adaption of cumulative sum methods for use with Knox's space-time statistic and application to Burkitt's lymphoma in Uganda. Contains a useful description and equations for a local version of the Knox test for space-time interactions. From the description this method is similar to that of GAMK-T of Stan Openshaw et al , which should have been referenced. It may be worth contacting the author and using this method in your PhD.
Browsed Applied Geography Volume 26, Issue 1 (January 2006) and Volume 25 (2005).
Skim read Atkinson D.M., Deadman P., Dudycha D., Traynor S (2005) Multi-criteria evaluation and least cost path analysis for an arctic all-weather road. - Looks good and has no reference to my colleague Steve Carver's work, so I pointed him to it.
Skim read Rocchini D., Di Rita A. (2005) Relief effects on aerial photos geometric correction. - Looks good and related to some work my colleague Erling Dalen is doing, so I pointed him to it.