Intelligent Systems Research StudentIntelligent Systems Laboratory, University of Bristol,
MVB, Woodland Road, Bristol BS8 1UB, UK
Tel: +44-117-33-14310 (ILRT)
- e-Research and e-Science
- Structured data integration
- Semantic Web
- Linked Data
- Big Data
I have enjoyed (probably far too much) being a part-time postgraduate research student since 2003 and recently submitted my thesis (September 2013). My supervisor is Professor Peter Flach.
Title: e-Science and e-Research Frameworks for Profiling and Matching Heterogeneous Data
Abstract: e-Science aims to advance scientific discovery by enabling large-scale, distributed research collaborations across networks of online instruments, data and computational services. e-Research broadens these aims to encompass all research, both scientific and non-scientific, by adopting and adapting the same digital research and data science concepts, tools and workflows. In this Thesis we explore e-Science and e-Research approaches to the software engineering problem of building research intelligence tools that profile and match heterogeneous data about researchers and their organisations. One motivating use case is submission sifting, which matches submitted conference or journal papers to potential peer reviewers based on the similarity between the paper's abstract and the reviewer's publications as found in online bibliographic databases. Our implementation, SubSift, demonstrates that a novel application of the vector space model from information retrieval produces useful results in practice; SubSift has already been used to support major conferences and journals. We demonstrate that submission sifting can be abstracted to a generic workflow of re-usable web service components that constitute a general framework, not immediately available elsewhere, for analysing heterogeneous textual content ranging from documents to web sites, blog posts and Twitter feeds. Moving beyond text, we introduce a dataflow model that ranges over a class of higher-order relations that are sufficiently expressive to represent a wide variety data types and structures. Our proof-of-concept implementation, JSONMatch, is used to demonstrate that the combination of this model and higher-order representation provides a powerful Big Data approach to analysing heterogeneous data. Finally we propose a formalism for querying heterogeneous structured data, elevating Codd's relational algebra to a higher-order algebra defined on the basic terms of a higher-order logic. An extension incorporates approximate joins on arbitrarily complex structured data and is shown experimentally to have promise for future work.
My publications are listed in the following repositories:
- PhD-related publications: Department of Computer Science
- All my publications: Explore Bristol Research
Occassional teaching related activities include: supervision of projects, preparation and marking of assignments.
- Artificial Intelligence and Logic Programming (undergraduate/MSc)
- Introduction to Machine Learning (MEng/MSc)
- Advanced Topics in Machine Learning and Data Mining (MSc)
- Advanced Computing MSc in Machine Learning and Data Mining
Reviewed paper submissions for the following books, journals, conferences and workshops:
- International Journal on Semantic Web and Information Systems (IJSWIS)
- Book - Logical and Relational Learning: From ILP to MRDM, De Raedt, Luc, 2008, Springer
- IEEE International Conference on Data Mining (ICDM)
- International Conference on Machine Learning (ICML)
- Mining and Learning with Graphs workshop at European Conference on Machine Learning and the European Conference on Principles and Practice of Knowledge Discovery in Databases (ECML PKDD)
- Fundamenta Informaticae
- International Symposium on Intelligent Data Analysis (IDA)
- e-Research projects, most recently data.bris and SubSift.
- Crowdsourcing and citizen science projects, such as NatureLocator.
- Semantic Web and Web 2.0 projects, most recently Visualising China and STARS.
- SaaS Web applications, such as Bristol Online Surveys.
- e-Learning applications, such as Virtual Training Suite and WinEcon.
Before joining the University in 1992, I worked for several years as a computer games developer for companies like Melbourne House, Mastertronic, Virgin Games and Electronic Arts.