<< 2010-1 >>
Department of
Computer Science
 

MSc Thesis: Weather Talk - extracting weather information by text mining

Vasileios Lampos
advised by Prof. Nello Cristianini

Weather Talk MSc Thesis
Weather Talk MSc Thesis
Weather talk visualisation tool
Weather Talk Visualisation Tool

By using this visualisation tool, one is able to compare the official with the inferred weather map in all the investigated schemes for 120 days.
Weather Talk Poster
Weather Talk Poster
[ MSc Thesis ] [ Visualisation Tool ] [ Poster ]

Part of the abstract
The main aim of this project was to design and implement a system able to infer the weather state of a location for a specific date by applying Bayesian inference models and statistical analysis on web observations. Additionally, we investigated various linear combinations of probabilistic schemes where traffic information, previous day's weather or a weather prior probability contribute to the final decision. As a final extension, we visualised the weather inference results on a map.
Software packages and a weather ontology were developed for data collection and preprocessing. Parameterised Bayesian belief networks formed the expression of probabilistic correlation between the inferred and the official weather observations. During training, we decide the optimal parameters and then test their absolute and relative performance. Experimental results indicate that the absolute and relative (p-values) performance in most of the schemes is significant. As a result, one may assume that similar or even more sophisticated information extraction models on different contexts will be able to deliver useful conclusions.

Part of the introduction
The aim of this project is to apply statistical analysis on web observations and to design Bayesian inference schemes in order to extract information about the weather state of a location during a specific date. Documents may include blogs, newsgroups, and news articles but not officially weather related sources. A further challenge is the implementation of a data fusion model able to combine traffic information with weather observations and achieve a better performance.

Part of the conclusions
Weather talk forms a web mining framework with an ontology embedded that bases its decisions on Bayesian theory. In a period of three months, without the needed computational power, and with all the limitations that we have mentioned in this chapter, we achieved to infer the two major weather states of a location with 63.51% of success. As a result, the most important outcome of this project is that this kind of information extraction is possible and now it should be focused on different contexts.

© 1995-2011 University of Bristol  |  Terms and Conditions
About this Page