Data Mining Techniques
Date: 22nd - 23rd February 2012
Duration: 2 days
Delivered by: Professor Brian Francis
Registration deadline has passed.
Please contact psc@lancaster.ac.uk for more information about this course.
- External from industry/commerce - £440
- External from an academic institution/public sector/charity - £120
- External postgraduate student - £60
- Lancaster University staff - £120
- Lancaster University postgraduate student - £60
- Members of Mathematics and Statistics at Lancaster University - £ 0
The main aim of data mining is to extract knowledge, or information, which is stored in very large databases. This course covers many of the concepts that are fundamental to understanding and successfully applying data mining methods. Statistical concepts are discussed without mathematically complex formulation. Practical sessions will use the latest versions of standard software rather than data mining – the emphasis of the course is on techniques rather than data mining software.
- formulating research objectives that can be translated into suitable analytical methods;
- data structure and organisation;
- model comparison and assessment;
- data splitting;
- assessing and interpreting predictive models;
- introduction to variable selection;
- benefits and drawbacks of neural networks;
- examining the benefits and drawbacks of regression trees;
- cluster analysis and latent class analysis;
- bootstrap and cross-validation.
- identify empirical problems and determine suitable analytical methods;
- understand the difficulties presented by massive, opportunistic data;
- understand the concepts of using logistic regression, neural networks, projection methods and decision trees for predictive modelling;
- prepare data for analysis, including partitioning data;
- train, assess and compare regression models, neural networks and decision trees;
- understand the advantages and disadvantages of cluster analysis, latent class modelling and other latent variable methods.