Skip Links

You are here: Home > Statistics > Courses


Data Mining Techniques

Date: 22nd - 23rd February 2012
Duration: 2 days
Delivered by: Professor Brian Francis

Registration deadline has passed.
Please contact psc@lancaster.ac.uk for more information about this course.

Cost
The course fees include all supporting documentation and refreshments.
  • External from industry/commerce - £440
  • External from an academic institution/public sector/charity - £120
  • External postgraduate student - £60
  • Lancaster University staff - £120
  • Lancaster University postgraduate student - £60
  • Members of Mathematics and Statistics at Lancaster University - £ 0
Course description

The main aim of data mining is to extract knowledge, or information, which is stored in very large databases. This course covers many of the concepts that are fundamental to understanding and successfully applying data mining methods. Statistical concepts are discussed without mathematically complex formulation. Practical sessions will use the latest versions of standard software rather than data mining – the emphasis of the course is on techniques rather than data mining software.

Topics covered
  • formulating research objectives that can be translated into suitable analytical methods;
  • data structure and organisation;
  • model comparison and assessment;
  • data splitting;
  • assessing and interpreting predictive models;
  • introduction to variable selection;
  • benefits and drawbacks of neural networks;
  • examining the benefits and drawbacks of regression trees;
  • cluster analysis and latent class analysis;
  • bootstrap and cross-validation.
Learning
Students will learn through the application of concepts and techniques covered in the course to real data sets. Students will be encouraged to examine issues of empirical interest in these studies.
Successful students will be able to:
  • identify empirical problems and determine suitable analytical methods;
  • understand the difficulties presented by massive, opportunistic data;
  • understand the concepts of using logistic regression, neural networks, projection methods and decision trees for predictive modelling;
  • prepare data for analysis, including partitioning data;
  • train, assess and compare regression models, neural networks and decision trees;
  • understand the advantages and disadvantages of cluster analysis, latent class modelling and other latent variable methods.
Cancellation Policy
Registrations are transferable to another course or individual at any time. Full refunds will be given for cancellation 10 or more working days before the course start date. Otherwise the full course fee will be charged.