Predictive analytics and data science for big data - Presciient

Predictive analytics and data science for big data

Data science skills are of vital and growing importance in commercial, governmental and not-for-profit organisations. Those in the management, risk, customer and IT functions increasingly need skills and/or literacy in this area.

This course introduces a range of data mining tools and techniques as they are commonly used in business.

Course outline

The most important of data science tools is the suite of predictive modelling methods. Accordingly, the course will develop attendees’ literacy in the strengths, characteristics and correct application of a range of predictive modelling methods, from relatively simple linear models through to complex and powerful random forests. Generalised Linear Models, Support Vector Machines, Decision Trees, Gradient Boosting Machines and Neural Networks will be covered along the way.

It will also teach the correct framing of predictive modelling problems, suitably preparing data, evaluating model accuracy and stability, interpreting results and interrogating models.
The two key styles of predictive modelling – operational and explanatory – will be described and explored.

As well as predictive modelling, the course will cover a range of other key data science tools, including:

  • Data exploration and visualisation: univariate summaries, correlation matrices, heat maps, hierarchical clustering.
  • Cluster analysis – used for segmentation and anomaly detection
  • Other “unsupervised” outlier detection tools
  • Association analysis – used in retail market basket analysis and the assessment of risk groupings.

This course will primarily be taught using Rattle, a graphical interface for predictive modelling and data science in R.

Additional topics

The following additional topics may be covered depending on the pace and interests of the class:

  • Principal components analysis – used to segment and interpret multivariate data.
  • Link and network analysis visualisation – which provide a simple and compelling way to communicate and analyse relationships, and are commonly applied in forensics, human resources and law enforcement.
  • Association analysis – used in retail market basket analysis and the assessment of risk groupings.
  • Frequent item set analysis.

Who should attend?

This course is for anyone with an interest in data science. No prior knowledge of R is required.

Course outcomes

Attendees will, by the end of the course:

  • Understand the fundamentals of predictive modelling.
  • Have developed the ability to assess the effectiveness and fitness for purpose of any predictive modelling tool or technique.
  • Understand a range of unsupervised data mining techniques.
  • Know how to effectively prioritise analytical resources in a data science context.

Course instructor

The course will be instructed by Presciient Director, Dr Eugene Dubossarsky. Eugene is a founder and Fellow of the Institute of Analytics Professionals of Australia (IAPA); Director, University of New South Wales School of Mathematics and Statistics Industry Advisory Board; and a recognised industry leader in Business Analytics. Eugene is an experienced, professional data scientist of 20 years’ experience programming in R and its parent language, S.

Prerequisites

No prior knowledge of R is required to take this course.

Join the Presciient information exchange

Join the Presciient information exchange

Signing up to email updates from Presciient helps us know where people are looking for our courses. And when we’ve got something new in your city, you’ll be the first to know!

You have Successfully Subscribed!