Predictive analytics and data science for big data - Presciient

Predictive analytics and data science for big data

Our leading course has transformed the machine-learning and data-science practice of the many managers, sponsors, key stakeholders, entrepreneurs and beginning data-science practitioners who have attended it.

This course is an intuitive, hands-on introduction to data science and machine learning. The training focuses on central concepts and key skills, leaving the trainee with a deep understanding of the foundations of data science and even some of the more advanced tools used in the field.

The course also covers key issues of data science practice in a work environment, and directs trainees to a range of further learning directions.

The skills taught are transferable to all software platforms, and the course does not involve coding, or require any coding knowledge or experience. A tool with a graphical user interface is used so trainees can focus on learning the central skills and ideas.

Key skills taught include building, assessing, selecting and deploying predictive models, as well as employing some of the most commonly used methods in the field, including general linear models (GLMs), and advanced methods such as random forests.

See what former trainees are saying about this course.

Course outline

The most important of data science tools is the suite of predictive modelling methods. Accordingly, the course will develop attendees’ literacy in the strengths, characteristics and correct application of a range of predictive modelling methods, from relatively simple linear models through to complex and powerful random forests. Generalised Linear Models, Support Vector Machines, Decision Trees, Gradient Boosting Machines and Neural Networks will be covered along the way.

It will also teach the correct framing of predictive modelling problems, suitably preparing data, evaluating model accuracy and stability, interpreting results and interrogating models.
The two key styles of predictive modelling – operational and explanatory – will be described and explored.

As well as predictive modelling, the course will cover a range of other key data science tools, including:

  • Data exploration and visualisation: univariate summaries, correlation matrices, heat maps, hierarchical clustering.
  • Cluster analysis – used for segmentation and anomaly detection
  • Other “unsupervised” outlier detection tools
  • Association analysis – used in retail market basket analysis and the assessment of risk groupings.

This course will primarily be taught using Rattle, a graphical interface for predictive modelling and data science in R.

Additional topics

The following additional topics may be covered depending on the pace and interests of the class:

  • Principal components analysis – used to segment and interpret multivariate data.
  • Link and network analysis visualisation – which provide a simple and compelling way to communicate and analyse relationships, and are commonly applied in forensics, human resources and law enforcement.
  • Association analysis – used in retail market basket analysis and the assessment of risk groupings.
  • Frequent item set analysis.

Who should attend?

This course is for anyone with an interest in data science. No prior knowledge of R is required.

Course outcomes

Attendees will, by the end of the course:

  • Understand the fundamentals of predictive modelling.
  • Have developed the ability to assess the effectiveness and fitness for purpose of any predictive modelling tool or technique.
  • Understand a range of unsupervised data mining techniques.
  • Know how to effectively prioritise analytical resources in a data science context.

Course instructor

The course will be instructed by Presciient Director, Dr Eugene Dubossarsky. Eugene is a founder and Fellow of the Institute of Analytics Professionals of Australia (IAPA); Director, University of New South Wales School of Mathematics and Statistics Industry Advisory Board; and a recognised industry leader in Business Analytics. Eugene is an experienced, professional data scientist of 20 years’ experience programming in R and its parent language, S.

Prerequisites

No prior knowledge of R is required to take this course.

Join the Presciient information exchange

Join the Presciient information exchange

Signing up to email updates from Presciient helps us know where people are looking for our courses. And when we’ve got something new in your city, you’ll be the first to know!

You have Successfully Subscribed!