Data processing and visualisation
Data is typically messy, poorly organised for analysis purposes, and difficult to interpret in its raw form. When it comes to understanding the risks, opportunities, trends and groupings hidden within it, a picture is worth a thousand words. The skills to effectively prepare, transform, explore and display data are vital for every data analysis practitioner.
This course builds on basic skills with R to provide analysts with a range of powerful tools to clean, prepare, explore, summarise, analyse, filter and search data, as well as a range of techniques to display data visually. These skills are fundamental for transforming data as it is often found in business environments into communication- and report-ready information.
Course outline
The data analysis component will present a range of data processing tools and techniques suitable for handling and transforming different data types and structures, alongside the R tools which support them, including:
- plyr
- reshape2
- stringr and other string processing tools
The visualisation component of the course will cover necessary statistical theory, as well as the tools for visualising time series, categories, risk, uncertainty and other relationships in data across many dimensions:
- Line plots
- Scatter plots
- Histograms
- Image plots
- 3D plotting
Different applications of format, shape, colour, and commentary to provide artistic, compelling and understandable visualisations will also be covered. R tools used will include:
- lattice
- graphics
- ggplot2
Who should attend?
This is a course for any data analyst. It does not require specialised statistical knowledge.
Course outcomes
Attendees will, by the end of the course, have in place a solid conceptual and practical grounding in tools and techniques for data handling, transformation, analysis and visualisation in a business context.
Course instructor
The course will be instructed by Presciient Director, Dr Eugene Dubossarsky. Eugene is a founder and Fellow of the Institute of Analytics Professionals of Australia (IAPA); Director, University of New South Wales School of Mathematics and Statistics Industry Advisory Board; and a recognised industry leader in Business Analytics. Eugene is an experienced, professional data miner of 20 years’ experience programming in R and its parent language, S.
Prerequisites
Attendees are recommended to have completed Presciient’s Introduction to R two-day course, or equivalent.