This class will explore the many unique applications and extensions of the randomForest package, many of which are implemented in R.
Access to these methods allows the user to easily solve problems not susceptible to other methods, including deep learning.
Topics will include:
- A brief overview of the random forest algorithm.
- Out-of-sample estimates on training data, and applications in fraud, risk and outlier detection—random forests can make confident predictions on training data, unlike most other methods.
- Single-model quantile regression—estimating a full distribution, not just the mean. Vital for risk-based estimation
- The proximity matrix—a powerful visualisation, clustering and insights tool unique to random forests.
- Random forests as an unsupervised learning method—outlier detection and clustering when there are no target values—vital for fraud detection.
- ranger, a fast, flexible implementation of random forests in R.
- extraTrees (Extremely Randomized Trees), an extension to random forests that often adds more accuracy.
- Dealing with small data sets and small classes.
A range of other topics, including recent works and extensions of existing packages, may also be covered.
Trainees are expected to be familiar with R, the basics of machine learning and out-of-sample error estimation, and the basic workings of the random forest algorithm.
Equipment
Computers running R with practical examples from all core components will be provided to attendees. These may not necessarily be configured for the most advanced topics, which may involve demonstrations only, especially if requiring server or cloud functionality.
Course instructor
The course will be led by Presciient director, Dr Eugene Dubossarsky. He is the head of the Sydney Users of R Forum. Eugene is also Principal Founder of Analyst First, an international analytics industry organisation. He is a founder of the Institute of Analytics Professionals of Australia (IAPA); Director, University of New South Wales School of Mathematics and Statistics Industry Advisory Board; and a recognised industry leader in business analytics. Eugene is an experienced analytics professional with 20 years’ experience programming in R and its parent language, S.
Testimonials
The Introduction to R course provided clear and logical assistance to getting up and running with R. More than that, the real value was in providing guidance on the myriad of online resources and introducing me to a network of passionate and helpful R users. Eugene is a knowledgeable and approachable teacher. I wouldn’t hesitate in recommending the course. I feel that I am now fully on the road to applying R and using data to improve efficiency across my organisation.
—James Orton, Data and IT Manager, UNICEF Australia
I have been trying to convert my Stata programming skills to R, however, there have been many times where I just wanted to sit down with someone and have them explain the fundamentals of programming in R. Sure, a number of books and websites have helped me become familiar with R, however, I still didn’t feel ready to translate all of my familiar Stata commands to R (e.g. I am comfortable plotting graphics using ggplot2, however, revert back to Stata for data manipulation). I knew that a more effective way to learn and feel confident would be to sit down with someone and have them explain how they use R, how they clean data, how they plot graphics, etc. I knew that once I felt comfortable with cleaning my data in R, analysis would be less of an issue—I’m happy to research the specifics on my own.
Thank you Eugene for advancing my R skills. I especially appreciate the time spent explaining the fundamentals of data manipulation — i.e. the code one needs to know before running any basic or sophisticated analysis. The pace of the workshop was perfect.
—Dr Chelsea Wise, Lecturer, Marketing, UTS Business School
Discounts
Please ask about our discounts for group bookings.
Feedback
Use enquiries@presciient.com to email us any questions about the course, including requests for more detail, specific content you would like to see covered, or queries regarding prerequisites and suitability. If you would like to attend but for any reason cannot, please also let us know.
Variation
Course material may vary from what is advertised due to the demands and learning pace of attendees. Additional material may be presented along with or in place of what is advertised.
Cancellation
The course may be cancelled by the organisers with full refund of fees.