The Complete Guide to Spark Machine Learning - Part 1
If you want to be recognized as a machine learning expert based on large-scale data, from understanding the core framework of Spark machine learning, to SQL-based data processing through difficult practical problems, to data analysis through business domain analysis, and to the ability to implement optimized machine learning models, please join this course.
This course is prepared for Intermediate Learners.
What you will learn!
Implementing Machine Learning Models in Spark
Detailed understanding of DataFrame, the foundation of Spark's data processing
Understand the various technical elements that make up the Spark Machine Learning Framework
Mastering Spark's Machine Learning Pipeline
Ability to use SQL for data analysis
SQL-based Feature Engineering Techniques
Implementing models with XGBoost and LightGBM in Spark
Model hyperparameter tuning method based on Bayesian optimization
Improve your data analysis and ML model implementation skills simultaneously through challenging real-world problems.
Data analysis method based on analysis domain
Various data visualization techniques
Data analysis + feature engineering + ML implementation, Grab three competencies at once.
With Apache Spark Machine learning meets.
Apache Spark, the leader in open source large-scale distributed processing solutions, has met with Machine Learning .
Many large domestic companies and financial institutions are using Apache Spark to analyze large amounts of data and create machine learning models. Since Spark is based on a distributed data processing framework, it can process large amounts of data and create ML models by expanding capacity on a few to dozens of servers . Therefore, it can overcome the limitations of scikit-learn, which can only implement machine learning models on a single server.
Also good at data processing/analysis As a machine learning expert I will help you grow.
The 'Spark Machine Learning Complete Guide - Part 1' course will help you grow into a machine learning expert who is skilled in data processing and analysis beyond learning how to implement machine learning models in Spark.
In order to grow into a true machine learning expert, it is very important not only to have ML implementation skills, but also to have the ability to process and combine business data to create ML models. To this end, you will learn how to process data using SQL, which is most commonly used in large-scale data processing in practice, and data analysis techniques based on business domain analysis through practice.
It is designed to help you develop data processing/analysis and ML implementation capabilities through detailed theoretical explanations and practical training.
The problems you will face We will solve it for you.
Implementing machine learning models on Spark is not easy. This is because it faces many problems that existing data scientists or machine learning experts have not experienced, such as unique machine learning APIs and frameworks based on the Spark architecture, and data processing based on SQL.
In this course, The Complete Guide to Spark Machine Learning, we will develop your ability to solve the problems you encounter .
The first half of the lecture 'Spark Machine Learning Complete Guide - Part 1'
The first half of the lecture consists of detailed theoretical explanations and abundant practical exercises on various elements that make up the Spark Machine Learning Framework, such as DataFrame, SQL, Estimator, Transformer, Pipeline, and Evaluator. Through this, you will be able to implement ML models in Spark easily and quickly .
We will also go into detail about how to use XGBoost and LightGB in Spark, and how to tune hyperparameters using HyperOpt based on Bayesian optimization.
The second half of the lecture 'Spark Machine Learning Complete Guide - Part 1'
The second half of the lecture will improve your real-world data processing/analysis skills and machine learning model implementation skills at the same time through hands-on practice on Kaggle's Instacart Market Basket Analysis competition . The Kaggle Instacart competition is a difficult competition, and the data set consists of e-commerce order processing tables (products, orders, and order products).
Through this data set, you will learn in detail how to process and analyze business data based on SQL, perform feature engineering, how to derive analysis domains from business, and how to create models based on the derived features.
This is Part 1 of the 'Spark Machine Learning Complete Guide' course that is being released this time. Part 2 of the course will be released later, and will cover text analysis, recommendations, and time series analysis.
💻 Please check before taking the class!
All of the practical codes in this lecture are based on Python. Scala is not covered, so please refer to this before selecting a lecture.
The practice environment Please check.
The hands-on training uses Databricks. Databricks provides a notebook environment that allows you to create Spark-based applications on the cloud without installing Spark.
Databricks is officially available for free use for 14 days as a Community version. And in the video lecture ' Managing Spark Clusters on Databricks and Using Databricks Even After 2 Weeks of Signing Up ' in Section 0, I explain how you can continue to use it for free after 14 days, so please watch that video carefully (for explanation about the Databricks Community version, please refer to the link ).
You can download the lecture practice code and lecture explanation materials from ‘Download the practice code and explanation materials.’
Player knowledge This is a required course.
This course is designed assuming that students have knowledge of Chapter 5 (Regression) of the Complete Guide to Python Machine Learning or equivalent, and that they have a very basic understanding of SQL . Please refer to the above when selecting a course.
It would be helpful to know the basics of Spark, but even if you don't, you will have no problem following the lecture.
Please check out the player lecture!
The Complete Guide to Python Machine Learning
Stop teaching theory-based machine learning. From core machine learning concepts to practical skills, easily and accurately.
Are you curious about the interview with the knowledge sharer? (Click)
Recommended for these people!
Who is this course right for?
Anyone who wants to implement machine learning using Spark
Those who want to implement machine learning based on large-scale data
Anyone who wants to improve their data processing techniques for machine learning using SQL
Anyone who wants to learn the entire process of processing data into the desired format and creating an ML model based on it in practice
Anyone who wants to improve data analysis, feature engineering capabilities, and ML implementation
Need to know before starting?
Understanding up to Chapter 5 (Regression) of the Complete Guide to Python Machine Learning or equivalent prior knowledge
I first got to know Professor Kwon Chul-min through the Complete Guide to Python Machine Learning. Thanks to that lecture, I, a non-major, was able to not give up on this field that I had been thinking of giving up on.
I am currently working in this field and studying steadily by taking Infraon lectures. I wanted to thank the teacher, so I first thanked the teacher in the Q&A session, and the teacher encouraged me that if I continued to study, I would be able to achieve what I had worked for.
I plan to continue to listen to the teacher's lectures in the future. ^^ㅎㅎ He really teaches so well.
Professor Kwon Chul-min, I would like to take this opportunity to sincerely thank you.
I am even more impressed that you left such a touching review. I think I should be the one to thank you for the writing that instantly rewards the hard work you put into creating the lecture. If you continue to work hard like this, you will definitely achieve everything you want. Thank you.
I am a student who has been attending Kwon Chul-min's lecture series! Thank you for continuing to provide high-quality lectures! And I have seen several Spark lectures in Scala and Java, but this is the first time I have seen a lecture that teaches Spark in Python, so I think it was even better! Although I have not completed the course yet, I still like how he tries to teach simple grammar as easily as possible! And I also like how he provides various practice materials to encourage repeated mastery! I look forward to other lectures in the future!