BEST

The Complete Guide to Spark Machine Learning - Part 1

If you want to be recognized as a machine learning expert based on large-scale data, from understanding the core framework of Spark machine learning, to SQL-based data processing through difficult practical problems, to data analysis through business domain analysis, and to the ability to implement optimized machine learning models, please join this course.

(4.9) 25 reviews

916 students

Apache Spark
Machine Learning(ML)
Big Data
Data Engineering
Thumbnail

This course is prepared for Intermediate Learners.

What you will learn!

  • Implementing Machine Learning Models in Spark

  • Detailed understanding of DataFrame, the foundation of Spark's data processing

  • Understand the various technical elements that make up the Spark Machine Learning Framework

  • Mastering Spark's Machine Learning Pipeline

  • Ability to use SQL for data analysis

  • SQL-based Feature Engineering Techniques

  • Implementing models with XGBoost and LightGBM in Spark

  • Model hyperparameter tuning method based on Bayesian optimization

  • Improve your data analysis and ML model implementation skills simultaneously through challenging real-world problems.

  • Data analysis method based on analysis domain

  • Various data visualization techniques

Data analysis + feature engineering + ML implementation,
Grab three competencies at once.

With Apache Spark
Machine learning meets.

Apache Spark, the leader in open source large-scale distributed processing solutions, has met with Machine Learning .

Many large domestic companies and financial institutions are using Apache Spark to analyze large amounts of data and create machine learning models. Since Spark is based on a distributed data processing framework, it can process large amounts of data and create ML models by expanding capacity on a few to dozens of servers . Therefore, it can overcome the limitations of scikit-learn, which can only implement machine learning models on a single server.


Also good at data processing/analysis
As a machine learning expert
I will help you grow.

The 'Spark Machine Learning Complete Guide - Part 1' course will help you grow into a machine learning expert who is skilled in data processing and analysis beyond learning how to implement machine learning models in Spark.

In order to grow into a true machine learning expert, it is very important not only to have ML implementation skills, but also to have the ability to process and combine business data to create ML models. To this end, you will learn how to process data using SQL, which is most commonly used in large-scale data processing in practice, and data analysis techniques based on business domain analysis through practice.

It is designed to help you develop data processing/analysis and ML implementation capabilities through detailed theoretical explanations and practical training.


The problems you will face
We will solve it for you.

Implementing machine learning models on Spark is not easy. This is because it faces many problems that existing data scientists or machine learning experts have not experienced, such as unique machine learning APIs and frameworks based on the Spark architecture, and data processing based on SQL.

In this course, The Complete Guide to Spark Machine Learning, we will develop your ability to solve the problems you encounter .

The first half of the lecture 'Spark Machine Learning Complete Guide - Part 1'

The first half of the lecture consists of detailed theoretical explanations and abundant practical exercises on various elements that make up the Spark Machine Learning Framework, such as DataFrame, SQL, Estimator, Transformer, Pipeline, and Evaluator. Through this, you will be able to implement ML models in Spark easily and quickly .

We will also go into detail about how to use XGBoost and LightGB in Spark, and how to tune hyperparameters using HyperOpt based on Bayesian optimization.

The second half of the lecture 'Spark Machine Learning Complete Guide - Part 1'

The second half of the lecture will improve your real-world data processing/analysis skills and machine learning model implementation skills at the same time through hands-on practice on Kaggle's Instacart Market Basket Analysis competition . The Kaggle Instacart competition is a difficult competition, and the data set consists of e-commerce order processing tables (products, orders, and order products).

Through this data set, you will learn in detail how to process and analyze business data based on SQL, perform feature engineering, how to derive analysis domains from business, and how to create models based on the derived features.

This is Part 1 of the 'Spark Machine Learning Complete Guide' course that is being released this time. Part 2 of the course will be released later, and will cover text analysis, recommendations, and time series analysis.

💻 Please check before taking the class!

  • All of the practical codes in this lecture are based on Python. Scala is not covered, so please refer to this before selecting a lecture.

The practice environment
Please check.

The hands-on training uses Databricks. Databricks provides a notebook environment that allows you to create Spark-based applications on the cloud without installing Spark.

Databricks is officially available for free use for 14 days as a Community version.
And in the video lecture ' Managing Spark Clusters on Databricks and Using Databricks Even After 2 Weeks of Signing Up ' in Section 0, I explain how you can continue to use it for free after 14 days, so please watch that video carefully (for explanation about the Databricks Community version, please refer to the link ).

You can download the lecture practice code and lecture explanation materials from ‘Download the practice code and explanation materials.’


Player knowledge
This is a required course.

This course is designed assuming that students have knowledge of Chapter 5 (Regression) of the Complete Guide to Python Machine Learning or equivalent, and that they have a very basic understanding of SQL . Please refer to the above when selecting a course.

It would be helpful to know the basics of Spark, but even if you don't, you will have no problem following the lecture.

Please check out the player lecture!

The Complete Guide to Python Machine Learning

Stop teaching theory-based machine learning.
From core machine learning concepts to practical skills, easily and accurately.

Are you curious about the interview with the knowledge sharer? (Click)

Recommended for
these people!

Who is this course right for?

  • Anyone who wants to implement machine learning using Spark

  • Those who want to implement machine learning based on large-scale data

  • Anyone who wants to improve their data processing techniques for machine learning using SQL

  • Anyone who wants to learn the entire process of processing data into the desired format and creating an ML model based on it in practice

  • Anyone who wants to improve data analysis, feature engineering capabilities, and ML implementation

Need to know before starting?

  • Understanding up to Chapter 5 (Regression) of the Complete Guide to Python Machine Learning or equivalent prior knowledge

  • Understanding SQL Basics

Hello
This is dooleyz3525

24,943

Students

1,160

Reviews

3,916

Answers

4.9

Rating

13

Courses

(전) 엔코아 컨설팅

(전) 한국 오라클

AI 프리랜서 컨설턴트

파이썬 머신러닝 완벽 가이드 저자

Curriculum

All

117 lectures ∙ (24hr 27min)

Course Materials:

Lecture resources
Published: 
Last updated: 

Reviews

All

25 reviews

4.9

25 reviews

  • gomjong님의 프로필 이미지
    gomjong

    Reviews 8

    Average Rating 4.9

    5

    100% enrolled

    Thanks to you, I learned about Spark and gained confidence in Kaggle challenges. Thank you!

    • freedom07님의 프로필 이미지
      freedom07

      Reviews 7

      Average Rating 5.0

      5

      93% enrolled

      I first got to know Professor Kwon Chul-min through the Complete Guide to Python Machine Learning. Thanks to that lecture, I, a non-major, was able to not give up on this field that I had been thinking of giving up on. I am currently working in this field and studying steadily by taking Infraon lectures. I wanted to thank the teacher, so I first thanked the teacher in the Q&A session, and the teacher encouraged me that if I continued to study, I would be able to achieve what I had worked for. I plan to continue to listen to the teacher's lectures in the future. ^^ㅎㅎ He really teaches so well. Professor Kwon Chul-min, I would like to take this opportunity to sincerely thank you.

      • dooleyz3525
        Instructor

        I am even more impressed that you left such a touching review. I think I should be the one to thank you for the writing that instantly rewards the hard work you put into creating the lecture. If you continue to work hard like this, you will definitely achieve everything you want. Thank you.

    • indizz4933님의 프로필 이미지
      indizz4933

      Reviews 1

      Average Rating 5.0

      5

      100% enrolled

      Thank you for explaining it step by step.

      • yhzzz123님의 프로필 이미지
        yhzzz123

        Reviews 12

        Average Rating 5.0

        5

        54% enrolled

        I am a student who has been attending Kwon Chul-min's lecture series! Thank you for continuing to provide high-quality lectures! And I have seen several Spark lectures in Scala and Java, but this is the first time I have seen a lecture that teaches Spark in Python, so I think it was even better! Although I have not completed the course yet, I still like how he tries to teach simple grammar as easily as possible! And I also like how he provides various practice materials to encourage repeated mastery! I look forward to other lectures in the future!

        • egs41님의 프로필 이미지
          egs41

          Reviews 53

          Average Rating 5.0

          5

          10% enrolled

          It was good to focus on the instructor's diction and voice, and the content was solid. Please continue to make good lectures. Thank you.

          dooleyz3525's other courses

          Check out other courses by the instructor!

          Similar courses

          Explore other courses in the same field!

          $77.00