Organizations need skilled, forward-thinking Big Data practitioners who can apply their business and technical skills to unstructured data such as tweets, posts, pictures, audio files, videos, sensor data, and satellite imagery and more to identify behaviors and preferences of prospects, clients, competitors, and others. In this short course you'll gain practical skills when you learn how to work with Apache Spark for Data Engineering and Machine Learning (ML) applications. You will work hands-on with Spark MLlib, Spark Structured Streaming, and more to perform extract, transform and load (ETL) tasks as well as Regression, Classification, and Clustering. The course culminates in a project where you will apply your Spark skills to an ETL for ML workflow use-case. NOTE: This course requires that you have foundational skills for working with Apache Spark and Jupyter Notebooks. The Introduction to Big Data with Spark and Hadoop course from IBM will equip you with these skills and it is recommended that you have completed that course or similar prior to starting this one.
Certificate Available ✔
Get Started / More InfoWant to know how to query and process petabytes of data in seconds? Curious about data analysis that scales automatically as your data grows? Welcome to the Data...
In this 1-hour long project-based course, you will learn how to perform core job-related tasks within the SQL Server Management Studio (SSMS) environment. You will...
This course introduces you to the core concepts, processes, and tools you need to know in order to get a foundational knowledge of data engineering. You will gain...
In Relational Databases for Beginners, you will learn invaluable information about how data is stored with the MySQL database. This guided project provides hands-on...