By the end of this project, you will use the Apache Spark Structured Streaming API with Python to stream data from two different sources, store a dataset in the MongoDB database, and join two datasets. The Apache Spark Structured Streaming API is used to continuously stream data from various sources including the file system or a TCP/IP socket. One application is to continuously capture data from weather stations for historical purposes.
Certificate Available ✔
Get Started / More InfoBig Data Engineers and professionals with NoSQL skills are highly sought after in the data management industry. This Specialization is designed for those seeking...
This is a self-paced lab that takes place in the Google Cloud console. Create a Google Cloud SQL PostgreSQL instance. Perform SQL operations using the GCP Console...
In this course, we see what the common challenges faced by data analysts are and how to solve them with the big data tools on Google Cloud. You’ll pick up some...
The course provides a general overview of the main methods in the machine learning field. Starting from a taxonomy of the different problems that can be solved through...