Certificate in Spark & Python
Gain hands-on experience in Spark & Python and advance your career to the next level.
Overview
Why Spark & Python
Spark is a data processing engine built for speed, ease of use and advanced analytics. It supports multiple workloads through a unified engine comprised of Spark components as libraries accessible via unified APIs in languages such as Java and Python. It can be deployed in different environments, read data from various sources and interact with a myriad of applications.
In This Course
The course is designed to analyze large datasets using Jupyter notebooks, MapReduce and Spark as a platform to perform data science and data engineering at scale using Spark.
The Sollers Advantage
Sollers’ Certificate in Spark & Python program is employer-backed and customized based on industry requirements. Designed by our employer partner, the program gets you hands-on experience in the domains of Spark & Python and serves as a quick solution to help you accelerate your career.
Learning Outcomes
- Use Python and Spark together to analyze Big Data.
- use the new Data Frame.
- Classify Customer Churn with Logistic Regression.
- Use Spark with Random Forests for Classification.
- Learn how to use Spark’s Gradient Boosted Trees.
- Use Spark’s MLlib to create Powerful Machine Learning Models.
- Learn about the Data Bricks Platform.
- Get set up on Amazon Web Services EC2 for Big Data Analysis.
- Learn how to use AWS Elastic Map Reduce Service.
- Learn how to leverage the power of Linux with a Spark Environment.
Syllabus
- Introduction to Big Data
- Hadoop Architecture
- HDFS
- Hadoop commands
- Map Reduce
- YARN, Job Tracker (HDP 1.0)
- Hive Introduction
- Hive Tables
- Hive Table Partitions, buckets, Skewing
- Sub-queries
- Kafka Introduction
- Kafka architecture
- Zookeeper
- Partitions, &replication
- Python Basics
- Spark Introduction
- RDD Transformations
- RDD Actions
- Pair RDD
- Shared Variables
- Data Frames
- SparkSQL
- Pyspark
- MLIB clustering
- Introduction to Oozie
- Components of Oozie
- Oozie Actions: HDFS, MapReduce
- Complex workflows
- Oozie coordinator
- Installation of Hadoop Multi-Node Cluster
- Configuration of Hadoop, Hive, Yarn
- Installation of Spark Multi-Node Cluster
- Complete the project
INTERNSHIP
Sollers hands you a project at the end of the course where you gain practical experience on the key concepts of Spark and Python you will learn during the course. You graduate only after the successful completion of the project.
Upon successful completion of the program, a certain level of skill and expertise is expected of students to get placed with our industry-leading corporate partners.
Work with the latest tools
INSTRUCTORS
Our instructors are not just highly experienced in the industry, they give you the personal attention you need and guide you every step of the way.
Course Duration
Starting soon
Reserve your spot
Limited seats only
Reserve your spot
For information regarding fee and/or reserving your spot, contact our Admissions Team.
Credit transfers applicable for alumni
Career Guidance
After the completion of program, we assist our students with interview coaching, resume building sessions, conduct mock interviews, job readiness training and make them competent to venture into the corporate world.
We provide exclusive one-on-one sessions with our industry-based career advisors who provide guidance right from resume feedback, assisting with interview Q&As, and helping with job preparations.