Big Data Engineering
Leveraging Cloud for Big Data Analytics
3 Weeks Live Sessions | 4th Weekend Campus Immersion
Big Data is essentially a special application of Data Science. The data sets are so huge that it requires infrastructure specially designed to handle big data, like Hadoop, Spark, NoSql etc. For deploying Data Analytics for Big Data, a thorough understanding of Big Data Infrastructure and Technologies is a must. This bootcamp will give you the right understanding and skills that you need to work with big data.
Talk to Admissions: 9740-991-601
Why this Bootcamp
Work on Real-life Data Science Problems
Take your career headon by working on projects using a competency based learning paradigm. Quality of time spent and the outcome is far more important than the quantity.
Work 1:1 with a Mentor
We pair you with a mentor who has extensive professional and academic knowledge of the field. You’ll have one-on-one conversations with your mentor, and receive useful feedback on improving your work.
We Will Keep You Engaged
Our mentors are here to keep you motivated, answer questions, provide feedback, and help deepen your understanding of essential tools and techniques. Learn with live online classes and face to face sessions. Learning is best when you are able to ask the questions and clarify your doubts with the faculty.
What You Will Learn
Unit 1: Cloud Computing Foundation
■ Cloud Service Models and Operation, Cloud Resources, Multitenancy
■ Virtual Hybrid/Dynamic Cloud Datacenter, and outsourcing enterprise IT infrastructure to Cloud
■ Cloud use cases and scenarios for enterprise
■ Cloud Economics and Pricing Model
Unit 2: Cloud and Big Data, Big Data Infrastructure and Components
■ Overview of major Cloud based Big Data Platforms: AWS, Microsoft Azure, Google Cloud Platform (GCP). Introduction into MapReduce/Hadoop
■ Hadoop Ecosystem and Components
■ HDFS and Cloud Based File Systems
■ HBase, Hive and Pig, YARN MapReduce/Hadoop Programming and Tools
Unit 3: SQL and NoSQL Databases
■ SQL basics (recollection from Database and SQL course)
■ NoSQL Databases types and overview
■ Column based databases and use (e.g.HBase)
■ Modern large scale databases AWS Aurora, Azure CosmosDB, Google Spanner
Unit 4: Data Streams and Streaming Analytics
■ Data Streams and Stream Analytics
■ Spark Architecture and Components
■ Popular Spark platforms, DataBricks, Spark Programming and Tools
Unit 5: Big Data Management and Security
■ Enterprise Big Data Architecture and Large Scale Data Management
■ Data Structures, Data Warehouses. Distributed Systems
■ CAP Theorem, ACID and BASE Properties
■ Cloud Based Services, Data Lakes
■ Big Data Security challenges, Data Protection
■ Access Control and Identity Management
1. Run MapReduce tasks, e.g. word count; run a ranking algorithm, run graph Pregel (shortest path) algorithm.
2. For an enterprise profile select and suggest the enterprise Big Data Infrastructure, services and components. Also create a Data Management Plan (DMP) and cost assessment and deployment plan.
Ability to be learn hands on with real industry data and delivering insights to industry jury is the best part of the program. Data Science and its application for Decision Science with practitioner faculty is the biggest highlight of the program. Strongly Recommend it.