Big Data Engineering – Online Bootcamp

Big Data Engineering

Leveraging Cloud for Big Data Analytics

3 Weeks Live Sessions | 4th Weekend Campus Immersion

Big Data is essentially a special application of Data Science. The data sets are so huge that it requires infrastructure specially designed to handle big data, like Hadoop, Spark, NoSql etc. For deploying Data Analytics for Big Data, a thorough understanding of Big Data Infrastructure and Technologies is a must. This bootcamp will give you the right understanding and skills that you need to work with big data.

Overview of Big Data Infrastructure and Technologies

Practical use of Big Data Technologies and competent implementation in your organization

Cloud Computing and Big Data

Learn to work with existing cloud based platforms and tools for Big Data processing and Data Analytics

Why this Bootcamp

Work on Real-life Data Science Problems

Take your career headon by working on projects using a competency based learning paradigm. Quality of time spent and the outcome is far more important than the quantity.

Work 1:1 with a Mentor

We pair you with a mentor who has extensive professional and academic knowledge of the field. You’ll have one-on-one conversations with your mentor, and receive useful feedback on improving your work.

We Will Keep You Engaged

Our mentors are here to keep you motivated, answer questions, provide feedback, and help deepen your understanding of essential tools and techniques. Learn with live online classes and face to face sessions. Learning is best when you are able to ask the questions and clarify your doubts with the faculty.

What You Will Learn

■ Cloud Service Models and Operation, Cloud Resources, Multitenancy
■ Virtual Hybrid/Dynamic Cloud Datacenter, and outsourcing enterprise IT infrastructure to Cloud
■ Cloud use cases and scenarios for enterprise
■ Cloud Economics and Pricing Model

■ Overview of major Cloud based Big Data Platforms: AWS, Microsoft Azure, Google Cloud Platform (GCP). Introduction into MapReduce/Hadoop
■ Hadoop Ecosystem and Components
■ HDFS and Cloud Based File Systems
■ HBase, Hive and Pig, YARN MapReduce/Hadoop Programming and Tools

■ SQL basics (recollection from Database and SQL course)
■ NoSQL Databases types and overview 
■ Column based databases and use (e.g.HBase) 
■ Modern large scale databases AWS Aurora, Azure CosmosDB, Google Spanner

■ Data Streams and Stream Analytics
■ Spark Architecture and Components
■ Popular Spark platforms, DataBricks, Spark Programming and Tools

■ Enterprise Big Data Architecture and Large Scale Data Management 
■ Data Structures, Data Warehouses. Distributed Systems
■ CAP Theorem, ACID and BASE Properties
■ Cloud Based Services, Data Lakes
■ Big Data Security challenges, Data Protection 
■ Access Control and Identity Management

1. Run MapReduce tasks, e.g. word count; run a ranking algorithm, run graph Pregel (shortest path) algorithm.

2. For an enterprise profile select and suggest the enterprise Big Data Infrastructure, services and components. Also create a Data Management Plan (DMP) and cost assessment and deployment plan.

Ability to be learn hands on with real industry data and delivering insights to industry jury is the best part of the program. Data Science and its application for Decision Science with practitioner faculty is the biggest highlight of the program. Strongly Recommend it.

Vinod Tiwari

Senior Analyst, TCS

Is this program right for you ? Get the advice from a Senior Counselor

Related BootCamps