Big data is a field that treats ways to analyze, systematically extract information from, or otherwise deal with data sets that are too large or complex to be dealt with by traditional data-processing application software. Big data challenges include capturing data, data storage, data analysis, search, sharing, transfer, visualization, querying, updating, information privacy and data source.
Our Big Data training programme comprises of 3 courses which are : -
1) Hadoop
2) Spark
3) ETL.
Hadoop course overview
- Introduction to Hadoop
- Introduction to Big Data
- Hadoop Distributed File System (HDFS)
- Map Reduce Fundamentals
- Map Reduce Programming – Java Programming
- NOSQL
- HBase
- Hive
- Pig
- SQOOP
- HCatalog
- Flume
Apache Spark course overview
- Batch and Real-Time Analytics with Apache Spark
- SCALA (Object Oriented and Functional Programming)
- Collections
- Object Oriented Programming
- Integrations
- SPARK CORE
- CASSANDRA (No SQL Database)
- Spark Integration with NoSQL (CASSANDRA) and Amazon EC2
- Spark Streaming
- Spark SQL
- Spark ML library
ETL course overview
- Introduction to ETL
- Data Warehouse concepts
- Data Mart
- Data Warehouse Architectures
- Building a dimensional model
- Introduction to Data Extraction, Transformation and Loading (ETL)
- Data Integration and ETL
- ETL Testing concepts