4+ years of development experience with Hadoop eco-system ( Spark, Scala, Oozie, Pig, Hive, HDFS, MapReduce) and/or NoSQL technologies such as Cassandra, and MongoDB with experience in real-time & stream processing systems. POC experience or Training won’t be considered
Excellent knowledge of Core Java or UNIX shell script or PL/SQL stored procedures is required
Should have knowledge in different Hadoop Distributions like CDH 4 / 5, Hortonworks, MapR, and IBM Big Insights.
Strong foundational knowledge and experience with a range of Big data components such as Hadoop/Yarn, HDFS, MapReduce, Oozie, Falcon, Pig, Hive, Zookeeper, Sqoop, and Flume
Develop MapReduce programs or Hadoop streaming.
Develop Pig scripts/Hive QL for analyzing all semi-structured/unstructured/structured data flows.
Should have knowledge of Table definitions, file formats, UDF, Data Layout ( Partitions & Buckets), Debugging & performance optimizations.