Senior Hadoop Developer

St. Louis, Missouri

Senior Hadoop Developer to develop, create, and modify general computer applications software or specialized utility programs.

Job responsibilities and duties include:

Implement data analytics processing algorithms on Big Data batch and stream processing frameworks (e.g. Hadoop MapReduce, Python, Spark, Scala, Kafka etc.).
Perform data acquisition, preparation, and perform analysis leveraging a variety of data programming techniques in Spark using Scala.
Work on complex issues where analysis of situations and data requires an in-depth evaluation of variable factors.
Load data from different datasets and decide on which file format is efficient for a task. Hadoop Developers source large volumes of data from diverse data platforms into Hadoop platform.
Install, configure, and maintain enterprise Hadoop environment.
Build distributed, reliable, and scalable data pipelines to ingest and process data in real-time. Hadoop Developers deals with fetching impression streams, transaction behaviors, clickstream data, and other unstructured data.
Define Hadoop Job Flows and manage Hadoop jobs using Scheduler.
Review and manage Hadoop log files.
Design and implement column family schemas of Hive and HBase within HDFS
Assign schemas and create Hive tables with suitable formats and compression techniques.
Mentor Big Data Developers on best practices and strategic development.
Develop efficient Pig and Hive scripts with joins on datasets using various techniques.
Apply different HDFS formats and structure like Parquet, Avro, etc. to speed up analytics.
Fine tune Hadoop applications for high performance and throughput.
Troubleshoot and debug any Hadoop ecosystem run time issues.
Develop and document technical design specifications.
Design and develop data integration solutions (batch and real-time) to support enterprise data platforms including Hadoop, RDBMS, and NoSQL.
Lead technical meetings, as required, and convey ideas clearly and tailor communication based on selected audience (technical and non-technical).
Implement Spark Streaming architecture and integration with JMS queue with custom receivers.
Develop and deploy API services in Java Spring.
Create Hive and HBase data source connection to Spring.
Implement multi-threading in Java/Scala.

This position has no direct reports and does not supervise any other personnel.

Minimum requirements:

Bachelor’s degree in Computer Science, Applied Computer Science, Engineering, or any related field of study, plus at least two (2) years of experience in the job offered or in any related position(s).

Qualified applicants must also have demonstrable proficiency, skill, experience, and knowledge with the following:

1.Hadoop/Big Data Ecosystem and Architecture
2.Hive, Spark, HBase, Sqoop, Impala, Kafka, Flume, Oozie, and MapReduce
3.Programming experience in Java, Scala, Python, and Shell Scripting
4.SQL and Data modelling

Work from home benefit offered.