Big Data Engineer Job Vacancy

Brief description :

  • Preferred Technical and Professional Expertise
  • Good to have at least one of the Certifications listed here: DP 200, DP 201, DP 203, AZ 204 – Data Engineering
  • Big Data Engineer Job Vacancy-2 – 7 years of recent experience in data engineering.
  • Bachelor’s Degree or more in Computer Science or a related field.
  • A solid track record of data management showing your flawless execution and attention to detail.
  • Strong knowledge of and experience with statistics.
  • Programming experience, ideally in Python, Spark, Kafka, or Java, and a willingness to learn new programming languages to meet goals and objectives.
  • Experience in C, Perl, Javascript, or other programming languages is a plus.
  • Knowledge of data cleaning, wrangling, visualization, and reporting, with an understanding of the best, most efficient use of associated tools and applications to complete these tasks.
  • Deep knowledge of data mining, machine learning, natural language processing, or information retrieval.
  • Experience processing large amounts of structured and unstructured data, including integrating data from multiple sources.
  • Airline Domain is a plus


Preferred skills:

Big Data Engineer Job Vacancy

Python, Spark, Kafka, or Java

Closing on:

08/02/2021

Contact email:

[email protected]

what is Kafka

Apache Kafka is an open-source stream-processing software platform developed by the Apache Software Foundation, written in Scala and Java. The project aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds. Kafka can connect to external systems (for data import/export) via Kafka Connect and provides Kafka Streams, a Java stream processing library. Kafka uses a binary TCP-based protocol that is optimized for efficiency and relies on a “message set” abstraction that naturally groups messages together to reduce the overhead of the network roundtrip. This “leads to larger network packets, larger sequential disk operations, contiguous memory blocks […] which allows Kafka to turn a burst stream of random message writes into linear writes.

Apache Kafka is based on the commit log, and it allows users to subscribe to it and publish data to any number of systems or real-time applications. Example applications include managing passenger and driver matching at Uber, providing real-time analytics and predictive maintenance for British Gas smart home, and performing numerous real-time services across all of LinkedIn.



Leave A Reply

Your email address will not be published.