AI/Big Data DevOps Engineer

Quick summary

AI at scale
100+ TBs of data
100+ servers
Main technology stack for this role: CentOS, Kafka, Flink, Druid, Hadoop, Spark, Node.js, Zookeeper, MongoDB
Key tools: Ansible, Jenkins, Docker, Kubernetes, Bash, Nagios, Telegraf, InfluxDB, Git, Bitbucket, Jira, Confluence
Level: Senior (more than 5 years of experience)
Location: Warsaw, Poland

About Deep BI, Inc.

Deep.BI is a data platform for media companies. It can save up to 95% of money needed to build and maintain an in-house big data solution and years of development.

The data plays a fundamental role in every aspects of media including:

new product development
increasing audience engagement
monetization (subscription, ads, branded content)

Deep.BI makes data collection, integration, storage, analytics and usage easy. It reduces all the complexity needed to implement big data technology and thus minimizes risk and cost. We use modern, real-time stack including Node.js, Kafka, Flink & Druid. We built our own HA, hybrid data cloud (now ~400 cores) and we're scaling it horizontally.

We also experiment with conversational user interface for our analytics platform, where customers get insights provided by chatbots. Also, as a next step we work on bot-2-bot communications to automate processes (RPA - Robotic Process Automation).

We're a young startup and yet a small team of enthusiasts, with solid financing from well known business angels having firsts big media customers from US and Europe.

We invite the best, passionate people. Let's talk and find out if there's a fit.

Responsibilities

In this role you will deal with building and operating a variety of distributed data delivery and processing systems such as Kafka, Flink, Spark, Hadoop and more. Your key focus in on creating scalable artificial intelligence data pipelines. You will be facing great challenges around large scale, complex upgrades, automation and deployment of services in a micro-service and multi-tenant environments. You will also be dealing with solving complex system issues and constantly improving the performance of the services.

Building artificial intelligence learning and predictions data pipelines
Development and maintenance of Deep.BI platform infrastructure
Handling hundreds of terabytes of data Managing 100+ servers
Cloning our current hybrid cloud infrastructure to different datacenter located on different continents
Provide architectural solutions for complex data issues resulting from large-scale and rapid growth needs
Building and maintaining high-performance, fault-tolerant, scalable distributed software systems
Continuous improvement of operation processes and procedures, focusing on engineering approach and automation tools development
Maintain and scale the data science cluster with GPU
Main technology stack for this role: CentOS, Kafka, Flink, Druid, Hadoop, Spark, Node.js, Zookeeper, MongoDB
Key tools: Ansible, Jenkins, Docker, Kubernetes, Bash, Nagios, Telegraf, InfluxDB, Git, Bitbucket, Jira, Confluence

Required qualifications

5 years experience in administering large scale clusters based on Linux
Minimum 3 years hands-on experience with maintaining: Kafka, Flink or Spark, Hadoop
Desire to build a great product used globally
Curiosity and a desire to continue learning new processes and technologies
Ability to work independently and as part of a team
Good interpersonal and communication skills
Intellectual curiosity, along with excellent problem-solving skills, including the ability to disaggregate issues, identify root causes and recommend solutions

Preferred skills

Familiarity with the following technologies would be an advantage: Druid, Node.js, Zookeeper, MongoDB, Parquet, Mongo, MySQL
Experience in working with cloud environments (AWS, Azure, Google Cloud Platform) as well as private data centers
Experience with large scale deployments using Docker and Kubernetes or similar technologies
Computer Science or related engineering degree from a university

Our offer

Salary: 13-20k PLN (depending on experience, different types of contract available) + paid holidays (20 or 26 days)
Work in a young startup with solid financing, among passionate and friendly people
Private medical care
Stock option plan
Flexible working hours, possibility of occasional remote work
Each member of the team has real influence on the product - state of the art big data & AI platform
Great office location - beautiful co-work on Senatorska street

‍