As a Data Engineer, you will develop innovative software using distributed data processing frameworks and techniques. Ocelot Data Engineers define and build data pipelines that enable our clients to make faster, better, data-informed business decision. You will work in a team environment with software engineers, analysts, and data scientists with the opportunity to mentor colleagues on your team and across other engineering teams.
- Primary required skills include primary skills include Python, Unit Testing, AWS, Terraform, Bash Scripting, Spark on AWS (Glue/EMR).
- Hands-on experience implementing, debugging, identifying performance bottlenecks, and fine-tuning batch and real-time big data integration frameworks in private or public cloud using various technologies (Azure Databricks, Hadoop, Spark, Kafka, AWS EMR, etc.).
- Experience applying principles, best practices, and trade-offs of schema design to various types of database systems: relational (Oracle, Postgres, MySQL, etc.), NoSQL (HBase, DynamoDB, MongoDB, etc.) and in-memory (ElastiCache) with understanding and proficiency in data manipulation techniques.
- Experience designing optimal ETL infrastructures from a variety of data sources.
- Experience in one or more general purpose programming languages (Java, Scala, Python, etc.).
Nice to have
- Experience implementing a data lake architecture.
- Experience with cloud-based data workflow orchestration services (AWS Data Pipeline, GCP DataFlow, Azure Data Factory).
- Experience with Business Intelligence platforms.
- Knowledge of API development (proper microservice separation, HTTP verb usage) and distributed icroservice architectures providing elasticity, redundancy, failover, and intelligent routing.
- Familiarity with DevOps practices, specifically understanding of OS and container management (Docker, Kubernetes, Clounitud Foundry).