Data Engineer
We Are:
At Data Society, we provide bespoke, leading-edge data and AI solutions for Fortune 1,000 companies and federal, state, and local governmental organizations. We partner with our clients to educate, equip, and empower their workforces with the skills they need to achieve their goals and expand their impact. We are empowering the workforces of the future, from data literacy for all employees to support for data engineers and data scientists to train up on the most complex AI solutions and Machine Learning skills.
Job Summary:
The Data Engineer is responsible for building scalable, performant data pipelines that power critical operational and analytical applications. The engineer must be able to work closely with our data science teams to run ML/AI models on top of enterprise-scale data and build the supporting data scaffolding to orchestrate, test, and monitor data systems. This is a customer-facing role within a cross-functional team so the ability to manage timelines, work both autonomously and collaboratively, and communicate effectively are a must. As this position will help to support federal contracts with security requirements, you must be a US Citizen to qualify.
Responsibilities:
- Ability to build a full data pipeline from data ingestion to processing/transformation to load to visualization and analysis.
- Design and manage large-scale data warehouses, lakehouses, and/or data marts
- Build and optimize data transformation pipelines using tools like dbt to support data flow from ingestion through analytics
- Champion data governance principles and quality standards, ensuring data lineage, documentation, and metadata are maintained
- Create efficient, performant SQL-based data queries and Python-based data processing jobs
- Demonstrate ability to balance computational load, performance, and cost
Skills you bring:
- Advanced Degree in Statistics, Applied Mathematics, Data Science, Computer Science, Operations Research or other closely related other quantitative or mathematical disciplines.
- 5+ years of data and analytics engineering in cloud environments
- Expertise in SQL, Python, and schema design with experience in data cataloging and governance tools
- Experienced with data transformation and ETL best practices
- Experienced with data orchestration tools like Airflow, transformation frameworks like dbt, and cloud deployment tools like Terraform.
- Demonstrated exceptional oral and written communication skills.
- The ability to work independently and in a team environment.
- The ability to work effectively across functions, levels and disciplines.
- Strong problem solving and critical thinking skills.
- Superior team-working skills, and a desire to learn, contribute, and explore.
- Experience with Snowflake, Databricks, Kafka, Flume, Spark, or Flink is a plus
Please note this job description is not designed to cover or contain a comprehensive listing of activities, duties or responsibilities that are required of the employee for this job. Duties, responsibilities, and activities may change at any time with or without notice.
This position will be remote though based out of the Washington, DC area with travel to client sites in DC if needed.