
Automating manual and repetitive data engineering tasks with AI
Datafold is the data engineering automation platform. A major part of our business is delivering large-scale data engineering projects such as data platform migrations with AI, at fixed price and guaranteed timeline. We are not a typical services provider or SI – we are a venture-backed software company that reimagined automation from the ground up and has been delivering projects up to 6x faster than alternatives.
We partner heavily with leading data platforms, including Databricks and Snowflake, to deliver large-scale complex migrations.
Datafold is hiring a Forward Deployed Data Engineer to own the delivery of our AI-automated data platform migration projects. This is a high-ownership, customer-facing role at the intersection of data engineering, AI, and project leadership.
As a Forward Deployed Data Engineer leveraging Datafold's AI technology, you will own migration projects end-to-end, gain deep understanding of the customer's data platform, manage execution of the project, and remove any blockers on the way for a timely migration delivery.
Using our proprietary AI tooling to plan and deliver the migration projects, you will be doing the work equivalent to a full team of consultants.
Own 1–4 concurrent migration projects end-to-end: scoping, planning, execution, and customer handoff
Be the primary customer contact: run weekly check-ins, manage stakeholder expectations, and escalate risks early before they compound
Configure Datafold's Migration Agent and oversee the migration execution
Partner with Datafold's engineering team to execute migrations
Help refine and scale our product and delivery playbook as the team grows
3–6 years in data consulting, professional services, or a customer-facing data engineering role
Excellent communication skills — equally comfortable in an exec check-in and a technical design session
Strong grasp of the modern data stack: dbt, Snowflake, Databricks, orchestration tools, and major patterns (stored procedures, streaming, incremental processing)
Extreme ownership mentality — you identify, surface, and fix problems and rally the team to help without being told
AI power user — using AI every day and always learning and improving on how to use it more effectively
Exposure to legacy data stack and patterns (ETL, stored procedures, etc.) and data platform migration projects is a strong plus
Datafold is an equal opportunity employer and does not discriminate against any employee or applicant for employment based on race, color, religion, sex, national origin, age, disability, genetic information, sexual orientation, gender identity, marital status, military status, or any other protected characteristic. We are committed to providing equal employment opportunities to all individuals. We strive to create an inclusive and diverse work environment where all employees are valued and unique perspectives are respected and celebrated.
About Datafold
At Datafold, we build tools for data practitioners to automate the most error-prone and time-consuming parts of the data engineering workflow: testing data to guarantee its quality. While data quality (just like software quality) is a complex and multifaceted problem, we draw from decades of our team’s combined experience in the data domain to build opinionated tools our users love. Specifically, we believe that:
Data quality is a byproduct of a great data engineering workflow. That means, rather than building yet-another-app for data practitioners to switch to and from, we insert our tools in the existing workflows, for example, in CI/CD for deployment testing and IDEs for testing during development.
Data quality issues should be addressed before deploying the code. Most data quality issues are bugs in the code that processes data, and applying a proactive, shift-left approach is the most effective way to achieve high shopping velocity and data quality simultaneously. Read more
Lack of metadata (data about data) is the biggest gap in the data engineering workflow. We bring powerful tools such as data diffing and column-level lineage to every data engineer’s workflow to help them validate the code and underlying data and fully understand the dependencies in complex data pipelines.
Datafold is used by data teams at Patreon, Thumbtack, Substack, Angellist, among others, and raised $22M from YC, NEA & Amplify Partners.