/

Data Engineering For Data Scientists

Copy Link

Unlock This Lesson

44

min

Data Engineering For Data Scientists

Data Engineering For Data Scientists

Data Engineering For Data Scientists

Data Engineering For Data Scientists

publish date

Dec 7, 2022

duration

44

min

Difficulty

Intermediate

Beginner

Beginner

Beginner

Case details

A high-level introduction to data engineering for data scientists. In this fast-paced talk, you’ll learn how adopting data engineering best practices and tools can improve your data science projects and empower you to deliver better, more reliable results in record time. We’ll discuss data architecture and design principles, and explore open source tools you can use today, including: - Running Jupyter notebooks in production using Papermill and nbdev - Improve data quality with Great Expectations, and monitor models with Evidently.ai - Write unit tests for your pandas and Spark dataframes with pandera - Reusable SQL with dbt, an exciting new tool for data transformation that’s transforming data teams. - Workflow orchestration with Apache Airflow, a better approach than fragile and frustrating cron jobs or Lambdas. - Version control your data alongside your code with DVC

Share case:

Questions?

Chat with Us!

910 Foulk Road, Suite 201

Wilmington, DE 19803, USA

© 2025 Geekle. All rights reserved.

Questions?

Chat with Us!

910 Foulk Road, Suite 201

Wilmington, DE 19803, USA

© 2025 Geekle. All rights reserved.

Questions?

Chat with Us!

910 Foulk Road, Suite 201

Wilmington, DE 19803, USA

© 2025 Geekle. All rights reserved.

Questions?

Chat with Us!

910 Foulk Road, Suite 201

Wilmington, DE 19803, USA

© 2025 Geekle. All rights reserved.