Skip to content

Testing and documenting data

Intro

Luca Gilli, CTO and founder of Clearbox.AI, provides us with tips on how to make our data pipelines more robust through the writing of unit tests and documentation, using the open-source library great_expectations.

Later on, we delve into what it means to optimize test creation through data profiling, sharing the experience gained during the development of the StructuredDataProfiling library.

Material

📚 Meeetup material:

Github

📚 Repository great_expectations:

➡️ https://greatexpectations.io/

➡️ https://github.com/great-expectations/great_expectations

📚 Repository StructuredDataProfiling:

➡️ https://github.com/Clearbox-AI/StructuredDataProfiling

Meetup video