12 Comments
User's avatar
Alejandro Aboy's avatar

Amazing! Have you tried something like this in prod? I wonder how It can be enriched with GitHub MCP or even Jira ticket creation for bug reporting for example

Expand full comment
Andres Vourakis's avatar

Yes, this is one of the first AI workflows I built early on, but I use it primarily during analysis, not as a way to detect issues within our data pipeline.

What I presented here is a lightweight version that could be customised for specific use cases.

Expand full comment
Madison Mae's avatar

This is so cool!

Expand full comment
Andres Vourakis's avatar

Thank you!

Expand full comment
Jonah Wiener-Brodkey's avatar

Could you theoretically connect to a database instead of uploading a csv?

Expand full comment
Andres Vourakis's avatar

100% You can use an MCP server to connect to your database of choice and bring in your data that way 👌

Expand full comment
Mikhail Mikushin's avatar

I see as a result that missing rows are removed / imputed. Do you specify what imputation method is applied here? It can significantly change the values distribution.

Expand full comment
Andres Vourakis's avatar

You can specify that in the main prompt or in the additional user instructions. It’s up to you which method you prefer

Expand full comment
Jose Parreño Garcia's avatar

We have done something similar where our stack is

Streamlit interface

Connection to Databricks Genie for text-to-sql

Connection OpenAI for reasoning and text-to-plotly

Expand full comment
Andres Vourakis's avatar

Awesome! Would like to hear more, are you writing about it anytime soon?

I'm planning to share more AI workflows I'm currently using at work

Expand full comment
Anuj's avatar

Nice built will try this today

Expand full comment
Andres Vourakis's avatar

Awesome, let me know how it goes!

Expand full comment