What is our primary use case?
My main use case for Dagster Labs is data pipeline orchestration.
A specific example of how I am using Dagster Labs for data pipeline orchestration is that Dagster can be used with dbt, which is another technology that allows scheduling data modeling pipelines and integration with other tools. This integration with dbt is particularly useful, and on the monitoring side, we can add data quality monitors to run on a schedule. Additionally, Dagster Labs integrates with Slack for alerts. The specific example is running data models on a schedule.
Regarding my use case, we also use Dagster Labs to integrate with some AWS components, such as AWS Batch jobs, and we use Dagster's Dagster Pipe technology to connect with our orchestration runs within Dagster, although the process is orchestrated into AWS services. While this is not unique, it is one of our use cases. Another use case is triggering remote Spark jobs, which is connected to data pipeline orchestration, making it one of Dagster Labs' official integrations.
What is most valuable?
Dagster Labs offers excellent features, including a lot of integrations out of the box, allowing for integration with all major players in the data landscape. Another great feature is the polished UI and developer experience. Unlike some other products with clunky UIs, Dagster Labs provides a pleasant experience with a single glass panel approach. This makes it easy to quickly glance at pipeline health, asset lineage, and everything important for a data engineer is just one or two clicks away. Additionally, it is an extensible product, enabling power users to easily adapt available libraries and code to fit their use cases.
A standout feature is the dbt integration, which is probably well-known and regarded as the industry's top integration, prompting my organization and others where I worked to use Dagster Labs.
Dagster Labs has positively impacted my organization mainly with cost management, allowing us to move away from a more locked-down tool, dbt Cloud. The impact was migrating all our dbt Cloud pipelines to Dagster Labs, which allowed us to save money.
Regarding cost savings, I find that while I do not see a reduction in employee hours or improved efficiency—since users were accustomed to working with dbt Cloud and it allowed more business users to interact with the tool—with Dagster Labs, the cost savings are primarily raw savings due to the platform price, limiting access to previous pipelines.
What needs improvement?
Some points of improvement for Dagster Labs include leveraging integrations with other great tools such as SQLMesh, which many people are interested in, as shown by its traction and frequent requests on GitHub. Additionally, the billing for Dagster Labs Cloud Plus is somewhat restrictive regarding the number of seats, which often pushes users towards the enterprise plan. This raises concerns for mid-sized companies.
For how long have I used the solution?
I have been using Dagster Labs since 2021.
What do I think about the stability of the solution?
Dagster Labs is stable, with a really good release cycle.
What do I think about the scalability of the solution?
Dagster Labs is scalable based on your compute environment, allowing for serverless scaling when using Dagster Labs Cloud. While there are limitations on workload size, transitioning to your own cloud platform resolves scalability issues, particularly suitable for Kubernetes.
How are customer service and support?
I only used customer support while setting things up, and although I have not needed to reach out since, my experience was good and responsive.
Which solution did I use previously and why did I switch?
In previous use cases, we utilized dbt Cloud, which is primarily a tool designed for dbt and has even worse pricing. We switched to Dagster Labs, and before that, we also used open source alternatives such as Prefect and Airflow, which are acceptable tools, with Airflow being the industry standard. However, I believe Dagster Labs is moving in the right direction.
How was the initial setup?
My experiences with pricing, setup cost, and licensing vary. For my current organization, we opted for Dagster Labs open source, and we are considering moving to the cloud offering. In a previous consultancy role, we selected the standard plan, and while the pricing was acceptable, it became limiting when it came to the number of seats, pushing us to choose a simpler plan. The setup cost was negligible as I had prior experience setting up Dagster Labs, and licensing was not a major concern for us.
What was our ROI?
It is difficult to quantify a return on investment, especially since I am in an individual contributor position and do not have access to metrics as I am not on the leadership side of the organization.
Which other solutions did I evaluate?
Before choosing Dagster Labs, we evaluated Airflow, specifically its AWS offering, and also looked at Prefect in other roles.
What other advice do I have?
I would rate Dagster Labs an eight out of ten. I chose an eight out of ten mainly due to the pricing and the way it handles seats, which can be unwelcoming for smaller organizations. As a product, it is one of the best I have had the pleasure to work with.
My advice for others looking into using Dagster Labs is to be wary of the pricing model. Understanding how to best leverage Dagster Labs hinges on having a solid grasp of the tool and its extensibility, enabling users to expand use cases creatively, while also avoiding overcomplicating pipelines. Dagster Labs' documentation offers a solid foundation to begin working.
I think it is crucial for Dagster Labs to keep maintaining the open source solution, as I believe it is a good gateway for new customers and essential for the data landscape, despite the for-profit nature of the company.
Which deployment model are you using for this solution?
Private Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)