Authors: Samhita Alla and David Espejo
Halfway through 2023, we’re thrilled to share Flyte’s latest developments, features, and enhancements. For those unfamiliar, Flyte is an open source orchestration platform designed for building data and machine-learning pipelines. Our mission is to provide you with the tools to construct seamless production-grade pipelines on a scalable and reliable infrastructure.
Since the beginning of the year, we have released a series of updates, integrated Flyte with various other tools and platforms, expanded our community, implemented numerous enhancements, and introduced exciting new features. Our focus has been on enhancing the reliability, ease of use, and robustness of Flyte.
Releases: 1.3.x to 1.8.x
From version 1.3.x to the latest 1.8.x release, Flyte has undergone continuous enhancements, introducing new features, improving performance, and addressing valuable user feedback. Highlights include:
- Human-in-the-loop tasks. With the introduction of gate nodes, Flyte empowers users to incorporate sleep, approval requests, and input prompts within their workflows. Whether adding pauses for manual intervention or incorporating approval steps, human-in-the-loop tasks enable more dynamic and interactive workflows.
- Simplified Flyte deployment. The functionality of Flyte’s backend has been consolidated into a single command, making the deployment process smoother and more straightforward.
- PodTemplate at the task level. This feature allows users to define and customize pod specifications within task definitions, providing flexibility and personalization.
- Stream files and directories. Flyte supports streaming of data from remote file systems, enabling users to read and write data without transferring entire files or directories. This optimizes performance, especially when dealing with large files or datasets.
- Data subsystem revamp. The data persistence layer has undergone a complete overhaul, with Flyte exclusively utilizing `fsspec` for handling input and output operations.
- Runtime metrics UI. The enhanced timeline view now categorizes node executions into time-series data, providing a more detailed understanding of workflow performance.
- ImageSpec. With ImageSpec, defining and building container images for Flyte tasks and workflows becomes effortless. ImageSpec simplifies the specification of necessary components without requiring a Dockerfile.
- Flyte agents. This feature simplifies authoring, testing, and deploying backend plugins. With Flyte agents, you can easily harness the power of plugins without diving deep into Golang. These agents act as long-running stateless services, eliminating the need to create a new pod for each task, which reduces overhead and provides scalability benefits.
You can access the detailed release notes on the Flyte blog.
Integrations
We’re excited to showcase a range of integrations developed to enhance Flyte’s capabilities.
- Dask. Easily spawn Dask ephemeral clusters within your Flyte workflows, similar to the support provided for Ray and Spark.
- PyTorch elastic training. Enjoy seamless support for distributed training using PyTorch elastic (`torchrun`). Scale your training workload efficiently by configuring and executing elastic training tasks effortlessly.
- Revamped Kubeflow plugins. Declare specifications such as images, resources, and restart policy for different replica groups. The refactored Kubeflow plugins include TFJob, MPIJob, and PyTorchJob.
- DuckDB. Run DuckDB queries seamlessly using the `DuckDBQuery` Flyte task.
- Banana. Orchestrate machine learning pipelines using Flyte, enabling on-demand and scalable model serving with Banana.
- Databricks. Incorporate the Databricks Jobs API with the Kubernetes Spark integration.
- LangChain. Orchestrate and track your LangChain experiments using the FlyteCallback in a LangChain LLM, chain, or agent.
- HuggingFace. Integrate spot instances into your existing Hugging Face workflows using the FlyteCallback. Strike a balance between cost efficiency and usability with this practical approach.
Large Language Models
When developing applications based on Large Language Models (LLMs), efficient underlying infrastructure management is key to optimal performance and scalability. This is where Flyte truly excels. We’ve prepared a series of informative blog posts that delve into the intricacies of leveraging Flyte for large language model applications. These posts will provide valuable insights and guidance to help you maximize Flyte’s capabilities.
- Building FlyteGPT on Flyte with LangChain
- Orchestrate and Track Your LangChain Experiments with Flyte
- Fine Tuning vs. Prompt Engineering Large Language Models
- Fine-Tuning Insights: Lessons from Experimenting with RedPajama Large Language Model on Flyte Slack Data
LLMs can transform how we understand and generate human-like text, revolutionizing our approach to natural language processing and communication. These models can potentially build groundbreaking applications, and we are committed to making significant contributions to this transformative field in the foreseeable future.
Community
The overarching goals of Flyte as a project are only feasible thanks to the collaborative nature of open source. Over the past 6 months, the Flyte community has discussed and started adopting a series of tools and processes to reduce barriers to contribution and ensure that the values of openness, inclusion, and autonomy are consistently cultivated:
- A new governance model includes defining roles and processes to foster ownership and collaboration between community members. The model also reintroduces the role of the Technical Steering Committee with members from different organizations, overseeing the health and future development of the project.
- An updated RFC process focused on transparency, ease of collaboration and accountability ensures all ideas are heard, and accepted proposals have a path to implementation.
- New communication channels, including the bi-weekly contributor’s meetup and an announcements mailing list hosted by the AI & Data Foundation.
- Working Groups (WG) and Special Interest Groups added to the Governance model, with the first WG already in operation (Config Overrides Working Group).
Open source community health is an ongoing process, so we continue to listen to the community’s needs and make every effort to lower the barriers to access, collaboration, and contribution.
Events
We had the pleasure of hosting and speaking at numerous events in the first half of this year. Below, you will find some of the recordings from these insightful sessions:
- The Fundamentals of Type-safe, Reproducible, and Scalable Data Pipelines
- Training and Ensuring Reliability of ML Models at Wolt: The Power of ArgoCD, Flyte, and Argo Workflows
- The Python Data Ecosystem: Navigating a fragmented landscape
- Robust and End-to-End Cloud Native Machine Learning & Data Processing Platform
- Run Your Data and Machine Learning Workflows on Kubernetes with Flyte
- Fine-tuning Language Models with Declarative ML Orchestration
What lies ahead
At Flyte, our mission is clear: to empower every organization in building scalable and reliable machine learning and data platforms. Informed by practical use cases collected from numerous users across diverse industries, Flyte addresses various aspects of the industry to provide a comprehensive solution.
We understand that many infrastructure teams are in the process of migrating or considering a migration to Kubernetes in the near future. This is where Flyte truly shines. By leveraging Kubernetes, Flyte enables teams to accelerate the development of machine learning and data products while remaining compliant with the rest of the organizational infrastructure.
Looking ahead, we are committed to orchestrating large language models seamlessly with Flyte. Our goal is to empower machine learning practitioners to run these models at scale without the hassle of scaling complexities. We are actively working on providing turnkey integrations and interfaces to make this process as smooth as possible.
Meanwhile, we’re laser-focused on creating the best documentation experience ever. We want all users to navigate and harness Flyte’s capabilities effortlessly. We also want to make Flyte as performant and reliable as possible by continually optimizing its features and functionalities.
Furthermore, we want to simplify the contribution process to Flyte. We value community collaboration and actively encourage developers to contribute and participate in shaping the future of Flyte.
Although we have achieved significant milestones, our journey is far from over. We invite you to join us on this exciting path. If you share our enthusiasm for Flyte, its vision, and growth, we encourage you to connect with us on our Slack channel and follow us on Twitter. To learn more about Flyte, visit our repository and explore our documentation.
Together, let’s shape the future of scalable and reliable machine learning and data platforms with Flyte.