Data Sovereignty in Vector Stores: A Deep Dive into Qdrant’s Hybrid Cloud
Learn how Qdrant's Hybrid Cloud offers data sovereignty & compliance, unifying cloud, on-premises, & edge environments into a single service
With the rise of RAG applications in enterprises, data compliance has become a key matter of concern. Much of the data that powers enterprise RAG applications consists of sensitive information that comes with various compliance requirements. Therefore, when enterprises look to build AI solutions, they need to use a stack that allows them to control the infrastructure in such a way that they adhere to the rules and regulations of various geographies. This is especially true when the AI application uses a vector store, as the data during transit and data stored need to be carefully protected.
To address these concerns, Qdrant has recently launched a Hybrid Cloud offering. This solution provides an alternative path where organizations do not need to share any data or API keys with Qdrant, yet can efficiently manage their vector database.
What’s interesting is that Qdrant Hybrid Cloud has capabilities similar to Qdrant’s own cloud platform. It uses Kubernetes clusters to unify environments — cloud, on-premises, or edge — into a single, enterprise-grade managed service.
In order to understand how the Qdrant Hybrid Cloud works, I decided to try it out. In this article, I will walk you through my experience of using it end to end. Let’s dive in.
How It Works
When an enterprise onboards a Kubernetes cluster as a Hybrid Cloud Environment, they can deploy the Qdrant Kubernetes Operator and Cloud Agent into this cluster. These components manage Qdrant databases within the organization’s Kubernetes cluster and establish an outgoing connection to Qdrant Cloud at cloud.qdrant.io on port 443.
This setup allows it to benefit from the same cloud management features and transport telemetry as available with any managed Qdrant Cloud cluster.
Platform Deployment Options
Qdrant Hybrid Cloud supports deployment on various managed Kubernetes platforms, including but not limited to:
Akamai (Linode)
Amazon Web Services (AWS)
Civo
DigitalOcean
Google Cloud Platform
Microsoft Azure
Oracle Cloud Infrastructure
OVHcloud
Red Hat OpenShift
Scaleway
STACKIT
Vultr
Each platform has specific prerequisites and installation steps, which are detailed in the Qdrant Hybrid Cloud Setup Guide.
Setting Up the Hybrid Cloud
In this blog, I will show you how to set up Qdrant’s hybrid cloud environment using Digital Ocean Cluster.
Set Up the Kubernetes Cluster in DigitalOcean
To start with the Qdrant Hybrid Cloud setup, you’ll need a Kubernetes cluster. This can be deployed on any cloud platform, on-premises, or in an edge environment. In this example, we’re using DigitalOcean to create and manage our Kubernetes cluster.
Go to this link and fire up a Kubernetes cluster on Digital Ocean. In the image given below, you can see that I have a cluster on my Digital Ocean account.
After deploying the Kubernetes cluster, the next step is to verify all its components to ensure everything is set up correctly.
We can see that our cluster is now up and running. It should be noted that this cluster is not tied to any Qdrant infrastructure yet. We will now integrate this DigitalOcean Kubernetes cluster with the Qdrant Hybrid Cloud infrastructure.
Set Up a Hybrid Cloud Environment on Qdrant
To integrate our Kubernetes cluster with the Qdrant Hybrid Cloud infrastructure, we’ll navigate to Qdrant’s Dashboard and access the Hybrid Cloud section.
We will then create a Hybrid Cloud Environment, as shown in the image below. We will need to enter the name of the hybrid cloud environment as well as the Kubernetes namespace for the Qdrant component. Once set, we will be able to see this component with the same name on our Digital Ocean cluster. For now, we will keep the rest of the stuff to default. You can try and experiment with different configurations. To learn more about the advanced setup, follow this link.
Once we’ve created the environment, we will be provided with a one-time installation command that we need to execute in our DigitalOcean cluster. Qdrant doesn’t need any API keys of our cluster in order to maintain data sovereignty, and this is why we will need to run the one-time installation command provided by Qdrant on our own.
Below is the one-time command that is generated for our DigitalOcean Cluster.
Configure Your DigitalOcean Cluster with the Hybrid Cloud Environment of Qdrant
Now, we will run this one-time installation command in the terminal.
We can see that our cluster, which is deployed on DigitalOcean, is now integrated with the Hybrid Cloud Environment of Qdrant, and we can verify this by looking at the last four namespaces of the cluster.
The dashboard waits for you to run the above command in the cluster. Once that is done, you can go ahead and click on the Continue button.
Create New Clusters on DigitalOcean from Hybrid Cloud
Now, let’s go ahead and create clusters on DigitalOcean from the Hybrid Cloud itself. First, we’ll check if all the states are ready; if they are, we’ll proceed to create the new clusters.
Next, we will choose the hardware specs of the hybrid cluster to be created on our Digital Ocean from the Qdrant dashboard itself.
Once that is done, we can see that our cluster is firing up. All of these things are happening in our digital ocean cluster. We are only managing this cluster from Qdrant’s Hybrid Cloud Environment.
We can also verify this from the terminal.
Finally, we can see from the Qdrant dashboard that the cluster is up and running.
Now, using this endpoint, we can easily connect with our Qdrant vector database and build our RAG application like normal.
If you’re also interested in learning how to build an RAG application, you can follow my other blog, where I built a chatbot using the RAG Stack. You could also try to integrate Qdrant’s Hybrid Cloud and RBAC into this RAG-powered chatbot I built.
Future Notes
In this blog, I explored how organizations can manage their data for RAG-based applications efficiently without their data ever leaving their infrastructure. If you are looking to build a data sovereign architecture for your AI application, do give it a spin!
References
This article was originally published on: https://quamernasim.medium.com/data-sovereignty-in-vector-stores-a-deep-dive-into-qdrants-hybrid-cloud-b9e47aa163f7