What you build
A user describes what they want to watch in natural language — “a suspenseful space movie” — and the system finds the best matches from the database, optionally filtered by genre, year, or minimum rating. The diagram below shows how data flows from raw movie records through embedding and into a searchable vector store.Prerequisites
Before starting, make sure the following are in place.- Python 3.10 or later.
pipavailable in your environment (verify withpip --version).- A virtual environment activated (recommended:
python -m venv .venv && source .venv/bin/activate). - An Actian VectorAI DB server running (default:
localhost:6574). - Internet access on first run —
sentence-transformersdownloads the embedding model (all-MiniLM-L6-v2, approximately 90 MB) from Hugging Face when you first callSentenceTransformer(EMBED_MODEL). - At least 512 MB of free memory to load the embedding model.
Step 1: Install dependencies
The following command installs the Actian VectorAI SDK and the sentence embedding library. Run it inside your virtual environment.| Package | Purpose |
|---|---|
actian-vectorai-client | Official Python SDK — async/sync clients, Filter DSL, gRPC transport. |
sentence-transformers | Open-source library for generating text embeddings. |
Step 2: Import libraries and configure
The following snippet imports every class needed for this tutorial and sets three constants that identify the server address, collection name, and embedding model. Running it loads the model into memory and prints the resolved configuration so you can confirm the values before proceeding.| Import | Purpose |
|---|---|
AsyncVectorAIClient | Manages the gRPC connection to VectorAI DB. |
Distance | Enum for similarity metrics (Cosine, Dot, Euclid). |
Field | Builds type-safe conditions on payload fields. |
FilterBuilder | Combines conditions with boolean logic (AND / OR / NOT). |
PointStruct | A data point: ID + vector + payload (metadata). |
VectorParams | Configuration for the vector space: dimension + distance. |
HnswConfigDiff | Tuning parameters for the HNSW search index. |
Expected output
The model loader may print aBertModel LOAD REPORTwarning aboutembeddings.position_idsmarked asUNEXPECTED. This can be safely ignored — it is a known artefact of loading sentence-transformer weights and does not affect embedding quality.
Step 3: Connect to the server
The following snippet opens a gRPC connection to the server, callshealth_check(), and prints the server’s version information. If the connection fails, an exception is raised inside the async with block and the error message identifies the problem.
Expected output
localhost:6574.
When check_connection() runs, the async with AsyncVectorAIClient(...) block manages the gRPC connection lifecycle. The client opens a channel to SERVER, runs the coroutine body including health_check(), and closes the channel when the block exits, so resources are released even if something fails. The sequence is as follows.
AsyncVectorAIClient(url=SERVER)creates a client instance.async withopens a gRPC channel and verifies the server is reachable.health_check()pings the server and returns status information.- When the
async withblock exits, the connection is closed cleanly.
Step 4: Create a collection
A collection is a named container for vectors. Think of it as a table in a relational database, but optimized for similarity search. The following snippet callsget_or_create, which creates the collection if it does not already exist. On subsequent runs it reuses the existing collection without error.
| Parameter | Value | Meaning |
|---|---|---|
size=384 | Vector dimension | Must match the embedding model’s output dimension. |
distance=Distance.Cosine | Similarity metric | Cosine similarity is ideal for sentence transformers. |
m=16 | HNSW graph connections | Each node connects to 16 neighbours — balances speed and recall. |
ef_construct=128 | Build-time search width | Higher values improve index quality at the cost of build time. |
Why use get_or_create
get_or_create is safe to call repeatedly. When the collection does not yet exist, the SDK creates it and returns True. When the collection already exists, the SDK skips creation and returns False. This boolean return value lets you log whether a new collection was provisioned, and your scripts become idempotent — safe to re-run without side effects.
Sync barrier: Always callcollections.get_info()immediately afterget_or_create()in the same connection block. This confirms the collection is fully committed on the server before the next step opens a new connection to write into it. Without it, a subsequentpoints.upsert()may raiseCollectionNotFoundErroreven though creation appeared to succeed.
Expected output
Step 5: Create embedding helpers
Expected output
384.
Batching matters for three reasons.
- Speed:
embed_textsprocesses all texts in a single forward pass through the model, which is significantly faster than callingembed_textin a loop. - Efficiency: Batching reduces CPU and memory overhead compared to encoding one string at a time.
- Best practice: Always batch when embedding more than a few texts.
Step 6: Prepare your data
Each movie becomes a point in the collection. A point has three parts.- ID — A unique identifier (integer or UUID string).
- Vector — An embedding of the movie’s plot description.
- Payload — Structured metadata (genre, year, rating, and so on).
Step 7: Embed and store the data
Expected output
After a successful upsert and flush, the stored count matches the number of points sent. The total reported byget_vector_count confirms all ten movies were persisted.
embed_textsconverts all 10 plots into 384-dimensional vectors in one batch.- Each movie becomes a
PointStructwith an integer ID, the plot vector, and the full metadata as payload. points.upsertsends the points to the server (“upsert” means insert-or-update).vde.flushensures the data is persisted to disk immediately.vde.get_vector_countconfirms how many vectors are stored.
Step 8: Run your first semantic search
| Parameter | Value | Purpose |
|---|---|---|
vector | Query embedding | The search finds vectors closest to this one. |
limit=5 | Top 5 results | Maximum number of results to return. |
with_payload=True | Include metadata | Returns title, genre, year, and other fields with each result. |
Expected output
limit is a maximum, not a guarantee. The number of results returned depends on how many points in the collection score above the internal threshold. Scores reflect the specific model and dataset — do not compare absolute score values across different models or collections.
Step 9: Filter by metadata
Filters restrict the candidate set before vector ranking. Actian VectorAI DB provides theField and FilterBuilder classes for this purpose.
Filter by genre
Expected output
Filter by minimum rating
Expected output
Step 10: Combine multiple filters
FilterBuilder supports three types of boolean logic.
| Method | Meaning | SQL equivalent |
|---|---|---|
.must() | All conditions must match. | AND |
.should() | At least one condition should match. | OR |
.must_not() | Exclude any points that match. | NOT |
Expected output
Step 11: Retrieve a specific movie by ID
Expected output
Step 12: Update movie metadata
Payload fields can be updated without re-embedding the vector.Expected output
set_payload merges the provided fields into the existing payload. Three properties define its behaviour.
- Merge behaviour: Only the specified fields are updated. All other fields in the existing payload remain unchanged.
- No re-embedding: The vector stays the same — only the metadata is modified, so there is no reprocessing cost.
- Immediate effect: Subsequent searches and retrievals reflect the updated values right away.
Add new fields
set_payload can also add entirely new keys to a point.
Expected output
add_tags with movie ID 0 and a list of four descriptive tags. The set_payload call merges the new tags field into the existing payload, leaving all previously stored fields — title, plot, genre, year, rating, and director — unchanged.
Step 13: Delete points
Delete by ID
Expected output
The vector count drops from 10 to 9, confirming that movie ID9 (Blade Runner 2049) was removed from the collection.
9 — corresponding to “Blade Runner 2049”, the last movie in the dataset — to points.delete(). After the deletion, vde.get_vector_count reads the updated total and prints it so you can confirm the point was removed.
Delete by filter
Step 14: Count points
The following snippet counts the total number of points in the collection, then runs filtered counts to check how many sci-fi movies exist, how many have a rating of 8.8 or higher, and how many were directed by Christopher Nolan.Expected output
Counts reflect the state after Step 12 (Interstellar’s rating updated to 8.8) and Step 13 (Blade Runner 2049 deleted).
The rating >= 8.8 count is 6, not 5 as you might expect from the raw dataset. This is because Interstellar’s rating was updated from 8.7 to 8.8 in Step 12 before this count runs.
Step 15: Inspect collection status
Expected output
This code connects to the server, callscollections.get_info()andvde.get_state()return raw integer enum values, not strings. Use theSTATUS_MAPandVDE_STATE_MAPdictionaries above to convert them. AVDE stateofactive(integer0) means the collection is ready for searches regardless of theStatusvalue. AStatusofredafter deletions is normal and does not affect search quality.
collections.get_info to retrieve the collection’s operational status and vector configuration, then calls vde.get_state to read the current VDE lifecycle state, and finally calls vde.get_vector_count to confirm the number of stored vectors. All three values are printed together so you can verify the collection is healthy and correctly configured before running searches.
Step 16: List all collections
Expected output
Because only one collection was created in this tutorial,collections.list() returns a single entry. The count in the header updates automatically as collections are added or removed.
collections.list(), which returns the names of all collections currently provisioned on the server. In this tutorial only one collection has been created, so the output lists Movies as the single entry.
Step 17: Put it all together — a complete search function
Expected output
Three calls are made with different queries and filter combinations. Each block shows the active filters and how many results matched before the ranked list is printed.min_rating >= 8.8. The third combines exclude_genre="crime" with min_year=1990. Each call prints the query, active filters, result count, and ranked movies with truncated plot descriptions.
Step 18: Cleanup
Expected output
The vector count reflects the state of the collection after all previous steps. The flush confirmation line indicates that any pending writes have been safely persisted to disk.vde.flush to ensure any pending writes are persisted to disk. The two lines that delete the collection are commented out — they are safe to uncomment when the tutorial data is no longer needed, but the collection is preserved by default so the data remains available for further experimentation.
What you learned
| Concept | API | What it does |
|---|---|---|
| Connect | AsyncVectorAIClient(url=...) | Open a gRPC connection to VectorAI DB. |
| Health check | client.health_check() | Verify the server is reachable. |
| Create collection | collections.get_or_create(vectors_config=VectorParams(...)) | Define a vector space with dimension and distance metric. |
| Embed text | SentenceTransformer.encode() | Convert text to a numerical vector. |
| Store data | points.upsert(collection, points=[PointStruct(...)]) | Insert or update points with vectors and metadata. |
| Persist | vde.flush(collection) | Write pending data to disk. |
| Semantic search | points.search(collection, vector=..., limit=5) | Find the most similar vectors. |
| Filter (equality) | Field("genre").eq("sci-fi") | Match a specific value. |
| Filter (range) | Field("rating").gte(8.5) | Numeric comparison. |
| Filter (exclude) | FilterBuilder().must_not(...) | Exclude matching points. |
| Combine filters | FilterBuilder().must(...).must(...).build() | Boolean AND/OR/NOT logic. |
| Get by ID | points.get(collection, ids=[0]) | Retrieve specific points. |
| Update metadata | points.set_payload(collection, payload={...}, ids=[0]) | Merge new fields into existing payloads. |
| Delete by ID | points.delete(collection, ids=[0]) | Remove specific points. |
| Delete by filter | points.delete(collection, filter=...) | Remove points matching conditions. |
| Count | vde.get_vector_count() + points.scroll() | Total and filtered counts. |
| Collection info | collections.get_info(collection) | Status and configuration. |
| Collection state | vde.get_state(collection) | VDE lifecycle state. |
| List collections | collections.list() | All collection names on the server. |
| Delete collection | collections.delete(collection) | Remove a collection entirely. |
Common patterns quick reference
Pattern 1: Search with optional filters
Useis not None rather than a truthiness check to avoid silently skipping valid falsy values such as 0.0.
Pattern 2: Upsert is idempotent
Callingupsert with the same ID replaces the existing point, so ingestion scripts can be re-run safely without creating duplicates. This makes bulk ingestion pipelines robust to restarts.
Pattern 3: Always flush after writes
Callvde.flush() immediately after points.upsert() to ensure data survives server restarts. Without it, recent writes may be lost if the server crashes.
Pattern 4: Use get_or_create for collections
get_or_create is safe to run on every application startup. It creates the collection if it does not exist and does nothing if it already does, so startup code does not need a separate existence check.
Next steps
- Predicate filters — Master the full Filter DSL with all field types and operators.
- Similarity search fundamentals — Explore search parameters, score thresholds, and pagination.
- Use open-source embedding models — Choose the right model for production.
- Optimizing retrieval quality — Tune HNSW parameters and search settings.
Predicate filters
Master the full Filter DSL with all field types and operators.
Similarity search fundamentals
Explore search parameters, score thresholds, and pagination.
Use open-source embedding models
Choose the right model and configure quantization for production.
Optimizing retrieval quality
Tune HNSW parameters, quantization, and search settings.