What is a Vector Database?

A Vector Database, at its essence, is a relational database system specifically designed to process vectorized data. Unlike conventional databases that contain information in tables, rows, and columns, vector databases work with vectors–arrays of numerical values that signify points in multidimensional space.

How Does a Vector Database Work ?

Vector Database is a type of database that is used in various machine learning use cases. They are specialized for the storage and retrieval of vector data.

What are embeddings?

Embedding is a data like words that have been converted into an array of numbers known as a vector that contains patterns of relationships the combination of these numbers that make up the vector act as a multi-dimensional map to measure similarity.

embeddings
Embeddings

The combination of these numbers that make up the vector act as a multi-dimensional map to measure similarity.

Object to Vector transformation
Object to Vector Transformation

Let’s see an example describe a 2d graph the words dog and puppy are often used in similar situations.

2D-Graph
2D Graph

So in a word embedding they would be represented by vectors that are close together.

Embedding of Word
Embedding of Word

Well this is a simple 2D example of a single dimension in reality the vector has hundreds of Dimensions that cover the rich multi-dimensional complex relationship between words.

Example

Images can also be turned into vectors. Google does similar images searches and the image sections are broken down into arrays of numbers allowing you to find patterns of similarity for those with closely resembling vectors.

ImageSections
Image Sections

Once an embedding is created it can be stored in a database and a database full of these is considered as a vector database.

VectorDatabase-(1)
Vector Database

Vector database can be used in several ways, searching where results are ranked by relevance to a query string or clustering where text strings are grouped by similarity and recommendations where items with related text strings are recommended also classification where text strings are classified by their most similar label.

Here's an another example how a vector database typically works:

A vector database operates like a super-fast library for storing and retrieving high dimensional data. It employs specific containers referred to as vectors in which numerical values that represent various features of the data are stored. These vectors are smartly organized so that one can find similar ones quickly.

When you ask a question or make some query, the database finds all relevant vectors and gives answers to your questions. It is as if you have an enchanted librarian who can effortlessly locate what you need, even when the data seems to be complicated.