Beginner's Guide to Semantic Kernel with Dapper, PostgreSQL

Introduction

Semantic Kernel is an open-source framework by Microsoft that integrates machine learning and natural language processing to build AI applications.

It offers tools for tasks like language understanding, text generation, and semantic search, emphasizing modularity and easy customization.

The tutorial guides through setting up a C# Web API for uploading and querying text files using PostgreSQL with PGVector for vector database, configuring database services and dependencies, creating embeddings with AzureOpenAI, and utilizing the Semantic Kernel Memory plugin for persistent storage and intelligent querying. The example demonstrates the practical steps needed to implement and interact with these technologies in a seamless manner.

Prerequisites

A C# compatible IDE, such as Visual Studio, Rider, or VS Code
An account with a compatible AI service (for this tutorial, I am using Azure Open AI)
Docker

Setting up Postgres with PGVector

To use Semantic Kernel, we need a vector database. Vector databases are specialized databases designed to store, index, and query high-dimensional vectors, enabling efficient similarity search and retrieval for tasks like machine learning and natural language processing.

For this tutorial, I’m going to use PostgreSQL. PostgreSQL is an open-source relational database management system known for its robustness, extensibility, and support for advanced data types and complex queries. Through the use of the PGVector extension, it is also very capable of handling vector embeddings. The easiest way to get started with this is to set up Docker and deploy the pgvector container with the following command:

docker run --name postgres -e POSTGRES_PASSWORD=mysecretpassword -p 5432:5432 -d pgvector/pgvector:pg16

Creating the Application

For this tutorial I’m going to set up a new C# Web API, that allows us to simply upload and query a text file.

💡

The full code for this example is available on github at https://github.com/davek-dev/semantic-kernel-demo

Open up the IDE and start a new web project.

Installing the Dependencies

Before we start, we’re going to need to install the following packages form Nuget into our project:

Dapper
Pgvector.Dapper

And if you’re using Azure Open AI then the following:

Microsoft.AspNetCore.OpenApI
Microsoft.KernelMemory.Core
Microsoft.KernelMemory.MemoryDb.Postgres
Microsoft.KernelMemory.SemanticKernelPlugin
Microsoft.SemanticKernel
Microsoft.SemanticKernel.Connectors.OpenAI

For different AI providers different packages will be required to some of those above.

Configuring the Database

The first thing we need to do is set up our database to enable the PGVector extension. The full DbContext.cs is here, but in summary, we have an Init() function that is called during the program start-up to create the database:

    private async Task CreateDatabase()
    {
        // create database if it doesn't exist
        await using var connection = new NpgsqlConnection(DbConstants.ConnectionString.Replace(DbConstants.DatabaseName, "postgres"));
        var sqlDbCount = $"SELECT EXISTS(SELECT datname FROM pg_catalog.pg_database WHERE datname = '{DbConstants.DatabaseName}');";
        var dbCount = await connection.ExecuteScalarAsync<int>(sqlDbCount);
        if (dbCount == 0)
        {
            var sql = $"CREATE DATABASE \"{DbConstants.DatabaseName}\"";
            await connection.ExecuteAsync(sql);
        }
    }

and then enable the PGVector extension:

    private async Task ConfigureVectors()
    {
        await using var connection =  CreateConnection() as NpgsqlConnection;
        await using (var cmd = new NpgsqlCommand("CREATE EXTENSION IF NOT EXISTS vector", connection))
        {
            await cmd.ExecuteNonQueryAsync();
        }
        await connection.ReloadTypesAsync();
    }

It is important to call ReloadTypesAsync() on the connection to ensure vector types are loaded into our application.

In our CreateConnection() function, we also configure our data source to use vectors:

    public IDbConnection CreateConnection()
    {
        var dataSourceBuilder = new NpgsqlDataSourceBuilder(DbConstants.ConnectionString);
        dataSourceBuilder.UseVector();
        var dataSource = dataSourceBuilder.Build();
        return dataSource.OpenConnection();
    }

We also need to configure Dapper to handle vector types. We can do this by calling SqlMapper.AddTypeHandler(new VectorTypeHandler()); in our startup code. In this example, I have moved our startup code to create and configure our database into an extension method that is called when the program starts.

    public static void ConfigureDatabaseServices(this WebApplication app)
    {
        SqlMapper.AddTypeHandler(new VectorTypeHandler());

        using var scope = app.Services.CreateScope();
        var context = scope.ServiceProvider.GetRequiredService<DbContext>();
        Task.Run(() => context.Init()).Wait();
    }

Adding AI

To use AI in our applications, we basically need to split our process into 2 parts. Loading the data which involves creating embeddings from our data, and querying the embeddings once we have loaded it.

Embeddings

Embeddings are dense vector representations of data, like words or images, that capture relationships and similarities. This makes processing and analysis in machine learning and natural language processing tasks more efficient.

In our example, we use AzureOpenAITextEmbeddingGeneration to create embeddings.

After creating the embeddings, we store them in our database for querying.

Adding the Memory

To store the vectors in our PostgreSQL database, we can use the Semantic Kernel Memory plugin.

The Semantic Kernel Memory plugin enhances the Semantic Kernel framework by enabling persistent memory storage for embeddings and other data. This allows applications to remember context and improve their ability to retrieve relevant information over time. It helps in tasks like conversational AI and recommendation systems by facilitating more intelligent interactions and better performance.

Again, we set up the Kernel and Memory at startup through an extension method. The full code for setting up Semantic Kernel and Memory is available here, but to summarise:

Firstly, I had to specify the embedding dimensions as 1536 (EmbeddingDimensions = 1536), since PGVector has a limit on vector dimensions, which the newest OpenAI models exceed. 1536 should be sufficient for most use cases.

You can also see some other configurations around partition size and tokens. You can adjust these to best suit your use case or leave them as they are.

Then, we add the PostgreSQL connection string and build the kernel memory.

Finally we build the Kernel Memory, build the KErnel then add the memory to the Kernel through the plugin:

    //...ommited for brevity    
    var kernelBuilder = Kernel.CreateBuilder();

    kernelBuilder.Services.AddAzureOpenAIChatCompletion(AzureOpenAiChatCompletionDeployment, AzureOpenAiEndpoint,
        AzureOpenAiApiKey);

    var kernel = kernelBuilder.Build();

    var plugin = new MemoryPlugin(memory, "kernelMemory");
    kernel.ImportPluginFromObject(plugin, "memory");

Consuming the Kernel

To use the Kernel we have just built, for this tutorial I have created a simple service that can simply import test into memory or query the memory to produce a response:

public class SemanticKernelService(Kernel kernel, IKernelMemory memory) : ISemanticKernelService
{
    public async Task ImportText(string text)
    {
        await memory.ImportTextAsync(text);
    }

    public async Task<string> Query(string query)
    {
        var result = await memory.AskAsync(query);
        return result.Result;
    }
}

And created a couple of Endpoints in our Program.cs to allow us to upload a text file to create embeddings, and then query the data:

//...ommited for brevity  
app.MapPost("Query",
        async (ISemanticKernelService semanticKernelService, string query) =>
        {
            var result = await semanticKernelService.Query(query);
            return TypedResults.Ok(result);
        })
    .WithOpenApi(operation =>
    {
        operation.Summary = "Query Transcriptions";
        operation.Description = "Query Transcriptions";
        operation.Tags = new List<OpenApiTag> { new() { Name = "AI" } };

        return operation;
    });

app.MapPost("/upload",
        async (IFormFile file) =>
        {
            await using var stream = file.OpenReadStream();
            var buffer = new byte[stream.Length];
            await stream.ReadAsync(buffer);
            var text = Encoding.UTF8.GetString(buffer);

            var semanticKernelService = app.Services.GetRequiredService<ISemanticKernelService>();
            await semanticKernelService.ImportText(text);

            return TypedResults.Ok();
        })
    .DisableAntiforgery();

app.Run();

Running Our Application

We can run our application, upload a text file and query it through the build in Swagger page.

In the below example I simply uploaded this text file and then asked a question to the query endpoint:

I hope you enjoyed this tutorial and found it useful. If you did please give me a shout out.

Beginner's Guide to Semantic Kernel with Dapper, PostgreSQL, and PGVector in C#

Introduction

Prerequisites

Setting up Postgres with PGVector

Creating the Application

Installing the Dependencies

Configuring the Database

Adding AI

Embeddings

Adding the Memory

Consuming the Kernel

Running Our Application

Comments

More from this blog

.NET MAUI - Saving PDF Files with ITextSharp

.NET MAUI - Working with PDF Files Pt. 2

.NET MAUI - Working with PDF Files Pt. 1

.NET MAUI - Basic Navigation

Command Palette

Introduction

Prerequisites

Setting up Postgres with PGVector

Creating the Application

Installing the Dependencies

Configuring the Database

Adding AI

Embeddings

Adding the Memory

Consuming the Kernel

Running Our Application

Comments

More from this blog