AI - Oracle

Vector Search in Oracle Database 26ai

The use case: Help Me Learn My New Car

As much as folks are lamenting about the new cloud version of Oracle support, I have other issues on my mind- like trying to acclimate to my first electric vehicle, even though I already owned a gas-powered Mini Cooper. 

As I was looking for a good use case to test new vector search with Oracle 26ai, it occurred to me that I could use something better than a key word search when using the manual to my new car. 

What if I used the manual, which is public information and could be easily made searchable as my use case?

Imagine you run a support portal (internal or customer-facing) with lots of short “support notes”:

  • “My MINI won’t charge with a public CCS fast charger…”
  • “The app shows charging error after an OTA update…”
  • “My EV plug latch is stuck…”

You discover quickly that keyword search often fails because people describe the same problem in different words. Vector search solves this by comparing meaning, not exact terms.  I’ve been facing something similar as I try to come up to speed with my car, so I decided to see if AI could help me out.

So in this blog post, I will work on:

  1. store the user manual and its embedding vector in a table
  2. create an HNSW vector index for fast nearest-neighbor search
  3. run a query that returns the most semantically similar sections in the manual to stop having to search for the info and often, failing.

Oracle 26ai is a strong fit because it lets you do this inside the database with a native VECTOR type, SQL functions, and built-in vector indexing options.

Create Object, Load Data, ETC.

Assumptions

  • You already have embeddings (from your app / model pipeline), or
  • You’ll generate embeddings in-database via DBMS_VECTOR.UTL_TO_EMBEDDING using a model you’ve loaded (optional section below). Oracle Documentation

Dimension note: pick a dimension that matches your embedding model (e.g., 768, 1024, 1536, etc.).

Table to store notes and vectors

1) Drop & create a table for support notes + embeddings

begin

execute immediate 'drop table support_notes purge';
 exception
 when others then
   if sqlcode != -942 then raise; end if;
end;

/
create table support_notes (
note_id     number generated by default as identity primary key,
created_at  timestamp default systimestamp not null,
title       varchar2(200) not null,
body        clob not null,
-- VECTOR(d) is the native vector datatype (d = embedding dimension)
embedding   vector(1536) not null
);

Oracle’s native VECTOR datatype is designed for storing embeddings directly in tables, so this was surprisingly easy.

Insert sample data (I have two options)

Option A: Insert embeddings you already have

Below I show the “shape” of the insert. Replace the […] with real embedding values from your model pipeline.

Make sure to insert precomputed embeddings (example shape; replace with real vectors) when using this yourself:

insert into support_notes (title, body, embedding) values (
  'Fast charging fails at public station',
  'CCS fast charger starts then stops after 5-10 seconds. Happens on multiple stations.',
  vector('[0.0123, -0.0456, 0.0789, ...]')  -- 1536 floats
);
insert into support_notes (title, body, embedding) values (
'Charging error after software update',
'After an OTA update, the car reports a charging error intermittently, especially on Level 2.',
vector('[0.0011, -0.0234, 0.0567, ...]')
);

commit;

Option B (this is optional), generate embeddings in the database:

Oracle 26ai supports generating embeddings via DBMS_VECTOR.UTL_TO_EMBEDDING (using JSON params and an in-database model you’ve loaded).

This Assumes you already loaded an ONNX embedding model named ‘doc_model’.

var params clob;

exec :params := '{"provider":"database","model":"doc_model"}';

insert into support_notes (title, body, embedding)
values (
'EV plug latch stuck',
'The charging connector latch won’t release unless I unlock twice.',
dbms_vector.utl_to_embedding('The charging connector latch won’t release unless I unlock twice.', json(:params))
);

commit;

Create the vector index (HNSW)

For fast similarity search, create an HNSW index, i.e. we’ll create an HNSW vector index (in-memory neighbor graph):

create vector index support_notes_hnsw_idx
on support_notes (embedding)
organization inmemory neighbor graph
distance cosine
with target accuracy 90;

I discovered how to do this pattern for HNSW vector indexes in Oracle AI Vector Search in the following blog.

Run A Top-K Similar Vector Search

You provide a query vector and ask for the nearest neighbors by cosine distance.  The following query will find the five most similar notes in the manual as part of an input vector.  We’ll need to replace the :qvec with query embedding(VECTOR(1536)) and in a lot of apps, you can just compute this embedding in the app tier and then bind it.

select
 note_id,
 title,
 vector_distance(embedding, :qvec, cosine) as dist
from support_notes
order by dist
fetch first 5 rows only;

VECTOR_DISTANCE is one of the core vector SQL functions used for similarity search, so it makes this easy, too.

Why use Oracle for vector search?

  • One platform for OLTP + AI search: keep your transactional data, security model, and vector similarity search in the same database instead of bolting on a separate vector store. Oracle positions AI Vector Search specifically around native VECTOR storage + in-database generation options.
  • First-class indexing choices: Oracle supports multiple vector index types (including HNSW and others) and documents when to use them.
  • In-database embedding support: if you want, you can load an ONNX embedding model and generate embeddings with DBMS_VECTOR.
  • Hybrid search path: if you later want “keywords + meaning”, Oracle documents hybrid vector indexing that combines text and vector search in one approach.

Below is a single, run-it-in-SQLcl demo script for Oracle Database 26ai that does hybrid search (keyword + vector) over a realistic “2024 MINI Cooper SE user manual” use case.

  • Dimension: VECTOR(1536)
  • Distance: COSINE
  • Hybrid: Oracle Text (keyword) + Vector index (semantic)
  • Embeddings: generated in-db via DBMS_VECTOR.UTL_TO_EMBEDDING (you’ll plug in your model/provider)
-- mini_manual_hybrid_demo.sql
-- Hybrid search demo for Oracle 26ai: Oracle Text + Vector Search (COSINE, 1536 dims)
-- Use case: Chunked 2024 MINI Cooper SE manual content (simplify manual navigation)

set echo on
set feedback on
set pagesize 200
set linesize 200
set serveroutput on
whenever sqlerror exit sql.sqlcode

prompt =========================================================
prompt 0) Preconditions
prompt - You need an embedding model accessible to DBMS_VECTOR
prompt - Update the PARAMS JSON below to match environment
prompt =========================================================

--------------------------------------------------------------------------------
-- 1) Drop old objects (safe rerun)
--------------------------------------------------------------------------------
begin
  execute immediate 'drop table mini_manual_chunks purge';
exception when others then
  if sqlcode != -942 then raise; end if;
end;
/

begin
  execute immediate 'drop sequence mini_chunk_src_seq';
exception when others then
  if sqlcode != -2289 then raise; end if;
end;
/

--------------------------------------------------------------------------------
-- 2) Create table: manual chunks + metadata + embedding
--------------------------------------------------------------------------------
prompt =========================================================
prompt 1) Create storage table for chunked manual text + vectors
prompt =========================================================

create table mini_manual_chunks (
  chunk_id        number generated by default as identity primary key,
  source_id       number not null,
  doc_title       varchar2(200) not null,
  section_path    varchar2(500) not null,  -- e.g. "Charging > troubleshooting"
  page_hint       varchar2(50),
  chunk_text      clob not null,
  -- dimension is fixed at 1536 (match your embedding model)
  embedding       vector(1536) not null,
  created_at      timestamp default systimestamp not null
);

create sequence mini_chunk_src_seq start with 1000 increment by 1;

--------------------------------------------------------------------------------
-- 3) Helper: embedding params + convenience procedure to insert chunks
--------------------------------------------------------------------------------
prompt =========================================================
prompt 2) Setup embedding params + insert helper
prompt =========================================================

-- IMPORTANT: Update this JSON for environment.
--  - {"provider":"database","model":"MINI_MANUAL_EMBED_1536"}
--  - {"provider":"oci","credential_name":"OCI_CRED","endpoint":"...","model":"..."}
--  - {"provider":"openai","credential_name":"...","model":"text-embedding-3-large"}
--
-- Keep the dimension consistent with VECTOR(1536).
var EMBED_PARAMS clob
begin
  :EMBED_PARAMS := '{
    "provider": "database",
    "model": "MINI_MANUAL_EMBED_1536"
  }';
end;
/

create or replace procedure add_manual_chunk(
  p_source_id    in number,
  p_doc_title    in varchar2,
  p_section_path in varchar2,
  p_page_hint    in varchar2,
  p_chunk_text   in clob
) as
  v_emb vector(1536);
begin
  -- Generate embedding in-database
  v_emb := dbms_vector.utl_to_embedding(
             p_chunk_text,
             json(:EMBED_PARAMS)
           );

  insert into mini_manual_chunks(source_id, doc_title, section_path, page_hint, chunk_text, embedding)
  values (p_source_id, p_doc_title, p_section_path, p_page_hint, p_chunk_text, v_emb);
end;
/
show errors

--------------------------------------------------------------------------------
-- 4) Insert realistic sample manual chunks (MINI Cooper SE focused)
--    The ones I chose are short, “chunkable” blocks like you'd create after parsing the PDF.
--------------------------------------------------------------------------------
prompt =========================================================
prompt 3) Insert realistic MINI Cooper SE manual-style sample chunks
prompt =========================================================

declare
  v_src number := mini_chunk_src_seq.nextval;
begin
  add_manual_chunk(
    v_src,
    '2024 MINI Cooper SE Owner''s Manual' 
    'Charging > Overview',
    'p. 110',
    q'[
Charging your vehicle: You can charge using AC charging (Level 1/Level 2) or DC fast charging when equipped and when supported by the charging station.
Always follow the instructions on the charging station and ensure the connector is fully seated before leaving the vehicle.
If charging does not start, verify the vehicle is unlocked (if required), the connector is latched, and the station is authorized.
]'
  );

  add_manual_chunk(
    v_src,
    '2024 MINI Cooper SE Owner''s Manual',
    'Charging > AC charging > Tips',
    'p. 114',
    q'[
AC charging: Use a compatible charging cable and confirm the outlet or EVSE is operating properly.
If the charging cable has status indicators, check for fault lights. Avoid using extension cords unless explicitly permitted by the EVSE manufacturer.
Charging may be reduced or interrupted to protect the battery under extreme temperatures.
]'
  );

  add_manual_chunk(
    v_src,
    '2024 MINI Cooper SE Owner''s Manual',
    'Charging > DC fast charging > Troubleshooting',
    'p. 120',
    q'[
DC fast charging: If the charging session starts and then stops, try re-initiating the session and confirm the connector is fully engaged.
Some stations require re-authorization if the session is interrupted.
If repeated attempts fail, try a different stall or station. Inspect the connector for debris and ensure the charge port area is clear.
]'
  );

  add_manual_chunk(
    v_src,
    '2024 MINI Cooper SE Owner''s Manual',
    'Charging > Charge port & connector',
    'p. 112',
    q'[
Charge port: Keep the charge port and connector clean and dry. Do not insert objects into the charge port.
If the connector does not release, confirm the vehicle is unlocked and follow the recommended release procedure.
Do not force the connector; damage may occur.
]'
  );

  add_manual_chunk(
    v_src,
    '2024 MINI Cooper SE Owner''s Manual',
    'Driver assistance > Lighting > Headlights',
    'p. 68',
    q'[
Automatic headlight control: When set to the default position, the vehicle controls exterior lights based on ambient light conditions.
Daytime running lights may be active during daylight. In low-light conditions, the system turns on headlights and tail lamps automatically.
Driver remains responsible for using appropriate lighting in poor visibility.
]'
  );

  add_manual_chunk(
    v_src,
    '2024 MINI Cooper SE Owner''s Manual',
    'Infotainment > Mobile app > Charging status',
    'p. 142',
    q'[
Charging status in the mobile app: The app may show charging state, estimated completion time, and notifications.
If the app shows an error but the vehicle is charging, refresh the status or verify connectivity.
For persistent issues, confirm the vehicle services are active and the phone has network access.
]'
  );

  commit;
end;
/
prompt Insert complete.

--------------------------------------------------------------------------------
-- 5) Create HYBRID indexes:
--    a) Oracle Text index on chunk_text for keyword search
--    b) Vector index (HNSW) on embedding for semantic search using cosine distance
--------------------------------------------------------------------------------
prompt =========================================================
prompt 4) Create indexes for HYBRID search (Text + Vector)
prompt =========================================================

-- Oracle Text (keyword / BM25-style scoring with CONTAINS)
-- Note: CTXSYS must be installed/configured 
create index mini_manual_text_idx
on mini_manual_chunks(chunk_text)
indextype is ctxsys.context;

-- Vector index (HNSW) for cosine similarity
create vector index mini_manual_vec_idx
on mini_manual_chunks(embedding)
organization inmemory neighbor graph
distance cosine
with target accuracy 90;

--------------------------------------------------------------------------------
-- 6) Queries
--------------------------------------------------------------------------------
prompt =========================================================
prompt 5) DEMO: Pure keyword search (Oracle Text)
prompt =========================================================

-- Keyword query example: "connector does not release"
-- Score is available via SCORE(1)
select
  chunk_id,
  section_path,
  page_hint,
  score(1) as text_score,
  substr(chunk_text, 1, 160) as preview
from mini_manual_chunks
where contains(chunk_text, 'connector AND release', 1) > 0
order by text_score desc
fetch first 5 rows only;

prompt =========================================================
prompt 6) DEMO: Pure vector similarity search (semantic)
prompt =========================================================

-- Create a query embedding from a user question (in-db)
var Q_TEXT clob
var QVEC  vector(1536)

begin
  :Q_TEXT := 'My fast charging starts then stops after a few seconds. What should I check?';
  :QVEC   := dbms_vector.utl_to_embedding(:Q_TEXT, json(:EMBED_PARAMS));
end;
/

select
  chunk_id,
  section_path,
  page_hint,
  vector_distance(embedding, :QVEC, cosine) as dist,
  substr(chunk_text, 1, 160) as preview
from mini_manual_chunks
order by dist
fetch first 5 rows only;

prompt =========================================================
prompt 7) DEMO: HYBRID search (keyword + vector)
prompt =========================================================

-- Hybrid strategy:
--  1) Use keyword query to enforce topical relevance (CONTAINS)
--  2) Use vector distance to order by semantic similarity within that set

var KQ varchar2(4000)
begin
  :KQ := 'charging AND (fast OR DC OR station)';
end;
/

select
  chunk_id,
  section_path,
  page_hint,
  score(1) as text_score,
  vector_distance(embedding, :QVEC, cosine) as dist,
  substr(chunk_text, 1, 160) as preview
from mini_manual_chunks
where contains(chunk_text, :KQ, 1) > 0
order by dist
fetch first 5 rows only;

prompt =========================================================
prompt 8) DEMO: “Simplify the manual” output (a friendly answer-shaped result)
prompt =========================================================

-- In a real app you’d pass the top chunk(s) to your summarizer / RAG layer.
-- For demo, we return the best chunk as “what to do next”.

select
  'Suggested section: ' || section_path || ' (' || page_hint || ')' as suggestion,
  chunk_text as recommended_text
from (
  select section_path, page_hint, chunk_text,
         vector_distance(embedding, :QVEC, cosine) as dist
  from mini_manual_chunks
  where contains(chunk_text, :KQ, 1) > 0
  order by dist
)
fetch first 1 row only;

prompt =========================================================
prompt DONE.
prompt Next step: replace sample chunks with your parsed manual chunks.
prompt =========================================================

It took me a while to get this to work the way I wanted, had to view several examples, push some of the code through AI and I’m still fiddling with it, but it’s a start to making it easier for me to search the manual with simple text vs. keywords.

So, Does this Solve My “Simplify the Whole Manual” Goal?

I’m parsing the manual PDF into chunks (200–800 tokens-ish each), each with:

  1. section_path (table of contents path)
  2. page_hint (page or location marker)
  3. chunk_text (the actual chunk)
  4. Then generating embeddings (you can do it in the database, like the script, or in your app pipeline)
  5. Hybrid search:
    • Oracle Text: catches exact terms, model numbers, “DRL”, “CCS”, “release procedure”
    • Vector search: catches meaning (“starts then stops”, “interrupts after a few seconds”, “session fails”)
  6. The output:
    • Retrieve top 3–10 chunks
    • Summarize into “steps to try” + link to section/page

Example Query and Result Set

SQL> select
2 'Suggested section: ' || section_path || ' (' || page_hint || ')' as suggestion,
3 chunk_text as recommended_text
4 from (
5 select section_path, page_hint, chunk_text,
6 vector_distance(embedding, :QVEC, cosine) as dist
7 from mini_manual_chunks
8 where contains(chunk_text, :KQ, 1) > 0
9 order by dist
10 )
11 fetch first 1 row only;
SUGGESTION
Suggested section: Charging > DC fast charging > Troubleshooting (p. 120)


RECOMMENDED_TEXT

DC fast charging: If the charging session starts and then stops, try re-initiating the session and confirm the connector is fully engaged. Some stations require re-authorization if the session is interrupted. If repeated attempts fail, try a different stall or station. Inspect the connector for debris and ensure the charge port area is clear.

Why Oracle for this specific use case of manual simplification?

  • You get one system that can store: chunk text, metadata, vectors, and run both keyword + semantic search in SQL (less glue, fewer moving parts).
  • You can keep security and access control in the database (who can search which manuals / editions) if there was proprietary information which shouldn’t be assessable to the public.
  • You can run this close to your data and integrate it with your existing app patterns (APIs, SQL, PL/SQL), and scale it with indexes designed for the job (Oracle Text + HNSW).

I’ll revisit this as I keep coming up to speed with my new car.  I really knew my 2015 Mini Cooper JCW all the way down to the bolts and I’d like to understand my new 2024 Mini Cooper SE Resolute as well.  I’m hoping this will be a way to speed up that knowledge transfer.

http://about.me/dbakevlar