Oracle AI Vector Search Professional
Oracle AI Vector Search Professional
When generating vector embeddings outside the database, what is the most suitable option
for storing the embeddings for later use?
in a CSV file.
In a binary FVEC file with the relational data in a CSV file.
In the database as BLOB (Binary Large Object) data
In a dedicated vector database.
2. When generating vector embeddings for a new dataset outside of Oracle Database 23ai,
which factor is crucial to ensure meaningful similarity search results?
The choice of programming language used to process the dataset (for example, Python,
Java).
3. You are working with vector search in Oracle Database 23ai and need to ensure the integrity
of your vector data during storage and retrieval. Which factor is crucial for maintaining the
accuracy and reliability of your vector search results?
Using the same embedding model for both vector creation and similarity search.
Regularly updating vector embeddings to reflect changes in the source data.
The specific distance algorithm employed for vector comparisons.
The physical storage location of the vector data.
4. Which DDL operation is NOT permitted on a table containing a VECTOR column in Oracle
Database 23ai?
Creating a new table using CTAS CREATE TABLE AS SELECT that includes the VECTOR column
from the original table.
Dropping an existing VECTOR column from the table.
Modifying the data type of an existing VECTOR column to a non-VECTOR type.
Adding a new VECTOR column to the table.
5. Which SQL statement correctly adds a VECTOR column named v with 4 dimensions and
FLOAT 32 format to an existing table named my table?.
Add the TARGET ACCURACY clause to the query with a higher value for the accuracy.
Change the index type to HNSW for better accuracy.
Increase the VECTOR MEMORY SIZE initialization parameter.
Re-create the index with a higher EFCONSTRUCTION value.
7. What happens when querying with an IVF index if you increase the value of the NEIGHBOR
PARTITION probes parameter?
The number of centroids decreases.
Accuracy decreases.
Index creation time is reduced.
More partitions are probed, improving accuracy, but also increasing query latency.
8. Which PL/SQL package is primarily used for interacting with Generative Al services in Oracle
Database 23ai?
DBMS AI.
DBMS ML.
DBMS VECTOR CHAIN.
DBMS GENAI.
9. Which SQL function is used to create a vector embedding for a given text string in Oracle
Database 23ai?
GENERATE EMBEDDING.
CREATE VECTOR_EMBEDDING.
EMBED TEXT.
VECTOR EMBEDDING.
10. Which PL/SQL function converts documents such as PDF, DOC, JSON, XML, or HTML to plain
text?
DBMS VECTOR.TEXT_TO_PLAIN.
DBMS VECTOR_CHAIN. UTL TO TEXT.
DBMS VECTOR CHAIN.UTIL_TO_CHUNKS.
DBMS VECTOR.CONVERT_TO_TEXT.
11. What is the primary purpose of the DBMS_VECTOR_CHAIN_UTL_TO_CHUNS package in a
RAG application?
12. What is the first step in setting up the practice environment for Select Al?
13. How is the security interaction between Autonomous Database and OCI Generative Al
managed in the context of Select Al?
14. You are storing 1,000 embeddings in a VECTOR column, each with 256 dimensions using
FLOAT32. What is the approximate size of the data on disk?
a) 1 MB.
b) 4 MB.
c) 256 KB.
d) 1 GB.
15. Which Oracle Cloud Infrastructure (OCI) service is directly integrated with Select Al?
a) 000 Language.
b) OCI Generative Al.
c) OCT Vision.
d) OCI Data Science.
16. Which is NOT a feature or capability related to Al and Vector Search in Exadata?
a) Native Support for Vector Search Only within the Database Server.
b) Vector Replication with Golden Gate.
c) Loading Vector Data using SQL *Loader.
d) Al Smart Scan.
17. Which statement best describes the core functionality and benefit of Retrieval Augmented
Generation (RAG) in Oracle Database 23ai?
a) It empowers LLMs to interact with private enterprise data stored within the
database, leading to more context-aware and precise responses to user queries.
b) It primarily aims to optimize the performance and efficiency of LLMs by using
advanced data retrieval techniques, thus minimizing response times, and reducing
computational overhead.
c) It allows users to train their own specialized LLMs directly within the Oracle
Database environment using their internal data, thereby reducing the reliance on
external Al providers.
d) It enables Large Language Models (LLMs) to access and process real-time data
streams from diverse sources to generate the most up-to-date insights.
18. If a query vector uses a different distance metric than the one used to create the index, what
happens?
The query fails.
An exact match search is triggered.
The index automatically updates.
A warning is logged, but the query executes.
19. What are the key advantages and considerations of using Retrieval Augmented Generation
(RAG) in the context of Oracle Al Vector Search?
It excels at optimizing the performance and efficiency of LLM inference through advanced
caching and precomputation techniques, leading to faster response times but potentially
storage requirements.
It prioritizes real-time data extraction and summarization from various sources to ensure the
LLM always has the most up-to-date information.
It focuses on training specialized LLMs within the database environment for specific tasks,
offering greater control over model behavior and data privacy but potentially requiring more
development effort.
It leverages existing database security and access controls, thereby enabling secure and
controlled access to both the database content and the LLM.
20. Which Python library is used to vectorize text chunks and the user's question in the following
example?
import oracledb
connection oracledb, connect (uner-un, password-pw, den-es)
table name - Page
with connection.cursor() as cursort
Create the table
create_table_sql
CREATE TABLE IF NOT EXISTS (table_name) (
id NUMBER PRIMARY KEY,
payload CLOR CHECK (payload TS JSON).
vector VECTOR)
try:
cursor.execute(create_table_sql)
except oracledb.DatabaseError as es:
raise
connection.autocommit True
from sentence_transformers import Sentence Transformer encoder
Sentence Transformer ('all MiniLM-L12-v2').
a) sentence_transformers.
b) oci.
c) oracledb.
d) Json.
21. What is the function of the COSINE parameter in the SQL query used to retrieve similar
vectors?
‘’’
topk = 3
sqlf"select payload, vector distance (vector, vector, COSINE) as score
from (table_name) order by score fetch approxirat (topk) rows only"".
22. You are tasked with finding the closest matching sentences across books, where each book
has multiple paragraphs and sentences. Which SQL structure should you use?
a. GROUP BY with vector operations.
b. FETCH PARTITIONS BY clause.
c. A nested query with ORDER BY.
d. Exact similarity search with a single query vector.
23. In the following Python code, what is the significance of prepending the source filename to
each text chunk before storing it in the vector database?
‘’’
docs = [{"text": filename + "/" + section, 'path': filename} for filename, sections in faqs.item()
for section in sections]
# Sample the resulting data
docs [:2].
‘’’
24. How does an application use vector similarity search to retrieve relevant information from a
database, and how is this information then integrated into the generation process?
a) Encodes the question and database chunks into vectors, finds the most
similar using cosine similarity, and includes them in the LLM prompt.
b) Trains a separate LLM on the database and uses it to answer, ignoring the
general LLM.
c) Converts the question to keywords, searches for matches, and inserts the
text into the response.
d) Clusters similar text chunks and randomly selects one from the most
relevant cluster.
25. When using SQL "Loader to load vector data for search applications, what is a critical
consideration regarding the formatting of the vector data within the input CSV file?
26. Which function is used to generate vector embeddings within an Oracle database?
a) DBMS_VECTOR_CHAIN.UTL_TO_CHUNKS.
b) DBMS_VECTOR_CHAIN.UTL_TO_TEXT.
c) DBMS_VECTOR_CHAIN.UTL_TO_EMBEDDINGS.
d) DBMS_VECTOR_CHAIN.UTL_TO_GENERATE_TEXT.
27. Which statement best describes the capability of Oracle Data Pump for handling vector data
in the context of vector search applications?
a) Data Pump only export and import vector data if the vector embeddings are stored
as BLOB (Binary Large Object) data types in the database.
b) Data Pump treats vector embeddings as regular text strings, which can lead to data
corruption or loss of precision when transferring vector data for vector search.
c) Data Pump provides native support for exporting and importing tables containing
vector data types, facilitating the transfer of vector data for vector search
applications.
d) Because of the complexity of vector data, Data Pump requires a specialized plug-in
to handle the export and import operations involving vector data types.
28. What happens when you attempt to insert a vector with an incorrect number of dimensions
into a VECTOR column with a defined number of dimensions?
29. In Oracle Database 23ai, which data type is used to store vector embeddings for similarity
search?
a) VECTOR2.
b) BLOB.
c) VECTOR.
d) VARCHAR2.
30. What is created to facilitate the use of OCI Generative Al with Autonomous Database?
31. Why would you choose to NOT define a specific size for the VECTOR column during
development?
32. What is the correct order of steps for building a RAG application using PL/SQL in Oracle
Database 23ai?
a. Load ONNX Model, Vectorize Question, Load Document, Split Text into Chunks,
Create Embeddings, Perform Vector Search, Generate Output.
b. Load Document, Split Text into Chunks, Load ONNX Model, Create Embeddings,
Vectorize Question, Perform Vector Search, Generate Output.
c. Data pumlVectorize Question, Load ONNX Model, Load Document, Split Text into
Chunks, Create Embeddings, Perform Vector Search, Generate Output.
d. Load Document, Load ONNX Model, Split Text into Chunks, Create Embeddings,
Vectorize Question, Perform Vector Search, Generate Output.
33. What is the primary purpose of a similarity search in Oracle Database 23ai?
34. What is the advantage of using Euclidean Squared Distance rather than Euclidean Distance in
similarity search queries.
35. You need to prioritize accuracy over speed in a similarity search for a dataset of images.
Which should you use?
a. Approximate similarity search with HNSW indexing and target accuracy of 70%.
b. Multisector similarity search with partitioning.
c. Exact similarity search using a full table scan.
d. Approximate similarity search with IVF indexing and target accuracy of 70%.
36. What is the significance of splitting text into chunks in the process of loading data into Oracle
Al Vector Search?
a. To reduce the computational burden on the embedding model.
b. To facilitate parallel processing of the data during vectorization.
c. To minimize token truncation as each vector embedding model has its own
maximum token limit.
37. What is the purpose of the VECTOR_DISTANCE function in Oracle Database 23ai similarity
search?
38. You are asked with creating a table to store vector embeddings with the following
characteristics: Each vector must have exactly 512 dimensions. The dimensions should be
stored as 32-bit floating point numbers. Which SQL statement should you use?
39. Which function should you use to determine the storage format of a vector?
a. VECTOR_DIMENSION_FORMAT.
b. VECTOR_CHUNKS.
c. VECTOR_NORM.
d. VECTOR_EMBEDDING.
RAG
41. You need to generate a vector from the string [1.2, 3.4] in FLOAT32 format with 2
dimensions. Which function will you use?
a. TO_VECTOR.
b. VECTOR_DISTANCE.
c. FROM_VECTOR.
d. VECTOR_SERIALIZE.
42. What is the primary purpose of the VECTOR_EMBEDDING function in Oracle Database 23ai?
44. What is the primary function of Al Smart Scan in Exadata System Software 24ai?
45. Which parameter is used to define the number of closest vector candidates considered
during HNSW index creation?
a. EFCONSTRUCTION.
b. VECTOR_MEMORY_SIZE.
c. NEIGHBOURS.
d. TARGET_ACCURACY.
46. You want to quickly retrieve the top-10 matches for a query vector from a dataset of billions
of vectors, prioritizing speed over exact accuracy. What is the best approach?
a. SELECT.
b. UPDATE.
c. DELETE.
d. JOIN ON VECTOR columns.
50. You are asked to fetch the top five vectors nearest to a query vector, but only for a specific
category of documents. Which query structure should you use?
51. What is the primary function of an embedding model in the context of vector search?
52. What is the significance of using local ONNX models for embedding within the database?
53. Which of the following actions will result in an error when using
VECTOR_DIMENSION_COUNT () in Oracle Database 23ai?
a. Providing a vector with a dimensionality that exceeds the specified dimension count.
b. Using a vector with a data type that is not supported by the function.
c. Providing a vector with duplicate values for its components.
d. Calling the function on a vector that has been created with to vector().
54. An application needs to fetch the top-3 matching sentences from a dataset of books while
ensuring a balance between speed and accuracy. Which query structure should you use?
55. You are asked with finding the closest matching sentences across books, where each book
has multiple paragraphs and sentences. Which SQL structure should you use?
56. What is the primary difference between the HNSW and IVF vector indexes in Oracle
Database 23ai?
57. A database administrator wants to change the VECTOR MEMORY SIZE parameter for a
pluggable database (PDB) in Oracle Database 23a. Which SQL command is correct?
58. Which vector index available in Oracle Database 23ai is known for its speed and accuracy,
making it a preferred choice for vector search?
a. Binary Tree (BT) index.
b. Inverted File System (IFS) index.
c. Inverted File System (IFS) index.
d. Hierarchical Navigable Small World (HNSW) index.
59. What is the purpose of the Vector Pool in Oracle Database 23ai?
60. What is the default distance metric used by the VECTOR DISTANCE function if none is
specified?
a. Euclidean.
b. Hamming.
c. Cosine.
d. Manhattan.
61. In Oracle Database 23ai, which SQL function calculates the distance between two vectors
using the Euclidean metric?
a. L1 DISTANCE.
b. L2 DISTANCE.
c. HAMMING DISTANCE.
d. COSINE DISTANCE.
62. What is a key advantage of using Goldengate 23ai for managing and distributing vector data
for Al applications?
63. What happens when you attempt to insert a vector with an incorrect number of dimensions
into a VECTOR column with a defined number of dimensions?
a. The database pads the vector with zeros to match the defined dimensions.
b. The database ignores the defined dimensions and inserts the vector as is.
c. The database truncates the vector to fit the defined dimensions.
d. The insert operation fails, and an error message is thrown.
64. Which function should you use to determine the storage format of a vector?
a. VECTOR_CHUNKS
b. VECTOR_EMBEDDING
c. VECTOR_NORM
d. VECTOR_DIMENSION_FORMAT
67. What is the first step in setting up the practice environment for Select AI?
a. Optionally create an OCI compartment
b. Create a policy to enable access to OCI Generative AI
c. Drop any compartment that does not use OCI Generative AI
d. Create a new user account with elevated privileges
68. You are tasked with creating a table to store vector embeddings with the followig
characterstics:Each vector must have exactly 512 dimensions ,aand the dimesnions should be
stored as 32 bitfloating poit numbers.Which sql statement should you use?