HP Vertica 7.1.x GettingStarted
HP Vertica 7.1.x GettingStarted
HP Vertica 7.1.x GettingStarted
Legal Notices
Warranty
The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be
construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein.
The information contained herein is subject to change without notice.
Copyright Notice
Copyright 2006 - 2015 Hewlett-Packard Development Company, L.P.
Trademark Notices
Adobe is a trademark of Adobe Systems Incorporated.
Microsoft and Windows are U.S. registered trademarks of Microsoft Corporation.
UNIX is a registered trademark of The Open Group.
Page 2 of 71
Contents
Contents
Downloading a VM
Starting the VM
Logging in as dbadmin
10
11
11
12
13
14
14
16
Advanced Installation
17
Querying Data
28
29
29
31
33
33
38
Page 3 of 71
44
44
Uninstalling HP Vertica
45
46
47
Tables
47
48
inventory_fact
49
customer_dimension
50
date_dimension
51
employee_dimension
52
product_dimension
53
promotion_dimension
54
shipping_dimension
54
vendor_dimension
55
warehouse_dimension
55
56
store_orders_fact
57
store_sales_fact
58
store_dimension
59
60
online_sales_fact
61
call_center_dimension
62
online_page_dimension
62
Sample Scripts
63
vmart_query_01.sql
63
vmart_query_02.sql
64
Page 4 of 71
vmart_query_03.sql
65
vmart_query_04.sql
65
vmart_query_05.sql
66
vmart_query_06.sql
67
vmart_query_07.sql
67
vmart_query_08.sql
68
vmart_query_09.sql
68
71
Page 5 of 71
Page 6 of 71
OR
l
The HP Vertica Administration Tools (See Running the Administration Tools in this guide for
details.)
The HP Vertica Management Console (See Using Management Console in this guide for
details.)
Page 7 of 71
1 CPU
1024 MB RAM
Bridged Networking
Downloading a VM
The HP Vertica VM is available both as an OVF template (for VMWare vSphere 4.0) and as a
VMDK file (for VMWare Server 2.0 and VMWare Workstation 7.0). Download and install the
appropriate file for your VMWare deployment from the myVertica portal at
http://www.vertica.com/documentation (registration required).
Starting the VM
1. Open the appropriate HP Vertica VM image file in VMWare. For example, open the VMX file if
you are using VMWare Workstation, or the OVF template if you are using VMWare vSphere.
2. Navigate to the settings for the VM image and adjust the network settings so that they are
compatible with your VM.
3. Start the VM. For example, in VMWare Workstation, select VM > Power > Power On.
Page 8 of 71
The RPM package name that the command returns contains the version and build numbers. If there
is a later version of HP Vertica, download it from the myVertica portal at
https://my.vertica.com/downloads (registration required). Upgrade instructions are provided in the
Installation Guide.
Object owner
dbadmin User
When you create a new database, a single database administrator account, dbadmin, is
automatically created along with a PUBLIC role. The database administrator bypasses all
permission checks and has the authority to perform all database operations, such as bypassing all
GRANT/REVOKE authorizations and any user granted PSEUDOSUPERUSER role.
Note: Although the dbadmin user has the same name as the Linux database administrator
account, do not confuse the concept of a database administrator with a Linux superuser (root)
privilege; they are not the same. A database administrator cannot have Linux superuser
privileges.
Object Owner
An object owner is the user who creates a particular database object; the owner can perform any
operation on that object. By default, only an owner or a database administrator can act on a
database object. In order to allow other users to use an object, the owner or database administrator
must grant privileges to those users using one of the GRANT statements. Object owners are
PUBLIC users for objects that other users own.
PUBLIC User
Page 9 of 71
All non- administrator and non-object owners are PUBLIC users. Newly created users do not have
access to schema PUBLIC by default. Make sure to GRANT USAGE ON SCHEMA PUBLIC to all
users you create.
Logging in as dbadmin
The first time you boot the VM you are automatically logged in and a web page displays further
instructions. To log back into the VM, use the following username and password.
l
Username: dbadmin
Password: password
Important: The dbadmin user has sudo privileges. Be sure to change the dbadmin and root
passwords with the Linux passwrd command.
Page 10 of 71
Page 11 of 71
Page 12 of 71
Tab
Up/Down Arrow
Space
Character
Page 13 of 71
store
online_Sales
Description
vmart_count_data.sql
vmart_define_schema.sql
vmart_gen.cpp
vmart_gen
Data generator
executable file.
Page 14 of 71
vmart_load_data.sql
vmart_ queries.sql
vmart_query_##.sql
vmart_schema_drop.sql
For more information about the schema, tables, and queries included with the VMart example
database, see the Appendix.
Page 15 of 71
A quick installation that lets you create the example database and start using it immediately.
See Quick Installation Using a Script in this guide for details. Use this method to bypass the
schema and table creation processes and start querying immediately.
Note: Both installation methods create a database named VMart. If you try both installation
methods, you will either need to drop the VMart database you created (see Restoring the
Status of Your Host in this guide) or create the subsequent database with a new name.
However, Hewlett-Packard strongly recommends that you start only one example database at
a time to avoid unpredictable results
This tutorial uses HP Vertica-provided queries, but you can follow the same set of procedures
later, when you create your own design and use your own queries file.
After you install the VMart database, the database has started. Connect to it using the steps in Step
3: Connecting to the Database.
Page 16 of 71
The example database log files, ExampleInstall.txt and ExampleDelete.txt, are written to
/opt/vertica/examples/log.
To start using your database, continue to Connecting to the Database in this guide. To drop the
example database, see Restoring the Status of Your Host in this guide.
Advanced Installation
To perform an advanced-but-simple installation, set up the VMart example database environment
and then create the database using the Administration Tools or Management Console.
Note: If you installed the VMart database using the quick installation method, you cannot
complete the following steps because the database has already been created.
To try the advanced installation, drop the example database (see Restoring the Status of Your
Host on this guide) and perform the advanced Installation, or create a new example database
Page 17 of 71
with a different name. However, Hewlett-Packard strongly recommends that you install only
one example database at a time to avoid unpredictable results.
The advanced installation requires the following steps:
l
Do not change directories while following this tutorial. Some steps depend on being in a
specific directory.
4. Run the sample data generator.
Page 18 of 71
$ ./vmart_gen
5. Let the program run with the default parameters, which you can review in the README file.
Using default parameters
datadirectory = ./
numfiles = 1
seed = 2
null = ' '
timefile = Time.txt
numfactsalesrows = 5000000
numfactorderrows = 300000
numprodkeys = 60000
numstorekeys = 250
numpromokeys = 1000
numvendkeys = 50
numcustkeys = 50000
numempkeys = 10000
numwarehousekeys = 100
numshippingkeys = 100
numonlinepagekeys = 1000
numcallcenterkeys = 200
numfactonlinesalesrows = 5000000
numinventoryfactrows = 300000
gen_load_script = false
Data Generated successfully !
Using default parameters
datadirectory = ./
numfiles = 1
seed = 2
null = ' '
timefile = Time.txt
numfactsalesrows = 5000000
numfactorderrows = 300000
numprodkeys = 60000
numstorekeys = 250
numpromokeys = 1000
numvendkeys = 50
numcustkeys = 50000
numempkeys = 10000
numwarehousekeys = 100
numshippingkeys = 100
numonlinepagekeys = 1000
numcallcenterkeys = 200
numfactonlinesalesrows = 5000000
numinventoryfactrows = 300000
gen_load_script = false
Data Generated successfully !
6. If the vmart_gen executable does not work correctly, recompile it as follows, and run the
sample data generator script again.
Page 19 of 71
Page 20 of 71
There is no need for a database administrator password in this tutorial. When you create a
production database, however, always specify an administrator password. Otherwise, the
database is permanently set to trust authentication (no passwords).
6. Select the hosts you want to include from your HP Vertica cluster and click OK.
This example creates the database on a one-host cluster. Hewlett-Packard recommends a
minimum of three hosts in the cluster. If you are using the HP Vertica Community Edition, you
are limited to three nodes.
7. Click OK to select the default paths for the data and catalog directories.
Catalog and data paths must contain only alphanumeric characters and cannot have leading
space characters. Failure to comply with these restrictions could result in database creation
failure.
When you create a production database, youll likely specify other locations than the default.
See Prepare Disk Storage Locations in the Administrators Guide for more information.
8. Since this tutorial uses a one-host cluster, a K-safety warning appears. Click OK.
Page 21 of 71
During database creation, HP Vertica automatically creates a set of node definitions based on
the database name and the names of the hosts you selected and returns a success message.
10. Click OK to close the Database VMart created successfully message.
Page 22 of 71
3. Click to select the appropriate existing cluster and click Create Database.
4. Follow the on-screen wizard, which prompts you to provide the following information:
n
Database name, which must be between 325 characters, starting with a letter, and
followed by any combination of letters, numbers, or underscores.
Page 23 of 71
(Optional) database administrator password for the database you want to create and
connect to.
IP address of a node in your database cluster, typically the IP address of the administration
host.
5. Click Next.
To configure and load data into the VMart database, complete the following steps:
n
If you installed the VMart database using the Quick Installation method, the schema, tables,
and data are already defined. You can choose to drop the example database (see Restoring the
Status of Your Host in this guide) and perform the Advanced Installation, or continue straight to
Querying Your Data in this guide.
Page 24 of 71
Defines two schemas in the VMart database schema: online_sales and store.
Page 25 of 71
VMart=> \i vmart_load_data.sql
Rows Loaded
------------1826
(1 row)
Rows Loaded
------------60000
(1 row)
Rows Loaded
------------250
(1 row)
Rows Loaded
------------1000
(1 row)
Rows Loaded
------------50
(1 row)
Rows Loaded
------------50000
(1 row)
Rows Loaded
------------10000
(1 row)
Rows Loaded
------------100
(1 row)
Rows Loaded
------------100
(1 row)
Rows Loaded
------------1000
(1 row)
Rows Loaded
------------200
(1 row)
Rows Loaded
------------5000000
(1 row)
Page 26 of 71
Rows Loaded
------------300000
(1 row)
VMart=>
Page 27 of 71
Querying Data
The VMart database installs with sample scripts that contain SQL commands that represent
queries that might be used in a real business. Use basic SQL commands to query the database, or
try out the following command. Once youre comfortable running the example queries, you might
want to write your own.
Note: The data that your queries return might differ from the example output shown in this
guide because the sample data generator is random.
Type the following SQL command to return the values for five products with the lowest fat content
in the Dairy department. The command selects the fat content from Dairy department products in
the product_dimention table in the public schema, orders them from low to high and limits the
output to the first five (the five lowest fat contents).
VMart => SELECT fat_content
FROM ( SELECT DISTINCT fat_content
FROM product_dimension
WHERE department_description
IN ('Dairy') ) AS food
ORDER BY fat_content
LIMIT 5;
The preceding example is from the vmart_query_01.sql file. You can execute more sample
queries using the scripts that installed with the VMart database or write your own. For a list of the
sample queries supplied with HP Vertica, see the Appendix.
Page 28 of 71
Note: Creating a database backup on a different cluster does not provide disaster recovery.
The cloned database you create with vbr.py is entirely separate from the original, and is not
kept synchronized with the database from which it is cloned.
If the epoch in the latest snapshot is earlier than the current ancient history mark (AHM).
Before and after you add, remove, or replace nodes in your database cluster.
Page 29 of 71
Note: When you restore a database snapshot, you must restore to a cluster that is identical to
the one where you created the snapshot. For this reason, always create a new snapshot after
adding, removing, or replacing nodes.
Ideally, create regular backups of your full database. You can run the HP Vertica vbr.py utility from
a cron job or other task scheduler.
The script prompts you to answer the following questions regarding the configuration file. Type
Enter to accept the default value in parentheses. See VBR Configuration File Reference in the
Administrators Guide for information about specific questions.
Snapshot name (backup_snapshot): Example_backup
Backup vertica configurations? (n) [y/n]: y
Number of restore points (1): 1
Specify objects (no default):
Vertica user name (dbadmin): dbadmin
Save password to avoid runtime prompt? (n) [y/n]: y
Password to save in vbr config file (no default): password
Node v_vmart_node0001
Backup host name (no default): localhost
Backup directory (no default): /home/dbadmin
Config file name (backup_snapshot.ini): exampleBackup.ini
Change advanced settings? (n) [y/n]: n
Saved vbr configuration to exampleBackup.ini.
After you answer the required questions, vbr.py generates a configuration file with the information
you supplied. Use the Config file name you specified when you run the --task backup or other
commands. The vbr.py utility uses the configuration file contents for both backup and restore
tasks.
Page 30 of 71
The backup location host has sufficient disk space to store the snapshots.
The user who starts the utility has write access to the target directories on the host backup
location.
Run the vbr.py script from a terminal using the database administrator account from an initiator
node in your database cluster. You cannot run the utility as root.
Use the --task backup and --config-file filename directives as shown in this example.
$ vbr.py --task backup --config-file exampleBackup.ini
Copying
[===============================================] 100%
All child processes terminated successfully.
Committing changes on all backup sites
backup done!
If the utility does not find a configuration file at this location, it fails with an error and exits.
The first time you run the vbr.py utility, it performs a full backup; subsequent runs with the same
configuration file create an incremental snapshot. When creating incremental snapshots, the utility
copies new storage containers, which can include data that existed the last time you backed up the
database, along with new and changed data since then. By default, vbr.py saves one archive
backup, unless you set the restorePointLimit parameter value in the configuration file to a value
greater than 1.
The backup directory exists and contains the snapshots from which to restore.
Page 31 of 71
The cluster to which you are restoring the backup has the same number of hosts as the one used
to create the snapshot; the node names and the IP addresses must also be identical.
The database you are restoring already exists on the cluster to which you are restoring data; the
database can be completely empty, without any data or schema. As long as the database name
matches the name in the snapshot, and all of the node names match the names of the nodes,
you can restore to it.
To begin a full database snapshot restore, log in using the database administrators account. You
cannot run the utility as root.
To restore the most recent snapshot, use the configuration file used to create the snapshot,
specifying vbr.py with the --task restore.
$ vbr.py --task restore --config-file exampleBackup.ini
Copying...
[==================================================] 100%
All child processes terminated successfully.
restore done!
You can restore a snapshot only to the database from which it was taken. You cannot restore a
snapshot into an empty database.
Page 32 of 71
Analyzes your logical schema, sample data, and, optionally, your sample queries.
Creates a physical schema design (a set of projections) that can be deployed automatically or
manually.
Can be run and rerun any time for additional optimization without stopping the database.
Use Database Designer to create a comprehensive design, which allows you to create new
projections for all tables in your database.
You can also use Database Designer to create an incremental design, which creates projections for
all tables referenced in the queries you supply. For more information, see Incremental Design in the
Administrators Guide.
You can create a comprehensive design with Database Designer using Management Console or
through Administration Tools. You can also choose to run Database Designer programmatically
(See About Running Database Designer Programmatically).
This section shows you how to:
l
Page 33 of 71
Note: To run Database Designer outside Administration Tools, you must be a dbadmin user. If
you are not a dbadmin user, you must have the DBDUSER role assigned to you and own the
tables for which you are designing projections.
You can choose to create the design manually or use the wizard. To create a design manually, see
Creating a Design Manually in the Administrator's Guide.
Set your browser so that it does not cache pages. If a browser caches pages, you may not be able
to see the new design added.
Follow these steps to use the wizard to create the comprehensive design in Management Console:
1. Log in to Management Console.
2. Verify that your database is up and running.
3. Choose the database for which you want to create the design. You can find the database under
the Recent Databases section or by clicking the Databases and Clusters page.
The database overview page opens:
Page 34 of 71
9. Select the schemas. Because the VMart design is a multi-schema database, select all three
schemas (public, store, and online_sales) for your design in the Select Sample Data window.
Click Next.
Page 35 of 71
If you include a schema that contains tables without data, the design could be suboptimal. You
can choose to continue, but HP Verticarecommends that you deselect the schemas that
contain empty tables before you proceed.
10. Choose the K-safety value for your design.The K-Safety value determines the number of buddy
projections you want database designer to create.
11. Submit query files to Database Designer in one of two ways:
a. Supply your own query files by selecting the Browse button.
b. Click Use Query Repository, which submits recently executed queries from the
QUERY_REQUESTS system table.
Click Next.
12. In the Execution Options window, select all the options you want. You can select all three
options or fewer.
Page 36 of 71
Analyze statistics: Select this option to run statistics automatically after design deploy to
help Database Designer make more optimal decisions for its proposed design.
Auto-build: Select this option to run Database Designer as soon as you complete the
wizard. This option only builds the proposed design.
Auto-deploy: Select this option for auto-build designs that you want to deploy automatically.
If you chose to automatically deploy your design, Database Designer executes in the
background.
If you did not select the Auto-build or Auto-deploy options, you can click Build Design or
Deploy Design on the Database Designer page.
Page 37 of 71
When the deployment completes, the My Design pane shows Design Deployed.
The event history window shows the details of the design build and deployment.
To run Database Designer with Administration Tools, see Run Database Designer with
Administration Tools in this guide.
Page 38 of 71
4. From the Configuration Menu, click Run Database Designer and then click OK.
5. When the Select a database for design dialog box opens, select VMart and then click OK.
If you are prompted to enter the password for the database, click OK to bypass the message.
Because no password was assigned when you installed the VMart database, you do not need
to enter one now.
6. Click OK to accept the default directory for storing Database Designer output and log files.
7. In the Database Designer window, enter a name for the design, for example, vmart_design,
and click OK. Design names can contain only alphanumeric characters or underscores. No
other special characters are allowed.
Page 39 of 71
8. Create a complete initial design. In the Design Type window, click Comprehensive and click
OK.
9. Select the schemas. Because the VMart design is a multi-schema database, you can select all
three schemas (online_sales, public, and store) for your design. Click OK.
If you include a schema that contains tables without data, the Administration Tools notifies you
that designing for tables without data could be suboptimal. You can choose to continue, but
Hewlett-Packard recommends that you deselect the schemas that contain empty tables before
you proceed.
10. In the Design Options window, accept all three options and click OK.
Page 40 of 71
Optimize with queries: Supplying the Database Designer with queries is especially
important if you want to optimize the database design for query performance. HewlettPackard recommends that you limit the design input to 100 queries.
Update statistics: Accurate statistics help the Database Designer choose the best strategy
for data compression. If you select this option, the database statistics are updated to
maximize design quality.
Deploy design: The new design deploys automatically. During deployment, new projections
are added, some existing projections retained, and any necessary existing projections
removed. Any new projections are refreshed to populate them with data.
11. Because you selected the Optimize with queries option, you must enter the full path to the
file containing the queries that will be run on your database. In this example, it is:
/opt/vertica/examples/VMart_Schema/vmart_queries.sql
The queries in the query file must be delimited with a semicolon (;).
12. Choose the K-safety value you want and click OK. The design K-Safety determines the
number of buddy projections you want database designer to create.
If you create a comprehensive design on a single node, you are not prompted to enter a Ksafety value.
13. In the Optimization Objective window, select Balanced query/load performance to create
a design that is balanced between database size and query performance. Click OK.
Page 41 of 71
Loads queries from the query file you provided (in this example,
/opt/vertica/examples/VMart_Schema/vmart_queries.sql).
Deploys the design or saves a SQL file containing the commands to create the design, based
on your selections in the Desgin Options window.
Depending on system resources, the design process could take several minutes. You should
allow this process to complete uninterrupted. If you must cancel the session, use Ctrl+C.
15. When Database Designer finishes, press Enter to return to the Administration Tools menu.
Examine the steps taken to create the design. The files are in the directory you specified to
store the output and log files. In this example, that directory is
/opt/vertica/examples/VMart_Schema. For more information about the script files, see
Page 42 of 71
Page 43 of 71
Page 44 of 71
Uninstalling HP Vertica
Perform the steps in Uninstalling HP Vertica in the Installation Guide.
Optional Steps
You can also choose to:
l
Page 45 of 71
Page 46 of 71
public
store
online_sales
Each schema contains tables that are created and loaded during database installation. See the
schema maps for a list of tables and their contents:
l
The VMart database installs with sample scripts that contain SQL commands that represent
queries that might be used in a real business. The sample scripts are available in the Sample
Scripts section of this Appendix. Once youre comfortable running the example queries, you might
want to write your own.
Tables
The three schemas in the VMart database include the following tables:
Page 47 of 71
public Schema
store Schema
online_sales
Schema
inventory_fact
store_orders_
fact
online_sales_fact
customer_dimension
store_sales_
fact
call_center_
dimension
date_dimension
store_dimension
online_page_
dimension
employee_dimension
product_dimension
promotion_dimension
shipping_dimension
vendor_dimension
warehouse_dimension
Page 48 of 71
inventory_fact
This table contains information about each product in inventory.
Page 49 of 71
Column Name
date_key
INTEGER
No
product_key
INTEGER
No
product_version
INTEGER
No
warehouse_key
INTEGER
No
qty_in_stock
INTEGER
No
customer_dimension
This table contains information about all the retail chains customers.
Column Name
Data Type
NULLs
customer_key
INTEGER
No
customer_type
VARCHAR(16)
Yes
customer_name
VARCHAR(256)
Yes
customer_gender
VARCHAR(8)
Yes
title
VARCHAR(8)
Yes
household_id
INTEGER
Yes
customer_address
VARCHAR(256)
Yes
customer_city
VARCHAR(64)
Yes
customer_state
CHAR(2)
Yes
customer_region
VARCHAR(64)
Yes
marital_status
VARCHAR(32)
Yes
customer_age
INTEGER
Yes
number_of_children
INTEGER
Yes
annual_income
INTEGER
Yes
occupation
VARCHAR(64)
Yes
Page 50 of 71
largest_bill_amount
INTEGER
Yes
store_membership_card
INTEGER
Yes
customer_since
DATE
Yes
deal_stage
VARCHAR(32)
Yes
deal_size
INTEGER
Yes
last_deal_update
DATE
Yes
date_dimension
This table contains information about dates. It is generated from a file containing correct date/time
data.
Column Name
Data Type
NULLs
date_key
INTEGER
No
date
DATE
Yes
full_date_description
VARCHAR(18)
Yes
day_of_week
VARCHAR(9)
Yes
day_number_in_calendar_month
INTEGER
Yes
day_number_in_calendar_year
INTEGER
Yes
day_number_in_fiscal_month
INTEGER
Yes
day_number_in_fiscal_year
INTEGER
Yes
last_day_in_week_indicator
INTEGER
Yes
last_day_in_month_indicator
INTEGER
Yes
calendar_week_number_in_year
INTEGER
Yes
calendar_month_name
VARCHAR(9)
Yes
calendar_month_number_in_year
INTEGER
Yes
calendar_year_month
CHAR(7)
Yes
Page 51 of 71
calendar_quarter
INTEGER
Yes
calendar_year_quarter
CHAR(7)
Yes
calendar_half_year
INTEGER
Yes
calendar_year
INTEGER
Yes
holiday_indicator
VARCHAR(10)
Yes
weekday_indicator
CHAR(7)
Yes
selling_season
VARCHAR(32)
Yes
employee_dimension
This table contains information about all the people who work for the retail chain.
Column Name
Data Type
NULLs
employee_key
INTEGER
No
employee_gender
VARCHAR(8)
Yes
courtesy_title
VARCHAR(8)
Yes
employee_first_name
VARCHAR(64)
Yes
employee_middle_initial
VARCHAR(8)
Yes
employee_last_name
VARCHAR(64)
Yes
employee_age
INTEGER
Yes
hire_date
DATE
Yes
employee_street_address
VARCHAR(256)
Yes
employee_city
VARCHAR(64)
Yes
employee_state
CHAR(2)
Yes
employee_region
CHAR(32)
Yes
job_title
VARCHAR(64)
Yes
reports_to
INTEGER
Yes
Page 52 of 71
salaried_flag
INTEGER
Yes
annual_salary
INTEGER
Yes
hourly_rate
FLOAT
Yes
vacation_days
INTEGER
Yes
product_dimension
This table describes all products sold by the department store chain.
Column Name
Data Type
NULLs
product_key
INTEGER
No
product_version
INTEGER
No
product_description
VARCHAR(128)
Yes
sku_number
CHAR(32)
Yes
category_description
CHAR(32)
Yes
department_description
CHAR(32)
Yes
package_type_description
CHAR(32)
Yes
package_size
CHAR(32)
Yes
fat_content
INTEGER
Yes
diet_type
CHAR(32)
Yes
weight
INTEGER
Yes
weight_units_of_measure
CHAR(32)
Yes
shelf_width
INTEGER
Yes
shelf_height
INTEGER
Yes
shelf_depth
INTEGER
Yes
product_price
INTEGER
Yes
product_cost
INTEGER
Yes
Page 53 of 71
lowest_competitor_price
INTEGER
Yes
highest_competitor_price
INTEGER
Yes
average_competitor_price
INTEGER
Yes
discontinued_flag
INTEGER
Yes
promotion_dimension
This table describes every promotion ever done by the retail chain.
Column Name
Data Type
NULLs
promotion_key
INTEGER
No
promotion_name
VARCHAR(128)
Yes
price_reduction_type
VARCHAR(32)
Yes
promotion_media_type
VARCHAR(32)
Yes
ad_type
VARCHAR(32)
Yes
display_type
VARCHAR(32)
Yes
coupon_type
VARCHAR(32)
Yes
ad_media_name
VARCHAR(32)
Yes
display_provider
VARCHAR(128)
Yes
promotion_cost
INTEGER
Yes
promotion_begin_date
DATE
Yes
promotion_end_date
DATE
Yes
shipping_dimension
This table contains information about shipping companies that the retail chain uses.
Column Name Data Type NULLs
shipping_key
INTEGER
No
ship_type
CHAR(30)
Yes
Page 54 of 71
ship_mode
CHAR(10)
Yes
ship_carrier
CHAR(20)
Yes
vendor_dimension
This table contains information about each vendor that provides products sold through the retail
chain.
Column Name
Data Type
NULLs
vendor_key
INTEGER
No
vendor_name
VARCHAR(64) Yes
vendor_address
VARCHAR(64) Yes
vendor_city
VARCHAR(64) Yes
vendor_state
CHAR(2)
vendor_region
VARCHAR(32) Yes
deal_size
INTEGER
Yes
last_deal_update
DATE
Yes
Yes
warehouse_dimension
This table provides information about each of the chains warehouses.
Column Name
Data Type
NULLs
warehouse_key
INTEGER
No
warehouse_name
VARCHAR(20)
Yes
warehouse_address
VARCHAR(256)
Yes
warehouse_city
VARCHAR(60)
Yes
warehouse_state
CHAR(2)
Yes
warehouse_region
VARCHAR(32)
Yes
Page 55 of 71
Page 56 of 71
store_orders_fact
This table contains information about all orders made at the companys brick-and-mortar stores.
Page 57 of 71
Column Name
Data Type
NULLs
product_key
INTEGER
No
product_version
INTEGER
No
store_key
INTEGER
No
vendor_key
INTEGER
No
employee_key
INTEGER
No
order_number
INTEGER
No
date_ordered
DATE
Yes
date_shipped
DATE
Yes
expected_delivery_date
DATE
Yes
date_delivered
DATE
Yes
quantity_ordered
INTEGER
Yes
quantity_delivered
INTEGER
Yes
shipper_name
VARCHAR(32)
Yes
unit_price
INTEGER
Yes
shipping_cost
INTEGER
Yes
total_order_cost
INTEGER
Yes
quantity_in_stock
INTEGER
Yes
reorder_level
INTEGER
Yes
overstock_ceiling
INTEGER
Yes
store_sales_fact
This table contains information about all sales made at the companys brick-and-mortar stores.
Column Name
Data Type
NULLs
date_key
INTEGER
No
Page 58 of 71
product_key
INTEGER
No
product_version
INTEGER
No
store_key
INTEGER
No
promotion_key
INTEGER
No
customer_key
INTEGER
No
employee_key
INTEGER
No
pos_transaction_number
INTEGER
No
sales_quantity
INTEGER
Yes
sales_dollar_amount
INTEGER
Yes
cost_dollar_amount
INTEGER
Yes
gross_profit_dollar_amount
INTEGER
Yes
transaction_type
VARCHAR(16)
Yes
transaction_time
TIME
Yes
tender_type
VARCHAR(8)
Yes
store_dimension
This table contains information about each brick-and-mortar store within the retail chain.
Column Name
Data Type
NULLs
store_key
INTEGER
No
store_name
VARCHAR(64)
Yes
store_number
INTEGER
Yes
store_address
VARCHAR(256)
Yes
store_city
VARCHAR(64)
Yes
store_state
CHAR(2)
Yes
store_region
VARCHAR(64)
Yes
Page 59 of 71
floor_plan_type
VARCHAR(32)
Yes
photo_processing_type
VARCHAR(32)
Yes
financial_service_type
VARCHAR(32)
Yes
selling_square_footage
INTEGER
Yes
total_square_footage
INTEGER
Yes
first_open_date
DATE
Yes
last_remodel_date
DATE
Yes
number_of_employees
INTEGER
Yes
annual_shrinkage
INTEGER
Yes
foot_traffic
INTEGER
Yes
monthly_rent_cost
INTEGER
Yes
Page 60 of 71
online_sales_fact
This table describes all the items purchased through the online store front.
Column Name
Data Type
NULLs
sale_date_key
INTEGER
No
ship_date_key
INTEGER
No
product_key
INTEGER
No
product_version
INTEGER
No
customer_key
INTEGER
No
call_center_key
INTEGER
No
online_page_key
INTEGER
No
shipping_key
INTEGER
No
warehouse_key
INTEGER
No
promotion_key
INTEGER
No
pos_transaction_number
INTEGER
No
Page 61 of 71
sales_quantity
INTEGER
Yes
sales_dollar_amount
FLOAT
Yes
ship_dollar_amount
FLOAT
Yes
net_dollar_amount
FLOAT
Yes
cost_dollar_amount
FLOAT
Yes
gross_profit_dollar_amount
FLOAT
Yes
transaction_type
VARCHAR(16)
Yes
call_center_dimension
This table describes all the chains call centers.
Column Name Data Type
NULLs
call_center_key
INTEGER
No
cc_closed_date
DATE
Yes
cc_open_date
DATE
Yes
cc_date
VARCHAR(50)
Yes
cc_class
VARCHAR(50)
Yes
cc_employees
INTEGER
Yes
cc_hours
CHAR(20)
Yes
cc_manager
VARCHAR(40)
Yes
cc_address
VARCHAR(256) Yes
cc_city
VARCHAR(64)
Yes
cc_state
CHAR(2)
Yes
cc_region
VARCHAR(64)
Yes
online_page_dimension
This table describes all the pages in the online store front.
Page 62 of 71
Column Name
Data Type
NULLs
online_page_key
INTEGER
No
start_date
DATE
Yes
end_date
DATE
Yes
page_number
INTEGER
Yes
page_description
VARCHAR(100) Yes
page_type
VARCHAR(100) Yes
Sample Scripts
You can create your own queries, but the VMart example directory includes sample query script
files to help you get started quickly.
You can find the following sample scripts at this path /opt/vertica/examples/VMart_Schema.
To run any of the scripts, enter
=>
\i <script_name>
vmart_query_01.sql
-----
vmart_query_01.sql
FROM clause subquery
Return the values for five products with the
lowest-fat content in the Dairy department
SELECT fat_content
FROM (
SELECT DISTINCT fat_content
FROM product_dimension
WHERE department_description
IN ('Dairy') ) AS food
ORDER BY fat_content
LIMIT 5;
Output
Page 63 of 71
fat_content
------------80
81
82
83
84
(5 rows)
vmart_query_02.sql
-----
vmart_query_02.sql
WHERE clause subquery
Asks for all orders placed by stores located in Massachusetts
and by vendors located elsewhere before March 1, 2003:
Output
order_number | date_ordered
-------------+-------------53019 | 2003-02-10
222168 | 2003-02-05
160801 | 2003-01-08
106922 | 2003-02-07
246465 | 2003-02-10
234218 | 2003-02-03
263119 | 2003-01-04
73015 | 2003-01-01
233618 | 2003-02-10
85784 | 2003-02-07
146607 | 2003-02-07
296193 | 2003-02-05
55052 | 2003-01-05
144574 | 2003-01-05
117412 | 2003-02-08
276288 | 2003-02-08
185103 | 2003-01-03
282274 | 2003-01-01
245300 | 2003-02-06
143526 | 2003-01-04
59564 | 2003-02-06
...
Page 64 of 71
vmart_query_03.sql
-----
vmart_query_03.sql
Noncorrelated subquery
Requests female and male customers with the maximum
annual income from customers
Output
customer_name
| annual_income
------------------+--------------James M. McNulty |
999979
Emily G. Vogel
|
999998
(2 rows)
vmart_query_04.sql
-- vmart_query_04.sql
-- IN predicate
-- Find all products supplied by stores in MA
SELECT DISTINCT s.product_key, p.product_description
FROM store.store_sales_fact s, public.product_dimension p
WHERE s.product_key = p.product_key
AND s.product_version = p.product_version AND s.store_key IN (
SELECT store_key
FROM store.store_dimension
WHERE store_state = 'MA')
ORDER BY s.product_key;
Output
product_key |
product_description
-------------+---------------------------------------1 | Brand #1 butter
1 | Brand #2 bagels
2 | Brand #3 lamb
2 | Brand #4 brandy
2 | Brand #5 golf clubs
2 | Brand #6 chicken noodle soup
3 | Brand #10 ground beef
Page 65 of 71
3 |
3 |
3 |
3 |
4 |
4 |
4 |
4 |
5 |
5 |
6 |
6 |
6 |
6 |
...
Brand
Brand
Brand
Brand
Brand
Brand
Brand
Brand
Brand
Brand
Brand
Brand
Brand
Brand
vmart_query_05.sql
------
vmart_query_05.sql
EXISTS predicate
Get a list of all the orders placed by all stores on
January 2, 2003 for the vendors with records in the
vendor_dimension table
Output
store_key | order_number | date_ordered
-----------+--------------+-------------98 |
151837 | 2003-01-02
123 |
238372 | 2003-01-02
242 |
263973 | 2003-01-02
150 |
226047 | 2003-01-02
247 |
232273 | 2003-01-02
203 |
171649 | 2003-01-02
129 |
98723 | 2003-01-02
80 |
265660 | 2003-01-02
231 |
271085 | 2003-01-02
149 |
12169 | 2003-01-02
141 |
201153 | 2003-01-02
1 |
23715 | 2003-01-02
156 |
98182 | 2003-01-02
44 |
229465 | 2003-01-02
178 |
141869 | 2003-01-02
134 |
44410 | 2003-01-02
141 |
129839 | 2003-01-02
Page 66 of 71
205
113
99
44
|
|
|
|
54138
63358
50142
131255
|
|
|
|
2003-01-02
2003-01-02
2003-01-02
2003-01-02
...
vmart_query_06.sql
-----
vmart_query_06.sql
EXISTS predicate
Orders placed by the vendor who got the best deal
on January 4, 2004
Output
store_key | order_number | date_ordered
-----------+--------------+-------------45 |
202416 | 2004-01-04
24 |
250295 | 2004-01-04
121 |
251417 | 2004-01-04
198 |
75716 | 2004-01-04
166 |
36008 | 2004-01-04
27 |
150241 | 2004-01-04
148 |
182207 | 2004-01-04
9 |
188567 | 2004-01-04
113 |
66017 | 2004-01-04
...
vmart_query_07.sql
-----
vmart_query_07.sql
Multicolumn subquery
Which products have the highest cost,
grouped by category and department
Output
Page 67 of 71
product_description
|
sku_number
|
department_description
---------------------------+-----------------------+--------------------------------Brand #601 steak
| SKU-#601
| Meat
Brand #649 brooms
| SKU-#649
| Cleaning supplies
Brand #677 veal
| SKU-#677
| Meat
Brand #1371 memory card
| SKU-#1371
| Photography
Brand #1761 catfish
| SKU-#1761
| Seafood
Brand #1810 frozen pizza
| SKU-#1810
| Frozen Goods
Brand #1979 canned peaches | SKU-#1979
| Canned Goods
Brand #2097 apples
| SKU-#2097
| Produce
Brand #2287 lens cap
| SKU-#2287
| Photography
...
vmart_query_08.sql
-- vmart_query_08.sql
-- Using pre-join projections to answer subqueries
-- between online_sales_fact and online_page_dimension
SELECT page_description, page_type, start_date, end_date
FROM online_sales.online_sales_fact f, online_sales.online_page_dimension d
WHERE f.online_page_key = d.online_page_key
AND page_number IN
(SELECT MAX(page_number)
FROM online_sales.online_page_dimension)
AND page_type = 'monthly' AND start_date = '2003-06-02';
Output
page_description
| page_type | start_date | end_date
---------------------------+-----------+------------+----------Online Page Description #1 | monthly
| 2003-06-02 | 2003-06-11
Online Page Description #1 | monthly
| 2003-06-02 | 2003-06-11
Online Page Description #1 | monthly
| 2003-06-02 | 2003-06-11
Online Page Description #1 | monthly
| 2003-06-02 | 2003-06-11
Online Page Description #1 | monthly
| 2003-06-02 | 2003-06-11
Online Page Description #1 | monthly
| 2003-06-02 | 2003-06-11
Online Page Description #1 | monthly
| 2003-06-02 | 2003-06-11
Online Page Description #1 | monthly
| 2003-06-02 | 2003-06-11
Online Page Description #1 | monthly
| 2003-06-02 | 2003-06-11
Online Page Description #1 | monthly
| 2003-06-02 | 2003-06-11
Online Page Description #1 | monthly
| 2003-06-02 | 2003-06-11
Online Page Description #1 | monthly
| 2003-06-02 | 2003-06-11
(12 rows)
vmart_query_09.sql
-- vmart_query_09.sql
-- Equi join
-- Joins online_sales_fact table and the call_center_dimension
Page 68 of 71
Output
sales_quantity | sales_dollar_amount | transaction_type |
cc_name
----------------+---------------------+------------------+------------------7 |
589 | purchase
| Central Midwest
8 |
589 | purchase
| South Midwest
8 |
589 | purchase
| California
1 |
587 | purchase
| New England
1 |
586 | purchase
| Other
1 |
584 | purchase
| New England
4 |
584 | purchase
| New England
7 |
581 | purchase
| Mid Atlantic
5 |
579 | purchase
| North Midwest
8 |
577 | purchase
| North Midwest
4 |
577 | purchase
| Central Midwest
2 |
575 | purchase
| Hawaii/Alaska
4 |
573 | purchase
| NY Metro
4 |
572 | purchase
| Central Midwest
1 |
570 | purchase
| Mid Atlantic
9 |
569 | purchase
| Southeastern
1 |
569 | purchase
| NY Metro
5 |
567 | purchase
| Other
7 |
567 | purchase
| Hawaii/Alaska
9 |
567 | purchase
| South Midwest
1 |
566 | purchase
| New England
...
Page 69 of 71
Page 70 of 71
Page 71 of 71