4.3. Indexing JSON Fields

Documentation

VoltDB Home » Documentation » Performance Guide

4.3. Indexing JSON Fields

The queries executed in the previous section all require a full table scan to compute the results. With large data sets these queries could be costly in terms of compute cycles and time. To speed up query execution for these types of queries, should they be executed frequently, you can define an index on the commonly accessed fields. Again the FIELD() function comes into play. VoltDB supports defining function-based indexes.

To significantly improve the query execution time of the queries in the prior section, the following two indexes should be created:

CREATE INDEX session_site_moderator
    ON user_session_table (field(json_data, 'site'), 
                field(json_data, 'moderator'), username);

CREATE INDEX session_props 
    ON user_session_table 
        (field(field(json_data, 'props'), 'download_version'), 
         field(field(json_data, 'props'), 'client_language'),   
         username);

These are fully functional SQL indexes. Whenever you create or update a record in the user_session_table table, VoltDB runs the FIELD() function to extract the specified field from the JSON value and stores the result inside the index. When you query by that same field in the future, VoltDB will use the index and avoid the table scan. Additionally, using the index usually avoids JSON string processing.