Search | arXiv e-print repository

Stage: Query Execution Time Prediction in Amazon Redshift

Authors: Ziniu Wu, Ryan Marcus, Zhengchun Liu, Parimarjan Negi, Vikram Nathan, Pascal Pfeil, Gaurav Saxena, Mohammad Rahman, Balakrishnan Narayanaswamy, Tim Kraska

Abstract: Query performance (e.g., execution time) prediction is a critical component of modern DBMSes. As a pioneering cloud data warehouse, Amazon Redshift relies on an accurate execution time prediction for many downstream tasks, ranging from high-level optimizations, such as automatically creating materialized views, to low-level tasks on the critical path of query execution, such as admission, scheduli… ▽ More Query performance (e.g., execution time) prediction is a critical component of modern DBMSes. As a pioneering cloud data warehouse, Amazon Redshift relies on an accurate execution time prediction for many downstream tasks, ranging from high-level optimizations, such as automatically creating materialized views, to low-level tasks on the critical path of query execution, such as admission, scheduling, and execution resource control. Unfortunately, many existing execution time prediction techniques, including those used in Redshift, suffer from cold start issues, inaccurate estimation, and are not robust against workload/data changes. In this paper, we propose a novel hierarchical execution time predictor: the Stage predictor. The Stage predictor is designed to leverage the unique characteristics and challenges faced by Redshift. The Stage predictor consists of three model states: an execution time cache, a lightweight local model optimized for a specific DB instance with uncertainty measurement, and a complex global model that is transferable across all instances in Redshift. We design a systematic approach to use these models that best leverages optimality (cache), instance-optimization (local model), and transferable knowledge about Redshift (global model). Experimentally, we show that the Stage predictor makes more accurate and robust predictions while maintaining a practical inference latency and memory overhead. Overall, the Stage predictor can improve the average query execution latency by $20\%$ on these instances compared to the prior query performance predictor in Redshift. △ Less

Submitted 4 March, 2024; originally announced March 2024.

Comments: 15 pages

arXiv:2307.07526 [pdf, other]

Can I say, now machines can think?

Authors: Nitisha Aggarwal, Geetika Jain Saxena, Sanjeev Singh, Amit Pundir

Abstract: Generative AI techniques have opened the path for new generations of machines in diverse domains. These machines have various capabilities for example, they can produce images, generate answers or stories, and write codes based on the "prompts" only provided by users. These machines are considered 'thinking minds' because they have the ability to generate human-like responses. In this study, we ha… ▽ More Generative AI techniques have opened the path for new generations of machines in diverse domains. These machines have various capabilities for example, they can produce images, generate answers or stories, and write codes based on the "prompts" only provided by users. These machines are considered 'thinking minds' because they have the ability to generate human-like responses. In this study, we have analyzed and explored the capabilities of artificial intelligence-enabled machines. We have revisited on Turing's concept of thinking machines and compared it with recent technological advancements. The objections and consequences of the thinking machines are also discussed in this study, along with available techniques to evaluate machines' cognitive capabilities. We have concluded that Turing Test is a critical aspect of evaluating machines' ability. However, there are other aspects of intelligence too, and AI machines exhibit most of these aspects. △ Less

Submitted 11 July, 2023; originally announced July 2023.

Comments: 11 pages, 3 figures

MSC Class: I.2.m Miscellaneous

arXiv:1812.03385 [pdf]

Biometric Recognition System (Algorithm)

Authors: Rahul Kumar Jaiswal, Gaurav Saxena

Abstract: Fingerprints are the most widely deployed form of biometric identification. No two individuals share the same fingerprint because they have unique biometric identifiers. This paper presents an efficient fingerprint verification algorithm which improves matching accuracy. Fingerprint images get degraded and corrupted due to variations in skin and impression conditions. Thus, image enhancement techn… ▽ More Fingerprints are the most widely deployed form of biometric identification. No two individuals share the same fingerprint because they have unique biometric identifiers. This paper presents an efficient fingerprint verification algorithm which improves matching accuracy. Fingerprint images get degraded and corrupted due to variations in skin and impression conditions. Thus, image enhancement techniques are employed prior to singular point detection and minutiae extraction. Singular point is the point of maximum curvature. It is determined by the normal of each fingerprint ridge, and then following them inward towards the centre. The local ridge features known as minutiae is extracted using cross-number method to find ridge endings and ridge bifurcations. The proposed algorithm chooses a radius and draws a circle with core point as centre, making fingerprint images rotationally invariant and uniform. The radius can be varied according to the accuracy depending on the particular application. Morphological techniques such as clean, spur and H-break is employed to remove noise, followed by removing spurious minutiae. Templates are created based on feature vector extraction and databases are made for verification and identification for the fingerprint images taken from Fingerprint Verification Competition (FVC2002). Minimum Euclidean distance is calculated between saved template and the test fingerprint image template and compared with the set threshold for matching decision. For the performance evaluation of the proposed algorithm various measures, equal error rate (EER), Dmin at EER, accuracy and threshold are evaluated and plotted. The measures demonstrate that the proposed algorithm is more effective and robust. △ Less

Submitted 8 December, 2018; originally announced December 2018.

Comments: Conference

arXiv:1302.0351 [pdf]

New Dimension Value Introduction for In-Memory What-If Analysis

Authors: Gaurav Saxena, Ruchi Narula, Manish Mishra

Abstract: OLAP systems operate on historical data and provide answers to analysts queries. Recent in-memory implementations provide significant performance improvement for real time ad-hoc analysis. Philosophy and techniques of what-if analysis on data warehouse and in-memory data store based OLAP systems have been covered in great detail before but exploration of new dimension value (attribute) introductio… ▽ More OLAP systems operate on historical data and provide answers to analysts queries. Recent in-memory implementations provide significant performance improvement for real time ad-hoc analysis. Philosophy and techniques of what-if analysis on data warehouse and in-memory data store based OLAP systems have been covered in great detail before but exploration of new dimension value (attribute) introduction has been limited in the context of what-if analysis. We extend the approach of Andrey Balmin et al of using select modify operator on data graph to introduce new values for dimensions and measures in a read-only in-memory data store as scenarios. Our system constructs scenarios without materializing the rows and stores the row information as queries. The rows associated with the scenarios are constructed as and when required by an ad-hoc query. △ Less

Submitted 11 June, 2013; v1 submitted 2 February, 2013; originally announced February 2013.

Comments: Changes from previous version: 1. Changed references format 2. Added a few more references 3. Added an algorithm to create sub-cube 4. Modified algorithm to process a query for a few errors 5. Rephrased sentences for clarity

Showing 1–4 of 4 results for author: Saxena, G