(CourseWikia - Com) SQL Server Interview Questions and Answers Updated 2021 (Copy)
(CourseWikia - Com) SQL Server Interview Questions and Answers Updated 2021 (Copy)
Vinod Kumar
SQL Server Performance Tuning Expert
© Pinal Dave SQLAuthority.com
Vinod Kumar
Vinod Kumar currently works as Director for Cloud Solutions Architect
working with Asia Global Downstream Customers at Microsoft. He has more
than 2 decades of experience working on various roles spanning product
development, evangelism, technical architecture, people leadership, Research
& Development and more. He holds 30+ Microsoft Certification on various
technologies till date and counting. Before joining Microsoft, he was
Microsoft MVP – SQL Server for more than 3 years.
Acknowledgement
I must express my deepest gratitude to my friend Rick Morelan, who had co-
authored many books with us and mentored us to become better author and
human. Vinod had the unique capability to bring the best from the people and
always there whenever I need support and guidance.
Today we are using computers for various activities, motor vehicle for
travelling to places and mobile phones for conversation. How many of us,
can claim the invention of the microprocessor, the basic wheel or the
telegraph? Well, the same way this book was not written overnight. The
journey of this book-writing goes many years back and many individuals to
thank for.
To begin with, we want to thank all those interviewers who reject
interviewees by saying they need to know the key things’ besides having high
grades in class. The whole concept of interview questions and answer
revolves around knowing ‘the key things’.
The core concept of this book will be always evolving and I am sure many of
you will come along with us and give your suggestions to make this book
always a key reference for anybody who wants to start with SQL Server.
Today we want to acknowledge the fact that you will keep the core concept
of this book at heart and help us to keep this book always alive and with the
latest information. We want to thank you for the same.
About this book
As representatives from the IT community, all of us have had our own
experiences of attending interviews – clearing or close to clearing and
sometimes with tons of questions and doubts failing miserably. These stories
are in the most pleasant or not so pleasant memories of our mind and we will
assure you this book will kindle those memories for sure. We have taken tons
of interviews and most of the interviews are not revolving around how deep
technical and internals you know about the subject – but it revolves around
how good you are with the basics.
To clear an interview, one doesn’t need to know inside-out of a subject, and
subjects like “SQL Server” so vast that every single day we learn something
new with this product, and even a complete lifetime will fly off if we keep
doing this. Again, the various roles one can get into for products like SQL
Server are from Database Developer, Database Modelers, Database Architect,
Database Administrator and many more. Hence, this book is geared towards
demystifying and a refresher for memories on the fundamentals which
sometimes are the most important things to clear any type of interview for
any role. Some of the concepts discussed are generic and are not tied to any
specific version of SQL Server, but most of it the new features introduced
with SQL Server have been included in this book.
This book is not a shortcut or a sure to crack interview guide but this book
gets you prepared in an organized manner. Let us also assure you this is
neither a completely comprehensive guide but surely is a great starter
nevertheless. Use this to guide you and be mentally prepared for the big day.
When faced with this big day, we get overwhelmed and confused about
where to start our preparation. And this book is just that secret recipe in your
arsenal to get geared up. Sometimes these basics will help you narrow to a
solution quickly when given a scenario.
Now this book’s flow is “Question & Answer” mode from start till the end to
help you grasp the concepts faster and to the point. Once you get an
understanding of concepts, then if we are twisted with the concept in a
scenario it becomes easy to solve them. Most companies have a typical way
to do interviews which are based on the scenario as per their environment and
these are just combinations of the concepts to fit their need and SLA.
Though each of these chapters is bucketed for convenience we highly
recommend reading each of the sections nevertheless irrespective of the roles
you might be doing as each of the sections have some interesting trivia’s
working with SQL Server. In the industry, the role of accidental DBA’s
especially with SQL Server is so common. Hence if you have performed the
role of DBA for a short stink and want to brush-up your fundamentals then
the respective sections will be a great skim.
Final Note:
After you complete reading this book, do not stop your learning. There are
over 301 SQL Server Interview Questions and Answers available for free
reading at here: https://blog.sqlauthority.com/category/sql-interview-
questions-and-answers/
TABLE OF CONTENT
PINAL DAVE
VINOD KUMAR
ABOUT THE AUTHORS
PINAL DAVE
VINOD KUMAR
ACKNOWLEDGEMENT
ABOUT THIS BOOK
SECTION 1: DATABASE CONCEPTS WITH SQL SERVER
SECTION 2: GENERIC QUESTIONS & ANSWERS FOR DBAS AND DEVS
SECTION 3: COMMON DEVELOPER QUESTIONS
SECTION 4: COMMON TRICKY QUESTIONS
SECTION 5: MISCELLANEOUS QUESTIONS ON SQL SERVER
SECTION 6: DBA SKILLS RELATED QUESTIONS
SECTION 7: DATA WAREHOUSING INTERVIEW QUESTIONS & ANSWERS
SECTION 8: GENERAL BEST PRACTICES
Section 1: Database Concepts with
SQL SERVER
What is RDBMS?
Relational Database Management Systems (RDBMS) are database
management systems that maintain data records and indices in tables.
Relationships may be created and maintained across and among the data
and tables. In a relational database, relationships between data items are
expressed using tables. Interdependencies among these tables are
expressed by data values rather than by pointers. This allows for a high
degree of data independence. An RDBMS can recombine the data items
from different files, providing powerful tools for data usage.
What is Normalization?
Database normalization is a data design and organization process applied
to data structures based on rules that help to build relational databases. In
relational database design, the process of organizing data to minimize
redundancy is called normalization. Normalization usually involves
dividing database data into different tables and defining relationships
between the tables. The objective is to isolate data so that additions,
deletions, and modifications of a field can be made in just one table and
then retrieved through the rest of the database via the defined
relationships.
What is De-normalization?
De-normalization is the process of attempting to optimize the performance
of a database by adding redundant data. It is sometimes necessary because
current DBMSs implement the relational model poorly. A true relational
DBMS would allow for a fully normalized database at the logical level
while providing physical storage of data that is tuned for high
performance. De-normalization is a technique to move from higher to
lower normal forms of database modeling to speed up database access.
How is ACID property related to Database?
ACID (an acronym for Atomicity Consistency Isolation Durability) is a
concept that Database Professionals generally look for while evaluating
relational databases and application architectures. For a reliable database,
all these four attributes should be achieved:
Atomicity is an all-or-none rule for Database modifications.
Consistency guarantees that a transaction never leaves your database in a
half-finished state.
Isolation keeps transactions separated from each other until they are
finished.
Durability guarantees that the database will keep track of pending
changes in such a way that the server can recover from an abnormal
termination and committed transactions will not be lost.
What are the different Normalization Forms?
There are many different types of normalization. Let us see the list.
1NF: Eliminate Repeating Groups
Make a separate table for each set of related attributes, and give each table
a primary key. Each field contains at most one value from its attribute
domain.
2NF: Eliminate Redundant Data
If an attribute depends on only part of a multi-valued key, then remove it
to a separate table.
3NF: Eliminate Columns Not Dependent On Key
If attributes do not contribute to a description of the key, then remove
them from a separate table. All attributes must be directly dependent on
the primary key.
BCNF: Boyce-Codd Normal Form
If there are non-trivial dependencies between candidate key attributes,
then separate them into distinct tables.
4NF: Isolate Independent Multiple Relationships
No table may contain two or more 1:n or n:m relationships that are not
directly related.
5NF: Isolate Semantically Related Multiple Relationships
There may be practical constrains on information that justifies separating
logically related many-to-many relationships.
ONF: Optimal Normal Form
A model limited to only simple (elemental) facts, as expressed in Object
Role Model notation.
DKNF: Domain-Key Normal Form
A model-free from all modification anomalies is said to be in DKNF.
Remember, these normalization guidelines are cumulative. For a database
to be in 3NF, it must first fulfil all the criteria of a 2NF and 1NF database.
What is a Trigger?
A trigger is a SQL procedure or SQLCLR Code that initiates an action
when an event (INSERT, DELETE or UPDATE) occurs. Triggers are
stored in and managed by the DBMS. Triggers can be used to maintain the
referential integrity of data by systematically changing the data. A trigger
cannot be called or executed; DBMS automatically fires the trigger as a
result of a data modification to the associated table. Triggers can be
considered to be similar to stored procedures in that both consist of
procedural logic that is stored at the database level. Stored procedures,
however, are not event-driven and are not attached to a specific table as
triggers are. Stored procedures are explicitly executed by invoking a
CALL to the procedure while triggers are implicitly executed. Besides,
triggers can also execute stored procedures.
Nested Trigger: A trigger can also contain INSERT, UPDATE and
DELETE logic within itself; so when the trigger is fired because of data
modification, it can also cause another data modification, thereby firing
another trigger. A trigger that contains data modification logic within itself
is called a nested trigger.
a. Instead of Trigger
b. After Trigger
What is an Index?
An index is a physical structure containing pointers to the data. Indices are
created in an existing table to locate rows more quickly and efficiently. It
is possible to create an index on one or more columns of a table, and each
index is given a name. The users cannot see the indexes; they are just used
to speed up queries. Effective indexes are one of the best ways to improve
performance in a database application. A table scan happens when there is
no index available to help a query. In a table scan, the SQL Server
examines every row in the table to satisfy the query results. Table scans
are sometimes unavoidable, but on large tables, scans have a terrific
impact on performance.
The query can reference the indexed view directly or the query optimizer
can select the view if it determines that the view can be substituted for
some or all of the query in the lowest-cost query plan.
What is a Cursor?
A cursor is a database object used by applications in the procedural logic
to manipulate data on a row-by-row basis, instead of the typical SQL
commands that operate on all / parts of rows as a set of data.
To work with a cursor, we need to perform some steps in the following
order:
Declare cursor
Open cursor
Fetch row from the cursor
Process fetched row
Close cursor
Deallocate cursor
Outer Join
A join that includes rows even if they do not have related rows in the
joined table is an Outer Join. You can create three different variations of
outer join to specify the unmatched rows to be included:
Left Outer Join: In Left Outer Join, all the rows in the first-
named table, i.e. "left" table, which appears leftmost in the
JOIN clause, are included. Unmatched rows in the right table
do not appear.
Right Outer Join: In Right Outer Join, all the rows in the
second-named table, i.e. "right" table, which appears rightmost
in the JOIN clause are included. Unmatched rows in the left
table are not included.
Full Outer Join: In Full Outer Join, all the rows in all joined
tables are included, whether they are matched or not.
Cross Join
A cross join that does not have a WHERE clause produces the Cartesian
product of the tables involved in the join. The size of a Cartesian product
result set is the number of rows in the first table multiplied by the number
of rows in the second table. A common example is when a company wants
to combine each product with a pricing table to analyse each product at
each price.
Self-Join
This is a special case when one table joins itself with one or two aliases to
avoid confusion. A self-join can be of any type, as long as the joined
tables are the same. A self-join is rather unique in that it involves a
relationship with only one table. A common example is when a company
has a hierarchal reporting structure whereby one member of staff reports
to another or a typical part with subparts hierarchy. Self-Join can be Outer
Join or Inner Join.
What is Identity?
Identity (or AutoNumber) is a column that automatically generates
numeric values. There can be only one IDENTITY Column in a given
table inside SQL Server. A start and increment value can be set, but most
DBAs leave these at 1. A GUID column also generates numbers; the value
of identity cannot be controlled. Identity/GUID columns do not need to be
indexed.
1. FROM
2. ON
3. OUTER
4. WHERE
5. GROUP BY
6. CUBE | ROLLUP
7. HAVING
8. SELECT
9. DISTINCT
10. ORDER BY
11. TOP
What is the PRIMARY KEY?
A PRIMARY KEY constraint is a unique identifier for a row within a
database table. Every table should have a primary key constraint to
uniquely identify each row, and only one primary key constraint can be
created for each table. The primary key constraints are used to enforce
entity integrity.
TRUNCATE
DELETE
No indexes
A clustered index
A clustered index and many non-clustered indexes
A non-clustered index
Many non-clustered indexes
RAID 0 – No Redundancy
RAID 1 – Mirroring
RAID 5 – Distributed Parity
RAID 10 - Mirrored and Striped
Types of Sub-query
Windows Mode
Mixed Mode – SQL and Windows
To change authentication mode in SQL Server, go to Start -> Programs- >
Microsoft SQL Server and click SQL Server Management Studio and
under Object Explorer, right-click the server, and then click Properties.
On the Security page, under Server authentication, select the new server
authentication mode, and then click OK.
What Table variables and how are they different from Local
Temporary tables?
A Table variable is like a Local temporary table but has some interesting
differences. The scoping rules of Table variables are the same as any other
variable inside SQL Server. For example, if you define a variable inside a
stored procedure, it can’t be accessed outside the stored procedure.
What is the STUFF Function and How Does it Differ from the
REPLACE Function?
STUFF function is used to overwrite existing characters using this syntax:
STUFF (string_expression, start, length, replacement_characters), where
string_expression is the string that will have characters substituted, the
start is the starting position, the length is the number of characters in the
string that are substituted, and replacement_characters are the new
characters interjected into the string. REPLACE function is used to
replace existing characters of all occurrences. Using the syntax REPLACE
(string_expression, search_string, replacement_string), every incidence of
search_string found in the string_expression will be replaced with
replacement_string.
What is an Execution Plan? When would you use it? How would
you View the Execution Plan?
An execution plan is a road map graphically or textually representation of
data retrieval methods chosen by the SQL Server query optimizer for a
stored procedure or ad-hoc query, and it is a very useful tool for a
developer to understand the performance characteristics of a query or
stored procedure since the plan is the one that SQL Server will place in its
cache and use to execute the stored procedure or query. Within the SQL
Server Management Studio, there is an option called "Include Actual
Execution Plan" (or use CTRL+M shortcut) under the SQL Editor
Toolbar. If this option is turned on, it will display the query execution plan
in a separate window when the query is executed.
How to Find Out the List Schema Name and Table Name for the
Database?
We can use any of the following scripts:
SELECT '['+SCHEMA_NAME(schema_id)+'].['+ Name +']' AS
SchemaTable
FROM sys.tables
SELECT '['+ TABLE_SCHEMA +'].['+ TABLE_NAME +']' AS
SchemaTable
FROM INFORMATION_SCHEMA.TABLES
How does using a Separate Hard Drive for Several Database
Objects Improves Performance Right Away?
Separating objects across different physical Hard drives will increase the
number of IOPS that can be handled in parallel for the SQL Server
instance. This is a deployment strategy done by the DBA. A non-clustered
index and tempdb can be created on a separate disk to improve
performance.
How to Find the List of Fixed Hard Drive and Free Space
on Server?
We can use the following Stored Procedure to figure out the number of
fixed drives (hard drive) a system has along with free space on each of
those drives.
EXEC master..xp_fixeddrives
Why can there be only one Clustered Index and not more than one?
A clustered index determines the physical order of data in a table. As a
fact, we all know that a set of data can be only stored in only one physical
order; that is why only one clustered index is possible.
What is the difference between Line Feed (\n) and Carriage Return
(\r)?
Line Feed – LF – \n – 0x0a – 10 (decimal)
Carriage Return – CR – \r – 0x0D – 13 (decimal)
DECLARE @NewLineChar AS CHAR(2) = CHAR(13) + CHAR(10)
PRINT ('SELECT FirstLine AS FL ' +@NewLineChar + 'SELECT
SecondLine AS SL' )
What is a Hint?
Hints are options and strong suggestions specified for enforcement by the
SQL Server query processor on DML statements. The hints override any
execution plan the query optimizer might select for a query.
There are three different types of hints. Let us understand the basics of
each of them separately.
Join Hint
This hint is used when more than one table is used in a query. Two or
more tables can be joined using different types of joins. This hint forces
the type of join algorithm (INNER [LOOP | MERGE | JOIN] JOIN) that is
used. Joins can be used in SELECT, UPDATE and DELETE statements.
Query Hint
This hint is used when a certain kind of logic has to be applied to a whole
query. Any hint used in the query is applied to the complete query as
opposed to a part of it. There is no way to specify that only a certain part
of a query should be used with the hint. After any query, the OPTION
clause is specified to apply the logic to this query. A query always has any
of the following statements: SELECT, UPDATE, DELETE, INSERT or
MERGE (SQL 2008); and this hint can be applied to all of them.
Table Hint
This hint is used when a certain kind of locking mechanism of tables has
to be controlled. SQL Server query optimizer always puts the appropriate
kind of lock on tables, when any of the Transact SQL operations
SELECT, UPDATE, DELETE, INSERT or MERGE is used. There are
certain cases when the developer knows when and where to override the
default behaviour of the locking algorithm, and these hints are useful in
those scenarios.
What is the difference between the Index Seek vs. Index Scan?
An index scan means that SQL Server reads all rows in a table, and then
returns only those rows that satisfy the search criteria. When an index scan
is performed, all the rows in the leaf level of the index are scanned. This
essentially means that all the rows of the index are examined instead of the
table directly. This is sometimes compared to a table scan, in which all the
table data is read directly. However, there is usually little difference
between an index scan and a table scan.
An index seeks, on the other hand, means that the Query Optimizer relies
entirely on the index leaf data to locate rows satisfying the query
condition. An index seek will be most beneficial in cases where a small
percentage (less than 10 or 15 percentage) of rows will be returned. An
index seek will only affect the rows that satisfy a query condition and the
pages that contain these qualifying rows; in terms of performance, this is
highly beneficial when a table has a very large number of rows.
What is the Maximum Size per Database for SQL Server Express?
SQL Server Express supports a maximum size of 4 GB per database,
which excludes all the log files. From SQL Server 2008 R2 onwards this
size has been made to 10GB. This is quite some data for a conventional
application and when designed properly can be used efficiently for small
development purposes.
How do We Know if Any Query is Retrieving Large or very little
data?
In one way, it is quite easy to figure this out by just looking at the result
set; however, this method cannot rely upon every time as it is difficult to
reach a conclusion when there are many columns and many rows.
It is easy to measure how much data is retrieved from the server to the
client-side. The SQL Server Management Studio has a feature that can
measure client statistics.
All the four tabs provide very important information; however, the one
which I prefer most is “Recent Expensive Queries.” Whenever I find my
server running slow or having any performance-related issues, my first
reaction is to open this tab and see which query is running slow. I usually
look at the query with the highest number for Average Duration. The
Recent Expensive Queries monitors only show queries that are in the SQL
Server cache at that moment.
What is CTE?
CTE is the abbreviation for Common Table Expression. A CTE is an
expression that can be thought of as a temporary result set that is defined
within the execution of a single SQL statement. A CTE is similar to a
derived table in that it is not stored as an object and lasts only for the
duration of the query.
A CTE can reference itself, thereby creating a recursive CTE. A recursive
CTE is one in which an initial CTE is repeatedly executed to return
subsets of data until the complete result set is obtained. A recursive CTE
contains 3 elements –
Which are the New Data Types Introduced in SQL SERVER 2008?
The GEOMETRY Type: The GEOMETRY data type is a system .NET
common language runtime (CLR) datatype in SQL Server. This type
represents data in a two-dimensional Euclidean coordinate system.
The GEOGRAPHY Type: The GEOGRAPHY datatype’s functions are
the same as with GEOMETRY. The difference between the two is that
when you specify GEOGRAPHY, you are usually specifying points in
terms of latitude and longitude.
New Date and Time Data types: SQL Server 2008 introduces four new
data types related to date and time: DATE, TIME, DATETIMEOFFSET,
and DATETIME2.
DATE: The new DATE data type just stores the data itself. It
is based on the Gregorian calendar and handles years from 1 to
9999.
TIME: The new TIME (n) type stores time with a range of
00:00:00.0000000 through 23:59:59.9999999. Precision is
allowed with this type. TIME supports seconds down to 100
nanoseconds. The n in TIME(n) defines this level of fractional
second precision from 0 to 7 digits of precision.
The DATETIMEOFFSET Type: DATETIMEOFFSET (n) is
the time-zone-aware version of a datetime datatype. The name
will appear less odd when you consider what it is: a date + time
+ time-zone offset. The offset is based on how far behind or
ahead you are from Coordinated Universal Time (UTC) time.
The DATETIME2 Type: It is an extension of the datetime type
in earlier versions of SQL Server. This new datatype has a date
range covering dates from January 1 of year 1 through
December 31 of year 9999. DATETIME2 not only includes the
larger date range but also has a timestamp and the same
fractional precision that TIME type provides.
What is CLR?
In SQL Server 2008, SQL Server objects such as user-defined functions
can be created using such CLR languages. This CLR language support
extends not only to user-defined functions but also to stored procedures
and triggers. You can develop such CLR add-ons to SQL Server using
Visual Studio.
What is LINQ?
Language-Integrated Query (LINQ) adds the ability to query objects using
.NET languages. The LINQ to SQL object/relational mapping (O/RM)
framework provides the following basic features:
What is RAISEERROR?
RAISERROR generates an error message and initiates error processing for
the session. RAISERROR can either reference a user-defined message
stored in the sys.messages catalog view or build a message dynamically.
The message is returned as a server error message to the calling
application or an associated CATCH block of a TRY…CATCH construct.
What is XPath?
XPath uses a set of expressions to select nodes to be processed. The most
common expression that you’ll use is the location path expression, which
returns a set of nodes called a node-set. XPath can use both an
unabbreviated an abbreviated syntax. The following is the unabbreviated
syntax for a location path:
/axisName::nodeTest[predicate]/axisName::nodeTest[predicate]
What is a Filestream?
Filestream allows you to store unstructured large objects (text documents,
images, and videos) in the file system and have these files integrated
within the database. FILESTREAM integrates the SQL Server Database
Engine with New Technology File System (NTFS); it stores the data in
varbinary (max) data type. Using this data type, the unstructured data is
stored in the NTFS file system, and the SQL Server Database Engine
manages the link between the Filestream column and the actual file
located in the NTFS. Using Transact-SQL statements users can insert,
update, delete and select the data stored in FILESTREAM-enabled tables.
What are some of the caveats working with Filestream data type?
Here are some of the interesting considerations with Filestream data type-
The sizes of the BLOBs are limited only by the volume size of the
NTFS file system.
FILESTREAM data must be stored in FILESTREAM filegroups.
FILESTREAM filegroups can be on compressed volumes.
We can use all backup and recovery models with FILESTREAM
data, and the FILESTREAM data is backed up with the structured
data
When using failover clustering, the FILESTREAM filegroups
must be on shared disk resources.
Encryption is not supported on FILESTREAM data.
SQL Server does not support database snapshots for
FILESTREAM filegroups.
Database mirroring does not support FILESTREAM. While Log
shipping and Replication support FILESTREAM datatypes.
What do you mean by TABLESAMPLE?
TABLESAMPLE allows you to extract a sampling of rows from a table in
the FROM clause. The rows retrieved are random and they are not in any
order. This sampling can be based on a percentage of several rows. You
can use TABLESAMPLE when only a sampling of rows is necessary for
the application instead of a full result set.
What are Ranking Functions?
Ranking functions return a ranking value for each row in a partition. All
the ranking functions are non-deterministic. The different Ranking
functions are as follows:
ROW_NUMBER () OVER ([<partition_by_clause>]
<order_by_clause>)
Returns the sequential number of a row within a partition of a result set,
starting at 1 for the first row in each partition.
RANK () OVER ([<partition_by_clause>] <order_by_clause>)
Returns the rank of each row within the partition of a result set.
DENSE_RANK () OVER ([<partition_by_clause>] <order_by_clause>)
Returns the rank of rows within the partition of a result set, without any
gaps in the ranking.
NTILE (integer_expression) OVER ([<partition_by_clause>]
<order_by_clause>)
Distributes the rows in an ordered partition into a specified number of
groups.
What is ROW_NUMBER()?
ROW_NUMBER() returns a column as an expression that contains the
row’s number within the result set. This is only a number used in the
context of the result set; if the result changes, the ROW_NUMBER() will
change.
What is a ROLLUP Clause?
ROLLUP clause is used to do aggregate operation on multiple levels in a
hierarchy. If we want to sum on different levels without adding any new
column, then we can do it easily using ROLLUP. We have to just add the
WITH ROLLUP Clause in the group by clause.
When I Delete any Data from a Table, does the SQL Server reduce
the size of that table?
When data are deleted from any table, the SQL Server does not reduce the
size of the table right away; however, it marks those pages as free pages,
showing that they belong to the table. When new data are inserted, they
are put into those pages first. Once those pages are filled up, SQL Server
will allocate new pages. If you wait for some time, the background
process de-allocates the pages, finally reducing the page size.
Section 6: DBA Skills related
Questions
How to Rebuild the Master Database?
Master database is a system database and it contains information about
running server’s configuration. When SQL Server 2005 is installed, it
usually creates master, model, msdb, tempdb, resourcedb and the
distribution system database by default. Only the Master database is the
one that is absolutely a must-have database. Without the Master database,
the SQL Server cannot be started. This is the reason why it is extremely
important to back up the Master database.
To rebuild the Master database, run Setup.exe, verify, and repair a SQL
Server instance, and rebuild the system databases. This procedure is most
often used to rebuild the master database for a corrupted installation of
SQL Server.
How to Copy the Tables, Schema and Views from one SQL Server
to Another?
There are multiple ways to do this -
1) “Detach Database” from one server and “Attach Database” to another
server.
2) Manually script all the objects using SSMS and run the script on a new
server.
3) Use Wizard of SSMS.
What is SQLCMD?
sqlcmd is an enhanced version of the isql and osql, and it provides way
more functionality than the other two options. In other words, sqlcmd is a
better replacement of isql (which will be deprecated eventually) and osql
(not included in SQL Server 2005 RTM). sqlcmd can work in two modes -
i) BATCH and ii) interactive modes.
CPU utilization
Storage space utilization
What are the System Data Collection Sets predefined inside SQL
Server?
During the installation, there are 3 System Data Collection are made
available to DBAs. These to be later configured to monitor SQL Server.
These cannot be deleted.
Disk Usage: Collects data about disk and log usage for all the
databases installed on the system.
Server Activity: Collects resource usage statistics and
performance data from the server and SQL Server.
Query Statistics: Collects query statistics, individual query
text, query plans, and specific queries.
You can use the sqlcmd which is the command prompt version
and the osql version in SQL Server 2005. We have a new
option -A that enables the connection to be as Admin
connection.
To enable the admin connection from SSMS you need to use
the ADMIN: before your server's name
What is the SP’s used for creating, starting and stopping a Server-
side trace?
The following are the system SPs that we can use to work with Server side
trace -
Apart from these many other SQL Server Security Auditing events are
also captured like Add DB user event, DBCC event, Login Failed,
Backup/Restore event, Server Starts and Stops and many more.
Row Compression
Row compression changes the format of the physical storage of data. It
minimizes the metadata (column information, length, offsets etc)
associated with each record. Numeric data types and fixed-length
strings are stored in variable-length storage format, just like Varchar.
Page Compression
Page compression allows common data to be shared between rows for
a given page. It uses the following techniques to compress data:
• Row compression.
• Prefix Compression. For every column in a page, duplicate
prefixes are identified. These prefixes are saved in compression
information headers which resides after the page header. A
reference number is assigned to these prefixes and that reference
number is replaced where ever those prefixes are being used.
Dictionary Compression
Dictionary compression searches for duplicate values throughout the
page and stores them in CI. The main difference between prefix and
dictionary compression is that the former is only restricted to one
column while the latter applies to the complete page.
What are Wait Types?
There are three types of wait types, namely,
Resource Waits. Resource waits occur when a worker requests access to
a resource that is not available because that resource is either currently
used by another worker or it’s not yet available.
Queue Waits. Queue waits occur when a worker is idle, waiting for work
to be assigned.
External Waits. External waits occur when an SQL Server worker is
waiting for an external event.
What is ‘FILLFACTOR’?
A “FILLFACTOR” is one of the important arguments that can be used
while creating an index.
According to MSDN, FILLFACTOR specifies a percentage that indicates
how much the Database Engine should fill each index page during index
creation or rebuild. Fill-factor is always an integer-valued from 1 to 100.
The fill-factor option is designed for improving index performance and
data storage. By setting the fill-factor value, you specify the percentage of
space on each page to be filled with data, reserving free space on each
page for future table growth.
Specifying a fill-factor value of 70 would imply that 30 per cent of each
page will be left empty, providing space for index expansion as data is
added to the underlying table. Space is reserved between the index rows
rather than at the end of the index. The fill-factor setting applies only
when the index is created or rebuilt.
What is PAD_INDEX?
PAD_INDEX is the percentage of free space applied to the intermediate-
level pages of the index as specified by the fill factor. The PAD_INDEX
option is useful only when FILLFACTOR is specified.
What are the questions and considerations you will make for
HA/DR design?
Understand prioritized HA/DR requirements for the
application. What is the SLA’s set by the customer?
Are customers comfortable or budgeted for a shared storage
solution?
What is the recovery point objective (RPO)? This decides the
combination of configurations like - Failover clustering is often
deployed alongside database mirroring, with clustering used for
local HA, and database mirroring used for DR.
Consider a Geo cluster (or stretch cluster) as a combined
HA/DR solution. This solution requires software to enable the
cluster and storage-level replication and from the storage
vendor
What is the recovery time objective (RTO)? How fast the
system has to get online after say a site failure.
Though these are some of the high-level questions, these do help narrow
down to a solution quickly or at least to a couple of options.
There are other special types of backups that we didn’t cover include –
Partial Backup
File Backup
Differential Partial Backup
Differential File Backup
Copy-Only Backups
NONE
CHECKSUM
TORN_PAGE_DETECTION
What are some of the operations that cannot be done on the Model
Database?
Some of the restrictions when working with Model Database is-
Conventional (Slow):
All the constraints and keys are validated against the data before it is
loaded; this way data integrity is maintained.
Direct (Fast):
All the constraints and keys are disabled before the data is loaded. Once
data is loaded, it is validated against all the constraints and keys. If data is
found invalid or dirty, it is not included in the index, and all future
processes on this data are skipped.
Process Goal
OLTP: Snapshot of business processes that do fundamental business
tasks
OLAP: Multi-dimensional views of business activities of planning and
decision making
Database Design
OLTP: Normalized small database. Speed will be not an issue because
of a small database, and normalization will not degrade
performance. This adopts the entity-relationship (ER) model
and an application-oriented database design.
OLAP: De-normalized large database. Speed is an issue because of a
large database and de-normalizing will improve performance as
there will be fewer tables to scan while performing tasks. This
adopts star, snowflake or fact constellation mode of subject-
oriented database design.
What is ODS?
ODS is the abbreviation of Operational Data Store ‑ a database structure
that is a repository for near real-time operational data rather than long-
term trend data. The ODS may further become the enterprise-shared
operational database, allowing operational systems that are being re-
engineered to use the ODS as their operation databases.
What is ETL?
ETL is an abbreviation of extract, transform, and load. ETL is software
that enables businesses to consolidate their disparate data while moving it
from place to place, and it doesn't matter that that data is in different forms
or formats. The data can come from any source. ETL is powerful enough
to handle such data disparities. First, the extract function reads data from a
specified source database and extracts a desired subset of data. Next, the
transform function works with the acquired data - using rules or lookup
tables, or creating combinations with other data - to convert it to the
desired state. Finally, the load function is used to write the resulting data
to a target database.
What is VLDB?
VLDB is an abbreviation of Very Large Database. For instance, a one-
terabyte database can be considered as a VLDB. Typically, these are
decision support systems or transaction processing applications serving a
large number of users.
What is MDS?
Master Data Services or MDS helps enterprises standardize the data
people rely on to make critical business decisions. With Master Data
Services, IT organizations can centrally manage critical data assets
companywide and across diverse systems, enable more people to securely
manage master data directly, and ensure the integrity of information over
time.
Report Parameter:
A report parameter is a variable defined at the report level that
allows the personalization of the report at the run time.
Report parameters differ from query parameters in that they are
defined in a report and processed by the report server.
Each time you add a report parameter to the report, a new
member is added to the Parameters collection for you to use in
an expression.
Sources extract data from data stores such as tables and views
in relational databases, files, and Analysis Services databases.
Transformations modify, summarize, and clean data.
Destinations load data into data stores or create in-memory
datasets.
What are the Differences between SSRS 2005 and SSRS 2008
versions?
Report Server 2008 is a complete architectural rewrite; from the report
processing engine and the report renderers to the fact that it no longer
depends on IIS to host the Report Server and Report Manager. The
following areas could have major implications for support:
Report Builder
SharePoint Integration
Client Print Control
Report Manager user interface
Report Server web method APIs
Command-line tools
What is PowerPivot?
PowerPivot is comprised of numerous different client and server
components that provide customers with an end to end solution for
creating business intelligence via the familiar interface of Microsoft Excel
2010.
PowerPivot for SharePoint adds server-side applications and features that
support PowerPivot data access and management for workbooks that you
publish to SharePoint. PowerPivot server components load the data,
process queries, perform scheduled data refresh, and track server and
workbook usage in the farm.