Term Paper On Database Architecture: Submitted To: Dr.v.saravana
Term Paper On Database Architecture: Submitted To: Dr.v.saravana
Question 1 a) What RDBMS is being used by the organization? Determine the need of selecting the RDBMS. Ans. A relational database is a collection of data items organized as a set of formally described tables from which data can be accessed easily. A relational database is created using the relational model. The software used in a relational database is called a relational database management system (RDBMS). A relational database is the predominant choice in storing data, over other models like the hierarchical database model or the network model. It consists of n number tables and each table has its own primary key. A relational database management system (RDBMS) is a program that lets you create, update, and administer a relational database. Most commercial RDBMS's use the Structured Query Language (SQL) to access the database, although SQL was invented after the development of the relational model and is not necessary for its use. A RDBMS has several major components: the business logic (or rules that govern what data are to be collected, by whom, when, and how theyre to be used), data themselves (grouped by functions based on the business needs, and the septic types of data (see below), and the commands to manipulate the data. What is informally called a database is usually the computer application that carries out the business rules and manipulates the data. Some database applications, such as FileMaker Pro, MS Access, and phpMyAdmin provide GUI tools to facilitate de"ning databases, data tables, and creating reports from the data. Other tools, such as MySQL, Oracle, and Sybase are just the tool for creating and manipulating data - the programmer must create the rest of the computer application for input, reports, etc. The leading RDBMS products are Oracle, IBM's DB2 and Microsoft's SQL Server. Despite repeated challenges by competing technologies, as well as the claim by some experts that no current RDBMS has fully implemented relational principles, the majority of new corporate databases are still being created and managed with an RDBMS. Why use the RDBMS by the Organization......? When your Web application reaches a certain size, it needs a good database design behind it. And in fact, this "certain size" is much smaller than almost every small-site developer thinks. Relational Data Base Management Systems (RDBMSes) need not be restrictive or over-architected, as their bad reputation sometimes brings developers to fear. A bit of thought toward what your site does quickly
turns into a sensible schema design, and it is easy to leave open expandable storage mechanisms like a configuration table within an RDBMS back end. A DBMS stores data in a table where the entries are filed under a specific category and are properly indexed. This allowed programmers to have a lot more structure when saving or retrieving data. A relational database contains data in more than one table. Each table contains a database that is then linked to other tables with respect to their relationships.
b) Identify the major database failures happened in the organization? Ans. Organizations database is to be failures due to some reasons. These reasons are listed in following: 1. Disk Failure 2. Security Breach/Virus Infection 3. Corrupted Data 4. Corruption or Loss of Account and NT Group Information 5. Physical Disaster 1. Disk Failure: A data disk failure is unlikely to be detected, unless the database is restarted because of a data disk failure. Possible causes of hard disk errors include (but are not limited to) the following conditions: A broken connection or wire A bad network card. A router change Changes in the firewall Endpoint reconfiguration Loss of the drive where the transaction log resides Operating system or process failure
2. Security Breach/Virus Infection: Maplesoft is investigating a security breach of its administrative database that took place on July 17th, 2012. As a result of the breach, the perpetrators gained access to some email subscription data, including email addresses, first and last names, and company and institution names. Any financial information held by Maplesoft remains secure, and has not been affected by this security breach. Viruses are malicious programs that may affect the functionality of your server as well as client and damage the Entourage database. They may enter your system through various ways including
Internet, network, external devices, and Email. When you open a virus infected Email in Entourage account, the database may corrupt and lead to data loss.
3. Physical Disaster: Companies today rely on around the clock access to their data to
support a wide range of users and external systems, including web-based self-service applications and integration with business partners systems. It is no longer acceptable for a failure of some component (hardware or software) to leave data unavailable at any time of the day or night. This 24/7 operation requires reliable and timely disaster recovery (DR) mechanisms for all components of these solutions, including the databases. In the event of a failure, service must be resumed within a very short period of time, with minimal or no data loss. The critical importance of DR is demonstrated by the following figures:
43% of US businesses never reopen after a disaster and a further 29% close within
This strategy, with some modification can also be used for user files such as those on a network file server. Backups require that data be backed up and also that the consistency and content of the backup also be monitored The ideal situation for this backup strategy is that the organization should also have a backup server that is a complete duplicate of the production server. Performing a disk-todisk backup before the scheduled full tape backup is a second level of disaster prevention
which allows a tape backup failure and still provides the means to completely recover a hard drive in the event of a crash. A backup server (disk-to-disk) can be expensive, but also serves a number of additional purposes: Can be used as a reporting server so that reports are not being run against the production database can causing performance problems. Provides for a complete test environment for testing software upgrades, unit testing for conversions, testing data loads, data entry testing and all other aspects of testing. Provides a fail over solution if the main server goes down. Can also be used as a web server for any report deployment.
Routine and Process for a Daily Backup: These backup routines could be varied where an organization is unable to shut down their production server, or in the cases where time windows may be too small for a full database export, backup, copy and restore.
server.
3. The DBMS vendor supplied database backup routine is run on the production
server.
4. The DBMS instance on the production server is shut down. 5. The cold backup of all the data, the DBMS instance and the copied out individual
11. The DBMS vendor supplied database backup routine is run on the backup server. 12. The DBMS instance on the backup server is shut down. 13. The cold backup of the backup server is done. 14. The DBMS instance on the backup server is brought back online. This strategy has the advantage of providing complete redundancy on dual servers so that fail over in the case of one going down is easily done.
Q. b) List out the run time performance issues taken by the DBA to improve the performance?
Ans. Planning Ahead: Which infrastructure makes the most sense: centralized or distributed How can I minimize the impact of business growth on application and database performance Are strong information management and security policies in place
Automation:
Virtualization:
Storage management: If an application needs more storage to support information processing, temporary space can be allocated from within the application.
Memory management: Automatically tune the total amount of memory consumed by database processes to minimize the
Load management: Harness a database that can invisibly manage I/O during intensive processing.
Availability and reliability increase uptime with smart availability and reliability Considering the following points when designing high availability and disaster recovery solutions can help DBAs rest easier: Assess the potential cost of downtime. :By planning for the worst, youll be prepared no matter what happens. Balance business risk versus cost to determine the level of recovery and availability thats right for your business.Minutes of downtime can equate to millions in lost revenue for some, but not every business needs uninterrupted data delivery or instant failover.
Automate regular database backup procedures and practice failover and recovery often.Extend recovery responsibilities and training across the teamDBAs may not always be available when disaster strikes
Simplify and accelerate the process of setting up primary and standby databases and adding nodes or servers.Choose a database that makes it easy to scale recovery solutions by spreading workloads across more servers to match business growth.
Minimize manual processes when recovering databasesDBAs will be busy enough in the event of a disaster, so automate what you can.By implementing self-healing databases that can take care of minor issues on their own, you can prevent avoidable downtime without constant monitoring
Question 3 Provide details of each component and draw a diagram of a database in the networking environment you have analyzed in question 1.
Ans:-
2. academic record
Question 4.
Investigate the various backup strategies used in MySQL and Oracle. Which RDBMS provides effective storage management techniques? Justify your answer with an example. Make use of the University/School library or internet resources for your investigation.
Ans.. Oracle Data Guard makes it possible to back up a production database using a valid physical standby database as the target for the backups, and those same backups can be used to restore and recover a production database.
Strategies of backups:
It is a good idea to schedule both logical and binary backups. They each have their use cases and add redundancy to your backups. If there is an issue with your backup, its likely not to affect the other tool. Store your backups on more than one server.In addition to local copies, store backups offsite. Look at the cost of S3 or S3+Glacier, its worth the peace of mind!
Test your backups, and if you have a test environment, load them there periodically. You can also spin up an EC2 instance to load your backups onto. In addition, you can binlog rollforward 24 hours of binlogs as a good test.
Store your binlogs off your primary server so you can perform point in time recovery.Store your binlogs offsite for disaster recovery scenarios.
Run pt-table-checksum periodically (i.e. once a month) and make sure your servers data stays consistent. Check summing is important, as backups are typically pulled off a slave and its vital that it has the same data. There are several ways to take backups (some good, some bad, and some will depend on your situation). Here's the thought process I use for choosing a backup strategy.
MYSQL BACKUPS:
Amazon S3 for MySQL I discuss S3 here but other cloud based storage can be used as well. S3 is just the most popular in this category and is in wide use. Details:
s3cmd we have been using the version from github., Mostly for multi-part upload support. This prevents us from having to split files up before uploading to S3.
There is released alpha version of this version here. You can now set bucket lifecycle properties so data over X days is archived to Glacier and data over Y days is removed. This is very convenient feature and allows you to cost effectively store long term backups with little additional work
Tips/Tricks:
add-header=x-amz-server-side-encryption:AES256 to use the server side encryption feature which helps with some types of compliance. We also have the capability to encrypt all files with gpg prior to upload via a separate script
mysqlbinlog 5.6 Last year Percona IT director Tamas Kozak had a great blog post that showed how mysqlbinlog in 5.6 could be used. With mysqlbinlog 5.6, you can now pull binary logs in real time to another server using mysqlbinlog
Useful to mirror the binlogs on the master to a second server. Allows you to roll forward backups even after losing the master Very useful for disaster recovery. You can have your backups in S3 and mysqlbinlog stop-never running on a small ec2 instance. This can allow for a very low cost disaster recovery plan to ensure you will not lose data even in the worst case scenarios.
Takes very little resources to run, can run about anywhere with disk space and writes out binlog files sequentially.
Ensure it stays running, restart it if it appears to be hanging Verify the file is the same on master and slave Re-transfer files that are partially transferred Compress the files after successful transfer
It can restore an entire server very fast. Often the limiter of how fast this can be restored to another server, is how fast you can transfer data over your network. If you have 1GB network and you have 1TB of data, it could take awhile.
It can compress the DB on the fly It can backup a server at approximately the maximum rate the server allows, given its IO system
It can typically execute a backup with little to no major impact on the server. For example in xtrabackup 2.0.5+, the time taken for FLUSH TABLES WITH RAED LOCK is normally under 1 second.
Tips/Tricks:
If you have a lot of non-transactional tables (i.e. myisam), use rsync option. This will rsync a copy of all the frm files and all the MYD/MYI files. It then does a second rsync while under a global lock. This means where you may have been locked for hours where you had many nontransactional tables, now you can be locked sub-second. Even with innodb only this can greatly cut down on the lock time by syncing the frm files.
Enable slave-info when backing up from a slave so you know what the position you are in the masters bin logs
Consistent backups between myisam and innodb tables. Global read lock only held until myisam tables are dumped.
o
We are researching into how we could further improve lock times here when nontransactional tables are used.
Almost no locking, if not using myisam tables Built in compression Each table is dumped to a separate file. This is very important to make restoring single tables easy. You can quickly restore a single table, instead of restoring your entire backup just to find a tiny table you need. This is actually the most common type of restore needed, so its important to make this operation as painless as possible.
Compressed mydumper typically 3x-5x smaller vs compressed xtrabackup Typically we upload mydumper backups to s3 vs xtrabackup given the time needed to upload/download. Though it depends on the available bandwidth and should be factored into your restore time.
Tips/Tricks:
run with kill-long-queries to avoid nasty problems with FLUSH TABLES WITH READ LOCK
compress, compresses tables per file and should typically be enabled by default. The time needed to uncompress is not a limiting factor on restore time when done inline.
Oracle Backup
When performing an Oracle backup, you create a representative copy of the present original data. If/when the original data is lost, the DBA can use the backup to reconstruct lost information.This database copy includes important parts of the database, such as the control file, archive logs and datafiles-structures.In the event of a media failure, the database backup is the key to successfully recovering data. A few common questions which are related to database backup in general are:
The frequency of the backup Choosing a strategy for the backup Type of backup Frequent and regular whole database or tablespace backups are essential for any recovery scheme.The frequency of backups should be based on the rate or frequency of changes to database data such as insertions, updates, and deletions of rows in existing tables, and addition of new tables.If a database's data is changed at a high rate, the database backup frequency should be proportionally high.When the Oracle database is created, the DBA has to plan beforehand for the protection of the database against potential failures. There are two modes of handling an Oracle backup according to which the DBA can choose an appropriate strategy: NOARCHIVELOG mode: If it is acceptable to lose a limited amount of data if there is a disk failure, you can operate the database in NOARCHIVELOG mode and avoid the extra work required to archive filled online redo log files. ARCHIVELOG mode : If it is not acceptable to lose any data, the database must be operated in ARCHIVELOG mode, ideally with a multiplexed online redo log. If it is needed to recover to a past point in time to correct a major operational or programmatic change to the database, be sure
to run in ARCHIVELOG mode and perform control file backups whenever making structural changes.
Provide the details of Listener running at database side. Explore TNSNAMES.ORA file, SQLNET.ORA service and other required parameters for the Oracle database connectivity.
Ans.. The Oracle listener is a cause of many issues when attempting to configure it for use. Because the listener is usually configured and then forgotten about it's sometimes overlooked and only learnt about when there are errors. So, I thought it would be a good idea to cover some of the basics along with some error messages that you might come across which involve the listener and configuration of the listener.ora and tnsnames.ora files.There are many types of listener running at the database side like LISTNER.ora, SQLNET.ora, TNSNAMES.ora, Remote listener, NONDEFAULT listener.
The computer domain.This domain is automatically appended to any unqualified net service name. For example, if the default domain is set to us.example.com then Oracle Database resolves db in the connect string Connect scott/tiger@db as: db.us.example.com.
A naming method the server uses to resolve a name to a connect descriptor. The following is a sample SQLNET.ORA file created during a preconfigured database configuration install:
Names.default_domain=your_network_domain_name When you set a default domain name that domain name will be automatically appended to any unqualified name, so typing @Live will be interpreted as @Live.ss64.com To force a lookup without the DEFAULT_DOMAIN being appended, append a dot to the tnsname: TNSPING LIVE.
names.directory_path = (TNSNAMES,ONAMES,HOSTNAME)
TNSNAMES.ORA File A TNSNAMES.ORA file is created on each node with net service names. A connect identifier is an identifier that maps to a connect descriptor. A connect descriptor contains the following information:
The network route to the service, including the location of the listener through a protocol address.The SERVICE_NAME for an Oracle Database Release 8.1 or later.
Multiple Descriptions in tnsnames.ora A tnsnames.ora file can contain net service names with one or more connect descriptors. Each connect descriptor can contain one or more protocol addresses. This connect descriptors with multiple
addresses. DESCRIPTION_LIST defines a list of connect descriptors. Multiple Address Lists in tnsnames.ora The tnsnames.ora file also supports connect descriptors with multiple lists of addresses, each with its own characteristics. In this two address lists are presented. The first address list features client load balancing and no connect time failover affecting only those protocol addresses within the address list. The second protocol address list features connect-time failover and no client load loading balancing, affecting only those protocol addresses within the ADDRESS_LIST. The client first tries the first or second protocol address at random, then tries protocol addresses three and four sequentially Conecting to Oracle Database You can connect to Oracle Database only through a client program, such as SQL*Plus or SQL Developer. 1.Connecting To Oracle Database From SQL*Plus
2.Connecting To Oracle Database From SQL Developer 3.Connecting To Oracle Database As User HR Connecting to Oracle Database from SQL*Plus SQL*Plus is a client program with which you can access Oracle Database. This section shows how to start SQL*Plus and connect to Oracle Database. To connect to Oracle Database from SQL*Plus: 1. If you are on a Windows system, display a Windows command prompt. 2. At the command prompt, type sqlplus and press the key Enter. SQL*Plus starts and prompts you for your user name.Type your user name and press the key Enter. SQL*Plus prompts you for your password. 3. Type your password and press the key Enter. Connecting to Oracle Database from SQL Developer SQL Developer is a client program with which you can access Oracle Database. This section assumes that SQL Developer is installed on your system, and shows how to start it and connect to Oracle Database. If SQL Developer is not installed on your system, see oracle database developer users guide for installation instructions. Connecting to Oracle Database as User HR This section shows how to unlock the HR account and connect to Oracle Database as the user HR, who owns the HR sample schema that the examples and tutorials in this document use.To do the tutorials and examples in this document, and create the sample application, you must connect to Oracle Database as the user HR from SQL Developer. The HR sample schema is the development environment for the sample application.