dbms explaination
dbms explaination
1. Project Structure:
```
hr_analytics/
├── src/
│ ├── database/
│ │ ├── config.py # Database configuration
│ │ ├── init_db.py # Create tables and import data
│ │ └── db_operations.py # CRUD operations
├── sql/
│ ├── schema/
│ │ └── create_tables.sql # Table definitions
│ └── queries/
│ ├── q1_turnover.sql
│ └── ...
├── notebooks/
│ └── analysis.ipynb
├── .env # Database credentials
└── requirements.txt
```
load_dotenv()
DB_CONFIG = {
'host': 'localhost',
'database': 'hr_analytics',
'user': os.getenv('DB_USER'),
'password': os.getenv('DB_PASSWORD'),
'port': 5432
}
```
def get_connection_string():
return f"postgresql://{DB_CONFIG['user']}:
{DB_CONFIG['password']}@{DB_CONFIG['host']}:{DB_CONFIG['port']}/
{DB_CONFIG['database']}"
def create_database():
"""
Create database and import CSV data
"""
engine = create_engine(get_connection_string())
# Read CSV
df = pd.read_csv('data/HRDataset_v14.csv')
# Create table
df.to_sql('employees', engine, if_exists='replace', index=False)
if __name__ == "__main__":
create_database()
```
4. Setup Steps:
Install requirements:
```bash
# requirements.txt
pandas
psycopg2-binary
sqlalchemy
python-dotenv
jupyter
matplotlib
seaborn
```
5. VS Code Setup:
- Install PostgreSQL extension
- Connect to database:
```json
// VS Code PostgreSQL connection
{
"name": "HR Analytics",
"server": "localhost",
"port": 5432,
"database": "hr_analytics",
"username": "your_username"
}
```
class DBOperations:
def __init__(self):
self.conn_string = f"postgresql://{DB_CONFIG['user']}:
{DB_CONFIG['password']}@{DB_CONFIG['host']}:{DB_CONFIG['port']}/
{DB_CONFIG['database']}"
self.engine = create_engine(self.conn_string)
7. Using in VS Code:
```sql
-- sql/queries/q1_turnover.sql
SELECT department,
COUNT(*) as employee_count,
ROUND(AVG(salary)::numeric, 2) as avg_salary
FROM employees
GROUP BY department;
# Execute query
with open('../sql/queries/q1_turnover.sql', 'r') as file:
query = file.read()
results = db.execute_query(query)
# Create visualization
plt.figure(figsize=(12,6))
sns.barplot(data=results, x='department', y='avg_salary')
plt.xticks(rotation=45)
plt.title('Average Salary by Department')
plt.show()
```
9. Multi-device Workflow:
On each device:
```bash
# First time setup
git clone <your-repo>
createdb hr_analytics # Create PostgreSQL database
python src/database/init_db.py # Import data
# Daily workflow
git pull # Get latest queries
# Work on queries/analysis
git add sql/queries/*.sql notebooks/*.ipynb
git commit -m "Updated analysis"
git push
```
-- Read
SELECT * FROM employees WHERE department = 'IT';
-- Update
UPDATE employees
SET salary = 80000
WHERE employee_name = 'John Doe';
-- Delete
DELETE FROM employees
WHERE employee_name = 'John Doe';
```
Would you like me to provide more details about any specific part or show how to
handle any particular analysis task?