General Data Engineering Questions
General Data Engineering Questions
9. What are the key differences between SQL and NoSQL databases?
11. What is your experience with SQL? Can you write complex queries?
14. Can you discuss your experience with Power BI for data visualization?
15. Describe your experience with MongoDB and when to use it over SQL
databases.
16. Explain the difference between batch processing and stream processing.
17. What are some common algorithms you have used for predictive
modeling?
21. What is your experience with cloud platforms (e.g., AWS, Azure)?
24. Describe your approach to setting up and managing data pipelines in the
cloud.
25. Can you describe your role and contributions to the Industrial Helmet
Monitoring System project?
26. How did you handle challenges during your internship at Gilbert Research
Center?
27. Describe the methodologies you used for your predictive analysis of air
quality.
28. What inspired you to lead the Malicious Domain Detection project?
31. Describe a time when you faced a significant challenge in a project and
how you overcame it.
33. Given the ‘employees’ and ‘projects’ tables, how would you query for the
five lowest-paid employees who have completed at least three projects?
34. Write a SQL query to find the top three revenue items sold yesterday in a
fast-food restaurant database.
35. How would you calculate the percentage of customers ordering drinks
with their meal in SQL?
36. Explain the concept of incremental load versus initial load in ETL
processes.
37. Given two tables, employees and departments, how would you select the
top three departments with at least ten employees making over 100K?
Advanced Topics
38. Can you explain the three approaches to implementing row versioning in
databases?
39. How would you implement a function to calculate the root mean squared
error of a regression model?
40. Describe how you would encode a categorical variable with thousands of
distinct values.
41. What are Type I and Type II errors in the context of statistical testing, and
why are they important?
Additional Questions
42. What is your approach to selecting and evaluating third-party tools for
integration into projects?
44. Can you describe a situation where you had to work with stakeholders to
gather requirements for a data project?
45. Discuss the challenges you faced when bringing together data from
different sources and how you resolved them.