0% found this document useful (0 votes)
32 views9 pages

Datasets For ESA Vol 1

The document outlines various datasets related to student performance, mental health, and adaptability in education. It includes details on datasets such as the Student Depression Dataset, Students Performance in Exams, and Higher Education Students Performance Evaluation, among others, highlighting their features and potential applications in research and machine learning. Ethical considerations and data privacy are emphasized, particularly for sensitive information regarding student mental health.

Uploaded by

loanb8391
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views9 pages

Datasets For ESA Vol 1

The document outlines various datasets related to student performance, mental health, and adaptability in education. It includes details on datasets such as the Student Depression Dataset, Students Performance in Exams, and Higher Education Students Performance Evaluation, among others, highlighting their features and potential applications in research and machine learning. Ethical considerations and data privacy are emphasized, particularly for sensitive information regarding student mental health.

Uploaded by

loanb8391
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Datasets for ESA Vol 1

1. Student Depression Dataset (Student Depression Dataset.csv)


Student depression dataset typically contains data aimed at analyzing, understanding,
and predicting depression levels among students. It may include features such as demographic
information (age, gender), academic performance (grades, attendance), lifestyle habits (sleep
patterns, exercise, social activities), mental health history, and responses to standardized
depression scales.
These datasets are valuable for research in psychology, data science, and education to
identify factors contributing to student depression and to design early intervention strategies.
Ethical considerations like privacy, informed consent, and anonymization of data are crucial in
working with such sensitive information.
- ID: Unique identifier for each student.
- Age: Age of the student.
- Gender: Gender (e.g., Male, Female).
- City: Geographic region
- CGPA: Grade Point Average or other academic scores.
- Sleep Duration: Average daily sleep duration.
- Profession:
- Work Pressure:
- Academic Pressure:
- Study Satisfaction:
- Job Satisfaction:
- Dietary Habits:
And much more

2. Students Performance in Exams (StudentsPerformance.csv)


This data set consists of the marks secured by the students in various subjects.
3. Students Adaptability Level in Online Education
(students_adaptability_level_online_education.csv)
Since as a beginner in machine learning it would be a great opportunity to try some
techniques to predict the outcome of Students’ Adaptability Level Prediction in Online
Education using Machine Learning Approaches
Gender
Age
Education Level
Institution Type
IT Student
Location in Town
Load-shedding
Financial Condition
Internet Type
Network Type
Class Duration
Self LMS
Device
4. Higher Education Students Performance Evaluation (student_prediction.csv)
The data was collected from the Faculty of Engineering and Faculty of Educational
Sciences students in 2019. The purpose is to predict students' end-of-term performances using
ML techniques.
1- Student Age (1: 18-21, 2: 22-25, 3: above 26)
2- Sex (1: female, 2: male)
3- Graduated high-school type: (1: private, 2: state, 3: other)
4- Scholarship type: (1: None, 2: 25%, 3: 50%, 4: 75%, 5: Full)
5- Additional work: (1: Yes, 2: No)
6- Regular artistic or sports activity: (1: Yes, 2: No)
7- Do you have a partner: (1: Yes, 2: No)
8- Total salary if available (1: USD 135-200, 2: USD 201-270, 3: USD 271-340, 4: USD 341-
410, 5: above 410)
9- Transportation to the university: (1: Bus, 2: Private car/taxi, 3: bicycle, 4: Other)
10- Accommodation type in Cyprus: (1: rental, 2: dormitory, 3: with family, 4: Other)
11- Mother's education: (1: primary school, 2: secondary school, 3: high school, 4: university,
5: MSc., 6: Ph.D.)
12- Father's education: (1: primary school, 2: secondary school, 3: high school, 4: university,
5: MSc., 6: Ph.D.)
13- Number of sisters/brothers (if available): (1: 1, 2:, 2, 3: 3, 4: 4, 5: 5 or above)
14- Parental status: (1: married, 2: divorced, 3: died - one of them or both) ***Listed as
"Kids"…woops
15- Mother's occupation: (1: retired, 2: housewife, 3: government officer, 4: private sector
employee, 5: self-employment, 6: other)
16- Father's occupation: (1: retired, 2: government officer, 3: private sector employee, 4: self-
employment, 5: other)
17- Weekly study hours: (1: None, 2: <5 hours, 3: 6-10 hours, 4: 11-20 hours, 5: more than 20
hours)
18- Reading frequency (non-scientific books/journals): (1: None, 2: Sometimes, 3: Often)
19- Reading frequency (scientific books/journals): (1: None, 2: Sometimes, 3: Often)
20- Attendance to the seminars/conferences related to the department: (1: Yes, 2: No)
21- Impact of your projects/activities on your success: (1: positive, 2: negative, 3: neutral)
22- Attendance to classes (1: always, 2: sometimes, 3: never)
23- Preparation to midterm exams 1: (1: alone, 2: with friends, 3: not applicable)
24- Preparation to midterm exams 2: (1: closest date to the exam, 2: regularly during the
semester, 3: never)
25- Taking notes in classes: (1: never, 2: sometimes, 3: always)
26- Listening in classes: (1: never, 2: sometimes, 3: always)
27- Discussion improves my interest and success in the course: (1: never, 2: sometimes, 3:
always)
28- Flip-classroom: (1: not useful, 2: useful, 3: not applicable)
29- Cumulative grade point average in the last semester (/4.00): (1: <2.00, 2: 2.00-2.49, 3:
2.50-2.99, 4: 3.00-3.49, 5: above 3.49)
30- Expected Cumulative grade point average in the graduation (/4.00): (1: <2.00, 2: 2.00-2.49,
3: 2.50-2.99, 4: 3.00-3.49, 5: above 3.49)
31- Course ID
32- OUTPUT Grade (0: Fail, 1: DD, 2: DC, 3: CC, 4: CB, 5: BB, 6: BA, 7: AA)

5. Student Study performance (Student Study data.csv)


A statistical study on student academic performance rely o on their age, marital status or
sleeping hours, and other relevant details which may affect their study performance.

This datasets is To help you make these predictions, you're given a set of students' records
throughout the weekend and their study performance by their cgpa for that we also collected if
they are married or unmarried.
File and Data Field Descriptions
Attributes
Students
Name: Students Name
Date: Date when the data collected
Day: Name of the weeks
Marital_Status: Students's marital status
your Gender: gender of the students
Study Hour: Study hour done by the students
Your Sleep Hour: Total sleep in a day by the student
Weekend: Was it a weekend in the place of the student
What is your age: Age of the students
what is your cgpa: Current CGPA of the student
6. Student Performance Data Set (student-por.csv)
If this Data Set is useful, and upvote is appreciated. This data approach student
achievement in secondary education of two Portuguese schools. The data attributes include
student grades, demographic, social and school related features) and it was collected by using
school reports and questionnaires. Two datasets are provided regarding the performance in two
distinct subjects: Mathematics (mat) and Portuguese language (por). In [Cortez and Silva,
2008], the two datasets were modeled under binary/five-level classification and regression
tasks. Important note: the target attribute G3 has a strong correlation with attributes G2 and
G1. This occurs because G3 is the final year grade (issued at the 3rd period), while G1 and G2
correspond to the 1st and 2nd-period grades. It is more difficult to predict G3 without G2 and
G1, but such prediction is much more useful (see paper source for more details).

7. Student Performance Prediction (mat2.csv)


This data approach student achievement in secondary education of two Portuguese
schools. The data attributes include student grades, demographic, social and school related
features and it was collected by using school reports and questionnaires.
Two datasets are provided regarding the performance in two distinct subjects:
Mathematics (mat.csv) and Portuguese language (por.csv). In [Cortez and Silva, 2008], the two
datasets were modeled under binary/five-level classification and regression tasks.
school student's school (binary: 'GP' - Gabriel Pereira or 'MS' - Mousinho da Silveira)
sex student's sex (binary: 'F' - female or 'M' - male)
age student's age (numeric: from 15 to 22)
address student's home address type (binary: 'U' - urban or 'R' - rural)
famsize family size (binary: 'LE3' - less or equal to 3 or 'GT3' - greater than 3)
Pstatus parent's cohabitation status (binary: 'T' - living together or 'A' - apart)
Medu mother's education (numeric: 0 - none, 1 - primary education (4th grade),
2 – 5th to 9th grade, 3 – secondary education or 4 – higher education)
Fedu father's education (numeric: 0 - none, 1 - primary education (4th grade), 2
– 5th to 9th grade, 3 – secondary education or 4 – higher education)
Mjob mother's job (nominal: 'teacher', 'health' care related, civil 'services' (e.g.
administrative or police), 'at_home' or 'other')
Fjob father's job (nominal: 'teacher', 'health' care related, civil 'services' (e.g.
administrative or police), 'at_home' or 'other')
reason reason to choose this school (nominal: close to 'home', school 'reputation',
'course' preference or 'other')
guardian student's guardian (nominal: 'mother', 'father' or 'other')
traveltime home to school travel time (numeric: 1 - <15 min., 2 - 15 to 30 min., 3 -
30 min. to 1 hour, or 4 - >1 hour)
studytime weekly study time (numeric: 1 - <2 hours, 2 - 2 to 5 hours, 3 - 5 to 10
hours, or 4 - >10 hours)
failures number of past class failures (numeric: n if 1<=n<3, else 4)
schoolsup extra educational support (binary: yes or no)
famsup family educational support (binary: yes or no)
paid extra paid classes within the course subject (Math or Portuguese) (binary:
yes or no)
activities extra-curricular activities (binary: yes or no)
nursery attended nursery school (binary: yes or no)
higher wants to take higher education (binary: yes or no)
internet Internet access at home (binary: yes or no)
romantic with a romantic relationship (binary: yes or no)
famrel quality of family relationships (numeric: from 1 - very bad to 5 - excellent)
freetime free time after school (numeric: from 1 - very low to 5 - very high)
goout going out with friends (numeric: from 1 - very low to 5 - very high)
Dalc workday alcohol consumption (numeric: from 1 - very low to 5 - very
high)
Walc weekend alcohol consumption (numeric: from 1 - very low to 5 - very
high)
health current health status (numeric: from 1 - very bad to 5 - very good)
absences number of school absences (numeric: from 0 to 93)

these grades are related with the course subject, Math or Portuguese:
G1 - first period grade (numeric: from 0 to 20)
G2 - second period grade (numeric: from 0 to 20)
G3 - final grade (numeric: from 0 to 20, output target)

8. Student Flexibility in Online Learning


(students_adaptability_level_online_education.csv)
Since as a beginner in machine learning it would be a great opportunity to try some
techniques to predict the outcome of Students’ Adaptability Level Prediction in Online
Education using Machine Learning Approaches.
9. Student Performance & Behavior Dataset (Students_Grading_Dataset.csv)
This dataset is real data of 5,000 records collected from a private learning provider.
The dataset includes key attributes necessary for exploring patterns, correlations, and insights
related to academic performance.
1. Student_ID: Unique identifier for each student.
2. First_Name: Student’s first name.
3. Last_Name: Student’s last name.
4. Email: Contact email (can be anonymized).
5. Gender: Male, Female, Other.
6. Age: The age of the student.
7. Department: Student's department (e.g., CS, Engineering, Business).
8. Attendance (%): Attendance percentage (0-100%).
9. Midterm_Score: Midterm exam score (out of 100).
10. Final_Score: Final exam score (out of 100).
11. Assignments_Avg: Average score of all assignments (out of 100).
12. Quizzes_Avg: Average quiz scores (out of 100).
13. Participation_Score: Score based on class participation (0-10).
14. Projects_Score: Project evaluation score (out of 100).
15. Total_Score: Weighted sum of all grades.
16. Grade: Letter grade (A, B, C, D, F).
17. Study_Hours_per_Week: Average study hours per week.
18. Extracurricular_Activities: Whether the student participates in extracurriculars
(Yes/No).
19. Internet_Access_at_Home: Does the student have access to the internet at home?
(Yes/No).
20. Parent_Education_Level: Highest education level of parents (None, High School,
Bachelor's, Master's, PhD).
21. Family_Income_Level: Low, Medium, High.
22. Stress_Level (1-10): Self-reported stress level (1: Low, 10: High).
23. Sleep_Hours_per_Night: Average hours of sleep per night.
10. Student Performance on an Entrance Examination
(Student_Performance_on_an_Entrance_Examination.csv)
This dataset contains comprehensive information regarding candidates' performance in
a common entrance examination, alongside various demographic and academic indicators.
It is designed to support analysis into the factors influencing success in competitive exams
and can serve as a valuable resource for educational researchers and data scientists.
1. Examination Performance: Data reflecting the candidate’s results in the entrance
examination.
2. Candidate Demographics:
3. Sex: Gender of the candidate.
4. Caste: Caste classification of the candidate.
5. Coaching Details:
6. Information on whether the candidate attended coaching classes within Assam,
outside Assam, or did not attend any coaching.
7. Educational Background:
8. Board Details: Names of the boards where the candidate studied during Class X and
Class XII.
9. Medium of Instruction: The medium used for teaching during Class XII.
10. Academic Performance:
11. Class X Percentage: Marks secured at the Class X level.
12. Class XII Percentage: Marks secured at the Class XII level.
13. Parental Occupation:
14. Occupation details for both the candidate's father and mother, which can help
analyze socioeconomic influences on performance.

11. Personalized Learning & Adaptive Education Dataset


(personalized_learning_dataset.csv)
his dataset is designed to support research on adaptive learning systems, personalized
education, and predictive student success modeling. It captures rich interaction logs from
online education platforms, including student engagement, quiz performance, learning
preferences, and dropout likelihood.
1. Student_ID – Unique identifier for each student
2. Age – Student's age (15-50 years)
3. Gender – Male, Female, Other
4. Education_Level – High School, Undergraduate, Postgraduate
5. Course_Name – Online course enrolled (e.g., Machine Learning, Python Basics,
Data Science)
6. Time_Spent_on_Videos (mins) – Total minutes spent watching videos
7. Quiz_Attempts – Number of attempts per quiz
8. Quiz_Scores (%) – Percentage score in quizzes
9. Forum_Participation (posts) – Number of forum discussions participated in
10. Assignment_Completion_Rate (%) – Percentage of completed assignments
11. Engagement_Level – Low, Medium, High (Based on activity metrics)
12. Final_Exam_Score (%) – Percentage score in the final exam
13. Learning_Style – Visual, Auditory, Reading/Writing, Kinesthetic
14. Feedback_Score (1-5) – Student rating of the course
15. Dropout_Likelihood (Yes/No) – Whether the student is likely to drop out
12. Student Performance (Student_Performance.csv)
The Student Performance Dataset is a dataset designed to examine the factors
influencing academic student performance. The dataset consists of 10,000 student records,
with each record containing information about various predictors and a performance index.
1. Hours Studied: The total number of hours spent studying by each student.
2. Previous Scores: The scores obtained by students in previous tests.
3. Extracurricular Activities: Whether the student participates in extracurricular
activities (Yes or No).
4. Sleep Hours: The average number of hours of sleep the student had per day.
5. Sample Question Papers Practiced: The number of sample question papers the
student practiced.
13. Students Performance Dataset (Student_performance_data _.csv)
This dataset contains comprehensive information on 2,392 high school students,
detailing their demographics, study habits, parental involvement, extracurricular activities, and
academic performance. The target variable, GradeClass, classifies students' grades into distinct
categories, providing a robust dataset for educational research, predictive modeling, and
statistical analysis.
1. Student Information
a. Student ID
b. Demographic Details
c. Study Habits
2. Parental Involvement
3. Extracurricular Activities
4. Academic Performance
5. Target Variable: Grade Class

Student Information
Student ID
StudentID: A unique identifier assigned to each student (1001 to 3392).
Demographic Details
Age: The age of the students ranges from 15 to 18 years.
Gender: Gender of the students, where 0 represents Male and 1 represents
Female.
Ethnicity: The ethnicity of the students, coded as follows: 0: Caucasian; 1:
African American; 2: Asian; 3: Other
ParentalEducation: The education level of the parents, coded as follows: 0: None; 1:
High School; 2: Some College; 3: Bachelor's; 4: Higher
Study Habits
StudyTimeWeekly: Weekly study time in hours, ranging from 0 to 20.
Absences: Number of absences during the school year, ranging from 0 to 30.
Tutoring: Tutoring status, where 0 indicates No and 1 indicates Yes.
Parental Involvement
ParentalSupport: The level of parental support, coded as follows: 0: None; 1:
Low; 2: Moderate; 3: High; 4: Very High
Extracurricular Activities
Extracurricular: Participation in extracurricular activities, where 0 indicates No
and 1 indicates Yes.
Sports: Participation in sports, where 0 indicates No and 1 indicates Yes.
Music: Participation in music activities, where 0 indicates No and 1 indicates
Yes.
Volunteering: Participation in volunteering, where 0 indicates No and 1 indicates
Yes.

Academic Performance
GPA: Grade Point Average on a scale from 2.0 to 4.0, influenced by study habits,
parental involvement, and extracurricular activities.
GradeClass: Classification of students' grades based on GPA: 0: 'A' (GPA >= 3.5); 1: 'B'
(3.0 <= GPA < 3.5); 2: 'C' (2.5 <= GPA < 3.0); 3: 'D' (2.0 <= GPA < 2.5); 4: 'F' (GPA < 2.0)

You might also like