A collection of datasets containing the grading point of evaluation of 1044 Portuguese students from two core classes(Mathematics and Portuguese).

Format

  • mat_df is a data frame with 395 observations.

  • por_df is a data frame with 649 observatoins.

  • full_df is a combination of mat_df and por_df, resulting 1044 observations.

These data frames all have 33 variables:

school

character vector. Student's school

binary: "GP" - Gabriel Pereira or "MS" - Mousinho da Silveira

sex

character vector. Student's sex

binary: "F" - female or "M" - male

age

Interger, student's age, from 15 to 22

address

Character, student's home address type.

binary: "U" - urban or "R" - rural

famsize

Character vector, family size

binary: "LE3" - less or equal to 3 or "GT3" - greater than 3

Pstatus

Character vector, parent's cohabitation status

binary: "T" - living together or "A" - apart

Medu

Integer, mother's education

multinomial:

  • 0 - none,

  • 1 - primary education (4th grade),

  • 2 – 5th to 9th grade,

  • 3 – secondary education,

  • 4 – higher education

Fedu

Integer, father's education

nominal:

  • 0 - none,

  • 1 - primary education (4th grade),

  • 2 – 5th to 9th grade,

  • 3 – secondary education,

  • 4 – higher education

Mjob

Character vector, mother's job,

nominal:

  • "teacher" - teacher,

  • "health" - health care related,

  • "services" - civil services (e.g. administrative or police),

  • "at_home" - at home,

  • "other" - other

Fjob

Character vector, father's job,

nominal:

  • "teacher" - teacher,

  • "health" - health care related,

  • "services" - civil services (e.g. administrative or police),

  • "at_home" - at home,

  • "other" - other

reason

Character vector, reason to choose this school

nominal:

  • "home" - close to home,

  • "reputation" - school reputation,

  • "course" - course preference,

  • "other" - other

guardian

Character vector, student's guardian

nominal: "mother", "father" or "other"

traveltime

Integer, home to school travel time

nominal/incremental:

  • 1 - less than 15 min.,

  • 2 - 15 to 30 min.,

  • 3 - 30 min. to 1 hour,

  • 4 - greater than 1 hour

studytime

Integer, weekly study time

nominal/incremental:

  • 1 --- less than 2 hours,

  • 2 --- 2 to 5 hours,

  • 3 --- 5 to 10 hours,

  • 4 --- greater than 10 hours

failures

Integer, number of past class failures

nominal/incremental: n if 1 <= n < 3, else 4

schoolsup

Character vector, extra educational support

binary: "yes" or "no"

famsup

Character vector, family educational support.

binary: "yes" or "no"

paid

Character vector, extra paid classes within the course subject (Math or Portuguese)

binary: "yes" or "no"

activites

Character vector, extra-curricular activities

binary: "yes" or "no"

nursery

Character vector, attended nursery school

binary: "yes" or "no"

higher

Character vector, wants to take higher education

binary: "yes" or "no"

internet

Character vector, internet access at home

binary: "yes" or "no"

romantic

Character vector, with a romantic relationship

binary: "yes" or "no"

famrel

Integer, quality of family relationships

numeric: from 1 - very bad to 5 - excellent

freetime

Integer, free time after school

numeric: from 1 - very low to 5 - very high

goout

Integer, going out with friends

numeric: from 1 - very low to 5 - very high

Dalc

Integer, workday alcohol consumption

numeric: from 1 - very low to 5 - very high

Walc

Integer, weekend alcohol consumption

numeric: from 1 - very low to 5 - very high

health

Integer, current health status

numeric: from 1 - very bad to 5 - very good

absences

Integer, number of school absences

numeric: from 0 to 93

G1

Integer, first period grade

numeric: from 0 to 20

G2

Integer, second period grade

numeric: from 0 to 20

G3

Integer, final grade

numeric: from 0 to 20

Reference paper

P. Cortez and A. Silva. Using Data Mining to Predict Secondary School Student Performance. In A. Brito and J. Teixeira Eds., Proceedings of 5th FUture BUsiness TEChnology Conference (FUBUTEC 2008) pp. 5-12, Porto, Portugal, April, 2008, EUROSIS, ISBN 978-9077381-39-7.

This paper is available at http://www3.dsi.uminho.pt/pcortez/student.pdf