Research Methods for the Learning Sciences - upenn.edu

Publish on 12th November 2019 Category: Birds 0

Core Methods inEducational Data MiningEDUC691Spring 2019The HomeworkLet’s go over basic homework 1The HomeworkLet’s go over basic homework 1Who did the assignment in Python?Who did the assignment in RapidMiner?RapidMiner folksHow well did you succeed in making the tool work?What were some of the biggest challenges?Python folksHow well did you succeed in making the tool work?What were some of the biggest challenges?Did it make a difference?When you ran Decision Tree/W-J48 with an without student as a variable in the data set?What was the difference?Did it make a difference?When you ran Decision Tree/W-J48 with an without student as a variable in the data set?What was the difference?Why might RapidMiner and Python produce different results for this?Removing student from the modelHow did you remove student from the model?There were multiple ways to accomplish thisHow would you know…If you were over-fitting to student?Or any variable, for that matter?What are some variables…That could cause your model not to apply to new data sets you might be interested in?Student is one example… what else?Did it make a difference?What happens when you turn on cross-validation?Questions? Comments? Concerns?How are you likingRapidMiner and Python?Other RapidMiner or Python questions?Note…Python and RapidMiner have a different set of algorithms availablePython’s set tends to be more recentBut it’s not totally clear they are *better*We’ll come back to this when we discuss HandWhat is the difference between a classifier and aregressor?What are some thingsyou might use a classifier for,in education?Bonus points for examples other than those in the BDE videosAny questions about anyclassification algorithms?Do folks feel like they understood logistic regression?Any questions?Logistic Regressionm = 0.5A - B + CLogistic Regressionm = 0.5A - B + CLogistic Regressionm = 0.5A - B + CLogistic Regressionm = 0.5A - B + CLogistic Regressionm = 0.5A - B + CWhy would someoneUse a decision tree rather than, say, logistic regression?Has anyoneUsed any classification algorithms outside the set discussed/recommended in the videos?Say more?Other questions, comments, concerns about lectures?Did anyone read Hand article?Thoughts?What is Hand’s main thesis?What is Hand’s main thesis?Who thinks it makes sense?Who thinks he’s probably wrong?What is Hand’s main thesis?Who thinks it makes sense?Who thinks he’s probably wrong?Please present arguments in favor of each perspectiveIf he is wrongWhy do simple algorithms work well for many problems?If he is rightWhy have some algorithms like recurrent neural networks become so popular?If he is rightWhy have some algorithms like recurrent neural networks become so popular?Note that many of the key successes have been in very large scale data sets like voice recognitionOne of Hand’s key argumentsData points trained on are not usually drawn from the same distributionAs the data points where the classifier will be appliedOne of Hand’s key argumentsData points trained on are not usually drawn from the same distributionAs the data points where the classifier will be appliedIs this a plausible argument for educational data mining?One of Hand’s key argumentsData points trained on are not usually drawn from the same distributionAs the data points where the classifier will be appliedIs this a plausible argument for large-scale voice recognition technology?Another of Hand’s key argumentsData points trained on are often treated as certainly true and objectiveBut they are often arbitrary and uncertainAnother of Hand’s key argumentsData points trained on are often treated as certainly true and objectiveBut they are often arbitrary and uncertainIs this a plausible argument for educational data mining?Another of Hand’s key argumentsData points trained on are often treated as certainly true and objectiveBut they are often arbitrary and uncertainIs this a plausible argument for large-scale speech recognition?NoteHand refers to these issues as over-fittingBut they are a specific type of over-fitting that is relevant to some problems and not to othersAnd is different than the common idea that over-fitting comes from limited dataAnother of Hand’s key argumentsResearchers and practitioners usually do best when working with an algorithm they know very wellAnd therefore more recent algorithms win competitionsBecause those are the algorithms the researcher knows bestandwants to prove are betterMomentary digressionWho here is familiar with data competitions like the KDD Cup, Kaggle competitions, andASSISTmentsLongitudinal Challenge?Some counter-evidence to HandRecent algorithms win a lot of data mining competitions these days (where lots of people are trying their best)Some counter-evidence to HandRecent algorithms win a lot of data mining competitions these days (where lots of people are trying their best)Those of you who like Hand, how would you respond to this?Some counter-evidence to HandOne possible rejoinder: These are usually well-defined problems where the training set and eventual test set resemble each other a lotAnother practical questionShould youPick one algorithm that seems really appropriate?Run every algorithm that will actually run for your data?Something in between?My typical lab practicePick a small number of algorithms thatHave worked on past similar problemsFit different kinds of patterns from each otherIs it really the algorithm?Or is it the data you put into it?We’ll come back to this in the Feature Engineering lecture in a monthQuestions? Comments?Creative HW 1Questions about Creative HW 1?Questions? Concerns?Other questions or comments?Next ClassFebruary 13Behavior DetectionBaker, R.S. (2015) Big Data and Education. Ch.1, V5. Ch. 3, V1, V2.Sao Pedro, M.A., Baker,R.S.J.d.,Gobert, J., Montalvo, O.Nakama, A. (2013) Leveraging Machine-Learned Detectors of Systematic Inquiry Behavior to Estimate and Predict Transfer of Inquiry Skill.User Modeling and User-Adapted Interaction, 23(1), 1-39.Kai, S., Paquette, L., Baker, R.S., Bosch, N.,D'Mello, S.,Ocumpaugh, J., Shute, V., Ventura, M. (2015) A Comparison of Face-based and Interaction-based Affect Detectors in Physics Playground. Proceedings of the 8th International Conference on Educational Data Mining, 77-84.Creative HW 1 dueThe End