training effectiveness literature review

>> teresa duncan: goodmorning, everybody. welcome to "using alternativestudent growth measures for evaluating teacher performance."my name is teresa duncan. i am facilitatingtoday's meeting. i am also the directorof rel mid-atlantic. today's presenterwill be brian gill. he is the lead author of thereport that this webinar is based on, and he is asenior fellow at mathematica policy research.

before we get started, i wantedto go and review a few of the features of ourwebinar platform with you. and so, as you'll see, thissoftware is modular, and you've got different sections of thescreen representing different ways you can interact with us. and so the presenter'sphoto is up in the upper left-hand corner. that's me. below that is a box calledquestions about content, so

please feel free to type in anyreactions, questions, comments that you have as we go along. and we will try to address thosequestions as we move forward through the webinar. you'll also see, down at thebottom left-hand corner, we are closed captioning this, andso that will be available immediately after this webinaras an archived file so that folks who are hearing impairedare able to also take advantage of this particular webinar.

moving down to the lower middleof your screen are resources to download. and so you'll see that youcan download a copy of the powerpoint slides that we'll beusing today, as well as a copy of the report thatthis webinar is based on. that report is also availableon the rel mid-atlantic website. over to the bottom right,we have technical issues. and so if you have anychallenges, please do type that in, and then one of our hosts inthe background will be able to

work with you to try to resolveany technical issues that you might encounter. hopefully, you won'tbe encountering any. and finally, we have a coupleof web links down in the lower right-hand corner. your feedback is very importantto us, and so we hope that you'll take a few minutes at theend of this webinar to fill out the evaluation survey. and there is also, for thoseof you who would like to submit

this experience, this time thatyou've spent here with us, as contact hours for yourprofessional development credits, we are able to providea certificate of attendance. and so that is an option thatyou can take if you wish at the end of the evaluation survey. and there's also a link to ourforum on our website, and we will -- i'll describe thata little bit more later on. so it's a very easy format. i think it's pretty intuitive.

and so we hope that you willparticipate and take advantage of these interactive features. a couple of other items that wewill be using today to make this a little bit more interactive,more dynamic, is our polls and open-ended questions. and so, you'll see that onetype of poll are just, you know, bullets that you can select anoption and tell us, you know, endorse that particular optionand tell us what's going on from your end.

and another type of open-endedpoll that we have allows you just to type in yourcomments and reactions. and so we hope that that willmake this a little bit more dynamic and facilitatesome discussion amongst our participants. so the agenda today, i'vegone through the welcome. i'll tell you a little bit aboutrel mid-atlantic and introduce brian to you all. and then we'll go through thepresentation, have a question

and answer period, andthen close it up, have some concluding remarks, and adjourn. so this is scheduledto go through 1 p.m. today, and we hope thatyou'll be able to stay for the whole session. so just a few wordsabout rel mid-atlantic. the rel program, the regionaleducational laboratory program, is a national program, and it'sset up with a regional focus. and of course,this is mid-atlantic.

we serve delaware, maryland, newjersey, pennsylvania, and the district of columbia. and the mission of the relsis really all about technical assistance around data use anddata-driven decision making. and so we pair that technicalassistance with the conduct of different types of researchstudies, and we hope to be able to develop useful, accessibleresearch-based products that we're able to share withstakeholders in our region, as well as nationally.

and so we invite youto visit our website, www.relmid-atlantic.org, andsee what resources there might interest you. and so before we start, i'd liketo ask about who is joining our webinar today. so, sharon, if youcould put that poll up? if you could tell us a littlebit about who you are and give us a sense ofwho's attending today. [pause]

lots ofresearchers in the audience. okay, we have apretty diverse group here. so thank you. thank you for joining us, and welook forward to your questions and your input. so, let's close out thiswindow and move forward. i'd like to welcomedr. brian gill of mathematica he is one of our seniorresearchers here at the rel, and he has led several of ourstudies and several of which

will be coming up inthe next several months. he is a leading expert onvalue-added metrics of school and teacher performance, ofcharter schools, school choice, and supplementaleducational services. he has published in manydifferent educational journals including economics of educationreview, education evaluation and policy analysis, journal of labor economics. and so he is widely publishedand really is a tremendous resource for our rel.

i'd also like to welcome dr.judy ferguson who joins us from new jersey. she is a veterandistrict administrator. she was also former presidentof the new jersey association of school administrators. so she is truly aveteran practitioner. and i think we've got anexcellent group today, and not just presenters but alsowith our participants. and so let me turnthis over to brian now.

>> brian gill: great.thanks so much, teresa. and thanks, judy, forparticipating, and thanks to all of you for joiningthis webinar today. i confess that this is the firsttime i have done one quite like this, but i will look forwardto trying to keep up with your comments online as igo through my slides. and there will be a coupleof places where i'll pause specifically totry to invite input. you won't be able to do thatover the phone, but you'll have

the opportunity to type in anyquestions you might have, and so judy and i will hope toengage you in some discussion. let me see if i canadvance the slides. no...yes. so as teresa mentioned, thiswebinar is based on a report that was released a few weeksago that involved a review of the literature on what we calledalternative student growth measures for -- that are beingused for evaluating teacher performance andteacher effectiveness.

and so what i'm going to try todo today is describe some of the findings from thatliterature review. as i'm sure most if not allof you know, states are now mandating that studentachievement growth be included as a component of measuresof teacher evaluation. this is coming partly as aresult of policy changes at the federal level where both therace to the top program and the waiver flexibility process forthe elementary and secondary education act requires statesto include student achievement

growth as one component at leastin teacher evaluation measures. in the mid-atlantic region,all of the states have moved to adopt student growth measures,and in fact, the great majority of states all over thecountry are doing this. now, most of the discussionaround student growth measures has related to the quantitativemeasures that are typically applied to state standardizedassessments in reading and math in grades 3 or 4 through 8. and usually the calculation ofstudent growth and applying it

to teachers involves somethingthat is called either a value-added statisticalmethod or a student growth percentile measure. and those things are complex andinvolve a lot of methodological discussion, but in one sense,it's relatively straightforward to figure out what we want to dowith them in those grades where you have successive -- youhave tests, standardized state assessments in successivegrades in the same subject. the challenge, of course,is that in most places, the

majority of teachers can't havea growth measure that is based on those kinds of measuresbecause they're teaching grades or subjects that are outsideof the state's standardized testing regimes. so again, as many of youprobably know, states and districts are struggling with"well, how are we going to measure student achievementgrowth associated with all the rest of our teachers, whetherthey are kindergarten teachers, or high school teachers,or art teachers, or special

education teachers?" and the point of our literaturereview was to try to find out what's known about how thiscan be done, about applying alternate growth measures toall those teachers who aren't in reading and math ingrades 4 through 8. and we found that there aretwo broad approaches that are being used. the first one is to use thesame kind of sophisticated statistical methods, value-addedmodels or student growth

percentiles, and apply those toalternative student outcomes. that is, apply those toassessments that are, go beyond what states typically have inreading and math in grades 3 or 4 through 8. so that involves the same kindof statistical adjustments but using them for different testsand different kinds of student outcomes, and i'lltalk a good bit about that. the second broad approach is amore qualitative one and that involves using teacher-developedor teacher-selected student

learning objectives, which insome states, new jersey, for example, are calledstudent growth objectives, so slos or sgos. and in those instances, there'snot necessarily a need for standardized assessments orcommon standardized assessment because, instead, the teachershave some discretion to develop or select an assessment. so what i'll be talking about isboth of those general categories and what the literature has toldus about the measures being used

and what we can learn from them. to start then, let's talk aboutthe statistical models, the value-added measures and similarmeasures that are being applied to other assessments. first of all, we found in someplaces that value-added models have been applied tocommercially available assessments, to testslike the stanford 9 and other widely availablecommercial assessments. and the literature finds someencouraging evidence of both

validity and reliabilityin using those commercially available assessments forgetting value-added measures of teacher performance. by "validity," we mean thatthere's evidence that they're measuring what wewant them to measure. and in this particular instance,the research shows that value-added measures basedon commercially available assessments tend to correlatepositively with measures based on observations of teachers'professional practice and with

measures based on studentsurveys which is encouraging. it suggests that these aremeasuring something about teachers' contributions tostudent achievement that educators should care about. and then by "reliability,"there we're interested in the stability of the measures, inthe year-to-year consistency and the extent to which they'retelling us something that is really about the teacher ratherthan just a random variation. here again, theevidence is encouraging.

the value-added measures thatuse the commercially available assessment, they showyear-to-year consistency for individual teachers that isconsistent to what we see with value-added models appliedto annual state assessments. so that's good news. now, it's also the case thoughthat the evidence suggests it's a good idea when trying to usevalue-added measures in teacher evaluation, that it's a goodidea to try to increase the reliability by averaging acrossmultiple years of teaching and

by making what the statisticianscall shrinkage adjustments. and doing both of those thingscan reduce the amount of random error that thegrowth measures produce. that's true actually regardlessof whether the value-added models are being appliedto a commercially available assessment or to astate assessment. either way, thoseare good practices to help improve reliability. what else do we knowabout value-added models?

well, the pittsburgh publicschools have been doing something interesting the pastfew years with some help from some folks at mathematica. pittsburgh developed -- severalyears ago started developing assessments that are alignedwith individual courses across subjects, particularly insecondary grades 6 through 12. these were not designedinitially for purposes of teacher evaluation. they were designed to try toimprove the consistency of the

curriculum across the districtand to produce standardized core-specific measuresof student learning. but it turns out, as we wereable to show in doing some analysis, that it is possibleto apply the value-added statistical adjustments tothese locally developed, curriculum-based assessments,just as it is for commercially available assessments andstate standardized assessments. but the curriculum-basedassessments cover a whole lot more grades and subjects thanthe state assessments do.

so it makes it possiblein pittsburgh to measure value-added for teachers in mostmiddle and high school courses, so not just reading and math ingrades 4 through 8, but in math, english language arts, science,and social studies, in nearly every course ingrades 6 through 12. and in this case, as in the caseof the commercially available assessments, we find someencouraging evidence of both the validity of the value-addedmeasures and the reliability. teachers who do well in termsof value-added on the state

assessments also tend to do wellin terms of value-added on these curriculum-based assessments,which is encouraging from a validity perspective. and then in terms ofreliability, we find that teachers at both ends ofthe spectrum in terms of the measured value-added on thesecurriculum-based assessments are statisticallydistinguishable from average. so there's real informationthere about the teacher's contribution tostudent achievement.

finally, there's even someindication, very little information on this, but there'ssome indication that value-added statistical models could beapplied not only to commercial assessments and to locallydeveloped core-specific assessments but even perhapsto some nontest outcomes. there's been some preliminaryresearch done, also in the pittsburgh public schools,on schoolwide value-added for student outcomes like attendance and course completion.so here, not assessments at all.

there's no clear evidenceof the validity of those results, unfortunately. there is some evidence oftheir reliability in that it's possible to distinguish --statistically distinguish the highest and lowestperforming schools from average. now, there is an open questionabout whether these kinds of outcomes would be suitablefor teacher evaluation. in pittsburgh, they've been usedonly for school-level results because the notion is that theseare the kinds of outcomes that

are the responsibility of theschool as a whole rather than for individual teachers. so that summarizes whatwe found with respect to the statistical models. to the application ofvalue-added measures to other kinds of outcomes beyondthe state assessments. and let me pause there becausei want to give judy a chance to comment, and i'd like togive all the rest of you the opportunity to type in anyquestions or comments you might

have specifically related tothe value-added models and their application to these othermeasures, and then we'll turn to talking about studentlearning objectives and student growth objectives. >> judy ferguson: brian? you know, i am, as a secondaryschoolteacher originally, i'm encouraged by the pittsburghwork in terms of designing end-of-course tests that can beadministered across either grade level or curriculum andhave some clear connection to

so i think that's a greatdirection in which to go. i have huge concerns, as you'vealready mentioned, with the measures such as attendance andcourse completion and how they relate or don't relateto teacher performance. i know the question is, youknow, does achievement improve because youcompleted the course? or did you stay in the coursebecause you were achieving? or did you drop out of thecourse for other reasons? and really, you have to isolatethe impact of the teacher on the

effects rather than just asimple correlation, which is problematic. what's the cause andwhat's the effect there? the schoolwide measures, i thinkit's great to look at things like attendance in particular. a lot of teachers, a lot ofadministrators, staff members have an impact on attendance. one teacher in a classroomreally cannot impact on attendance, so, therefore,shouldn't be measured on

that variable. i think perhaps at the collegelevel where attendance and course completion are specificto an individual teacher, because as a college student, ican go to a class or not go to class, i can complete the courseor not complete the course. there may be more hopefor it as an indicator of performance there. so those are myreactions at this point. i think if we -- those kinds ofmeasures are useful schoolwide.

they could even be used on agroup of teachers, a group of staff members for merit payif you can -- as a schoolwide effort, if you can improveachievement through improving attendance and throughimproving course completion. >> brian gill: thanks, judy. i think that makesa whole lot of sense. and i think those are exactlythe reasons that in pittsburgh for these kinds of measures,they are looking at them for the schoolwide level rather thanfor individual teachers.

we thought it was worthincluding them in the literature review anyway just to -- becausei think -- well, partly because some of the analysis that'srelated to teacher evaluation is also potentially relevant toprincipal evaluation, which is something that districts andstates are considering alongside this; and some of these measuresmight be appropriate for principals, if not for teachers. and more generally, to letpeople recognize that it might be possible to -- the purposeof the value-added statistical

method is to try to identifyan educator's contribution to student outcomes. and the general point here isthat we're seeing good reasons to believe that thosestatistical methods could be useful for a variety ofdifferent kinds of outcomes, and it's not just the standardizedassessments in reading and math in grades 4 through 8. so, in pittsburgh and afew other places, they've demonstrated that it's possibleto go beyond that, that it's

possible to come up withvalue-added measures of teacher performance that allow - thatmake it possible to include a lot more teachers and thatinvolve using student outcomes that are specifically designedto be well aligned with the curriculum in the case of thesecurriculum-based assessments. and i see that there's aquestion here -- there's several questions here, let's see if ican go to some of the ones that are showing upon the screen here. one is "were thestudies reviewed primarily for

elementary grades?" no, again with respect to thecommercial assessments, they were often in elementary gradesgoing up to middle grades. but for the course-specificassessments, they were secondary grades, 6 up through 12. and that's useful because asyou know, very few states have standardized assessments formost courses at the high school level, but there are clearlyopportunities to develop those and to apply thestatistical models to them.

another question about whetherthey're being used to evaluate special education teachers, andthat is an area that requires a lot more research. to my knowledge, it's relativelyrare that they are being used for that purpose. the problem is that you have tohave assessments that will do a decent job of measuring theachievement and the changes in the achievement for whateverpopulation of kids that are being served.

there are almost certainlysome kinds of special education students that could be includedin measures related to the value-added of teachers, butit's going to depend a lot on what kind of assessments areavailable for them and whether there are a large number ofteachers serving similar kinds of students. so far, i would have to say thatthere's really very little known about the extent to whichvalue-added models can be validly applied tospecial education teachers.

i would expect that there willbe a good bit more research on that in coming years, buti wish i could provide more information now. "what states areconsidering using attendance in value-added models?" that's a good question, andi don't know enough about the policy regime for that. i don't know of any state thatis considering using it for teacher evaluation purposes.

whether there are some statesthat may be using it at the school level, i don't know. as i mentioned, it's certainlythe case in pittsburgh that they're interested in usingattendance and other kinds of nontest outcomes for purposesof measuring a school's performance, though notfor teachers' performance. let's see. oh, these things are scrollingupward rather than downward. we've got some answers aswell as some questions.

so, someone is pointing outthat in new jersey, there are regulations saying that studentsare only included if they attend for 70 percent of the year. different states have differentregulations about that sort of thing. you can see these answers ithink as well, so i'll let you read those. we have at least one morequestion here: "how do you recommend communicatingthe uncertainty in these

value-added estimates?" that is a great questionand it's a real challenge. it's a realcommunications challenge. i think that one key point tomake with regard to uncertainty, any measurement of anythinghas some uncertainty in it. so, if you think aboutvalue-added models in the context of teacher evaluationgenerally, the -- there are always going to be only a partof a teacher's evaluation, and a large part of it is going toremain and should remain based

on observations ofprofessional practice. now, in fact, the principal'sobservation of a classroom or more broadly a principal'syear-long evaluation of a teacher's professional practicehas some uncertainty in it as well, but that's not alwaysrecognized because it typically isn't usually quantified. and so there's a sense inwhich the uncertainty or the reliability of a value-addedmeasure is not conceptually any different from the uncertaintyaround the classroom observation

where what the principal seesdepends on what day of the week it is, depends on how the kidswere doing that day, on the particular lesson chosen. so the first thing, i think,is to make sure that there's a recognition that any measure hassome uncertainty in it, and the aim is to try to reduce thatuncertainty in any way possible. and so that's why i think thatthe -- that's why when we are talking with districts andstates about value-added models, we recommend that theyaverage across multiple years of

teaching toreduce the uncertainty. and we recommend that theyapply the statistical shrinkage adjustment toreduce the uncertainty. and there's pretty good evidencethat suggests that when you do those sorts of things in yourvalue-added models that you get -- you can reduce the level ofuncertainty to a level that is as good as what you would getin almost any measure of job performance that is used intrying to understand someone's job performanceoutside of teaching.

so i think that describing thebroader context is a key point. and also explaining what themethod is doing to try to reduce the uncertainty. okay. i think that deals with all ofthe questions that came up here. and so this was great. so thank you for those. let me turn then to the studentlearning objectives, the second broad category here. so student learning objectives,or student growth objectives as

they're called in new jerseyand some places, have a great advantage, that they can beused as growth measures for any teacher pretty much,any grade, any subject. the way they usually work isthat individual teachers choose classwide growth targets thatare supposed to be based on students' abilities at thebeginning of the school year. in most places, theyhave to be approved by the school principal. and there's some variationacross different states and

districts in exactly how muchthe teacher has discretion to select the outcome measure. in some places, districts placeconstraints and say you need to use this measure, but you canchoose what the target is going to be for this measure. in other places, it's more wideopen and allows teachers to select with most-- with discretion. there is not much known aboutthe reliability and validity of student learning objectives.

we couldn't find any studiesthat have looked at reliability, looked at how much error thereis or how much random error there is in thestudent learning objectives. so not nearly as much is knownabout that as there is for the value-added models. there's a little bitof evidence of validity. a couple of studies have foundsmall positive correlations between a teacher's success inachieving the slo goal and the teacher's value-added.

that's encouraging. but it's -- frankly, there'snot much evidence out there yet. and when you think about it,it's not too surprising that there's less evidence on thereliability and validity of slos than on value-added modelsbecause the flip side of the fact that they can be appliedto any teacher of any grade and subject and that they, you know,allow this customization is that it's challenging to ensuretheir reliability and validity. when teachers have a lot ofdiscretion about selecting the

slos and designing them, thenit's going to be hard to make sure that they are, that theyare comparable across teachers in terms of the difficultyof achieving the objectives. moreover, they sometimes haveto rely on teacher-developed assessments, and most often,there's no information available about the reliability andvalidity of those underlying assessments in measuringthe student's learning. so there's an enormous amountthat's kind of inherently unknowable in a process thatinvolves this much variation and

teacher-level discretion. they do present someimplementation challenges. implementing student learningobjectives or student growth objectives across a districtrequires quite a bit of training, not only of teacherswho will be responsible each year for developing the slos butalso for principals who will be responsible for approving themand playing some role in trying to make sure that theyare comparable for different teachers across different gradesand subjects and classrooms in

their schools. they involve a substantialamount of work for both teachers and principals. apart from the training, evenonce the educators have been trained, they've got to actuallydo this each -- at the beginning of every school yearfor each class of students. some districts try to improvethe reliability and validity of the slos by having a sort ofdistrict-level auditing that looks over them and provides asecond layer of approval beyond

what the principal provides. that seems sensible as a wayto try to promote reliability and validity. i don't think there's - there'sno evidence out there on whether it succeeds in doing that ornot, but it seems sensible. but of course, it creates anadditional level of burden at the central office, besidesthe burden at the classroom and school levels. in some places that have beenusing slos, teachers have raised

questions about their fairness. again, not surprisingly, giventhat it's hard to make sure that these things areconsistent across classrooms. and then a final implementationissue that's a potential challenge is that using slos orsgos for purposes of evaluating teachers, using them for whatis in essence a high-stakes purpose, might underminetheir value for purposes of instructional planning. and i think most of the folks,at least early on who have been

champions of slos and sgos havepromoted them at least in part, and maybe primarily, becausethey view them as valuable tools for teachers in planning theirinstruction over the course of the year, that it can beextremely helpful to spend time analyzing your students' needsand abilities and setting targets for their achievementgrowth over the course of the year. when those are also going to beused as a high-stakes evaluation for the teacher that createsa little bit of a conflict

of interest. that is, for purposes ofinstructional planning, teachers are probably going to wantto set very ambitious goals and targets. but if their evaluation then isgoing to depend on whether they meet those goals and targets,that implicitly at least might create some pressure tolower the expectations a bit. so that is something that ithink districts and states need to think about asthey put this together.

so that's what we learned inthe literature view on student learning objectives andstudent growth objectives. so i will pause again herebefore turning back to a general summary of the findings, just toask what questions you all have, what comments judy has aboutstudent learning objectives and student growth objectives. >> judy ferguson: well,i'll start, brian. >> brian gill: thank you. >> judy ferguson: i think you'vereally identified the value and

the concerns about using studentgrowth objectives for measuring let me start with the positive. the opportunities, certainlyit's another opportunity for professional development, forteachers to learn and grow individually and collectively. to learn how to measure theirinstruction, to work toward instructional improvement. we see things like professionallearning communities, better conversations between principalsand teachers about classroom

instruction, more reflection,more self-assessment, and all of that. so those are all thegood things about it. the concern with the reliabilityand validity as you get further and further and further awayfrom standardized measures to a classroom-made test, and thenusing that to evaluate teachers for retention or promotion,those are where the big problems exist. like you, i agree that the valuefor designing and measuring

growth measures like thisfor teachers is that they can improve their instruction. the danger is that it'salmost like a pseudoscience. we've created this elaboratestatistical methodology around something that weknow really is judgment. and we're back to the principalevaluating the teacher based on a lot of professional judgmentand teachers evaluating students based on theirprofessional judgment. so we have to be very carefulthat we don't see this as all

scientific and proven and,therefore, we can make really important decisions aboutteacher livelihood based on this unreliable, at this point,statistically not valid or reliable, as i said. i don't want to beat adead horse on that one. it is a tool, it is one of manytools, and i think it's a great tool, but we don't wantto misuse it in education. and the other question i have,and then i'll turn it back to others to ask questions, iswhat return on investment are we

getting here. we're putting a lot of time andenergy and money into another way to measurestudent achievement. we have lots of ways to measurestudent achievement already. so that's the kind ofdevil's advocate here to start the conversation. is it really worth the time,money, and effort considering what we're getting from it atthis point in terms of knowing more about students havelearned, how they've grown, and

how teachers haveimpacted on that learning. so i'll turn it back to you. >> brian gill: okay,thanks so much, judy. those are really greatand provocative thoughts. i appreciate that. nobody had done a study of sortof a cost-benefit kind of study of student learning objectivesor for that matter of value-added models, but i thinkit's an entirely appropriate question to ask, particularlybecause the level of

individualization of thesethings necessarily implies that there's a whole lot ofwork that goes into them. and there's a risk, as you say,of making something look like it's telling you more than itis, by enshrining it in policy even when, in practice, weknow that there are serious limitations on the validity andreliability of an assessment or a target developed by anyindividual teacher who frankly is not in a position to beable to assess the validity and reliability of his orher own decisions there.

let me see -- take a look atsome of the questions that are coming up here. one person asked where can wefind resources for creating slos, best practices. one place i know of that hasdone a ton of work on this and it's probably worth checking outis an organization called the community training andassistance center, ctaac. and they've done some -- a lotof work on implementation of slos and professionaldevelopment and also some of the

research on slos. you can find them atctaacusa.com, or if you type - if you just go to googleand look up student learning objectives and ctaac, whichagain is community training and assistance center, you can --you'll find some resources on there, their website. i think, i'm sure thatthere are more out there. that's the one i'mmost familiar with. oh, wow!

somebody actuallybrought up the link. thank you so much. i don't even know where thatcame from, but i'm impressed. instant result. whoever is interested in thatquestion, the link is now listed up there at thebottom of poll 3. okay, what otherquestions here came up? "given the issues with slos,sgos, what other ways can student growth be measured fornon-tested subjects and grades?"

that's a great question, and iwish i had a good answer for it. you know, i think the key issuehere is to think about what one's primary purpose is. and if your primary purpose intrying to measure achievement growth is to use this forpurposes of teacher evaluation, for a relatively high-stakesmeasure of a teacher's contribution to studentachievement, then i think that implies that reliability andvalidity are very important. and so in that case, i thinkit's necessary -- if we want to

ensure comparability, it'snecessary to try to think about well, how can we make sure thatat least in terms of consistency across the school and a districtwithin a particular subject and grade, say, can wedevelop some assessments? that is, do something likethey've done in pittsburgh? now some states are addingassessments particularly at the secondary level thatare course specific. if those become more prevalentin secondary grades and subjects, there will be moreopportunities to apply the

value-added models there, butfrankly we're a long way, i think, from having any sort ofstandardized assessments that can usefully measure learninggrowth in, say, arts or music. and so there really isn't,as far as i know, any great solution to that at the moment. i would say that that's one ofthe reasons that having really good measures of professionalpractice is absolutely critical. i would say it remains criticaleven in grades and subjects where we have goodvalue-added measures and good

standardized assessments. but it's absolutely essential ingrades and subjects where that can't be done. i wish i had more of an answer,a better answer to that. okay. "the real value in using vamsor slos is what it tells the teacher in terms of howto improve instruction. what research hasbeen done in this area?" that is an interesting question,and i think that that's one where a lot of researchis just getting started.

one of the things that i alwayssay to districts and states, particularly when they'reconsidering value-added models, is that by themselves,value-added estimates don't actually provide anyinformation about how to improve instruction. even if they provide a reallygood and valid and reliable measure of a teacher'scontribution to student achievement, they don't, bythemselves, they don't provide any support or any informationabout how a teacher can improve

value-added, and there again,i think that we need to -- the research needs to -- theresearch needs to improve a lot in terms of connecting highvalue-added to particular teacher practices, and frankly,as researchers, we're just getting started on that. but i think that even in theabsence of a lot of rigorous research on that, i think thatit should be possible for school districts and schools to lookthemselves at value-added results, or student learningobjective results for that

matter, in the contextof information about professional practice. and even if they're not in aposition to do that in sort of a rigorous, quantitative kind ofway, i think that there's enough knowledge of professionalpractice among educators that it ought to be possible to use thatknowledge to help in coaching teachers to improve theirinstructional practice if their value-added or studentlearning objectives results don't look great.

"how should performancemeasure data be collected from teachers?" i confess, i'm not quite surei understand what that means. i guess it depends on what sortof performance measure it is, because it strikes me that allthe things we've been talking about are different kindsof performance measures. so and that how it would becollected would depend on the type of measure. "what weight do slostypically have in evaluations?"

that is very often determined bystate and district policy, and it varies in different places. and in some states, they givea lot of discretion to school districts to determine howmuch weight that's going to be. other states don't -- a fairlytypical practice is to say the slos will have - in nontestedgrades and subjects -- whatever weight value-added is supposedto have in tested grades and subjects, but that's not by anymeans universal, and there are some places where slos andvalue-added are even used for

the same teachers. you know, having a value-addedestimate for a teacher doesn't mean the teacherdoesn't also get an slo. and so it's some combination ofthings where the weight of the slo might depend onwhether the teacher also has a value-added measure. so i don't think i should sayanything more definitive about that, because in fact, it doesvary depending on what state and district you're in.

"are you seeing more slo tasksthat are performance based or pencil and paper tests?" i haven't seen any studiesthat have specifically made that comparison. certainly both kinds of tasksexist, and my assumption is that which is used more commonlywould depend a whole lot on what grade and subjectwe're talking about. but i'm not aware of anybody whohas systematically catalogued those things to see whichare being used more commonly.

and so i think that goes to allthe questions i see up there. all right. well, thank you again for those. more good questions. let me just go back to a summarybriefly, and then we can talk more about any remaining issuesthat folks are interested in. so brief summary, as we heard,there is good evidence that value-added statistical modelscan be successfully applied to alternate student assessmentsincluding commercially available

assessments and home-grownassessments, if those are consistent across the gradeand subject within a district, at least. the slos are broadlyapplicable to all teachers. and sometimes correlate withvalue-added which is encouraging from a validity perspective,but they involve some pretty substantial implementationchallenges and some threats to validity and reliability. so i guess here we areinterested in learning from you

what sort of alternativemeasures that you are seeing being used in the schools anddistricts where you work and hoping that that might inspirea little bit of discussion. and that is apparently poll 4. there it is. so yes, so let us know whatkinds of alternative measures are you seeing being used foralternative growth measures for purposes of teacher evaluation. my sense is that nearly everystate is now using slos or sgos

in some form or another becauseagain their student growth measures are being mandated foruse in teacher evaluation, and nobody has standardizedassessments for every grade and subject. but i'd be interested in hearingabout any variance on those practices or anybody who isseeing standardized assessments other than the conventionalstate tests being used and having value-added statisticalmodels applied to them. okay, here we're seeing a few.

"teachers using portfoliosdesigned with students where the rubric is standardized." okay,well, that's a very interesting one and seems like a promisingway to go to try to create some standardization where -- whilestill having it be applicable to all sorts ofgrades and subjects. maybe we can do someresearch on that in the future. "district developed assessments,nationally recognized assessments, industry certifiedexam, student projects and portfolios." "students meetingor exceeding expected growth on

the pre-and post northwestevaluation assessments." that's an interesting one. that's a kind of hybrid. it's using a standardizedassessment from the nwea which i know many, many schools anddistricts across the country are using for diagnostic purposes. one opportunity i would suggestlooking at there is actually formally applying a value-addedstatistical model to nwea assessments rather than justcounting the number of students

meeting or exceeding expectedgrowth because just counting the number of students meeting orexceeding expected growth might not fully accountfor student differences. however, i also recognize thatthere are many districts that don't have the resources or thesize to get value-added models systematically applied, inwhich case meeting or exceeding expected growth might be agood alternative solution. "instructionally sensitivesummative learning targets [district assessment]." i'd becurious to know what that means.

sounds interesting. "document-based questionswith department-created rubric." yeah, again, i'mnot sure what that is. "in arizona, also usingnweas with student growth percentiles." that'svery interesting. that seems like a very promisingdirection in which to go also. so good. it's clear that everybody isstruggling with this and trying to figure outgood ways to do it.

oh, here's another question. "if you're using an rti," whichi'm guessing here means response to intervention framework,"doing universal screening several times per year, mightthat be a good way to measure student growth schoolwideor for individual teachers?" that's a good question. i don't know enough about therti framework to know whether that's the case. it strikes me as entirelypossible that that would work.

universal screening is certainlysomething you'd like to see in order to promotevalidity and reliability. so there might be opportunitiesto apply value-added statistical models to thescreening results from that framework. okay. "teacher selecting common corestate standards that have been identified as essential tosuccess in the course in high school math using a lot ofpre- and post-assessment that addresses this common core statestandards." so that again is

sort of a hybrid. so the idea is sort of torely on the common standard but presumably -- and thepre-and post-assessment. i guess the question there is,would be, you know, how do you ensure consistency ofsetting standards for different classrooms anddifferent kinds of students. okay. well, thishas been very helpful. thanks for your input here. i think we are almost done.

we have one more poll for youasking what action steps you plan to take. that is, if you are working ina school, or a district, or a state, have you learnedanything here that might suggest directions to go as youdevelop growth measures for teacher evaluation? judy, can i ask you, what wouldyou advise a district or state about action steps? >> judy ferguson: go slow.

[laughter] >> brian gill: yeah. that makes a lot of sense. policy is movingvery fast at the moment. >> judy ferguson: yeah, i thinkthe dearth of research is a big concern. and if we, if states will takesome time before the high-stakes part kicks in to really evaluatethe effectiveness of student growth objectives for measuringteacher performance and get more

data, get more information, thatwould be a wise thing to do. probably not going to happen,but it would be a wise thing to look for. we need more research. you know, it's clear in thework that you've done, which is phenomenal, there's just not awhole bunch out there, and we need to invest a lot more timeand effort in finding out what is reliable andvalid in terms of measuring teacher effectiveness.

>> brian gill: that's amessage i always like to hear. i'm glad yousaid it rather than me. >> judy ferguson: youdidn't tell me to say it either. no, i think it's --while people are writing their responses there, one thing iwalked away today from this webinar and from the reading andpreparation for it is i do think there is a lot of incentive inthis process for teachers to learn how to make betterassessments, and that's a good thing.

it's the high-stakes part thatis so concerning at this very early stage of ourknowledge base of what works and what doesn't. so that's my take. >> brian gill: great, thank you. and then we're having a varietyof action steps proposed here. "hoping that the parcc, the newassessments from the consortium, data collected by new jerseysmart will provide more support to the process." yes, lots ofstates are trying to improve

their data systems. new assessments, as you know,are going to be coming along online that are supposed tobe aligned to common core and individual state standards. and i think there's reason forhope that those will also -- will first of all provide betterinformation about what students are learning but then ultimatelypermit better information about what teachers are contributingto that learning as well. yeah, i hope that you find someof the resources we provided to

be useful. i think we have a linksomewhere to our report. if you look at the resources,the little resources box, you can download the report there,as well as finding it on the rel mid-atlantic website. those of you who are interestedin learning more about slos, i recommend checkingout that ctaac website. and good luck to you. i have to confess that it's,you know, it's easier to be in a

researcher position here than itis to be on the frontlines and trying to make these thingswork while you're teaching and running your schools. so i think that's our last poll. and the only remaining thing isjust any final questions that any of you might have thatwe haven't already addressed. are there any furtherquestions remaining? [no audible response] >> brian gill:and if not, then i will

turn it back over to teresa totalk a little bit about where some of this information isgoing and what's going to be happening soon in otherevents and presentations from rel mid-atlantic. >> teresa duncan: great, thankyou so much, brian, and thank you very much too judy. we did not lack for questions. so i'm very pleased with thelevel of engagement that we saw from our audience.

and so with this slide, if youhave any additional questions, please feel free to email brian,and there's his email address, bgill@mathematica-mpr.com. and if you have any questionsabout the rel, its work and any of its efforts, i'm more thanhappy to answer and share with you any informationthat you might need. my email address isteresa.duncan@icfi.com. i also invite you to continuethe discussion on our forums, on our website.

you'll see the link up there. i think particularly with someof the follow-up steps and since this is an ongoing effort reallyfor everybody, we'd love to hear from you and to continue thediscussion and learn more from each other's efforts. so please do join us onthis online discussion forum. what we're also going to do iscollect all of the responses, some of which -- we had severalquestions come in prior to the webinar among our registrants.

and so we didn't get to answerall of those, but we will compile those as well as thosethat came in during the webinar and put them in an faq kind ofdocument, and we'll post that on the forums, so you folks areable to download them and engage in discussion as you wish. so we hope that you'll find thata useful resource, and we hope to see you there. i also wanted to putin a plug for several of our upcoming events.

before the holidays start andeverybody runs away for the holidays, we have four events innovember, two of which are from our teachereffectiveness webinar series. and so we have one coming upnovember 7 focusing on pathways into teaching. and then another one on november21 about the teacher's role in quality classroom interactions. we also have acouple of other webinars. one focuses on -- this is apractice guide published from,

by the department of education,and it's focusing on improving mathematical problemsolving in grades 4 through 8. and so that istaking place november 6. and we have a different typeof webinar on november 18. it's very hands-on. it's intended to be a tool thatmight be of use to district and -- district staff,administrators and that is using and creatingautomated data dashboards. and that is for pc users.

so i'm sorry,macintosh, we're working on it. macintosh users, we are workingon it, but we're not quite there yet. and we are continuing thesetypes of webinars in 2014, so please join our mailing list. you can do thatfrom our website. and we hope that you'll find relmid-atlantic to be a resource that you go to often. and so i invite you totake a minute to complete our

participantsatisfaction feedback survey. your feedback isvery important to us. and so we do listen,and we do respond. so please, dofill out that survey. it should take you nomore than five minutes. sharon, i think if you couldsend them over to there? i'm assuming that thelink is going for everybody. so thank you verymuch for participating.

training effectiveness literature review

Related : training effectiveness literature review

0 komentar:

Posting Komentar

Popular Posts