mroberts@cs.colostate.edu
http://www.cs.colostate.edu/˜mrobertsComputerScienceDepartment,ColoradoStateUniversity
FortCollins,CO,80521
MarkRoberts
Aplanningsystem’sperformanceisbiasedduetomanyfactorsrelatedtoitsdesign.Forexample,therepresenta-tion,decisionpoints,searchcontrol,memoryusage,heuris-ticguidance,andstoppingcriteriaallcanhaveimplicationsforperformance.Probleminstancecharacteristicsalsoim-pactsystemperformance.Theinteractionofthedesignchoiceswiththeprobleminstancemakesitdifficulttose-lectthemostefficientsystemfromthearrayofchoices.Itseemsnaturaltoapplylearningtoaidinallocatingcomputa-tionalresourcesamongaportfolioofplannersthatmayhavecomplementing(orcompeting)searchtechnologies.Suchselectioniscalledtheportfoliostrategy.
Mythesisisthatwecanstudyaportfolioofplanningsys-temsforcluesaboutwhyonealgorithmisfavoredoveran-other.Asecondarythesisisthatwecanuncoveralgorithmicandproblemstructuredependenciesbyexaminingalgorithmperformanceonspecificinstances.Thisresearchfocusesonaseriesofquestions,roughlyaddressedintheorder:
1.Canwemodelplannerperformanceonbenchmarkprob-lemswithsimplefeatures?
2.Isaportfoliobasedonanalysisofandlearningfrompre-viousperformancecompetitivewithexistingplanners?3.Whichportfoliostrategiesconstituteeffectiveselection,ranking,andallocation?
4.Whatcanwelearnfromthedataand/ormodelsthatwillaidusindevelopingandtestingspecifichypotheseslead-ingtostrongerexplanationsofsearchperformanceinplanning?
5.Canweconstructameta-plannerbasedonourfindings?6.Howcantheknowledgewehavegainedsupplementthecurrentbenchmarkproblems?
7.Doourfindingsholdunderrelaxedclassicassumptions?Whatfollowsarehighlightedfindingsforitems(1)and(2)aswellasthecurrentstateofitems(3)and(4);mydisserta-tionwillprovidedetailedanalysesforalloftheseitems.Thispaperpresentsaseveralextensionstotheresultspresentedinthe2006DoctoralConsortium.Weadded9moreplanners(buttwowerenotreliable)aswellas767morechallengingproblemsfromtheIPC5PropositionalandIPC4hard-typeddomains.Wealsoadded30moreadvancedlearningmodelstoouroriginaltwo1.Finally,wemodifiedtheportfoliotouseavarietyofrankingandallocationstrate-1
Anothergraduatestudent,LandonFlom,didthisworkgies(includingrandombaselines)andtheabilitytorunal-gorithmseitherserially(inorderbyranking)orinaround-robinfashionuntilsuccessortimerunsout.Intermsofexaminingthedata,wehaveincludedrecentworkthattakesstepstowardanalyzingquestion4above.
CollectingPerformanceData
Wehavegonethroughseveralmajorrevisionsofdatacol-lection.Ourcurrentconfigurationconsistsof:
4726STRIPSPDDLProblemsfrom385domainsthataretakenfromHoffmann’sdataset(2004),theUCPOPStrictbenchmark,IPCsets(IPC1,IPC2,IPC3EasyTyped,IPC4,IPC5)plus37otherpubliclyavailableproblemsfromtwodomains(SodorandStek).
27Plannersfromourcompletelistof86plannerin-stancesthataredesignedtosamplethevarietyofhistoricalapproaches.Eachplannerisallowed30minutesand768MBmemory.Weused30identicallyconfiguredmachines.38Featuresautomaticallyextractedfromproblemanddomaindefinitions;weincludedfeaturesfrom(Howeetal.1999)plusmanyothers.Wedividethefeaturesintothreecategoriesofincreasingknowledgeandcomputationalcost:domainspecific,instancespecific,actioninteraction.Inearlywork,weexamined20featuresfromHoffmann’s(2004)statespacetaxonomy2.Butthesefeaturesareverycostlyandareexcludedfrommodeling.Welabelthedo-mainandinstance-specificfeaturesas‘fast’andtheactioninteractionandtopologicalfeaturesas‘expensive.’
PerformanceModels
Foreachplanner,weconstructedtwomodels:suc-cessandruntime.Successmodelsoutputabinaryde-cisionandmayalsoestimatetheprobabilityoffind-ingasolutiongivenaprobleminstanceandaplan-ner(P(solutionfound|problem,planner)).Runtimemodelspredictcomputationtimeneededforagivenplannertocom-pleteagivenprobleminstance.
WebuildallmodelswiththeWEKAdataminingpackage(Witten&Frank2005);tostart,weusedtwomodelsthatworkedwellthatwecouldexplain:OneRandJ48.OneRselectsthesinglefeaturethatyieldsthehighestpredictionvalueonthetrainingset,whileJ48isadecisiontreemethod
2
WethankJ¨orgHoffmannforsupplyingthecode.
basedonQuinlan’sC4.5.Themodelsaredistinguishedbasedonthetrainingdatausedtobuildthem:preIPC4data(allbutIPC4andIPC5),preIPC5(allbutIPC5),andchal-lenge3(problemsforwhichonetothreeplannerssucceededandthemediancompletiontimewasoveronesecond).Un-lessmentionedotherwise,resultsarereportedforten-foldcross-validation.Forthesetwomodelswefoundthat:
•Whenpredictingsuccess,J48achieved96.7%averageac-curacy(sdof3.2)forpreIPC4and96.8%averageaccu-racy(sdof2.12)forpreIPC5.
•Whenpredictingtime,theaverageaccuracyusingJ48onlogbinneddatawas95.0%(sdof2.49).Overallplanners,84.2%oftherunsfinishinlessthanonesecondand5%finishedingreaterthan1000seconds.
•Expensivefeaturesdoimproveaccuracy,butnotenoughtojustifytheircomputationalcost.
•InthesuccessOneRmodels,theaveragenumberofnegationsineffectswasthebestpredictorfornineoftheplanners;thepredicatearitywasbestforanotherfour.Thefirstfeaturemayindicatewheretheoftenusedh+heuristicmayhavetrouble;thesecondroughlyinfluencesbranchinginthesearchspace.
Ourmostrecentworkextendsthemodelsetto30moretechniques.Somepreliminaryfindingswithrespecttothesemodelsare:
•Theexpandedmodelsetimprovesaccuracyovertheini-tialmodels;especiallyforruntime.ThemostcommonmodelwasKStar–avariantofK-NearestNeighbors.•ModelstrainedonpreIPC4andtestedonIPC4didnotgeneralizewell,butmodelstrainedonpreIPC5andtestedonIPC5generalizedwell.
ThePortfolioArchitecture
Theportfolioshouldcontainthemostaccuratemodels(thatarenotoverfit).Wesplitchallenge3intoan80%trainingand20%testingsets,wherethetestingsetincludedhalftheproblemsfromIPC4andIPC5(tofurtherfocusthelearningonthemostchallengingproblems).Weevaluatedallmodelsonthetestingdataandselectedthemostaccuratesuccessandruntimemodels.
Theotherthreedecisionmakingcomponents(selection,ranking,andallocation)canbeconfiguredintomanydiffer-entcombinations.Thecomponentsandtheiroptionsare:Selectionrestrictsthesetofpossibleplannerstoonlythoseneededtocovertheproblemset.Weuseasinglestrat-egythatcomputesa“cover”fromthe28plannersusingagreedysetcoveringapproximationalgorithm(Cormenetal.2003)onthe80%challenge3data.Weexcludedfromthefinalsetanyplannerthatonlysolvedoneuniqueprobleminthetrainingset(4plannerstotal).Thereduced“cover”consistsof10planners.
Rankingorderstheexecutionoftheplanners.Therank-ingstrategieswehaveexaminedinclude:
randomrankstheplannersinanarbitraryorder,coverusesthesetcoveringorder,
pSuccessprunesthoseplannerspredictedtofailandorderstherestbydecreasingpredictedprobabilityofsuccess,predTimeordersbyincreasingpredictedtime,and
SimonKadaneordersindecreasingpSuccess
inserialpredTime,whichmini-mizesexpectedcostofsuccesssearch(1975).Allocationdeterminestheruntimeforeachplannerwithinoneoftwoexecutionparadigms:
Serialexecutionrunseachplannertoitsallocatedtimeandquitsatthefirstsuccessorafterallplannershavespenttheirtime.Forserialexecutionweappliedtwoallocationstrategies:
avgPlannerusestheaveragetimetosucceedfortheplanner,and
predTimecomputesthepredictedtimefortheproblemfromthemodel.
Round-robinexecutioncyclesthroughthequeueofplan-nersuntilasingleplannerindicatessuccess,allplannersstopwithfailure,ortheportfolioexceedstheexperimen-taltimelimit(30minutes).Oneachiteration,theplannerbeginswherepreviouslyleftoff;itdoesnotrestartfromscratch.Wesupportoneallocationstrategy:
confIntusestheruntimedistributionofsuc-cessfultrainingrunstoestimatethequantilesq={25,50,75,80,85,90,95,97,99}.Forex-ample,iftheactualsuccessfulruntimesofaplannerare{0.1,0.2,0.3,0.4,0.5,10,100,1000},thenconfIntwillreturnanallocationstrategyof{0.28,0.45,32.5,64.0,95.5,370.0,685.0,1000.0}.Wealsoimplementedarandomallocationschemeforbothexecutionmodelsasabaseline.
Howdoestheportfoliocomparetoexistingplanners?
Wehavecomparedthissimpleportfoliotothemostsuccess-fulandaverageplannersandfoundthat:
•asimpleallocationstrategy(pairedwithpSuccess)thatusestherun-timedistributionscanproduceperformancebetterthantheaverageplannerperformance,
•thebestrankingstrategiesemploytheextendedmodels,•weneedevenmoreaccuratetimemodels;underanyrank-ingstrategyrandomallocationperformedaswellasanyotherallocationstrategy,
•performancetrendssuggestthatround-robinallocationismoresuccessfulthanserialallocation,
•thebestportfoliolags60secondsonaveragebehindan‘oracle’selectionstrategythatusesperfectknowledge,•thebestportfoliois10secondsfasteronaveragethanthebestplanner(SGPlan-2006),and
•outof244testingproblemsfromchallenge3,thebestportfoliosolves51moreproblemsthanthebestplanner.
LearningfromtheData:Hoffmann’sTaxonomy
WeappliedHoffmann’stopologytaxonomytoseeifithelpsexplaintheperformanceoftwogroupsofplanners:Heuris-ticSearch(HS)plannersandNon-heuristicsearch(NHS)planners.Usingstatisticalanalysesonthechallenge3datawehavefoundthat:
•TheperformanceofHSandNHSplannersisdistinctfromoneanotherregardlessofthetaxonomy,
•TheperformanceofNHSplannersappearstobeinsensi-tivetothetaxonomy,
•TheperformanceofHSplannersissensitivetothetaxon-omy,
•Moreworkneedstobedonetoaccountforproblemdiffi-cultyintheexistingtaxonomy.
Theresults,thoughlimited,suggestthatitisusefultominethedatatouncoverspecificperformancedependenciesandthatthisanalysiscanextendexistingknowledge.
Summaryofstatus
Withrespecttothequestionsposedintheintroduction,wehavemadethefollowingcontributions:
1.ModelingPerformanceWecanmodelplannerper-formanceonbenchmarkproblemswithsimplefeaturesex-tractedfromproblemsandplanners.Featurecostandfeatureselectionremainimportanttofurtherenhancingtheportfoliomodels.Giventherelativelypoorperformanceofthetimemodels,wemayneedtolookintometa-learningtechniquesinmachinelearningforbettermodels.
2.BasicPortfolioOurproof-of-conceptportfolioiscom-petitivewithexistingplannersasofIPC4butincludingprob-lemsuptothemostrecentIPC5.
3.PortfolioStrategiesWehavebegunexploringthespaceofportfoliostrategiesandfoundthatround-robinstrategiesthatuseSimon/Kadanerankingappeartobethemostsuccessful.However,moreworkneedstobedoneonmodelingruntimesothattheportfoliocanbetterallocatetime.Therearealsomanymorestrategiesfromthelitera-turethatwecouldemployinourstudy.
4.LinkingPerformanceWehaveshownthatboththemodelsandtherawperformancedataarerichwithinfor-mationthatcanleadtoexplanationsofsearchperformanceinclassicalplanning.Themodelsprovidesomeevidencelinkingparticularfeaturestoperformanceprediction.Butthereremainsmuchworktolinkthesearchbiasofvariousplannerswiththeirperformanceonspecificproblems.Weexpecttoidentifynewfeaturesthathelpexplainindirectac-tioninteractionsasweperformricherdomainanalysis.AreasonableplacetostartmaybeincludingsomefeaturesfromHAP(Vrakasetal.2005),TIM(Fox&Long1998),andLondex(Chen,Zhao,&Zhang2007).Amoresophisti-catedapproachwouldbetousetheanalysisstructuresfrom(Helmert2006)ingeneratingfeaturesandidentifyingprob-lemspecificbias3.
5.Themeta-plannerAsitis,theportfolioisunfitforinclusionasacompetitionplanner.Butweexpecttomakekeyinsightsintowhichcomponentsofplannersarecriti-calforsuccessfulperformanceandimplementaplannerthatcombinesthesecomponents.Wealsohopetoextendouranalysistomodelingdynamicfeatureswhereinwemayin-cludesampledstate-spacefeaturesthatmaychangeplannerbehavioron-line.
6.ExtendingtheBenchmarkProblemsGeneratingspe-cificproblemswithIPCproblemgeneratorswillbeusefulfortestingspecifichypothesesaboutthedependencieswenotice.Wehaveconverted4(butnotyetthoroughlystud-ied)twonewproblemsets,whichwehopewillextendthe
3AthankstoDavidSmithforthissuggestion.4
ChristinaWilliamsaddedtheseproblems.
benchmarkstootherrealisticdomains.Akeyissueistoex-pandtheproblemsetinwaysthatarebothchallengingandrealistic(Watsonetal.1999).
7.RelaxingAssumptionsFinally,thisworkisbasedintheclassicalplanningparadigmandnumerousextensionstoPDDLrelaxclassicalassumptions.Wehopetofullydevelopaprincipledmethodologyforportfolioconstructionandthenextendittotheserelaxedassumptions.
AcknowledgmentsTheauthorwouldliketothankhisad-visorAdeleHowe,thestudentsintheAIgroupatCSU,theICAPScommunity,aswellasalltheauthorsoftheplanningsystems.
Portionsofthisworkappearinothervenues,andthereaderisreferredtotheseortheauthorswebsiteformoredetail:(Roberts&Howe2006),(Roberts,Howe,&Flom2007),(Roberts&Howe2007).
References
Chen,Y.;Zhao,X.;andZhang,W.2007.Longdistancemu-tualexclusionforpropositionalplanning.InInternationalJointConferenceonArtificialIntelligence(IJCAI-07).
Cormen,T.;Leiserson,C.;Rivest,R.;andStein,C.2003.In-troductiontoAlgorithms.MITpress,Cambridge,MA,secondedition.
Fox,M.,andLong,D.1998.TheautomaticinferenceofstateinvariantsinTIM.JAIRVolume9:367–421.
Helmert,M.2006.Newcomplexityresultsforclassicalplanningbenchmarks.InICAPS2006,52–61.
Hoffmann,J.2004.UtilizingProblemStructureinPlanning:AlocalSearchApproach.Berlin,NewYork:Springer-Verlag.
Howe,A.;Dahlman,E.;Hansen,C.;vonMayrhauser,A.;andScheetz,M.1999.Exploitingcompetitiveplannerperformance.InProc.ofECP-99.
Roberts,M.,andHowe,A.2006.Directingaportfoliowithlearning.InRuml,W.,andHutter,F.,eds.,AAAIWorkshoponLearningforSearch.
Roberts,M.,andHowe,A.2007.Localsearchtopology:Impli-cationsforplannerperformance.InICAPS2007,WorkshoponHeuristicsforDomain-independentPlanning,toappear.
Roberts,M.;Howe,A.;andFlom,L.2007.Learnedmodelsofperformanceformanyplanners.InICAPS2007,WorkshopAIPlanningandLearning,toappear.
Simon,H.,andKadane,J.1975.Optimalproblem-solvingsearch:All-or-nonesolutions.ArtificialIntelligence6:235–247.Vrakas,D.;Tsoumakas,G.;Bassiliades,N.;andVlahavas,I.2005.IntelligentTechniquesforPlanning.IdeaGroup.chap-terMachineLearningforAdaptivePlanning,90–120.
Watson,J.;Barbulescu,L.;Howe,A.;andWhitley,L.1999.Algorithmperformanceandproblemstructureforflow-shopscheduling.InProc.ofAAAI-99.
Witten,I.H.,andFrank,E.2005.DataMining:Practicalmachinelearningtoolsandtechniques.SanFrancisco:MorganKaufmann,2ndedition.
因篇幅问题不能全部显示,请点此查看更多更全内容