您的当前位置:首页正文

Harnessing Algorithm Bias in Classical Planning

2023-11-13 来源:星星旅游
HarnessingAlgorithmBiasinClassicalPlanning

mroberts@cs.colostate.edu

http://www.cs.colostate.edu/˜mrobertsComputerScienceDepartment,ColoradoStateUniversity

FortCollins,CO,80521

MarkRoberts

Aplanningsystem’sperformanceisbiasedduetomanyfactorsrelatedtoitsdesign.Forexample,therepresenta-tion,decisionpoints,searchcontrol,memoryusage,heuris-ticguidance,andstoppingcriteriaallcanhaveimplicationsforperformance.Probleminstancecharacteristicsalsoim-pactsystemperformance.Theinteractionofthedesignchoiceswiththeprobleminstancemakesitdifficulttose-lectthemostefficientsystemfromthearrayofchoices.Itseemsnaturaltoapplylearningtoaidinallocatingcomputa-tionalresourcesamongaportfolioofplannersthatmayhavecomplementing(orcompeting)searchtechnologies.Suchselectioniscalledtheportfoliostrategy.

Mythesisisthatwecanstudyaportfolioofplanningsys-temsforcluesaboutwhyonealgorithmisfavoredoveran-other.Asecondarythesisisthatwecanuncoveralgorithmicandproblemstructuredependenciesbyexaminingalgorithmperformanceonspecificinstances.Thisresearchfocusesonaseriesofquestions,roughlyaddressedintheorder:

1.Canwemodelplannerperformanceonbenchmarkprob-lemswithsimplefeatures?

2.Isaportfoliobasedonanalysisofandlearningfrompre-viousperformancecompetitivewithexistingplanners?3.Whichportfoliostrategiesconstituteeffectiveselection,ranking,andallocation?

4.Whatcanwelearnfromthedataand/ormodelsthatwillaidusindevelopingandtestingspecifichypotheseslead-ingtostrongerexplanationsofsearchperformanceinplanning?

5.Canweconstructameta-plannerbasedonourfindings?6.Howcantheknowledgewehavegainedsupplementthecurrentbenchmarkproblems?

7.Doourfindingsholdunderrelaxedclassicassumptions?Whatfollowsarehighlightedfindingsforitems(1)and(2)aswellasthecurrentstateofitems(3)and(4);mydisserta-tionwillprovidedetailedanalysesforalloftheseitems.Thispaperpresentsaseveralextensionstotheresultspresentedinthe2006DoctoralConsortium.Weadded9moreplanners(buttwowerenotreliable)aswellas767morechallengingproblemsfromtheIPC5PropositionalandIPC4hard-typeddomains.Wealsoadded30moreadvancedlearningmodelstoouroriginaltwo1.Finally,wemodifiedtheportfoliotouseavarietyofrankingandallocationstrate-1

Anothergraduatestudent,LandonFlom,didthisworkgies(includingrandombaselines)andtheabilitytorunal-gorithmseitherserially(inorderbyranking)orinaround-robinfashionuntilsuccessortimerunsout.Intermsofexaminingthedata,wehaveincludedrecentworkthattakesstepstowardanalyzingquestion4above.

CollectingPerformanceData

Wehavegonethroughseveralmajorrevisionsofdatacol-lection.Ourcurrentconfigurationconsistsof:

4726STRIPSPDDLProblemsfrom385domainsthataretakenfromHoffmann’sdataset(2004),theUCPOPStrictbenchmark,IPCsets(IPC1,IPC2,IPC3EasyTyped,IPC4,IPC5)plus37otherpubliclyavailableproblemsfromtwodomains(SodorandStek).

27Plannersfromourcompletelistof86plannerin-stancesthataredesignedtosamplethevarietyofhistoricalapproaches.Eachplannerisallowed30minutesand768MBmemory.Weused30identicallyconfiguredmachines.38Featuresautomaticallyextractedfromproblemanddomaindefinitions;weincludedfeaturesfrom(Howeetal.1999)plusmanyothers.Wedividethefeaturesintothreecategoriesofincreasingknowledgeandcomputationalcost:domainspecific,instancespecific,actioninteraction.Inearlywork,weexamined20featuresfromHoffmann’s(2004)statespacetaxonomy2.Butthesefeaturesareverycostlyandareexcludedfrommodeling.Welabelthedo-mainandinstance-specificfeaturesas‘fast’andtheactioninteractionandtopologicalfeaturesas‘expensive.’

PerformanceModels

Foreachplanner,weconstructedtwomodels:suc-cessandruntime.Successmodelsoutputabinaryde-cisionandmayalsoestimatetheprobabilityoffind-ingasolutiongivenaprobleminstanceandaplan-ner(P(solutionfound|problem,planner)).Runtimemodelspredictcomputationtimeneededforagivenplannertocom-pleteagivenprobleminstance.

WebuildallmodelswiththeWEKAdataminingpackage(Witten&Frank2005);tostart,weusedtwomodelsthatworkedwellthatwecouldexplain:OneRandJ48.OneRselectsthesinglefeaturethatyieldsthehighestpredictionvalueonthetrainingset,whileJ48isadecisiontreemethod

2

WethankJ¨orgHoffmannforsupplyingthecode.

basedonQuinlan’sC4.5.Themodelsaredistinguishedbasedonthetrainingdatausedtobuildthem:preIPC4data(allbutIPC4andIPC5),preIPC5(allbutIPC5),andchal-lenge3(problemsforwhichonetothreeplannerssucceededandthemediancompletiontimewasoveronesecond).Un-lessmentionedotherwise,resultsarereportedforten-foldcross-validation.Forthesetwomodelswefoundthat:

•Whenpredictingsuccess,J48achieved96.7%averageac-curacy(sdof3.2)forpreIPC4and96.8%averageaccu-racy(sdof2.12)forpreIPC5.

•Whenpredictingtime,theaverageaccuracyusingJ48onlogbinneddatawas95.0%(sdof2.49).Overallplanners,84.2%oftherunsfinishinlessthanonesecondand5%finishedingreaterthan1000seconds.

•Expensivefeaturesdoimproveaccuracy,butnotenoughtojustifytheircomputationalcost.

•InthesuccessOneRmodels,theaveragenumberofnegationsineffectswasthebestpredictorfornineoftheplanners;thepredicatearitywasbestforanotherfour.Thefirstfeaturemayindicatewheretheoftenusedh+heuristicmayhavetrouble;thesecondroughlyinfluencesbranchinginthesearchspace.

Ourmostrecentworkextendsthemodelsetto30moretechniques.Somepreliminaryfindingswithrespecttothesemodelsare:

•Theexpandedmodelsetimprovesaccuracyovertheini-tialmodels;especiallyforruntime.ThemostcommonmodelwasKStar–avariantofK-NearestNeighbors.•ModelstrainedonpreIPC4andtestedonIPC4didnotgeneralizewell,butmodelstrainedonpreIPC5andtestedonIPC5generalizedwell.

ThePortfolioArchitecture

Theportfolioshouldcontainthemostaccuratemodels(thatarenotoverfit).Wesplitchallenge3intoan80%trainingand20%testingsets,wherethetestingsetincludedhalftheproblemsfromIPC4andIPC5(tofurtherfocusthelearningonthemostchallengingproblems).Weevaluatedallmodelsonthetestingdataandselectedthemostaccuratesuccessandruntimemodels.

Theotherthreedecisionmakingcomponents(selection,ranking,andallocation)canbeconfiguredintomanydiffer-entcombinations.Thecomponentsandtheiroptionsare:Selectionrestrictsthesetofpossibleplannerstoonlythoseneededtocovertheproblemset.Weuseasinglestrat-egythatcomputesa“cover”fromthe28plannersusingagreedysetcoveringapproximationalgorithm(Cormenetal.2003)onthe80%challenge3data.Weexcludedfromthefinalsetanyplannerthatonlysolvedoneuniqueprobleminthetrainingset(4plannerstotal).Thereduced“cover”consistsof10planners.

Rankingorderstheexecutionoftheplanners.Therank-ingstrategieswehaveexaminedinclude:

randomrankstheplannersinanarbitraryorder,coverusesthesetcoveringorder,

pSuccessprunesthoseplannerspredictedtofailandorderstherestbydecreasingpredictedprobabilityofsuccess,predTimeordersbyincreasingpredictedtime,and

SimonKadaneordersindecreasingpSuccess

inserialpredTime,whichmini-mizesexpectedcostofsuccesssearch(1975).Allocationdeterminestheruntimeforeachplannerwithinoneoftwoexecutionparadigms:

Serialexecutionrunseachplannertoitsallocatedtimeandquitsatthefirstsuccessorafterallplannershavespenttheirtime.Forserialexecutionweappliedtwoallocationstrategies:

avgPlannerusestheaveragetimetosucceedfortheplanner,and

predTimecomputesthepredictedtimefortheproblemfromthemodel.

Round-robinexecutioncyclesthroughthequeueofplan-nersuntilasingleplannerindicatessuccess,allplannersstopwithfailure,ortheportfolioexceedstheexperimen-taltimelimit(30minutes).Oneachiteration,theplannerbeginswherepreviouslyleftoff;itdoesnotrestartfromscratch.Wesupportoneallocationstrategy:

confIntusestheruntimedistributionofsuc-cessfultrainingrunstoestimatethequantilesq={25,50,75,80,85,90,95,97,99}.Forex-ample,iftheactualsuccessfulruntimesofaplannerare{0.1,0.2,0.3,0.4,0.5,10,100,1000},thenconfIntwillreturnanallocationstrategyof{0.28,0.45,32.5,64.0,95.5,370.0,685.0,1000.0}.Wealsoimplementedarandomallocationschemeforbothexecutionmodelsasabaseline.

Howdoestheportfoliocomparetoexistingplanners?

Wehavecomparedthissimpleportfoliotothemostsuccess-fulandaverageplannersandfoundthat:

•asimpleallocationstrategy(pairedwithpSuccess)thatusestherun-timedistributionscanproduceperformancebetterthantheaverageplannerperformance,

•thebestrankingstrategiesemploytheextendedmodels,•weneedevenmoreaccuratetimemodels;underanyrank-ingstrategyrandomallocationperformedaswellasanyotherallocationstrategy,

•performancetrendssuggestthatround-robinallocationismoresuccessfulthanserialallocation,

•thebestportfoliolags60secondsonaveragebehindan‘oracle’selectionstrategythatusesperfectknowledge,•thebestportfoliois10secondsfasteronaveragethanthebestplanner(SGPlan-2006),and

•outof244testingproblemsfromchallenge3,thebestportfoliosolves51moreproblemsthanthebestplanner.

LearningfromtheData:Hoffmann’sTaxonomy

WeappliedHoffmann’stopologytaxonomytoseeifithelpsexplaintheperformanceoftwogroupsofplanners:Heuris-ticSearch(HS)plannersandNon-heuristicsearch(NHS)planners.Usingstatisticalanalysesonthechallenge3datawehavefoundthat:

•TheperformanceofHSandNHSplannersisdistinctfromoneanotherregardlessofthetaxonomy,

•TheperformanceofNHSplannersappearstobeinsensi-tivetothetaxonomy,

•TheperformanceofHSplannersissensitivetothetaxon-omy,

•Moreworkneedstobedonetoaccountforproblemdiffi-cultyintheexistingtaxonomy.

Theresults,thoughlimited,suggestthatitisusefultominethedatatouncoverspecificperformancedependenciesandthatthisanalysiscanextendexistingknowledge.

Summaryofstatus

Withrespecttothequestionsposedintheintroduction,wehavemadethefollowingcontributions:

1.ModelingPerformanceWecanmodelplannerper-formanceonbenchmarkproblemswithsimplefeaturesex-tractedfromproblemsandplanners.Featurecostandfeatureselectionremainimportanttofurtherenhancingtheportfoliomodels.Giventherelativelypoorperformanceofthetimemodels,wemayneedtolookintometa-learningtechniquesinmachinelearningforbettermodels.

2.BasicPortfolioOurproof-of-conceptportfolioiscom-petitivewithexistingplannersasofIPC4butincludingprob-lemsuptothemostrecentIPC5.

3.PortfolioStrategiesWehavebegunexploringthespaceofportfoliostrategiesandfoundthatround-robinstrategiesthatuseSimon/Kadanerankingappeartobethemostsuccessful.However,moreworkneedstobedoneonmodelingruntimesothattheportfoliocanbetterallocatetime.Therearealsomanymorestrategiesfromthelitera-turethatwecouldemployinourstudy.

4.LinkingPerformanceWehaveshownthatboththemodelsandtherawperformancedataarerichwithinfor-mationthatcanleadtoexplanationsofsearchperformanceinclassicalplanning.Themodelsprovidesomeevidencelinkingparticularfeaturestoperformanceprediction.Butthereremainsmuchworktolinkthesearchbiasofvariousplannerswiththeirperformanceonspecificproblems.Weexpecttoidentifynewfeaturesthathelpexplainindirectac-tioninteractionsasweperformricherdomainanalysis.AreasonableplacetostartmaybeincludingsomefeaturesfromHAP(Vrakasetal.2005),TIM(Fox&Long1998),andLondex(Chen,Zhao,&Zhang2007).Amoresophisti-catedapproachwouldbetousetheanalysisstructuresfrom(Helmert2006)ingeneratingfeaturesandidentifyingprob-lemspecificbias3.

5.Themeta-plannerAsitis,theportfolioisunfitforinclusionasacompetitionplanner.Butweexpecttomakekeyinsightsintowhichcomponentsofplannersarecriti-calforsuccessfulperformanceandimplementaplannerthatcombinesthesecomponents.Wealsohopetoextendouranalysistomodelingdynamicfeatureswhereinwemayin-cludesampledstate-spacefeaturesthatmaychangeplannerbehavioron-line.

6.ExtendingtheBenchmarkProblemsGeneratingspe-cificproblemswithIPCproblemgeneratorswillbeusefulfortestingspecifichypothesesaboutthedependencieswenotice.Wehaveconverted4(butnotyetthoroughlystud-ied)twonewproblemsets,whichwehopewillextendthe

3AthankstoDavidSmithforthissuggestion.4

ChristinaWilliamsaddedtheseproblems.

benchmarkstootherrealisticdomains.Akeyissueistoex-pandtheproblemsetinwaysthatarebothchallengingandrealistic(Watsonetal.1999).

7.RelaxingAssumptionsFinally,thisworkisbasedintheclassicalplanningparadigmandnumerousextensionstoPDDLrelaxclassicalassumptions.Wehopetofullydevelopaprincipledmethodologyforportfolioconstructionandthenextendittotheserelaxedassumptions.

AcknowledgmentsTheauthorwouldliketothankhisad-visorAdeleHowe,thestudentsintheAIgroupatCSU,theICAPScommunity,aswellasalltheauthorsoftheplanningsystems.

Portionsofthisworkappearinothervenues,andthereaderisreferredtotheseortheauthorswebsiteformoredetail:(Roberts&Howe2006),(Roberts,Howe,&Flom2007),(Roberts&Howe2007).

References

Chen,Y.;Zhao,X.;andZhang,W.2007.Longdistancemu-tualexclusionforpropositionalplanning.InInternationalJointConferenceonArtificialIntelligence(IJCAI-07).

Cormen,T.;Leiserson,C.;Rivest,R.;andStein,C.2003.In-troductiontoAlgorithms.MITpress,Cambridge,MA,secondedition.

Fox,M.,andLong,D.1998.TheautomaticinferenceofstateinvariantsinTIM.JAIRVolume9:367–421.

Helmert,M.2006.Newcomplexityresultsforclassicalplanningbenchmarks.InICAPS2006,52–61.

Hoffmann,J.2004.UtilizingProblemStructureinPlanning:AlocalSearchApproach.Berlin,NewYork:Springer-Verlag.

Howe,A.;Dahlman,E.;Hansen,C.;vonMayrhauser,A.;andScheetz,M.1999.Exploitingcompetitiveplannerperformance.InProc.ofECP-99.

Roberts,M.,andHowe,A.2006.Directingaportfoliowithlearning.InRuml,W.,andHutter,F.,eds.,AAAIWorkshoponLearningforSearch.

Roberts,M.,andHowe,A.2007.Localsearchtopology:Impli-cationsforplannerperformance.InICAPS2007,WorkshoponHeuristicsforDomain-independentPlanning,toappear.

Roberts,M.;Howe,A.;andFlom,L.2007.Learnedmodelsofperformanceformanyplanners.InICAPS2007,WorkshopAIPlanningandLearning,toappear.

Simon,H.,andKadane,J.1975.Optimalproblem-solvingsearch:All-or-nonesolutions.ArtificialIntelligence6:235–247.Vrakas,D.;Tsoumakas,G.;Bassiliades,N.;andVlahavas,I.2005.IntelligentTechniquesforPlanning.IdeaGroup.chap-terMachineLearningforAdaptivePlanning,90–120.

Watson,J.;Barbulescu,L.;Howe,A.;andWhitley,L.1999.Algorithmperformanceandproblemstructureforflow-shopscheduling.InProc.ofAAAI-99.

Witten,I.H.,andFrank,E.2005.DataMining:Practicalmachinelearningtoolsandtechniques.SanFrancisco:MorganKaufmann,2ndedition.

因篇幅问题不能全部显示,请点此查看更多更全内容