<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <font face="Lucida Grande">Fri 2/2 would be good for a phone call

      with Stefanie and Michael.  (Does that work for both of you? -- if

      so, what time is good?  I'm fairly unconstrained.)<br>

      <br>

      Fri 2/2 won't work for a larger group meeting, though -- John will

      be at AAAI.  I'll be traveling on Fri 2/9 but John should be back

      by then, so maybe we could plan a joint group meeting that day --

      do you still have your regular meetings on Fridays?<br>

      <br>

      Marie<br>

      <br>

    </font><br>

    <div class="moz-cite-prefix">On 1/21/18 11:23 AM, Stefanie Tellex

      wrote:<br>

    </div>

    <blockquote type="cite"

      cite="mid:05615ee0-d5c1-5a22-f499-4aa211d1db90@cs.brown.edu">I

      agree, for after the winter deadlines.

      <br>

      <br>

      Stefanie

      <br>

      <br>

      On 01/21/2018 04:51 AM, Lawson Wong wrote:

      <br>

      <blockquote type="cite">So have <a class="moz-txt-link-abbreviated" href="mailto:kcaluru@brown.edu">kcaluru@brown.edu</a>

        <a class="moz-txt-link-rfc2396E" href="mailto:kcaluru@brown.edu"><mailto:kcaluru@brown.edu></a> and <a class="moz-txt-link-abbreviated" href="mailto:miles_holland@brown.edu">miles_holland@brown.edu</a>

        <a class="moz-txt-link-rfc2396E" href="mailto:miles_holland@brown.edu"><mailto:miles_holland@brown.edu></a>

        <br>

        <br>

        The reviews look like typical planning-community reviews --

        generally sensible requests but clearly impossible to accomplish

        within the page limit. I guess it's generally hard to please

        planning reviewers unless there are some theoretical results.

        Review 2 actually reads a little like one that Nakul got for his

        paper...

        <br>

        <br>

        I don't know if Michael and Stefanie have answered separately

        regarding a meeting; it certainly sounds helpful to continue

        discussing (AMDP) hierarchy learning. Both the IJCAI and RSS

        deadlines are on that week (1/31 and 2/1 respectively), so if

        possible it may be best to meet after those deadlines, such as

        on Fri 2/2 -- unless the intent was to discuss before the IJCAI

        deadline.

        <br>

        <br>

        -Lawson

        <br>

        <br>

        <br>

        On Sat, Jan 20, 2018 at 6:04 AM, Littman, Michael

        <<a class="moz-txt-link-abbreviated" href="mailto:mlittman@cs.brown.edu">mlittman@cs.brown.edu</a>

        <a class="moz-txt-link-rfc2396E" href="mailto:mlittman@cs.brown.edu"><mailto:mlittman@cs.brown.edu></a>> wrote:

        <br>

        <br>

            <a class="moz-txt-link-abbreviated" href="mailto:christopher_grimm@brown.edu">christopher_grimm@brown.edu</a>

        <a class="moz-txt-link-rfc2396E" href="mailto:christopher_grimm@brown.edu"><mailto:christopher_grimm@brown.edu></a> has

        <br>

            graduated.

        <br>

        <br>

        <br>

            On Fri, Jan 19, 2018 at 12:06 PM, Marie desJardins

        <br>

            <<a class="moz-txt-link-abbreviated" href="mailto:mariedj@cs.umbc.edu">mariedj@cs.umbc.edu</a>

        <a class="moz-txt-link-rfc2396E" href="mailto:mariedj@cs.umbc.edu"><mailto:mariedj@cs.umbc.edu></a>> wrote:

        <br>

        <br>

                Hi everyone,

        <br>

        <br>

                I wanted to share the initial reviews we received on our

        ICAPS

        <br>

                submission (which I've also attached).  Based on the

        reviews, I

        <br>

                think the paper is unlikely to be accepted, so we are

        working to

        <br>

                see whether we can get some new results for an IJCAI

        submission.

        <br>

                We are making good progress on developing hierarchical

        learning

        <br>

                methods for AMDPs but we need to (a) move to larger/more

        complex

        <br>

                domains, (b) develop some theoretical analysis

        (complexity,

        <br>

                correctness, convergence), and (c) work on more

        AMDP-specific

        <br>

                hierarchy learning techniques (right now we are using an

        <br>

                off-the-shelf method called HierGen that works well but

        may not

        <br>

                necessarily find the best hierarchy for an AMDP

        representation).

        <br>

        <br>

                I'd be very interested to talk more about how this

        relates to

        <br>

                the work that's happening at Brown, and to hear any

        <br>

                feedback/ideas you might have about this work.

        <br>

        <br>

                Michael/Stephanie, could we maybe set up a time for the

        three of

        <br>

                us to have a teleconference?  I'll be on vacation next

        week but

        <br>

                the week after that would be good.  Possible times for

        me -- Mon

        <br>

                1/29 before 11:30am, between 1-2, or after 4pm; Wed 1/31

        before

        <br>

                10am or after 2pm; Thu 2/1 between 11-1:30 or 3-4; Fri

        2/2 any time.

        <br>

        <br>

                BTW, these are the Brown students who are on this list. 

        Please

        <br>

                let me know if anyone should be added or removed.

        <br>

        <br>

                <a class="moz-txt-link-abbreviated" href="mailto:carl_trimbach@brown.edu">carl_trimbach@brown.edu</a>

        <a class="moz-txt-link-rfc2396E" href="mailto:carl_trimbach@brown.edu"><mailto:carl_trimbach@brown.edu></a>

        <br>

                <a class="moz-txt-link-abbreviated" href="mailto:christopher_grimm@brown.edu">christopher_grimm@brown.edu</a>

        <a class="moz-txt-link-rfc2396E" href="mailto:christopher_grimm@brown.edu"><mailto:christopher_grimm@brown.edu></a>

        <br>

                <a class="moz-txt-link-abbreviated" href="mailto:david_abel@brown.edu">david_abel@brown.edu</a> <a class="moz-txt-link-rfc2396E" href="mailto:david_abel@brown.edu"><mailto:david_abel@brown.edu></a>

        <br>

                <a class="moz-txt-link-abbreviated" href="mailto:dilip.arumugam@gmail.com">dilip.arumugam@gmail.com</a>

        <a class="moz-txt-link-rfc2396E" href="mailto:dilip.arumugam@gmail.com"><mailto:dilip.arumugam@gmail.com></a>

        <br>

                <a class="moz-txt-link-abbreviated" href="mailto:edward_c_williams@brown.edu">edward_c_williams@brown.edu</a>

        <a class="moz-txt-link-rfc2396E" href="mailto:edward_c_williams@brown.edu"><mailto:edward_c_williams@brown.edu></a>

        <br>

                <a class="moz-txt-link-abbreviated" href="mailto:jun_ki_lee@brown.edu">jun_ki_lee@brown.edu</a> <a class="moz-txt-link-rfc2396E" href="mailto:jun_ki_lee@brown.edu"><mailto:jun_ki_lee@brown.edu></a>

        <br>

                <a class="moz-txt-link-abbreviated" href="mailto:kcaluru@brown.edu">kcaluru@brown.edu</a> <a class="moz-txt-link-rfc2396E" href="mailto:kcaluru@brown.edu"><mailto:kcaluru@brown.edu></a>

        <br>

                <a class="moz-txt-link-abbreviated" href="mailto:lsw@brown.edu">lsw@brown.edu</a> <a class="moz-txt-link-rfc2396E" href="mailto:lsw@brown.edu"><mailto:lsw@brown.edu></a>

        <br>

                <a class="moz-txt-link-abbreviated" href="mailto:lucas_lehnert@brown.edu">lucas_lehnert@brown.edu</a>

        <a class="moz-txt-link-rfc2396E" href="mailto:lucas_lehnert@brown.edu"><mailto:lucas_lehnert@brown.edu></a>

        <br>

                <a class="moz-txt-link-abbreviated" href="mailto:melrose_roderick@brown.edu">melrose_roderick@brown.edu</a>

        <a class="moz-txt-link-rfc2396E" href="mailto:melrose_roderick@brown.edu"><mailto:melrose_roderick@brown.edu></a>

        <br>

                <a class="moz-txt-link-abbreviated" href="mailto:miles_holland@brown.edu">miles_holland@brown.edu</a>

        <a class="moz-txt-link-rfc2396E" href="mailto:miles_holland@brown.edu"><mailto:miles_holland@brown.edu></a>

        <br>

                <a class="moz-txt-link-abbreviated" href="mailto:nakul_gopalan@brown.edu">nakul_gopalan@brown.edu</a>

        <a class="moz-txt-link-rfc2396E" href="mailto:nakul_gopalan@brown.edu"><mailto:nakul_gopalan@brown.edu></a>

        <br>

                <a class="moz-txt-link-abbreviated" href="mailto:oberlin@cs.brown.edu">oberlin@cs.brown.edu</a> <a class="moz-txt-link-rfc2396E" href="mailto:oberlin@cs.brown.edu"><mailto:oberlin@cs.brown.edu></a>

        <br>

                <a class="moz-txt-link-abbreviated" href="mailto:sam_saarinen@brown.edu">sam_saarinen@brown.edu</a>

        <a class="moz-txt-link-rfc2396E" href="mailto:sam_saarinen@brown.edu"><mailto:sam_saarinen@brown.edu></a>

        <br>

                <a class="moz-txt-link-abbreviated" href="mailto:siddharth_karamcheti@brown.edu">siddharth_karamcheti@brown.edu</a>

        <br>

                <a class="moz-txt-link-rfc2396E" href="mailto:siddharth_karamcheti@brown.edu"><mailto:siddharth_karamcheti@brown.edu></a>

        <br>

        <br>

                Marie

        <br>

        <br>

        <br>

                -------- Forwarded Message --------

        <br>

                Subject:     ICAPS 2018 review response (submission

        [*NUMBER*])

        <br>

                Date:     Thu, 11 Jan 2018 14:59:19 +0100

        <br>

                From:     ICAPS 2018 <a class="moz-txt-link-rfc2396E" href="mailto:icaps2018@easychair.org"><icaps2018@easychair.org></a>

        <br>

                <a class="moz-txt-link-rfc2396E" href="mailto:icaps2018@easychair.org"><mailto:icaps2018@easychair.org></a>

        <br>

                To:     Marie desJardins <a class="moz-txt-link-rfc2396E" href="mailto:mariedj@umbc.edu"><mariedj@umbc.edu></a>

        <a class="moz-txt-link-rfc2396E" href="mailto:mariedj@umbc.edu"><mailto:mariedj@umbc.edu></a>

        <br>

        <br>

        <br>

        <br>

                Dear Marie,

        <br>

        <br>

                Thank you for your submission to ICAPS 2018. The ICAPS

        2018 review

        <br>

                response period starts now and ends at January 13.

        <br>

        <br>

                During this time, you will have access to the current

        state of your

        <br>

                reviews and have the opportunity to submit a response. 

        Please keep in

        <br>

                mind the following during this process:

        <br>

        <br>

                * Most papers have a so-called placeholder review, which

        was

        <br>

                   necessary to give the discussion leaders access to

        the reviewer

        <br>

                   discussion. Some of these reviews list questions that

        already came

        <br>

                   up during the discussion and which you may address in

        your response but

        <br>

                   in all cases the (usually enthusiastic) scores are

        meaningless and you

        <br>

                   should ignore them. Placeholder reviews are clearly

        indicated as such in

        <br>

                   the review.

        <br>

        <br>

                * Almost all papers have three reviews. Some may have

        four. A very

        <br>

                   low number of papers are missing one review. We hope

        to get that

        <br>

                   review completed in the next day. We apologize for

        this.

        <br>

        <br>

                * The deadline for entering a response is January 13th

        (at 11:59pm

        <br>

                   UTC-12 i.e. anywhere in the world).

        <br>

        <br>

                * Responses must be submitted through EasyChair.

        <br>

        <br>

                * Responses are limited to 1000 words in total. You can

        only enter

        <br>

                   one response, not one per review.

        <br>

        <br>

                * You will not be able to change your response after it

        is submitted.

        <br>

        <br>

                * The response must focus on any factual errors in the

        reviews and any

        <br>

                   questions posed by the reviewers. Try to be as

        concise and as to the

        <br>

                   point as possible.

        <br>

        <br>

                * The review response period is an opportunity to react

        to the

        <br>

                   reviews, but not a requirement to do so. Thus, if you

        feel the reviews

        <br>

                   are accurate and the reviewers have not asked any

        questions, then you

        <br>

                   do not have to respond.

        <br>

        <br>

                * The reviews are as submitted by the PC members,

        without much

        <br>

                   coordination between them. Thus, there may be

        inconsistencies.

        <br>

                   Furthermore, these are not the final versions of the

        reviews. The

        <br>

                   reviews can later be updated to take into account the

        discussions at

        <br>

                   the program committee meeting, and we may find it

        necessary to solicit

        <br>

                   other outside reviews after the review response

        period.

        <br>

        <br>

                * The program committee will read your responses

        carefully and

        <br>

                   take this information into account during the

        discussions. On the

        <br>

                   other hand, the program committee may not directly

        respond to your

        <br>

                   responses in the final versions of the reviews.

        <br>

        <br>

                The reviews on your paper are attached to this letter.

        To submit your

        <br>

                response you should log on the EasyChair Web page for

        ICAPS 2018 and

        <br>

                select your submission on the menu.

        <br>

        <br>

                ----------------------- REVIEW 1 ---------------------

        <br>

                PAPER: 46

        <br>

                TITLE: Learning Abstracted Models and Hierarchies of

        Markov Decision Processes

        <br>

                AUTHORS: Matthew Landen, John Winder, Shawn Squire,

        Stephanie Milani, Shane Parr and Marie desJardins

        <br>

        <br>

                Significance: 2 (modest contribution or average impact)

        <br>

                Soundness: 3 (correct)

        <br>

                Scholarship: 3 (excellent coverage of related work)

        <br>

                Clarity: 3 (well written)

        <br>

                Reproducibility: 3 (authors describe the implementation

        and domains in sufficient detail)

        <br>

                Overall evaluation: 1 (weak accept)

        <br>

                Reviewer's confidence: 2 (medium)

        <br>

                Suitable for a demo?: 1 (no)

        <br>

                Nominate for Best Paper Award: 1 (no)

        <br>

                Nominate for Best Student Paper Award (if eligible): 1

        (no)

        <br>

                [Applications track ONLY]: Importance and novelty of the

        application: 6 (N/A (not an Applications track paper))

        <br>

                [Applications track ONLY]: Importance of

        planning/scheduling technology to the solution of the problem: 5

        (N/A (not an Applications track paper))

        <br>

                [Applications track ONLY] Maturity: 7 (N/A (not an

        Applications track paper))

        <br>

                [Robotics track ONLY]: Balance of Robotics and Automated

        Planning and Scheduling: 6 (N/A (not a Robotics track paper))

        <br>

                [Robotics Track ONLY]: Evaluation on physical

        platforms/simulators: 6 (N/A (not a Robotics track paper))

        <br>

                [Robotics Track ONLY]: Significance of the contribution:

        6 (N/A (not a Robotics track paper))

        <br>

        <br>

                ----------- Review -----------

        <br>

                The paper proposes a method for learning abstract Markov

        decision processes (AMDP) from demonstration trajectories and

        model based reinforcement learning. Experiments show that the

        method is more effective than the baseline.

        <br>

        <br>

                On the positive side, a complete method for learning

        AMDP is given and is shown to be work on the problems used in

        the experiments. The proposed model based reinforcement learning

        method based on R-MAX is also shown to outperform the baseline

        R-MAXQ.

        <br>

        <br>

                On the negative side, the method for learning the

        hierarchy, HierGen, is taken from a prior work, leaving the

        adaptation of R-MAX to learn with hierarchy as the main

        algorithmic novelty. No convergence proof for the earning method

        is provided, although it is empirically shown to outperform the

        baseline R-MAXQ. The experiments are done on toy problems,

        indicating that the method is probably not ready for more

        demanding practical problems.

        <br>

        <br>

                Overall, I am inclined to vote weak accept. The problem

        is difficult, so I think that the work does represent progress,

        although it is not yet compelling.

        <br>

        <br>

                ----------------------- REVIEW 2 ---------------------

        <br>

                PAPER: 46

        <br>

                TITLE: Learning Abstracted Models and Hierarchies of

        Markov Decision Processes

        <br>

                AUTHORS: Matthew Landen, John Winder, Shawn Squire,

        Stephanie Milani, Shane Parr and Marie desJardins

        <br>

        <br>

                Significance: 2 (modest contribution or average impact)

        <br>

                Soundness: 3 (correct)

        <br>

                Scholarship: 2 (relevant literature cited but could be

        expanded)

        <br>

                Clarity: 3 (well written)

        <br>

                Reproducibility: 3 (authors describe the implementation

        and domains in sufficient detail)

        <br>

                Overall evaluation: -1 (weak reject)

        <br>

                Reviewer's confidence: 4 (expert)

        <br>

                Suitable for a demo?: 2 (maybe)

        <br>

                Nominate for Best Paper Award: 1 (no)

        <br>

                Nominate for Best Student Paper Award (if eligible): 1

        (no)

        <br>

                [Applications track ONLY]: Importance and novelty of the

        application: 6 (N/A (not an Applications track paper))

        <br>

                [Applications track ONLY]: Importance of

        planning/scheduling technology to the solution of the problem: 5

        (N/A (not an Applications track paper))

        <br>

                [Applications track ONLY] Maturity: 7 (N/A (not an

        Applications track paper))

        <br>

                [Robotics track ONLY]: Balance of Robotics and Automated

        Planning and Scheduling: 6 (N/A (not a Robotics track paper))

        <br>

                [Robotics Track ONLY]: Evaluation on physical

        platforms/simulators: 6 (N/A (not a Robotics track paper))

        <br>

                [Robotics Track ONLY]: Significance of the contribution:

        6 (N/A (not a Robotics track paper))

        <br>

        <br>

                ----------- Review -----------

        <br>

                The authors introduce a reinforcement learning algorithm

        for AMDPs that learns a hierarchical structure and a set of

        hierarchical models. To learn the hierarchical structure, they

        rely on an existing algorithm called HierGen. This algorithm

        extracts causal structure from a set of expert trajectories in a

        factored state environment.

        <br>

        <br>

                While R-AMDP outperforms R-MAXQ on the two toy problems,

        I think there is a lot more work to do to show that R-AMDP is a

        good basis for developing more general algorithms. First, it

        would be nice to examine the computational complexity of R-AMDP

        (rather than just empirical comparison in Figure 3). Second,

        what if R-AMDP is just getting lucky in the two toy tasks

        presented. Maybe there are other problems where R-AMDP performs

        poorly. Further, stopping the plots at 50 or 60 trials may just

        be misleading since R-AMDP could be converging to a suboptimal

        but pretty good policy early on. It’s also not clear that R-AMDP

        can be scaled to huge state or action spaces. Does the

        hierarchical structure discovered by HierGen lend itself to

        transfer when the dynamics change? It would be nice to have a

        more rigorous analysis of R-AMDP and a longer discussion of its

        potential pitfalls (when should we expected it to succeed and

        when should it fail?). There is a hind of this in the discussio!

        <br>

                  n about HierGen’s inability to distinguish between

        correlation and causation.

        <br>

        <br>

                While reading the abstract I expected the contribution

        to be in learning the hierarchy. The authors should probably

        change the abstract to avoid this confusion.

        <br>

        <br>

                ----------------------- REVIEW 3 ---------------------

        <br>

                PAPER: 46

        <br>

                TITLE: Learning Abstracted Models and Hierarchies of

        Markov Decision Processes

        <br>

                AUTHORS: Matthew Landen, John Winder, Shawn Squire,

        Stephanie Milani, Shane Parr and Marie desJardins

        <br>

        <br>

                Significance: 3 (substantial contribution or strong

        impact)

        <br>

                Soundness: 3 (correct)

        <br>

                Scholarship: 3 (excellent coverage of related work)

        <br>

                Clarity: 3 (well written)

        <br>

                Reproducibility: 5 (code and domains (whichever apply)

        are already publicly available)

        <br>

                Overall evaluation: 3 (strong accept)

        <br>

                Reviewer's confidence: 4 (expert)

        <br>

                Suitable for a demo?: 3 (yes)

        <br>

                Nominate for Best Paper Award: 1 (no)

        <br>

                Nominate for Best Student Paper Award (if eligible): 1

        (no)

        <br>

                [Applications track ONLY]: Importance and novelty of the

        application: 6 (N/A (not an Applications track paper))

        <br>

                [Applications track ONLY]: Importance of

        planning/scheduling technology to the solution of the problem: 5

        (N/A (not an Applications track paper))

        <br>

                [Applications track ONLY] Maturity: 7 (N/A (not an

        Applications track paper))

        <br>

                [Robotics track ONLY]: Balance of Robotics and Automated

        Planning and Scheduling: 6 (N/A (not a Robotics track paper))

        <br>

                [Robotics Track ONLY]: Evaluation on physical

        platforms/simulators: 6 (N/A (not a Robotics track paper))

        <br>

                [Robotics Track ONLY]: Significance of the contribution:

        6 (N/A (not a Robotics track paper))

        <br>

        <br>

                ----------- Review -----------

        <br>

                This is only a placeholder review. Please ignore it.

        <br>

        <br>

                ----------------------- REVIEW 4 ---------------------

        <br>

                PAPER: 46

        <br>

                TITLE: Learning Abstracted Models and Hierarchies of

        Markov Decision Processes

        <br>

                AUTHORS: Matthew Landen, John Winder, Shawn Squire,

        Stephanie Milani, Shane Parr and Marie desJardins

        <br>

        <br>

                Significance: 2 (modest contribution or average impact)

        <br>

                Soundness: 2 (minor inconsistencies or small fixable

        errors)

        <br>

                Scholarship: 3 (excellent coverage of related work)

        <br>

                Clarity: 1 (hard to follow)

        <br>

                Reproducibility: 2 (some details missing but still

        appears to be replicable with some effort)

        <br>

                Overall evaluation: -1 (weak reject)

        <br>

                Reviewer's confidence: 3 (high)

        <br>

                Suitable for a demo?: 2 (maybe)

        <br>

                Nominate for Best Paper Award: 1 (no)

        <br>

                Nominate for Best Student Paper Award (if eligible): 1

        (no)

        <br>

                [Applications track ONLY]: Importance and novelty of the

        application: 6 (N/A (not an Applications track paper))

        <br>

                [Applications track ONLY]: Importance of

        planning/scheduling technology to the solution of the problem: 5

        (N/A (not an Applications track paper))

        <br>

                [Applications track ONLY] Maturity: 7 (N/A (not an

        Applications track paper))

        <br>

                [Robotics track ONLY]: Balance of Robotics and Automated

        Planning and Scheduling: 6 (N/A (not a Robotics track paper))

        <br>

                [Robotics Track ONLY]: Evaluation on physical

        platforms/simulators: 6 (N/A (not a Robotics track paper))

        <br>

                [Robotics Track ONLY]: Significance of the contribution:

        6 (N/A (not a Robotics track paper))

        <br>

        <br>

                ----------- Review -----------

        <br>

                The paper describes an approach for learning abstract

        models and hierarchies for hierarchies of AMDPs. These

        hierarchies are similar, if not exactly the same, as those used

        by frameworks such as MAXQ, where each task in the hierarchy is

        an MDP with actions corresponding to child tasks. Prior AMDP

        work apparently uses hand-specified models of each task/AMDP,

        which are directly used for planning. This paper extends that

        work by learning the models of each task/AMDP. This is done

        using RMAX at each task. There is not a discussion of

        convergence guarantees of the approach. Apparently convergence

        must occur in a bottom-up way. Experiments are shown in two

        domains and with two hierarchies in one of the domains (Taxi).

        The approach appears to learn more efficiently than a prior

        approach R-MAXQ. The exact reasons for the increased efficiency

        were not exactly clear based on my understanding from the paper.

        <br>

        <br>

                The paper is well-written at a high level, but the more

        technical and formal descriptions could be improved quite a bit.

        For example, the key object AMDP, is only described informally

        (the tuple is not described in detail). Most of the paper is

        written quite informally.  Another example is that Table 1 talks

        about "max planner rollouts", but I didn't see where rollouts

        are used anywhere in the algorithm description.

        <br>

        <br>

                After reading the abstract and introduction, I expected

        that a big part of the contribution would be about actually

        learning the hierarchy. However, that does not seem to be the

        case. Rather, an off-the-shelf approach is used to learn

        hierarchies and then plugged into the proposed algorithm for

        learning the models of tasks. Further, this is only tried for

        one of the two experimental domains. The abstract and

        introduction should be more clear about the contributions of the

        paper.

        <br>

        <br>

                Overall, I was unclear about what to learn from the

        paper. The main contribution is apparently algorithm 1, which

        uses R-MAX to learn the models of each AMPD in a given

        hierarchy. Perhaps this is a novel algorithm, but it feels like

        more of a baseline in the sense that it is the first thing that

        one might try given the problem setup. I may not be appreciating

        some type of complexity that makes this not be straightforward.

        This baseline approach would have been more interesting if some

        form of convergence result was provided, similar to what was

        provided for R-MAXQ.

        <br>

        <br>

        <br>

                The experiments show that R-AMDP learns faster and is

        more computationally efficient than R-MAXQ. I was unable to get

        a good understanding for why this was the case. This is likely

        due to the fact that I was not able to revisit the R-MAXQ

        algorithm and it was not described in detail in this paper. The

        authors do try to explain the reasons for the performance

        improvement, but I was unable to follow exactly. My best guess

        based on the discussion is that R-MAXQ does not try to exploit

        the state abstraction provided for each task by the hierarchy

        ("R-MAXQ must compute a model over all possible future states in

        a planning envelope after each action"). Is this the primary

        reason or is there some other reason? Adding the ability to

        exploit abstractions in R-MAXQ seems straightforward, though

        maybe I'm missing something.

        <br>

        <br>

                ------------------------------------------------------

        <br>

        <br>

                Best wishes,

        <br>

                Gabi Röger and Sven Koenig

        <br>

                ICAPS 2018 program chairs

        <br>

        <br>

        <br>

                _______________________________________________

        <br>

                Robot-learning mailing list

        <br>

                <a class="moz-txt-link-abbreviated" href="mailto:Robot-learning@cs.umbc.edu">Robot-learning@cs.umbc.edu</a>

        <a class="moz-txt-link-rfc2396E" href="mailto:Robot-learning@cs.umbc.edu"><mailto:Robot-learning@cs.umbc.edu></a>

        <br>

        <a class="moz-txt-link-freetext" href="https://lists.cs.umbc.edu/mailman/listinfo/robot-learning">https://lists.cs.umbc.edu/mailman/listinfo/robot-learning</a>

        <br>

        <a class="moz-txt-link-rfc2396E" href="https://lists.cs.umbc.edu/mailman/listinfo/robot-learning"><https://lists.cs.umbc.edu/mailman/listinfo/robot-learning></a>

        <br>

        <br>

        <br>

        <br>

            _______________________________________________

        <br>

            Robot-learning mailing list

        <br>

            <a class="moz-txt-link-abbreviated" href="mailto:Robot-learning@cs.umbc.edu">Robot-learning@cs.umbc.edu</a>

        <a class="moz-txt-link-rfc2396E" href="mailto:Robot-learning@cs.umbc.edu"><mailto:Robot-learning@cs.umbc.edu></a>

        <br>

            <a class="moz-txt-link-freetext" href="https://lists.cs.umbc.edu/mailman/listinfo/robot-learning">https://lists.cs.umbc.edu/mailman/listinfo/robot-learning</a>

        <br>

        <a class="moz-txt-link-rfc2396E" href="https://lists.cs.umbc.edu/mailman/listinfo/robot-learning"><https://lists.cs.umbc.edu/mailman/listinfo/robot-learning></a>

        <br>

        <br>

        <br>

        <br>

        <br>

        _______________________________________________

        <br>

        Robot-learning mailing list

        <br>

        <a class="moz-txt-link-abbreviated" href="mailto:Robot-learning@cs.umbc.edu">Robot-learning@cs.umbc.edu</a>

        <br>

        <a class="moz-txt-link-freetext" href="https://lists.cs.umbc.edu/mailman/listinfo/robot-learning">https://lists.cs.umbc.edu/mailman/listinfo/robot-learning</a>

        <br>

        <br>

      </blockquote>

      <br>

      _______________________________________________

      <br>

      Robot-learning mailing list

      <br>

      <a class="moz-txt-link-abbreviated" href="mailto:Robot-learning@cs.umbc.edu">Robot-learning@cs.umbc.edu</a>

      <br>

      <a class="moz-txt-link-freetext" href="https://lists.cs.umbc.edu/mailman/listinfo/robot-learning">https://lists.cs.umbc.edu/mailman/listinfo/robot-learning</a>

      <br>

    </blockquote>

    <br>

    <div class="moz-signature">-- <br>

      Dr. Marie desJardins

      <br>

      Associate Dean for Academic Affairs

      <br>

      College of Engineering and Information Technology

      <br>

      University of Maryland, Baltimore County

      <br>

      1000 Hilltop Circle

      <br>

      Baltimore MD 21250

      <br>

      <br>

      Email: <a class="moz-txt-link-abbreviated" href="mailto:mariedj@umbc.edu">mariedj@umbc.edu</a>

      <br>

      Voice: 410-455-3967

      <br>

      Fax: 410-455-3559</div>

  </body>

</html>