[Robot-learning] Collaboration with Brown

Marie desJardins mariedj at umbc.edu
Tue Aug 21 19:23:24 EDT 2018


Yes, that's the basic gist. When I talked to Reid at the PI meeting last
year, he expressed concern about coordination and collaboration. I think
the main thing is to be able to articulate, in both reports, what UMBC is
doing, what Brown is doing, and what we're collaborating on - and how that
collective set of activities is providing a coherent overall project. The
"collaborative project" should be more than just what we're doing, plus
what you're doing.

Marie


On Tue, Aug 21, 2018, 5:21 PM Littman, Michael <mlittman at cs.brown.edu>
wrote:

> Not sure what you mean about "this collaboration issue". As far as I know,
> the description Dave sent is the main point of direct collaboration.
>
> Are you saying you think that we need to find more points of contact to
> soothe the PM?
>
> On Tue, Aug 21, 2018 at 10:57 AM Marie desJardins <mariedj at umbc.edu>
> wrote:
>
>> Thanks, David, this is really helpful.
>>
>> Michael/Stefanie, can you share your thoughts on this collaboration issue?
>>
>> Marie
>>
>> On 8/14/18 11:57 AM, Abel, David wrote:
>>
>> Hi all,
>>
>> I received the following comment from Reid Simmons with a request to
>>> revise and resubmit the annual report:
>>>
>>> "Unclear whether all the reported work was done at UMBC or some was done
>>> at Brown. If some of the work was done by collaborators, please indicate
>>> this; if the work was done fully at UMBC, please indicate what types of
>>> collaboration were done in the past year (and what are expected in the
>>> coming year)."
>>>
>>> The submitted version is attached.  Can some combination of John,
>>> Michael, Stefanie, Nakul, and David provide some input about
>>> collaborations?  I do know that Reid has expressed some concern in the past
>>> about how/whether the two project sites are coordinating, so emphasizing
>>> the ways in which our work is coordinating and complementing each other
>>> would be good to add.
>>>
>>
>> John and I have been collaborating on a project together since around
>> March. I don't see the project described in the attached AMDP writeup, so
>> here's a brief description.
>>
>> At a high level, we're investigating whether we can improve how option
>> models are computed, both in terms of (1) learning options and their
>> models, and (2) using options to plan (as part of a hierarchy or on their
>> own). The main insight we're exploiting to improve over current option
>> models is that the option model shouldn't depend on the exact number of
>> lower level actions taken in an execution of the option. Instead, we offer
>> a variant of options that retains a *rough estimate* of the number of
>> lower level actions taken on a per state basis. This value is most critical
>> in figuring out how much to discount future plans.
>>
>> So far we've shown:
>>
>>    1. A sample bound for learning options using this new model. (How
>>    many samples $(s, o, s')$ are needed to determine *roughly* how many
>>    lower level actions will be taken when $o$ is executed in $s$?)
>>    2. A bound on the value function when using the new, learned, option
>>    model, compared to using the usual option models.
>>    3. John has conducted some really interesting experiments in a
>>    variety of Taxi instances that showcase the potential of the method. In
>>    short: we can learn faster, and with lower variance, if we use the new
>>    option model.
>>
>> We have several ongoing subtasks:
>>
>>    - Use the new option model to inform the option reward model, too.
>>    - Prove similar results as (1.) and (2.) above with the new option
>>    reward model.
>>    - Target option models with low variance.
>>
>> Our writeup is here
>> <https://www.sharelatex.com/project/5ab3e0446f167e439582055a>. Hope this
>> helps! Let me know if there is any other information that would be useful --
>>
>> Best,
>> -Dave
>>
>>
>>>
>>> Michael
>>>
>>> --
>>> Dr. Marie desJardins
>>> Associate Dean for Academic Affairs
>>> College of Engineering and Information Technology
>>> University of Maryland, Baltimore County
>>> 1000 Hilltop Circle
>>> Baltimore MD 21250
>>>
>>> Email: mariedj at umbc.edu
>>> Voice: 410-455-3967
>>> Fax: 410-455-3559
>>>
>>> _______________________________________________
>>> Robot-learning mailing list
>>> Robot-learning at cs.umbc.edu
>>> https://lists.cs.umbc.edu/mailman/listinfo/robot-learning
>>>
>>>
>>
>>
>> _______________________________________________
>> Robot-learning mailing listRobot-learning at cs.umbc.eduhttps://lists.cs.umbc.edu/mailman/listinfo/robot-learning
>>
>>
>> _______________________________________________
>> Robot-learning mailing list
>> Robot-learning at cs.umbc.edu
>> https://lists.cs.umbc.edu/mailman/listinfo/robot-learning
>>
> _______________________________________________
> Robot-learning mailing list
> Robot-learning at cs.umbc.edu
> https://lists.cs.umbc.edu/mailman/listinfo/robot-learning
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cs.umbc.edu/pipermail/robot-learning/attachments/20180821/6f55c79d/attachment-0001.html>


More information about the Robot-learning mailing list