<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Aptos;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
font-size:11.0pt;
font-family:"Aptos",sans-serif;
mso-ligatures:standardcontextual;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:#467886;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal-compose;
font-family:"Aptos",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:11.0pt;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="#467886" vlink="#96607D" style="word-wrap:break-word">
<div class="WordSection1">
<p class="MsoNormal">We are looking for a motivated PhD student to join the Division of Speech, Music, and Hearing (TMH) at KTH Royal Institute of Technology in Stockholm.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">This project aims to advance Human-Robot Interaction (HRI) by enhancing embodied AI, integrating multimodal social cues and task-related actions into foundation models to enable robots to communicate in a more natural and human-like manner.
It addresses the current limitations of Large Language Models, which lack the ability to comprehend and generate essential social cues like facial expressions, gestures, and gaze, as well as perform task-specific behaviors. The project focuses on three key
objectives: integrating multimodal perception into AI models, training these models to produce both verbal and non-verbal outputs, and developing new metrics to evaluate their performance in HRI scenarios.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">The <a href="https://wasp-sweden.org/" target="_blank">Swedish AI-program WASP</a> funds this project.
<a href="https://wasp-sweden.org/graduate-school/" target="_blank">WASP's graduate school</a> fosters a strong multi-disciplinary, international network among PhD students, researchers, and industry through research visits, partner universities, and visiting
lecturers.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">The candidate must have a degree in Computer Science or related fields. Documented written and spoken English and programming skills are required. Some experience with artificial intelligence, robotics, human-robot interaction, and multimodal
machine learning is preferred.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">The student will start before mid-January of 2024 and the last application date is August 31<sup>st</sup>. Application details can be consulted through KTH’s dedicated recruitment system:
<a href="https://www.kth.se/lediga-jobb/739179?l=en">https://www.kth.se/lediga-jobb/739179?l=en</a><o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Best regards,<br>
André Pereira<br>
Researcher @ KTH Royal Institute of Technology<br>
School of Electrical Engineering and Computer Science<br>
Division of Speech, Music and Hearing (TMH)<o:p></o:p></p>
</div>
</body>
</html>
<br>
<hr>
<p>Manage your subscription:</p>
<p>List Subscription Page: https://LISTSERV.ACM.ORG/SCRIPTS/WA-ACMLPX.CGI?SUBED1=ICMI-MULTIMODAL-ANNOUNCE</p>
<p>Unsubscribe: ICMI-MULTIMODAL-ANNOUNCE-signoff-request@LISTSERV.ACM.ORG</p>