<div dir="ltr">
[Apologies for multiple postings]<br>-------------------------------------------------------------------------<br>Authorship Identification of SOurce COde (AI-SOCO)<br>Website: <a href="https://sites.google.com/view/ai-soco-2020/" target="_blank">https://sites.google.com/view/ai-soco-2020/</a><br><br>To be organized at FIRE 2020 (<a href="http://fire.irsi.res.in/fire/2020/home" target="_blank">http://fire.irsi.res.in/fire/2020/home</a>)<br>10 - 13 December<div>Virtual Conference <br>-------------------------------------------------------------------------------<br><br>--------------------------<br>Task Description:<br>--------------------------<br><br>General
authorship identification is essential to the detection of undesirable
deception of others' content misuse or exposing the owners of some
anonymous hurtful content. This is done by revealing the author of that
content. Authorship Identification of SOurce COde (AI-SOCO) focuses on
uncovering the author who wrote some piece of code. This facilitates
solving issues related to cheating in academic, work and open source
environments. Also, it can be helpful in detecting the authors of
malware softwares over the world.<br><br>The dataset is composed of
source codes collected from the open submissions in the Codeforces
online judge. Codeforces is an online judge for hosting competitive
programming contests such that each contest consists of multiple
problems to be solved by the participants. A Codeforces participant can
solve a problem by writing a solution for it using any of the available
programming languages on the website, and then submitting the solution
through the website. The solution's result can be correct (accepted) or
incorrect (wrong answer, time limit exceeded, etc.).<br><br>In our
dataset, we selected 1000 users and collected 100 source codes from each
one. So, the total number of source codes is 100,000. All collected
source codes are correct and written using the C++ programming language.
For each user, all collected source codes are from unique problems.<br><br>Given
the pre-defined set of source codes and their writers, the task
participants should build systems that are able to detect the writer
given any new, unseen before source codes from the previously defined
writers list.<br><br>Full task description can be found at: <a href="https://sites.google.com/view/ai-soco-2020/" target="_blank">https://sites.google.com/view/ai-soco-2020/</a><br><br><br>------------<br>Timeline<br>------------<br><br>8th June - Open track websites<br>8th June – Training and development data release<br>31st July – Test data release<br>7th September – Run submission deadline<br>20th September – Results declared<br>31st October – Working notes and overview papers due (tentative)<br>10th-13th December - FIRE 2020<br><br><br>----------------<br>Organizers<br>----------------<br><br>Ali Fadel, Jordan University of Science and Technology, Jordan<br>Husam Musleh, Jordan University of Science and Technology, Jordan<br>Ibraheem Tuffaha, Jordan University of Science and Technology, Jordan<br>Mahmoud Al-Ayyoub, Jordan University of Science and Technology, Jordan<br>Yaser Jararweh, Duquesne University, USA<br>Elhadj Benkhelifa, Staffordshire University, UK<br>Paolo Rosso, Universitat Politècnica de València, Spain<br><br>For regular updates subscribe to our mailing list: <a href="mailto:ai-soco-fire@googlegroups.com" target="_blank">ai-soco-fire@googlegroups.com</a><br><br>Regards,<br>Organizers of the Authorship Identification of SOurce COde (AI-SOCO) Task</div>
</div>