View on GitHub

ULITE-ws

Understanding LIterature references in academic full TExt at JCDL 2022

Welcome to ULITE workshop at JCDL 2022

JCDL logo

Introduction

Understanding literature references is crucial for the process of maintaining access to previous research as well as gaining new insights from the structure and content that published scientific papers build on.

With the growing number of publications and academic full texts, curating databases of literature manually becomes impossible. By automatic extraction and parsing of references we release the research community from this burden and create different opportunities for deeper analysis of literature. One aspect of this is understanding of publication history: how authors influence each other, which institutions are involved in certain fields and topics and so on. One way to achieve this is to build a reference graph using the citation information inside the papers. If such a system exists, one can easily access information about further work of a particular author on the topic as well as related work, location of relevant datasets and which other projects the institution is pursuing.

It is worth mentioning that while in the field of Computer Science and some other domains multiple initiatives and infrastructures for automatic processing of scientific literature exist (Semantic Scholar, Google Scholar, AMiner, CiteSeerX, etc.), for many other fields the situation is drastically different. Social sciences, law, and history (to name a few) scholars often have to rely on fragmented data sources of full-text documents to navigate their fields.

Call for papers

Motivation

The workshop on Understanding Literature References in Academic Full Text (ULITE), co-located with JCDL 2022, invites submissions on the topics of processing of literature references.

Topics

We invite submissions on the following and related topics (but are not limited to):

Important dates

All deadlines are 11:59 pm UTC -12h (“anywhere on Earth”).

Submission

Regular papers: All submissions must be written in English, following the CEUR Proceedings template (10 pages for full papers and 4 pages for short papers exclusive of unlimited pages for references) and should be submitted as PDF files to EasyChair (see below).

Poster & demonstration: We welcome submissions detailing original, early findings, works in progress and industrial applications of knowledge entities extraction ande evaluation for a special poster session, possibly with a 2-minute presentation in the main session. Some research track papers will also be invited to the poster track instead, although there will be no difference in the final proceedings between poster and research track submissions. These papers should follow the same format as the research track papers but can be shorter (2 pages for poster and demo papers).

CEURART (incl. LaTeX and Word templates) https://ceurws.wordpress.com/2020/03/31/ceurws-publishes-ceurart-paper-style/

Submit via EasyChair: https://easychair.org/conferences/?conf=ulite2022

Submit a paper

Peer Review Progress and Workshop Format

Our peer review process will be supported by Easychair. Each submission is assigned to 2 to 3 reviewers, preferably at least one expert in Bibliometrics and one expert in NLP or Information Extraction. The programme committee will consist of peer reviewers from all participating communities. Accepted papers are either long papers (15-minute talks) or short papers (5-minute talks).

The workshop is planned to last half a day and will include a keynote, invited papers, paper presentations and a closing panel discussion on the needs of the different stakeholders involved in the field of reference understanding. The expected number of participants is approx. 40 people and we plan the event as an on-site event. If the pandemic situation still holds we will switch to an online format (Nunes et al., 2020). A typical seminar-style room with a projector for 30-50 delegates would be required. A/V capacities would be desirable to potentially invite speakers to give an online presentation. For the on-site poster session, we would require a small number of poster boards. The workshop would follow the general JCDL format (this year: online).

Objectives and Target Audience

The goal of this workshop is to engage communities interested in the broad topic of literature reference understanding and automatic processing of scientific fulltext publications. Our workshop has a focus on working with open infrastructures/tools and offering the extracted information as open data for reuse. Our view is to expose people from one community to the work of the respective other community and to foster fruitful interaction across communities.

The target audience of the workshop are researchers and practitioners, junior and senior, from Natural Language Processing (NLP) and Information Extraction as well as Information Retrieval and Bibliometrics/Scientometrics. These could be IR/NLP researchers interested in potential new application areas for their work as well as researchers and practitioners working with bibliometric data and interested in how IR/NLP methods can make use of such data.

We particularly welcome submissions (research paper and technical contributions like) that address the topics for underrepresented domains such as, references in languages other than English and from fields other than Computer Science.

Outcome

Some of the expected outcomes of the workshop include:

Keynotes and invited speakers

  1. Silvio Peroni (accepted). Silvio Peroni holds a Ph.D. degree in Computer Science and he is an Associate Professor at the Department of Classical Philology and Italian Studies, University of Bologna. He is an expert in document markup and semantic descriptions of bibliographic entities using Semantic Web technologies. In his keynote he will talk about his project Open Citations

Contact

All questions about submissions should be emailed to ulite2022@easychair.org.

Biographies of the Organizers

Anastasiia Iurshina is a PhD student in the Analytic Computing group, University of Stuttgart. She is involved in the OUTCITE project. Her interests include processing of literature references as well as more general natural processing tasks such as entity linking.

Muhammad Ahsan Shahid is a software developer at the GESIS - Leibniz-Institute for the Social Sciences department Knowledge Technologies for the Social Sciences (KTS). He received his Master’s degree in Computer Science from the Technische Universität Berlin. His interests include development, operating data and NLP. He is also a member of the enthusiastic team responsible for OUTCITE.

Tobias Backes is a PhD student at GESIS - Leibniz-Institute for the Social Sciences department Knowledge Technologies for the Social Sciences (KTS). He received his Master’s degree in Language Science and Technology from Saarland University. His interests include entity resolution problems such as author disambiguation, institution resolution and duplicate detection. He is also a software contributor in OUTCITE.

Philipp Mayr is team leader at the GESIS - Leibniz-Institute for the Social Sciences department Knowledge Technologies for the Social Sciences (WTS). He received his PhD in applied informetrics and information retrieval from the Berlin School of Library and Information Science at Humboldt University Berlin. His research group focuses on methods and techniques for interactive information retrieval and data set search. He was the main organizer of the BIR workshops at ECIR 2014-2020 and is the co-PI of the EXCITE and OUTCITE projects.

Steffen Staab is professor for Analytic Computing at University of Stuttgart and professor for Web and Computer Science at University of Southampton. Steffen is a fellow of the European Association for Artificial Intelligence. His research interests lie at the intersection of knowledge graphs and machine learning with scientific knowledge graphs being one of the points where these interests meet.

References