publications | Yuetian's Home page

2024

Reflections & Resonance: Two-Agent Partnership for Advancing LLM-based Story Annotation

Yuetian Chen, and Mei Si

In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), May 2024

Abs

We introduce a novel multi-agent system for automating story annotation through the generation of tailored prompts for a large language model (LLM). This system utilizes two agents: Agent A is responsible for generating prompts that identify the key information necessary for reconstructing the story, while Agent B reconstructs the story from these annotations and provides feedback to refine the initial prompts. Human evaluations and perplexity scores revealed that optimized prompts significantly enhance the model’s narrative reconstruction accuracy and confidence, demonstrating that dynamic interaction between agents substantially boosts the annotation process’s precision and efficiency. Utilizing this innovative approach, we created the “StorySense” corpus, containing 615 stories, meticulously annotated to facilitate comprehensive story analysis. The paper also demonstrates the practical application of our annotated dataset by drawing the story arcs of two distinct stories, showcasing the utility of the annotated information in story structure analysis and understanding.

2023

Enhancing Sentiment Analysis Results through Outlier Detection Optimization

Yuetian Chen, and Mei Si

arXiv preprint arXiv:2311.16185, May 2023

Abs

When dealing with text data containing subjective labels like speaker emotions, inaccuracies or discrepancies among labelers are not uncommon. Such discrepancies can significantly affect the performance of machine learning algorithms. This study investigates the potential of identifying and addressing outliers in text data with subjective labels, aiming to enhance classification outcomes. We utilized the Deep SVDD algorithm, a one-class classification method, to detect outliers in nine text-based emotion and sentiment analysis datasets. By employing both a small-sized language model (DistilBERT base model with 66 million parameters) and non-deep learning machine learning algorithms (decision tree, KNN, Logistic Regression, and LDA) as the classifier, our findings suggest that the removal of outliers can lead to enhanced results in most cases. Additionally, as outliers in such datasets are not necessarily unlearnable, we experienced utilizing a large language model – DeBERTa v3 large with 131 million parameters, which can capture very complex patterns in data. We continued to observe performance enhancements across multiple datasets.
Prompt to GPT-3: Step-by-Step Thinking Instructions for Humor Generation

Yuetian Chen, Bowen Shi, and Mei Si

14th International Conference on Computational Creativity, May 2023

Abs

Artificial intelligence has made significant progress in natural language processing, with models like GPT-3 demonstrating impressive capabilities. However, these models still have limitations when it comes to complex tasks that require an understanding of the user, such as mastering human comedy writing strategies. This paper explores humor generation using GPT-3 by modeling human comedy writing theory and leveraging step-by-step thinking instructions. In addition, we explore the role of cognitive distance in creating humor.
Automated Visual Story Synthesis with Character Trait Control

Yuetian Chen, Bowen Shi, Peiru Liu, and 2 more authors

Artificial Intelligence and Social Computing, May 2023
Visual Story Generation Based on Emotion and Keywords

Yuetian Chen, Ruohua Li, Bowen Shi, and 2 more authors

arXiv e-prints, Jan 2023

2021

Automated Cell Recognition using Single-cell RNA Sequencing with Machine Learning

Chengqi Xu, Yuetian Chen, and Yiyang Cao

In 2021 5th International Conference on Computational Biology and Bioinformatics, Jan 2021
A review of self-encoding language models for bidirectional representation

Yuetian Chen

In Proc. of SPIE Vol, Jan 2021