New Talk is coming this week?

CTA BG
News / Event
accepted paper

New Publication: METAL: Metamorphic Testing Framework for Analyzing Large-Language Model Qualities

New paper accepted! 🎉 🎉 🎉 Excited to share our paper, "METAL: Metamorphic Testing Framework for Analyzing Large-Language Model Qualities" by Sangwon Hyun, Mingyu Guo and @alibabar has been accepted in the 17th IEEE International Conference on Software Testing, Verification and Validation (ICST) 2024 , a Level-A conference within software testing community.

This paper proposes an automated framework, MEtamorphic Testing for Analyzing Large-Language (METAL), to evaluate essential quality attributes of primary tasks in different LLMs. It facilitates the systematic testing of LLM qualities by defining Metamorphic Relations (MRs), which serve as modularized evaluation metrics. The METAL framework can automatically generate hundreds of MRs from templates that cover various QAs and tasks. We believe that this framework is the first open and executable framework to directly evaluate and calculate the comprehensive qualities of LLM outputs utilizing unlimited input text space.

🔗 Full paper can be accessed through arxiv.org/abs/2312.06056.

#largelanguagemodels #Metamorphictesting #Qualityattributes #Textperturbations #LLMQualityTesting

Image
logo

Room 4.54, Ingkarni Wardli, North Terrace, The University of Adelaide, SA, Australia

ali.babar@adelaide.edu.au

+61 8 8313 4478

© 2023 CREST. All Rights Reserved.