Foojay Podcast #99: Testing the Untestable: LLM Security for Java Developers with Tiberius

Author: Frank Delporte

Original post on Foojay: Read More

Table of Contents

YouTubePodcast AppsGuestsLinksContent

Your AI-powered Java application is live in production. But have you actually tested whether it can be jailbroken or manipulated into leaking data it should never reveal? In this episode, Iryna Dohndorf walks us through Tiberius, an open-source security testing library for LLM applications in Java, and explains why everything you know about unit testing needs a rethink when non-determinism is part of the design.

YouTube

Podcast Apps

You can listen and subscribe to the Foojay Podcast on:

Guests

  • Iryna Dohndorf – Software Engineer at Karakun Group, active member of the Basel One and Devoxx UK program committees, and creator of Tiberius

Links

Content

  • 00:00 Introduction of topic and guest
  • 01:05 The problem Tiberius wants to solve
  • 06:39 How “traditional” unit tests don’t work for LLM integrations
  • 10:23 Scan-Fixture-Validate principle and sharing artifacts
  • 15:15 Using different skills, for example, the grandmother skill
  • 17:33 Testing for required versus forbidden bias
  • 19:35 The probes across nine attack categories used by Tiberius
  • 20:44 Buff mutation testing
  • 26:55 Using Tiberius in your pipelines and when to fail
  • 29:35 Using multi-trial scans
  • 31:14 Fingerprinting: which model you use, should not be detectable
  • 32:55 Combining multiple models, model as a judge
  • 34:41 Sharing JSON models to improve tests
  • 36:05 How to get started with Tiberius in Spring and with LangChain4j
  • 36:41 Quarkus not supported yet, plans for the future
  • 39:07 Conclusions and a call out to everyone to become a Foojay author

The post Foojay Podcast #99: Testing the Untestable: LLM Security for Java Developers with Tiberius appeared first on foojay.