OOP 2026
Quality in the blur: testing and evaluating AI systems
Artificial intelligence impresses - and disappoints. Sometimes chatbots answer correctly, sometimes incorrectly, but even the most absurd nonsense (as well as the opposite) is presented with seemingly great certainty. And such systems are supposed to control business-critical processes? The key challenge is therefore: how can the quality of AI systems be measured and assured? How does quality assurance deal with systems that are inherently probabilistic, i.e. that can deliver false results even in normal operation?