Sofia / Zurich, October 16, 2024. ETH Zurich, INSAIT, and LatticeFlow AI announce the release of the first evaluation framework of the EU AI Act for Generative AI models.
The release, available at https://compl-ai.org, includes the first technical interpretation of the EU AI Act, mapping regulatory requirements to technical ones, together with a free and open-source framework to evaluate Large Language Models (LLMs) under this mapping. The launch also features the first compliance-centered evaluation of public foundation models from organizations such as OpenAI, Meta, Google, Anthropic, and Alibaba against the EU AI Act technical interpretation.
Thomas Regnier, the European Commission’s spokesperson for digital economy, research, and innovation, commented on the release: “The European Commission welcomes this study and AI model evaluation platform as a first step in translating the EU AI Act into technical requirements, helping AI model providers implement the AI Act.”
First Technical Interpretation of the EU AI Act
The EU AI Act, the first comprehensive AI regulation, came into force in August 2024. However, the Act outlines high-level regulatory requirements without providing detailed technical guidelines for companies to follow. To address this, the European Commission has
launched a consultation on the Code of Practice for providers of general-purpose Artificial Intelligence (GPAI) models, tasked to supervise the implementation and enforcement of the AI Act rules on GPAI. The COMPL-AI release can also benefit the GPAI working groups, which can use the technical interpretation document as a starting point for their efforts.
An Open-Source Framework for Evaluating LLMs on Regulations
In addition to the technical interpretation, COMPL-AI includes a free and open-source framework built upon 27 state-of-the-art benchmarks that can be used to evaluate LLMs against these technical requirements.
“We invite AI researchers, developers, and regulators to join us in advancing this evolving project,” said Prof. Martin Vechev, Full Professor at ETH Zurich and Founder and Scientific Director of INSAIT in Sofia, Bulgaria. “We encourage other research groups and practitioners to contribute by refining the AI Act mapping, adding new benchmarks, and expanding this open-source framework. The methodology can also be extended to evaluate AI models against future regulatory acts beyond the EU AI Act, making it a valuable tool for organizations working across different jurisdictions.”
First Compliance-Centered Evaluation Of Public Generative AI Models
This launch also includes the first evaluation of public generative AI models from OpenAI, Meta, Google, Anthropic, Alibaba, and others, which are measured against an actionable interpretation of the EU AI Act. While these models have traditionally been optimized for performance, this is the first time they have been comprehensively assessed against an actionable interpretation of the EU AI Act.
The evaluation reveals key gaps — several high-performing models fall short of meeting regulatory requirements, with many scoring only around 50% across cybersecurity and fairness benchmarks. On the positive side, most models performed well in terms of harmful content and toxicity requirements, showing that companies have already optimized their models in these areas. Additionally, some technical requirements, such as copyright and user privacy protection, remain difficult to benchmark, suggesting the need for further refinement of the regulation to support reliable technical evaluations.
“With this framework, any company — whether working with public, custom, or private models — can now evaluate their AI systems against the EU AI Act technical interpretation. Our vision is to enable organizations to ensure that their AI systems are not only high-performing but also fully aligned with the regulatory requirements such as the EU AI Act,” said Dr. Petar Tsankov, CEO and Co-Founder at LatticeFlow AI.
For more information, including access to the open framework, the mapping to technical requirements, and evaluation results, visit https://compl-ai.org.