Tether's AI research division, QVAC, has made a groundbreaking announcement that could reshape the landscape of artificial intelligence education. On December 22, 2025, they unveiled QVAC Genesis II, a major enhancement to the world's largest publicly available synthetic educational dataset for AI pretraining, which is detailed in the document.
Introduction of QVAC Genesis II
The new release of QVAC Genesis II introduces an impressive 107 billion tokens, increasing the total dataset size to 148 billion tokens across 19 educational domains. This expansion is designed to significantly enhance the scale and depth of AI training data, focusing on improving the reasoning capabilities of AI models rather than merely their predictive abilities.
Focus on Critical Thinking and Reasoning
By prioritizing the development of critical thinking and reasoning skills in AI, QVAC aims to foster a new generation of intelligent systems that can better understand and interact with complex information.
Accessibility of the Dataset
The dataset is made available under a Creative Commons license, ensuring that researchers and developers can freely access and utilize this valuable resource to advance their work in AI education.
Earlier today, Meta Platforms launched LLaMA 2, an open-source large language model aimed at enhancing AI research and commercial applications. This development contrasts with Tether's recent unveiling of QVAC Genesis II, which focuses on educational datasets for AI. For more details, see LLaMA 2.







