Meet Arcee-SuperNova: Our Flagship 70B Model, Alternative to OpenAI

Meet Arcee-SuperNova: a groundbreaking model with state-of-the-art abilities in instruction-following and strong alignment with human preferences.

Meet Arcee-SuperNova: Our Flagship 70B Model, Alternative to OpenAI
The incredible capabilities of Arcee SuperNova - trained on an orchestra of mostly open source AI tools

This report outlines the development, performance, and deployment strategies of Arcee-SuperNova, our latest model release, which serves as the flagship general model for our OpenAI Migration plan. This model represents a significant evolution in our approach to large language models, particularly in terms of instruction-following capabilities, alignment with human preferences, and customer integration.

Model Composition and Post-Training Techniques

Arcee-SuperNova is the result of integrating new post-training techniques developed in-house. Specifically, the model is a distilled version of Llama-3.1-405B-Instruct into Llama-3.1-70B-Instruct, which serves as the foundation through our DistillKit. We utilized the logits and attention masks from the 405B model, preserving its instruction-following capabilities while reducing model size.

In parallel, we trained a separate model based on base Llama-3.1-70B, using synthetic instruction data generated through our Evol-Kit pipeline. This pipeline played a pivotal role in enhancing the model’s ability to respond to diverse queries with precision and strict adherence to user instructions. To further optimize the integration of updates during training, we adopt a merging technique every half epoch, enabling smoother and more consistent performance improvements.

A third version of Llama-3.1-70B-Instruct was also trained with additional Direct Preference Optimization (DPO), aimed at refining the model's alignment with human preferences. Although this DPO-enhanced version was included in the final model merge, it contributed a lesser total weight compared to the distilled and instruction-tuned models.

The final version of Arcee-SuperNova is a combination of these models, leveraging the strengths of each to achieve strong human preference alignment and state-of-the-art performance in instruction-following.

Benchmark Performance

IFeval is a common generally intelligent instruction following evaluation dataset
A subset of the Auto-Hard-Arena benchmark, highlighting Arcee-Llama-3.1-SuperNova's performance in human preference scores.
Reasoning, Math, and Knowledge retrieval benchmarks

Summary of Achievements

  1. State-of-the-art instruction-following: The model’s ability to follow instructions is a defining feature and the most notable improvement.
  2. Customer alignment: The model has been tuned to strictly adhere to customer needs and preferences, making it suitable for business use cases requiring precision and reliability.
  3. Human preference scores: The model is notably closer to the 405B-Instruct model in human preference alignment than the stock Meta Instruct model, a critical factor in delivering a high-quality, user-friendly experience.
  4. Benchmark performance: The model demonstrates its superiority over 70B-Instruct in general benchmarks, highlighting its broader applicability across tasks.
  5. Mathematical query improvement: Arcee-SuperNova performs exceptionally well in handling difficult mathematical queries, a common area where many models fall short.

Deployment and Customer Integration

We are offering Arcee-Llama-3.1-SuperNova-70B through a chat interface forked from OpenWebUI, model weights via the AWS Marketplace for AWS VPC deployment, and via API at full precision for external benchmarks, should any reputable sources wish to replicate our results. This accessibility ensures that the model's capabilities can be independently verified and utilized in a variety of settings.

SuperNova is designed to serve as a direct replacement for services like ChatGPT and Anthropic, with a key differentiator: we host it entirely within the customer's own environment. This means that customer data never leaves their infrastructure and the customer owns the model parameters . This approach addresses critical concerns around data privacy and security, making SuperNova an ideal solution for businesses that handle sensitive information.

As we continue to improve SuperNova, updates will be offered to customers as a drop-in replacement, but only if they choose to adopt them. Customers have complete control over the model weights, and changes to the model occur only with their consent. This empowers businesses to maintain stability in their operations, extend model capabilities, and integrate updates on their own terms.

Additionally, we offer the capability to retrain the model over time using Reinforcement Learning from Human Feedback (RLHF) to more closely align the model with the customer's specific preferences and business needs. We also provide the ability to perform continued training on top of SuperNova for bespoke tasks that are unique to each user. This customization ensures that the model not only meets general performance standards but also excels in areas that are crucial to the customer's operations.

Acknowledgments

We extend our deepest gratitude to Meta for the release of Llama-3.1. Many of the achievements we have made with SuperNova are built directly on top of their groundbreaking work. Without their contributions, the development of Arcee-SuperNova would not have been possible. Their commitment to advancing the field has been instrumental in allowing us to push the boundaries of what's achievable with post-training powerful LLMs.

Arcee-SuperNova marks a new milestone in our model development efforts, offering significant advancements in instruction-following, human preference alignment, and customer-centric deployment. By leveraging a combination of distilled checkpoints, synthetic instruction data, innovative post-training techniques, and flexible integration options, the model delivers competitive performance across several key areas. We are confident that SuperNova will not only meet but exceed the expectations of our customers, setting a new standard for what can be achieved within their own environment.