Insights

Partner Spotlight: Arcee AI 🤝 MongoDB

Partner Spotlight: Arcee AI 🤝 MongoDB

Joint customers use MongoDB & Arcee AI to take data from JSON files and turn it into world-class custom language models with practical business use cases–in just a few clicks.

How Do I Prep my Data to Train an LLM?

How Do I Prep my Data to Train an LLM?

So you want to train a custom language model, and you do have the requisite large set of text data. But how do you know that the data is *really actually ready* for model training? Our researchers here at Arcee AI tell you what to look out for.

Optimizing LLM Training with Spectrum

Optimizing LLM Training with Spectrum

Here at Arcee AI, we're the pioneers of training performant and efficient LLMs with Model Merging... And now we bring you *yet another* cutting-edge technique that also dramatically optimizes your training and improves your models.

Evolutionary Model Merging For All

Evolutionary Model Merging For All

We've been focused on developing this groundbreaking technique for the community, and we're now excited to announce the launch of this state-of-the-art functionality in MergeKit.

Why Methods Like QLoRA Fall Short in Domain Knowledge Injection

Why Methods Like QLoRA Fall Short in Domain Knowledge Injection

Arcee's research shows that the Standard Continual Pre-Training (CPT) approach performs better than QLORA-based CPT.

How to Merge Llama3 Using MergeKit

How to Merge Llama3 Using MergeKit

... And what do we do at Arcee when an exciting new model drops? We MERGE IT on MergeKit! We walk you through the process and share the initial results.

Case Study: Innovating Domain Adaptation through Continual Pre-Training and Model Merging

Case Study: Innovating Domain Adaptation through Continual Pre-Training and Model Merging

We show how Arcee uses the most innovative Continual Pre-Training and Model Merging techniques to deliver high-quality domain-specific language models at a fraction of the cost of our competitors–using Medical and Patent data.