Arcee + mergekit: our commitment to open source
On Day 3 of March Merge Madness we’re striking a bit of a serious note, with a discussion about our very serious commitment to the open source community.
When Arcee merged with mergekit last month, we promised to uphold mergekit Founder Charles Goddard’s commitment to keeping the repo open source.
When Goddard – a former NASA and Apple engineer – first started experimenting with LLMs, he quickly became interested in the innovative new techniques based on model merging.
Charles started reading all the literature on the topic that he could find – but soon discovered that the code was not readily available. He was impressed by the high level of innovative research in the papers, but was disappointed by the lack of accompanying code releases. On the rare occasion that code was included, it wasn't in easily accessible form – usually just a Jupyter notebook that's barely been looked at, or code written to run on a researcher's particular cluster.
These roadblocks led Charles to one conclusion: if he was going to experiment with model merging, he'd need to write his own code. That led him to create a repo he called "mergekit," which quickly became popular on Github.
In February, mergekit merged Arcee – with Charles joining Arcee's staff as a Senior Research Engineer.
Both Charles and Arcee are committed to keeping the mergekit library open source, as they lead the way in model merging, LLM, and SLM (Small Language Model) research and development.
Check out the video interview below to hear more from Charles about why he’s so passionate about open source.
TRANSCRIPT:
Charles Goddard, mergekit Founder
There are all these huge tech companies that are producing incredible breakthroughs in the fields of natural language processing and artificial intelligence.
But those results are staying more or less siloed behind corporate firewalls and going towards specific products, but not going out to the greater community.
And that's absolutely understandable. They've invested huge amounts of money into that and it's their IP.
But I really want to basically be driving forward the frontiers of human knowledge in a way that's accessible to humans – other than the people building it.
So, I made mergekit open source with the thought that I can do a bunch of experiments with model merging, but that's only so many.
If I release the tools for this kind of thing to the world at large, then suddenly you've got thousands of people playing with this and doing different merges, trying out different techniques, figuring out the best hyperparameters.
In a week, more interesting results come out of the open source community than I could do in months.