Date: April 26, 2024
Apple recently changed its stance of being a closed technology company by releasing an Open-Efficient Language Model that offers 2x accuracy.
Apple has taken a huge leap from being a technologically close-ended company to one that began sharing its latest Apple Open ELM. It is said that it's not the fastest, but it can ensure 2.35% more accuracy than the available OLM models trained on public data sets. Apple OpenELM also uses 2x fewer pertaining tokens to perform. This move clearly indicates that Apple is not sitting idle in the race for AI leadership.
Apple said its decision to release an open model is not limited to integrations and API connectivity with its technology, as in earlier times, but extends to sharing its training and evaluation framework. "Diverging from prior practices that only provide model weights and inference code, and pre-train on private datasets, our release includes the complete framework for training and evaluation of the language model on publicly available datasets, including training logs, multiple checkpoints, and pre-training configurations," eleven Apple researchers stated in their associated technical paper.
Apple has, however, kept certain clauses underlining the usage of its ELM. Its accompanying software release is not yet recognized as an open-source license, but Apple claims the right to file a patent if any derivative work stemming from its ELM network is deemed as rights infringement.
The secret behind the higher accuracy of Apple’s OpenELM is a technique called layer-wise scaling. This method allocates parameters more efficiently in the transformer model. Using this, instead of each layer having the same set of configurations, every layer has its own configuration set. The result of this unique method is higher accuracy, as clearly observed in the correct predictions from the model in benchmark tests.
Apple OpenELM is trained on the RedPajama dataset from GitHub, a ton of books, Wikipedia, StackExchange posts, ArXiv papers, and more, and the Dolma set from Reddit, Wikibooks, Project Gutenberg, and more. The training has resulted in a fast yet lower power consumption chatbot that works on a simple, prompt basis. While you type the prompt, the chatbot even tries to auto-complete it.
OpenELM is available in pretrained and instruction-tuned models using 270 million, 450 million, 1.1 billion, and 3 billion parameters. Those using it are warned to exercise due diligence before trying the model for anything meaningful, as the company does not claim 100% correctness due to the self-training nature of its generative AI.
By Arpit Dubey
Arpit is a dreamer, wanderer, and tech nerd who loves to jot down tech musings and updates. Armed with a Bachelor's in Business Administration and a knack for crafting compelling narratives and a sharp specialization in everything from Predictive Analytics to FinTech—and let’s not forget SaaS, healthcare, and more. Arpit crafts content that’s as strategic as it is compelling. With a Logician mind, he is always chasing sunrises and tech advancements while secretly preparing for the robot uprising.
Reddit Unveils AI-Powered Search Tool for Smarter Results
Reddit launched Reddit Answers, an AI-powered search tool that curates and summarizes discussions to enhance user experience and reduce reliance on Google.
OpenAI Scraps o3 Model, Pushes for Unified GPT-5 in a Major AI Overhaul
OpenAI is canceling its o3 AI model and merging it into GPT-5 for a simpler, more powerful system. A big move to stay ahead in the AI race.
Virtual Reality in Healthcare: Revolutionizing Patient Care
Experience the power of virtual reality in healthcare as it transforms medical training, patient care, and treatment methods with immersive technology for better accuracy, efficiency, and improved outcomes.
Google I/O 2025: Dates Announced for the Tech Giant’s Biggest Event of the Year
Google I/O 2025 is set for May 20-21! Expect big AI reveals, Android 16 updates, and more. Registrations are open for keynotes, demos, and game-changing tech innovations!