The Edge Gallery app offers users a choice of downloadable models, ranging from compact versions around 500MB to more advanced models weighing in at roughly 4GB.
Google has introduced Edge Gallery, a new app that allows users to run large language models (LLMs) directly on their smartphones, without needing an internet connection. The Android-exclusive app is currently available via the Google AI Edge GitHub repository, with an iOS version said to be “coming soon.”
According to a blog post by the company, “The Google AI Edge Gallery is an open-source Android application that serves as an interactive playground for developers.” The app serves as a testbed for developers and enthusiasts eager to explore AI’s capabilities on the edge, meaning, directly on their devices.
The Edge Gallery app offers users a choice of downloadable models, ranging from compact versions around 500MB to more advanced models weighing in at roughly 4GB. Users need to sign in to the Hugging Face platform and accept usage terms before they can access the models, most of which are open source and freely available.
Available models include Google’s own Gemma 3 and the newly announced Gemma 3n, as well as Alibaba’s Qwen 2.5. Once downloaded, users can interact with these models across three key functions: having real-time conversations, uploading and interpreting images, or using the “Prompt Lab,” a single-turn interaction mode where users input a question or statement and receive one AI-generated response.
What sets the app apart is its ability to run entirely offline. Once a model is installed, users don’t need an active data connection to engage with it—perfect for remote environments or users with limited connectivity.
Meet Gemma 3n: Compact, Capable, and Cloud-Free
One standout offering in the Edge Gallery lineup is Google’s Gemma 3n model, designed to work seamlessly on smartphones with minimal memory consumption. Though it’s a small language model, it holds its own on performance metrics.
On the LMArena leaderboard for text tasks, Gemma 3n scored 1293 points. For comparison, OpenAI’s o3-mini model scored slightly higher at 1329, while the o4-mini notched 1379 points. The top performer remains Google’s Gemini 2.5 Pro with a score of 1446.
However, as with any offline model, there are limits. The AI cannot access real-time data or events beyond its training cutoff. For example, Gemma 3n’s knowledge is current only up to June 2024.
By bringing powerful AI capabilities directly onto mobile devices, Google is not only showcasing its technical muscle but also pointing to a future where generative AI can be truly untethered from the cloud.