I switched from LM Studio to llama.cpp, and I’m never going back to a bloated wrapper


Running AI locally sounds like it should be straightforward until you realize that the app making it feel easy is quietly eating the resources you actually need. I spent time with LM Studio before I started noticing that my hardware was working harder to keep the interface alive than to run the model itself. However, Llamma.cpp is much better and can even run on Raspberry Pi.

LM Studio has too much bloat

I ditched the heavy wrappers for raw llama.cpp

Llama next to a task manager Credit: Jorge Aguilar / HowToGeek

When I started running AI locally, I gravitated toward tools like LM Studio. It is pretty easy to see why, since it is very popular thanks to its model search, downloading, and chat interface. It doesn’t feel much different than using any other app on your computer, and you don’t even need a NAS.

All that convenience comes at a price, though, because the packaging just hides what is actually doing the work. LM Studio, Ollama, and GPT4All are all local AI running the same core engine underneath, which is llama.cpp.

What is different is everything that is built around that engine. Heavy GUI managers force your OS to burn memory and CPU cycles just to keep the interface alive. My hardware was spending its budget rendering visual elements and maintaining API translation layers instead of doing the actual AI work. I didn’t spend long on LM Studio because it was clearly going overboard.

The main culprit is that most of these managers are built on Electron, which ships a full Chromium browser engine bundled with a Node.js runtime. That’s expensive even when the AI isn’t doing anything.

In practice, LM Studio alone can sit at 1.40 GB of RAM and pull up to 1.2 GB of GPU VRAM just as background overhead. On an 8 GB card, that’s not a minor inconvenience; it directly determines which models you can even load. Every megabyte the wrapper takes is a megabyte the model doesn’t get.

Running llama.cpp as a native binary cuts all of that out. While other AI may force your PC to waste memory just from the empty UI, llama.cpp keeps its background footprint down low. When it is running, it doesn’t have to be more than a regular browser. Wrappers also add latency. You get prompt ingestion, which is just the wait time before you see the first token. There was a noticeable difference between running llama.cpp and using LM Studio.

Bypassing the wrapper fixed that. There’s another upside, too, because llama.cpp moves fast, and GUI tools always lag behind its release cycle by weeks. Running it directly means new features like multi-modal audio inputs are available the moment they ship.

You get real control for a smaller learning curve

The learning curve of a command-line interface can feel intimidating coming from a GUI. I remember that I had thought that any time I was using a command line, I was likely going to break something on the PC. However, if you switch to raw llama.cpp it’s worth learning.

To get llama.cpp running on your PC, you need files from two places, pull them both into the same local folder, and you’re basically done.

Start at the llama.cpp GitHub repository. Go to the latest release and download the pre-compiled zip that matches your hardware. Create a folder somewhere convenient and unzip everything into it.

Then head to Hugging Face, grab whichever model you want in GGUF format, but a lighter one is smarter for testing, and drop that file into the same folder.

To run it, type cd then the path from the folder. Then name the AI in a script with the first prompt, and you can start talking.

Make sure to use the launch string with the model filename before your first prompt. Here is what I used llama-cli -m meta-llama-3-8b-instruct.Q4_K_M.gguf -ngl 99 -p "Why is running AI via raw llama.cpp better than a heavy GUI wrapper?"

The performance difference is hard to ignore once you see it. Idle VRAM usage drops from several gigabytes to a fraction of one. Prompt processing speeds jump significantly enough that I noticed it on the first request. Stripping out the GUI and tuning things yourself sounds complicated, but you will definitely see the difference.

The trade-off is worth it

The performance gains make it hard to go background

AI for llama on server Credit: Jorge Aguilar / HowToGeek

It’s easy to see why someone would argue that a GUI is better for beginners. Apps like LM Studio offer a comfortable, pick-up-and-play experience that hides the messy side of deployment. If you’re really that into a GUI, I’d recommend GPT4All over LM Studio because it’s not as restrictive or hard on your PC.

You can make this look like a regular chatbot if you run the code with your model and then -ngl 99 and the URL is http://localhost:8080. It just won’t run as well.

To most people, running a language model through a terminal looks like developer territory. Learning to go through directories and set execution parameters takes time, and that can put people off. Convenience would be why you’d head to heavy wrappers. However, treating local AI like a casual desktop app means paying a real performance price for all that graphical overhead.

I’m not willing to give up over a GB of VRAM just to keep an interface running. It is a huge waste. Learning the llama.cpp interface removes all of that, and you only have to learn it once. After that, your machine can focus on the actual work.

Now that I am used to the speed and control, going back to a heavy interface feels like a genuine step backward. It feels like giving up performance just for a pretty interface. Since llama.cpp includes a built-in web server, it’s not like you’re stuck staring at a terminal either. A little work learning a few commands gets you a much faster, cleaner setup.


The terminal is the difference maker

Switching to raw llama.cpp isn’t for everyone. If you’re not comfortable working from a terminal yet, the learning curve is real, even if it’s shorter than it looks. GPT4All is a more reasonable starting point than LM Studio if you want a GUI that doesn’t punish your hardware for existing. That said, once you’ve run a model without the wrapper overhead even once, it’s hard to unsee the difference. For a lot of setups, it’s the difference between loading the model you actually want and settling for something smaller.



Source link

Leave a Reply

Subscribe to Our Newsletter

Get our latest articles delivered straight to your inbox. No spam, we promise.

Recent Reviews


macOS has a built-in screenshot tool that gets the basics right. You can take a screenshot, record your screen, and even annotate your captures. But the moment you want something more, like scrolling capture, advanced annotation tools, or a quick way to share your screenshots via a link, it starts to fall apart.

That’s where CleanShot X comes in. It’s a powerful screenshot and screen recording app for Mac that replaces the built-in screenshot tool. It feels as if the developers looked at the screenshot features in macOS and added everything that was missing.

Over the past few years, the app has added several new features I didn’t know I needed until it offered them. It has become one of my favorite Mac utilities, and in this article, I will show you its features that will convince you to buy the app instantly. 

Scrolling capture saves you from stitching screenshots together

One of the most frustrating limitations of macOS’s screenshot tool is that it can only capture what’s visible on your screen. If I need to capture a long webpage or a full chat history, I am stuck taking multiple screenshots and stitching them together. That wastes an unbelievable amount of time. 

CleanShot X solves this with its scrolling capture feature. I can trigger the scrolling capture, and CleanShot X automatically scrolls through the content and delivers a single image. I don’t even have to manually scroll the page if I don’t want to.

This feature alone saves me hours of time every month. If you have to deal with long screenshots, you should definitely try it out. 

Time delay capture lets you screenshot the impossible

Some screenshots are tricky to take because they require you to trigger something before capturing. For example, sometimes the on-screen feature you want to capture disappears as soon as you use a keyboard shortcut or click anywhere with your mouse. 

Sometimes, the on-screen elements appear for a short time, and by the time you hit the screenshot shortcut, they disappear. CleanShot X’s time delay capture gives me a few seconds to set things up before the screenshot is taken. I trigger the capture, put everything in place, and CleanShot X does the rest. 

It’s a small feature that solves a genuinely annoying problem.

Capture text from images with OCR

I love that CleanShot X has a built-in OCR function. It lets me capture text directly from any image or video on my screen. Although it happens rarely, I have come across websites that don’t let me copy content. With CleanShot X’s OCR function, that’s not an issue. 

I use this constantly when reviewing PDF documents with restricted permissions or watching a video on YouTube. It is far faster than typing things out manually, and it works surprisingly well. There are many apps that let you capture text with OCR, but since CleanShot X has this feature built in, I don’t need to install an extra app. 

Add beautiful backgrounds to your screenshots

If you share screenshots for work, tutorials, or social media, you know how plain a raw screenshot looks. CleanShot X lets me add beautiful backgrounds to my screenshots, turning a flat capture into something that looks polished and share-ready.

For backgrounds, I can choose from solid colors, gradients, or even my current desktop wallpaper. I can also adjust the padding and shadow, align the screenshot to the edges, and adjust the corner radius. It takes a few seconds and makes a huge difference in how professional your screenshots look.

Annotation tools that get the job done

While macOS’s screenshot tool lets you annotate your screenshots, the annotation tools inside CleanShot X are, in my opinion, the best available on the Mac. 

I can add arrows, text labels, shapes, highlights, and more. I can also change the weight and color of annotations. There are also multiple arrow styles I can choose from. I especially like the curved arrow style that lets me curve the arrows and make them pop. 

One of my favorite new additions is the “Highlighter” tool. It snaps to the text in a screenshot, which makes it really easy to highlight it before sharing. 

Then there’s the “Spotlight” tool that highlights your selection by darkening the rest of the screenshot. It’s perfect for drawing someone’s attention to a specific part of a screenshot. 

No matter what annotation tools you need, you can find them and more in CleanShot X. 

Hide sensitive information before you share

You can find hundreds of instances in the news where a prominent figure shared a screenshot and inadvertently revealed private information. Thankfully, CleanShot X has a dedicated tool to blur or black out sensitive information, so such accidents never happen.

I can choose to pixelate, blur, or completely black out the information. The best part is that I can also adjust the strength of these effects. It lets me blend in the hidden information so the blur doesn’t stand out from the rest of the screenshot. 

Video and GIF recording built right in

CleanShot X also lets you record your screen as a video or export directly as an optimized GIF. The GIF export is particularly useful for sharing quick demos or showing someone how to do something without creating a large video file. 

It can record the entire screen, a specific window, or a custom region. It can also show my mouse clicks and keyboard shortcuts. I can record my computer audio, my microphone, and webcam video. 

I love that it automatically adds the webcam video in the corner, so it doesn’t interfere with the rest of the recording. I can also change the video size and shape. All these features make it really easy to create video tutorials. 

Quick share with cloud links

Once you take a screenshot or finish a recording, you need to share it. Of course, you can easily share screenshots via messages or emails. But CleanShot X gives me a better way. 

Whenever I capture something, it opens a quick share overlay. I can use it to instantly upload my screenshots to CleanShot Cloud and grab a shareable link with a single click.

I no longer have to drag files into cloud storage, attach images to emails, or upload to third-party services. I capture it, click share, and paste the link. It is one of those workflow improvements that sounds minor until you use it every single day.

Capture beautiful screenshots with CleanShot X

CleanShot X has become one of my most dependable apps on Mac. In fact, all the screenshots you see in this article or any of my articles have been captured using CleanShot X. Yes, it’s a paid app, but it has paid its cost multiple times over with the time it has saved me. 

CleanShot X is available as a one-time purchase or through a SetApp subscription. If you want unlimited cloud storage, you have to pay for a monthly subscription. That will also get you advanced features like a custom domain and branding, password-protected link sharing, and more. 

For most users, the one-time purchase is more than enough, and it’s what I use. If you spend any time taking screenshots or recording your screen on a Mac, it is absolutely worth every penny.



Source link