That Guy From Delhi: development

Showing posts with label development. Show all posts

29 Mar 2026

Understanding Google AI Credits: What You're Actually Paying For

If you've subscribed to Google AI Pro [1] (or are eyeing an upgrade), you've probably noticed the word "credits" appearing everywhere — in your billing dashboard, in the Gemini app, and inside your code editor [5]. But what are they? How do they reset? And if your family is on the plan, who's burning through them? [3]

I spent an afternoon untangling this, and the short version is: Google doesn't have one quota system — it has three, each resetting on a different clock. Here's how it all works.

The Three Clocks

The single biggest source of confusion with Google AI Pro is that there are three independent limits running simultaneously, each with its own reset schedule:

Timer	What Resets?	Applies To
Monthly	1,000 AI Credits	Video / 4K Images / IDE Overages
Weekly	Antigravity Baseline Quota	"Free" high-tier usage in Antigravity
Daily	Prompt & Media Limits	Chat (Thinking/Pro), Images, Music

Understanding which "fuel tank" you're drawing from at any given moment is the key to not running dry at the wrong time.

Monthly AI Credits (The 1,000 Pool)

When you subscribe to AI Pro, Google allocates 1,000 AI Credits [1] at the start of each billing cycle. These are the credits you see in your Google One dashboard with a countdown showing days until reset.

Key facts:

No rollover. If you have 380 credits left with 10 days to go, those 380 vanish on your billing date [1]. You start fresh at 1,000.
These are for premium "heavy" tasks. High-end video generation (Google Flow/Veo 3.1) [4], advanced image creation (Imagen 4 / Nano Banana 2), and IDE overage billing all draw from this pool.
Standard tasks are free. Typing prompts into the Gemini app, generating standard images (Gemini / Imagen 4), and basic music tracks (Lyria 3) do not consume monthly credits up to their daily limits [4].

The Top-Up Exception: If you manually purchase a "Top-up" credit pack, those purchased credits are typically valid for 12 months from the date of purchase and do carry over across billing cycles [1]. Only your subscription credits are use-it-or-lose-it.

What Can You Buy With Credits?

Here are the standard costs for premium generation (2026 pricing) [2, 6]:

Feature	Credit Cost (Approx.)	What 380 credits buys you
Nano Banana 2 (Imagen 4)	1	~380 high-res assets
Veo 3.1 Fast (via Flow)	10	~38 cinematic clips
Veo 3.1 Quality (via Flow)	100	~3 cinematic clips
IDE Overage (per hour)*	15	~25 hours of high-tier use

*Approximation based on typical token consumption in Antigravity.

Tip: If you're close to the end of your billing month with credits to spare, it's the perfect time for high-resolution 4K image generation or experimental Veo video clips. They won't cost you anything "extra" once the new month starts.

Antigravity Weekly Baseline Quota (The IDE Clock)

If you use Google Antigravity (the AI-powered code editor), you'll notice a separate "refresh date" in your settings. This date is typically different from your monthly credit reset.

What it is: Google gives AI Pro users a "Free Baseline" of high-performance model usage (Gemini 3.1 Pro, Claude 4.6 Sonnet, etc.) within Antigravity every week [5, 6].

Key facts:

5-Hour Sprints: Your immediate capacity refreshes every 5 hours [5].
7-Day Hard Cap: There is a weekly baseline limit. If you exhaust this, the 5-hour refresh stops working until your next 7-day reset (the "refresh date" you see) [5].
This quota is individual — each account on your family plan has its own personal IDE baseline.

The Overage Toggle

In Antigravity settings, the "Use AI Credits for Overages" toggle lets you decide what happens when your weekly baseline runs out:

ON: Antigravity draws from your monthly 1,000-credit pool to keep you coding at full speed.
OFF: You're limited to the "Flash" model until the weekly refresh hits.

Daily Quotas (Chat, Music, and Deep Research)

Finally, your day-to-day interactions have their own daily caps that reset at midnight Pacific Time. These never touch your 1,000 monthly credits:

Feature	Limit Per Day
Thinking Model Chat	300 prompts
Pro Model Chat	100 prompts
Nano Banana 2 (Std)	1,000 images
Lyria 3 (Music)	50 tracks [4]
Deep Research Reports	3 reports

Family Plans: What's Shared, What's Not

Feature	Shared with Family?	Reset Cadence
1,000 AI Credits	Yes [3] (shared pool)	Monthly
2 TB Storage	Yes [1] (shared pool)	Monthly
Daily Interaction Limits	No (individual)	Daily
Antigravity Baseline	No [5] (individual)	Weekly

The Credits are the only shared fuel. If your family member generates a few Veo 3.1 videos, they are spending from the same 1,000-credit bucket you use for your IDE overages [3]. You can monitor this in the Google One → AI Credits Activity dashboard.

Important: Family members must be 18+ for high-tier model access. Under-18 accounts are restricted to Gemini Basic.

References

Google One AI Pro: Membership Overview - Plan pricing and 1,000 credit allocation.
Google One - AI Credits Pricing (2026) - Details on credit reset and top-ups.
Google One Help - Managing Family AI Credit Activity - Shared pool tracking and activity history.
Google Cloud - Expanding AI Creativity with Google Flow and Veo 3.1 - Media generation models and limits.
Google DeepMind - Antigravity IDE Baseline Quota Framework - 5-hour refresh and weekly baseline rules.
Anthropic - Claude Sonnet 4.6 on Google Cloud Vertex AI - February 2026 release news.

9 Mar 2026

Display IMDb Ratings on Einthusan

Technical Features

Surfing niche streaming sites without inline film ratings is a recipe for endless tab-opening and "analysis paralysis." To scratch my own itch, I put together a small userscript called Masala Script to fix this exact problem.

For context, Einthusan is a massive streaming directory for South Asian cinema. While it's an excellent digital archive, exploring its thousands of regional films is tedious because it lacks external metadata like IMDb ratings.

To solve this friction, Masala Script (presently just a single Tampermonkey extension file) reads the Einthusan page, interfaces with the free OMDB API, and renders IMDb rating badges right next to the title.

Technical Features

While it might seem trivial to inject an API call onto a DOM load, building this script properly required addressing a few interesting technical hurdles:

Intelligent Fallback Matching via Wikipedia: South Asian movie titles vary wildly in transliteration. An exact-title search against the OMDB API frequently fails for regional films. However, Einthusan usually provides a Wikipedia link for each movie. I built a transparent scrape fallback: if the direct title/year fetch fails, the script fetches the linked Wikipedia page in the background, extracts the definitive ttXXXXXXX IMDb ID using a regular expression, and then repeats the OMDB query using that exact ID for 100% accuracy.
Aggressive Client-Side Local Caching: The free tier of the OMDB API provides 1,000 requests per day. A typical Einthusan browse page can render over 20 movie cards at once. Scrolling through just 50 pages would immediately exhaust the daily allowance and result in rate limits. The script counters this by heavily utilizing the GM_setValue and GM_getValue Tampermonkey APIs—caching successful queries in the browser for 7 days, and failed title lookups for 1 day.
Detailed Error Tooltips: Rather than failing silently, any lookup that ultimately misses (or fails due to API config errors) renders a "Fail" or "N/A" UI badge. When hovered, it provides an exact exception traceback or error string so the user knows exactly why the movie metadata wasn't found.
Single Page Application Navigation Detection: Einthusan manages categories via AJAX updates, utilizing history.pushState and popstate to load new frames. The userscript actively monkey-patches and listens to these navigation boundaries, injecting the DOM observers properly on dynamic content swapping.

The Genesis of Masala Script

Built using modern AI tools, this project demonstrates how rapidly useful and robust browser enhancements can be coded from scratch with minimal manual boilerplating.

Looking at the initial Git history over the course of the day shows how quickly the script escalated from a basic exact-title scraper into a much more mature and intelligent tool:

commit 70027efcd906586cb46928b6da16c46c2402ae25
docs: replace development instructions with a detailed features and notes section in README.

commit 133dbdf7e6123f3a9ac774d820e1a1aa932051f4
Add caching for imdb ratings

commit acb25cc1187884e771efa9e32cf9836461d403c1
feat: Do a fallback search (via Wikipedia), in case first imdb rating fetch fails.

commit 06f3f05be8f6ebd08dfedffb80cf4d53544786f6
feat: Add movie page to the list of pages where this works

Try It Out

If you want to try it out on your next Einthusan movie night:

Install the Tampermonkey browser extension.
Claim a free OMDB API Key.
Install the script by clicking on imdb-einthusan.user.js.

The next time you navigate to any browse or movie page on Einthusan, the script will prompt you for your OMDB API key (only once), and start displaying ratings next to the movie cards!

Although this extension might only interest a very niche subset of users, for those of us who regularly browse Einthusan (and heavily rely on IMDb to filter through the noise), it's a massive quality-of-life add-on to have.

If anyone is looking for new features, has a bug to report, or wants to contribute, feel free to create an issue or submit a pull request on the repository. Happy watching!

23 Nov 2025

PostgreSQL Buildfarm Members: A status update

The PostgreSQL Buildfarm is a global network of machines that continuously test PostgreSQL across a wide range of operating systems, architectures, compilers, and branches. Over the past few years, I have created and maintained several buildfarm members, each with its own quirks and strengths. In this post, I’ll share a status-update working on the following animals: alligator, dodo, woodpecker, leafhopper, massasauga, parula, and snakefly.

What is the Buildfarm?

The Buildfarm is essential for PostgreSQL development. It helps catch platform-specific bugs early, ensures code quality, and provides confidence that new changes work everywhere. Each member reports results for multiple branches (like master, REL_18_STABLE, etc.), using different OSes, compilers, and hardware.

About the Architectures

The Open Hardware Frontier: RISC-V

RISC-V is an open standard instruction set architecture (ISA) and unlike most other ISAs, RISC-V is provided under open source licenses that do not require fees to use.

ovenbird is my first foray into this architecture, running on a VisionFive 2 board.
It (hopefully) represents the future of open hardware, and ensuring PostgreSQL compiles and runs correctly on it is a long-term investment in the open-source ecosystem.

Bridging Windows and Linux: WSL2

Windows Subsystem for Linux (WSL) lets developers run a GNU/Linux environment -- including most command-line tools, utilities, and applications -- directly on Windows, unmodified, without the overhead of a traditional virtual machine or dualboot setup.

woodpecker runs inside a Debian container on WSL2.
This setup is crucial for verifying that PostgreSQL behaves correctly in this increasingly popular development environment, which bridges the gap between Windows and Linux.

Small Scale, Big Impact: Raspberry Pi

Raspberry Pi revolutionized low-cost computing and is a fantastic platform for edge cases (pun intended) :)

dodo runs on a Raspberry Pi 4 Model B.
It helps identify performance regressions and race conditions that might be masked by faster hardware. It also ensures PostgreSQL remains viable for low-powered, IoT and edge computing use cases.

The Rise of ARM in the Cloud: Graviton

Several of the buildfarm animals I’ve created run on the Graviton processors. Graviton is Amazon’s custom ARM-based CPU family, designed for high performance and energy efficiency in AWS cloud environments.

Graviton1 (first generation) was introduced in 2018, bringing ARM64 to AWS EC2.
Graviton2 (second generation) launched in 2020, offering major improvements in performance and scalability.
Graviton3 (third generation) arrived in 2022, further boosting compute, memory bandwidth, and energy efficiency—making it ideal for demanding workloads like database regression testing.
Graviton4 (fourth generation) is the latest, offering even greater performance and efficiency for cloud-native workloads. The buildfarm animal 'leafhopper' is one of the first to test PostgreSQL on Graviton4.

Testing PostgreSQL on these platforms helps ensure the database runs smoothly on modern cloud hardware and takes advantage of ARM’s growing ecosystem.

Disclosure: The Graviton machines are provided by my employer. All other machines (including the WSL2, RISC-V, and Raspberry Pi instances) are my personal machines.

Meet the Buildfarm Animals

Here’s a quick overview of the machines I have created and recently worked on:

alligator

OS: Ubuntu 24.04 LTS
Arch: x86_64
Compiler: gcc experimental (nightly build)
Branches: master, REL_18_STABLE, REL_17_STABLE, REL_16_STABLE, REL_15_STABLE, REL_14_STABLE, REL_13_STABLE
Notes: Tracks the latest GCC changes, often finds compiler regressions before anyone else.

dodo

OS: Raspbian GNU/Linux 10
Arch: armv7l
Compiler: gcc experimental (nightly build)
Branches: master, REL_18_STABLE, REL_17_STABLE, REL_16_STABLE, REL_15_STABLE, REL_14_STABLE, REL_13_STABLE
Notes: ARM platform, useful for catching issues on lower-powered hardware.

woodpecker

OS: Debian/WSL2@win11 12 (bookworm)
Arch: x86_64
Compiler: gcc 12.2.0
Branches: master, REL_18_STABLE, REL_17_STABLE, REL_16_STABLE, REL_15_STABLE, REL_14_STABLE, REL_13_STABLE
Notes: Runs inside WSL2 on Windows 11, great for testing integration with Windows environments.

leafhopper

OS: Amazon Linux 2023
Arch: aarch64/graviton4/r8g.2xl
Compiler: gcc experimental (hourly build)
Branches: master, REL_18_STABLE, REL_17_STABLE, REL_16_STABLE, REL_15_STABLE, REL_14_STABLE, REL_13_STABLE
Notes: Created and managed in a work-based environment; leafhopper is one of the first buildfarm animals testing PostgreSQL on Graviton4 hardware.

massasauga

OS: Amazon Linux 2
Arch: aarch64 (Graviton1)
Compiler: gcc experimental (nightly build)
Branches: master, REL_18_STABLE, REL_17_STABLE, REL_16_STABLE, REL_15_STABLE, REL_14_STABLE, REL_13_STABLE
Notes: Created and managed in a work-based environment; Graviton1 machine—one of the earliest ARM64 regression testers in the buildfarm, still running reliably after several years.

parula

OS: Amazon Linux 2 (AL2) / Graviton3
Arch: aarch64/Graviton3/c7g.2xl
Compiler: gcc experimental (nightly build)
Branches: master, REL_18_STABLE, REL_17_STABLE, REL_16_STABLE, REL_15_STABLE, REL_14_STABLE, REL_13_STABLE
Notes: Created and managed in a work-based environment; focuses on the third generation of AWS Graviton hardware, useful for performance and compatibility.

snakefly

OS: AmazonLinux@Graviton2 AL2
Arch: aarch64 (Graviton2)
Compiler: gcc experimental (nightly build)
Branches: master, REL_18_STABLE, REL_17_STABLE, REL_16_STABLE, REL_15_STABLE, REL_14_STABLE, REL_13_STABLE
Notes: Created and managed in a work-based environment; Graviton2-based member, helps ensure ARM64 stability across AWS generations.

ovenbird (newest member)

OS: Ubuntu 24.04.3 LTS
Arch: riscv64
Compiler: gcc 13.3.0
Branches: master, REL_18_STABLE, REL_17_STABLE, REL_16_STABLE, REL_15_STABLE, REL_14_STABLE, REL_13_STABLE
Notes: The newest addition to the family, ovenbird brings riscv64 architecture to the buildfarm, helping ensure PostgreSQL is tested on cutting-edge open hardware.

Challenges and Rewards

Managing these buildfarm animals means keeping up with OS upgrades, compiler changes, hardware failures, and PostgreSQL branch updates. Some of these machines are especially aggressive about GCC: they check for updates from the GCC git repository every few hours, recompile a fresh GCC, and use it for the next buildfarm run. This helps catch compiler regressions and compatibility issues very early.

If you want to read more about how these GCC compiles work and see the open source repository, check out my blog post: Compiling Latest GCC to Test More.

Some of these machines have been running for 3-4 years, and their logs are a treasure trove for debugging tricky platform-specific issues. The diversity of hardware and software helps the PostgreSQL community maintain its reputation for reliability and portability.

Testing with the latest GCC is especially rewarding: it ensures that upstream GCC changes are in tandem with the expectations of the PostgreSQL community, and that PostgreSQL continues to compile and pass tests without surprises. A good example is an upstream GCC bug that was found, reported, and fixed—making sure that no GCC changes adversely affect PostgreSQL in the long run. Read more about this incident here: PostgreSQL mailing list discussion of a GCC bug.

Here's another email thread that exemplifies why testing gcc experimental is helpful in ensuring that PostgreSQL compiles and tests stay green: PostgreSQL mailing list - GCC experimental thread.

However, it is also important to note that aggressive testing of GCC HEAD needs to be balanced against the time of PostgreSQL developers. The current buildfarm system does not explicitly distinguish between "production" and "bleeding edge" machines, meaning failures on experimental setups can sometimes be distracting. As discussed in this mailing list thread, there is an ongoing conversation about how to best handle these "platform not believed stable" scenarios to ensure that transient failures on experimental toolchains don't unnecessarily burden the community.

Speaking of new architectures, a few months back I wrote about [Testing PostgreSQL on Debian/Hurd](https://www.thatguyfromdelhi.com/2025/08/testing-postgresql-on-debianhurd.html) and planned to add a Hurd machine to the buildfarm. It looks like I've been beaten to the punch! A new member, [fruitcrow](https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=fruitcrow&br=master), is already up and running to test PostgreSQL on GNU/Hurd. This is fantastic news—having "competition" in adding diverse buildfarm members is exactly what we want. It shows that more people recognize that a wide array of test environments leads to a more stable PostgreSQL.

Final Thoughts

If you’re interested in contributing to PostgreSQL, running a buildfarm animal is a great way to help. It’s a hands-on way to learn about PostgreSQL internals, compilers, and operating systems, and it’s rewarding to see your machine’s name in the global test results.

14 Aug 2025

Testing PostgreSQL on Debian/Hurd: A Windows + QEMU Adventure

Curiosity often leads to the most interesting technical adventures. This time, I decided to explore something off the beaten path: running Debian GNU/Hurd inside a virtual machine on my Windows 11 host and compiling PostgreSQL from source.

This post is part 1 of a multi-part series documenting the process, challenges, and discoveries along the way. Future parts will dive deeper into advanced topics, automation, and ongoing compatibility work—so if you're interested in PostgreSQL, alternative operating systems, or open source testing, stay tuned!

What is Debian?
Debian is one of the oldest and most respected Linux distributions, known for its stability, vast software repositories, and commitment to free software principles. While most people associate Debian with the Linux kernel, it’s actually a complete operating system that can run on different kernels.

What is GNU/Hurd?
GNU/Hurd is an alternative kernel developed by the GNU Project. Unlike Linux, GNU/Hurd is built on a microkernel architecture (specifically GNU Mach), aiming for greater modularity and flexibility. While GNU/Hurd is still experimental and not as mature or widely used as Linux, it represents a fascinating approach to operating system design.

Debian GNU/Hurd combines the familiar Debian userland (tools, package management, etc.) with the GNU/Hurd kernel, offering a unique environment for open source enthusiasts and OS tinkerers.

My goal for this experiment was to see how far I could get with a modern database stack—specifically, compiling and running PostgreSQL—on this unusual platform.

Setting Up the VM

Instead of the CD image, I used the pre-built disk image available here. After downloading and extracting the .img file, I launched the VM with QEMU using the following command:

qemu-system-x86_64.exe -machine type=pc,accel=whpx,kernel-irqchip=off -boot d -m 4096 -usb -display default,show-cursor=on -drive file=".\debian-hurd-i386-20250807.img",cache=writeback

Explanation of the command:

qemu-system-x86_64.exe: Runs QEMU for 64-bit x86 systems (works for 32-bit guests too).
-machine type=pc,accel=whpx,kernel-irqchip=off: Specifies a PC-type machine, enables Windows Hypervisor Platform acceleration (WHPX), and disables kernel IRQ chip emulation for compatibility.
-boot d: Boots from the first hard disk.
-m 4096: Allocates 4GB of RAM to the VM.
-usb: Enables USB support.
-display default,show-cursor=on: Uses the default display and ensures the mouse cursor is visible.
-drive file=".\debian-hurd-i386-20250807.img",cache=writeback: Uses the extracted Hurd disk image as the hard drive and enables writeback caching for better disk performance.

This boots directly into the installed Debian/Hurd system with improved performance and usability on a Windows 11 host.

Preparing to Build PostgreSQL

Debian/hurd is minimal out of the box, so the first step was to install all the build tools and libraries required for compiling PostgreSQL:

sudo apt-get update
sudo apt-get install build-essential git libxml2-dev libxslt-dev autotools-dev automake libreadline-dev zlib1g-dev bison flex libssl-dev libpq-dev ccache

This command installs the compiler, linker, version control tools, XML and SSL libraries, autotools, and all other dependencies PostgreSQL may need for a successful build and test cycle.

Downloading and Compiling PostgreSQL

Instead of downloading a release tarball, I cloned the official PostgreSQL git repository and compiled the master branch:

git clone https://github.com/postgres/postgres.git
cd postgres
./configure --prefix=~/proj/localpg
make
make install

This approach ensures you're building the latest development version of PostgreSQL directly from source, and installs it locally to your user's ~/proj/localpg directory.

Setting Up the Database Cluster

PostgreSQL needs a data directory (cluster) to store its databases. Since the installation was local to my user, I simply initialized the cluster and started the server using the full path to the binaries (since they're not in my PATH):

~/proj/localpg/bin/initdb -D ~/proj/localpg/pgdata
~/proj/localpg/bin/pg_ctl -D ~/proj/localpg/pgdata -l logfile start

Connecting and Creating a Table

With the server running, I connected to the database and created a sample table:

~/proj/localpg/bin/psql -d postgres

Inside psql:

CREATE TABLE test_table (id SERIAL PRIMARY KEY, name TEXT);
INSERT INTO test_table (name) VALUES ('Hello from Debian/Hurd!');
SELECT * FROM test_table;

Example output:

CREATE TABLE
INSERT 0 1
 id |         name         
----+----------------------
  1 | Hello from Debian/Hurd!
(1 row)

Running the Test Suite

To ensure the build was solid, I went back to the source directory and ran:

cd ~/postgres
make check

This runs PostgreSQL's regression tests, verifying that the core features work as expected—even on Hurd. This ran mostly fine (except for a few tests that failed - more to be researched on that failure).

Quick QEMU Tip

When working with QEMU, remember that Ctrl-Alt-G is your friend—it releases the mouse and keyboard from the VM window, making it much easier to switch back to your host system.

Adding a Separate Volume for More Disk Space

The base Debian/Hurd image is quite small and can easily run out of space, especially when compiling large projects or running make check. I frequently hit disk full errors during testing.

Solution:

Shutdown the VM.
Resize the disk image:
```
qemu-img resize debian-hurd-i386-20250807.img +10G
```
This adds 10GB to the existing disk image.
Restart the VM.
Create a new partition:
- Use fdisk /dev/hd0 (or the appropriate device) to create a new partition in the extra space.
Format the new partition:
```
mkfs.ext4 /dev/hd0s3
```
(Note: On my setup, the original root partition was /dev/hd0s2, so the new partition created for extra space was /dev/hd0s3. Adjust the device name as needed for your configuration.)

Although the root volume is of ext2 type (!!!), Debian/Hurd works fine with ext4—so feel free to use ext4 for the new partition.

Mount the new volume:

mkdir -p /mnt/newvol
mount /dev/hd0s3 /mnt/newvol

Grant non-root user access:
- As root, change ownership:
```
chown robins:robins /mnt/newvol
```
- Now your non-root user (e.g., robins) can use /mnt/newvol for compiling PostgreSQL and running make check without running out of disk space.

Why use a non-root user for PostgreSQL? PostgreSQL is designed to run as a non-root user for security reasons. Running the database server or its tests as root can expose your system to unnecessary risks and may even cause certain operations to fail. Always use a dedicated non-root user for installation, testing, and day-to-day database operations.

This approach made it possible to complete the build and test cycle without disk space issues.

Final Thoughts

Running Debian/Hurd in a VM on Windows 11 was surprisingly smooth, though some packages and features are less mature than on Linux. Compiling PostgreSQL from scratch was a great way to explore the system's capabilities and compatibility. If you're looking for a fun, geeky weekend project, give Debian/Hurd a try!

Next Steps & What's Still Pending

This is only part 1 of a multi-part series. In future installments, I'll cover:

Setting up the PostgreSQL buildfarm for automated testing on Debian/Hurd
Deeper investigation into SMP/multi-core support (currently not working)
More QEMU optimization and compatibility testing
Additional performance tuning and disk management strategies
Troubleshooting Perl module installation issues (e.g., LWP::Protocol::https, LWP::Simple, Net::SSLeay), which currently fail to install—more research is needed to understand and resolve these problems.
Investigating why make check did not complete successfully (failed on a few tests)—this requires further research.

Some features, like multi-core support, full buildfarm integration, reliable Perl module installation, and passing all PostgreSQL regression tests, are not yet working or fully tested. These will be explored in detail in future posts. Stay tuned!

8 Jul 2024

On-Prem AI chatbot - Hello World!

In continuation of the recent posts...

Finally got a on-premise chat-bot running! Once downloaded, the linux box is able to spin up / down the interface in a second.

(myvenv) ai@dell:~/proj/ollama$ time ollama run mistral
>>> /bye

real    0m1.019s
user    0m0.017s
sys     0m0.009s

That, on a measly ~$70 Marketplace i5/8GB machine is appreciable (given what all I had read about the NVidia RTX 4090s etc.). Now obviously this doesn't do anything close to 70 tokens per second, but am okay with that.

(myvenv) ai@dell:~/proj/ollama$ sudo dmesg | grep -i bogo
[sudo] password for ai:
[    0.078220] Calibrating delay loop (skipped), value calculated using timer frequency.. 6585.24 BogoMIPS (lpj=3292624)
[    0.102271] smpboot: Total of 4 processors activated (26340.99 BogoMIPS)

Next, I wrote a small little hello-world script to test the bot. Now where's the fun if it were to print a static text!!:

(myvenv) ai@dell:~/t$ cat a.py
from langchain_community.llms import Ollama

llm = Ollama(model="llama3")
result=llm.invoke("Why is 42 the answer to everything? Keep it very brief.")
print (result)

And here's the output, in just ......... 33 seconds :)

(myvenv) ai@dell:~/t$ time python a.py
A popular question! The joke about 42 being the answer to everything originated from Douglas Adams' science fiction series "The Hitchhiker's Guide to the Galaxy." In the book, a supercomputer named Deep Thought takes 7.5 million years to calculate the "Answer to the Ultimate Question of Life, the Universe, and Everything," which is... 42!

real    0m33.299s
user    0m0.568s
sys     0m0.104s
(myvenv) ai@dell:~/t$

And, just for kicks, works across languages / scripts too. Nice!

(myvenv) ai@dell:~/t$ ollama run mistral
>>> भारत की सबसे लंबी नदी कौन सी है?
 भारत की सबसे लंबी नदी गंगा है, जिसका पूरण 3670 किमी होता है। यह एक विश्वमित्र नदी है और बहुप्रकार से कई प्रदेशों के झिल्ले-ढाल में विचलित है।

>>>

Again, am pretty okay with this for now. I'll worry about speed tomorrow, when I have a script that's able to test the limits, and that's not today.

Hello World!

7 Jul 2024

Installing Ollama on an old linux box

Trying out Ollama - Your 10 year old box would do too.

TLDR

Yes, you CAN install an AI engine locally
No, you DON'T need to spend thousands of dollars to get started!
Agreed, that your ai engine wouldn't be snappy, it's still great to get started.

Server

You'd realise that any machine should get you going.

I had recently bought a second-hand desktop box (Dell OptiPlex 3020) from FB Marketplace and repurposed it here.
For specs, it was an Intel i5-4590 CPU @ 3.30GHz with 8GB of RAM and 250 GB of disk, nothing fancy.
It came with an AMD Radeon 8570 (2GB RAM) [4], and the Ollama install process recognized and optimized for the decade old GPU. Super-Nice!
For completeness, the box cost me $70 AUD (~50 USD) in May 2024. In other words, even for a cash-strapped avid learner, there's a very low barrier to entry here.

Install

The install steps were pretty simple [1] but as you may know, the models themselves are huge.

For e.g. look at this [3]:

mistral-7B - 4.1 GB
gemma2-27B - 16 GB
Code Llama - 4.8 GB

Given that, I'd recommend switching to a decent internet connection. If work allows, this may be a good time to go to work instead of WFH on this one. (Since I didn't have that luxury, my trusty but slow 60Mbps ADSL+ meant that I really worked up on my patience this weekend)

The thing that actually tripped me, was that Ollama threaded downloads really scream speed and it ended up clogging my test server (See my earlier blog post that goes into some details [2]).

Run with Nice

With system resources in short-supply, it made good sense, to ensure that once Ollama is installed, it is spun up with least priority.

On an Ubuntu server, I did this by modifying the ExecStart config for Ollama's systemd script.

ai@dell:~$ sudo service ollama status | grep etc
     Loaded: loaded (/etc/systemd/system/ollama.service; enabled; preset: enabled)

ai@dell:~$ cat /etc/systemd/system/ollama.service | grep ExecStart
ExecStart=nice -n 19 /usr/local/bin/ollama serve

So when I do end up asking some fun questions, ollama is always playing "nice" :D

Enjoy ...

Reference:

Install + Quick Start: https://github.com/ollama/ollama/blob/main/README.md#quickstart
Model downloads made my server unresponsive: https://www.thatguyfromdelhi.com/2024/07/ollama-is-missing-rate-limits-on.html
Model sizes are in GBs: https://github.com/ollama/ollama/blob/main/README.md#model-library
Radeon 8570: https://www.techpowerup.com/gpu-specs/amd-radeon-hd-8570.b1325

6 Jul 2024

Ollama is missing --rate-limits on downloads

I am just starting my AI journey, and trying to get Ollama to work on my linux box, was an interesting non-AI experience.

I noticed, that everytime I was trying out something new, my linux box got reliably stuck every single time I pulled a new model. htop helped point out, that each time I did a ollama pull or ollama run, it spun up a ton of threads.

Often things got so bad, that the system became quite unresponsive. Here, you can see "when" I triggered the pull:

Reply from 192.168.85.24: bytes=32 time=7ms TTL=64
Reply from 192.168.85.24: bytes=32 time=7ms TTL=64
Reply from 192.168.85.24: bytes=32 time=7ms TTL=64
Reply from 192.168.85.24: bytes=32 time=8ms TTL=64
Reply from 192.168.85.24: bytes=32 time=65ms TTL=64
Reply from 192.168.85.24: bytes=32 time=286ms TTL=64
Reply from 192.168.85.24: bytes=32 time=286ms TTL=64
Reply from 192.168.85.24: bytes=32 time=304ms TTL=64

A little searching, led me to this on-going Github thread where a feature like --rate-limit were requested for multiple reasons. Some people were unhappy with how a pull clogged their routers, some were unhappy with how it jammed all other downloads / browsing on the machine. I was troubled since my linux box (a not-so-recent but still 6.5k BogoMIPS 4vCPU i5) came to a crawl.

While the --rate-limit feature takes shape, here are two solutions that did work for me :

As soon as I started the fetch (ollama run or ollama pull etc), I used iotop to change the ionice priority to idle. This made the issue go away completely (or at least made the system quite usable). However, it was still frustrating since (unlike top and htop) one had to type the PIDs... and as you may have guessed it already, Ollama creates quite a few when it does such the fetch.

Note that doing something like nice -n 19 did not help here. This was because the ollama processes weren't actually consuming (much) CPU for this task at all!

Then I tried to use ionice, which didn't work either! Note that since Ollama uses threads, the ionice tool didn't work for me. This was because ionice doesn't work with threads within a parent process. So this meant, something like the following did not work for me:

# These did not help!

robins@dell:~$ nice -n 19 ollama run mistral # Did not work!
robins@dell:~$ ionice -c3 ollama run mistral # Did not work either!!

After some trial-and-error, a far simpler solution was to just run a series of commands immediately after triggered a new model fetch. Essentially, it got the parent PID, and then set ionice for each of the child processes for that parent:

pid=`ps -ef | grep "ollama run" | grep -v grep | awk '{print $2}'`
echo $pid
sudo ionice -c3 -p `ps -T -p $pid | awk '{print $2}' | grep -v SPID | tr '\r\n' ' '`

This worked something like this:

robins@dell:~$ pid=`ps -ef | grep "ollama run" | grep -v grep | awk '{print $2}'` && [ ${#pid} -gt 1 ] && ( sudo ionice -c3 -p `ps -T -p $pid | awk '{print $2}' | grep -v SPID | tr '\r\n' ' '` ; echo "done" ) || echo "skip"skip

robins@dell:~$ pid=`ps -ef | grep "ollama run" | grep -v grep | awk '{print $2}'` && [ ${#pid} -gt 1 ] && ( sudo ionice -c3 -p `ps -T -p $pid | awk '{print $2}' | grep -v SPID | tr '\r\n' ' '` ; echo "done" ) || echo "skip"done

After the above, iotop started showing idle in front of each of the ollama processes:

Total DISK READ:         0.00 B/s | Total DISK WRITE:         3.27 M/s
Current DISK READ:       0.00 B/s | Current DISK WRITE:      36.76 K/s
    TID  PRIO  USER     DISK READ DISK WRITE>    COMMAND                                                                                                                                                                                                                      2692712 idle ollama      0.00 B/s  867.62 K/s ollama serve
2705767 idle ollama      0.00 B/s  852.92 K/s ollama serve
2692707 idle ollama      0.00 B/s  849.24 K/s ollama serve
2693740 idle ollama      0.00 B/s  783.07 K/s ollama serve
      1 be/4 root        0.00 B/s    0.00 B/s init splash
      2 be/4 root        0.00 B/s    0.00 B/s [kthreadd]
      3 be/4 root        0.00 B/s    0.00 B/s [pool_workqueue_release]
      4 be/0 root        0.00 B/s    0.00 B/s [kworker/R-rcu_g]
      5 be/0 root        0.00 B/s    0.00 B/s [kworker/R-rcu_p]
      6 be/0 root        0.00 B/s    0.00 B/s [kworker/R-slub_]

While at it, it was funny to note that the fastest way to see whether the unresponsive system is "going to" recover (because of what I just tried) was by keeping a separate ping session to the linux box. On my local network, I knew the system is going to come back to life in the next few seconds, when I noticed that the pings begin ack'ing in 5-8ms instead of ~100+ ms during the logjam.

So yeah, +10 on the --rate-limit or something similar!

EDIT: 2 years on - People are still complaining - The issue is still open :(

Reference:

https://github.com/ollama/ollama/issues/2006

16 Jun 2024

Compiling latest gcc to test more architectures

Off late, I've had two separate needs to compile GCC by hand and although my first foray into compiling gcc from git took patience, stumbling over the basics was interesting to say the least.

The first time I realised that an old GCC version could matter, was this feedback [1] that one of my buildfarm members was running an old (for its arch) gcc version, something that I almost never paid much attention to. The other being that that led me to newer architectures (more on that below) and how this could repeat itself if / when I end up playing with more architectures.

So finally, I can say I have a framework that frequently checks / recompiles gcc and ensures all my local tests are using the latest and the greatest gcc :) . (I am happy with how this has taken shape on my home server, and once I am able to port it to my other machines, don't see why this shouldn't land on github).

Now admittedly, compiling gcc on a nightly basis was already an overkill, but then what the heck - I went and did this hourly basis just because well-why-not. My personal ask was to:

Incrementally learn how compiling gcc unfolds
Have some fun scripting while at it
... but most importantly, see whether I could utilize this experience in other experiments where the idea is to forewarn database developers about upcoming changes.

A little more on point 3 above, I oversee a few machines on the postgres buildfarm and they differ in some aspects:

Different archictures:

aarch64: Gravitons
x86-64: A vanilla off-the-shelf dell workstation
armv7l - Raspberry Pi4

Different GCCs:

8.3.0 (default in pi4)
7.3.1 (default on most ALs)
13.2 (default on Ubuntu)
14.0.1 (naive attempt at compiling whatever cleared make check)
gcc (experimental nightly)

(Internally I also run some different fuzzing workloads but that's besides the point)
(Future plans - add some form of randomizer to test odd combinations of compilation flags, but more on that in an upcoming post)

Now if all goes to plan, I should also add to the mix, 2 new architectures. They wouldn't be the snappiest processors in the market (at least not in the pricing-level I am after), but hey it should be fun to play with!

loongarch64 - (cough) Recent (but sure as butterflies, a promising) entrant - the MIPS64 Loongson has been around for some time now, but my interest has grown off late owing to sporadic reports that its becoming somewhat competitive, which should be interesting to review.
riscv64 - Another interesting arch that should be fun to try out. Again, am not holding my breath that it'd top any charts, but could still end up being interesting nonetheless.

On the GCC front, getting the setup ready and stable, clears the path to now focus upgrading my buildfarm animals one-by-one and basically start focussing beyond this hurdle.

gcseb02 20240615_1700 - git checkout successful.
gcseb02 20240615_1700 - git pull successful.
gcseb02 20240615_1700 - No change in gcc version. Quitting.

gcsa36f 20240615_1800 - git checkout successful.
gcsa36f 20240615_1800 - git pull successful.
gcsa36f 20240615_1800 - gcc has changed - [471fb092601] vs [57af69d56e7]. Recompiling.
gcsa36f 20240615_1800 - make successful
gcsa36f 20240615_1800 - make install successful.
gcsa36f 20240615_1800 - gcc version string has changed from [15.0.0 20240615 (experimental) - 471fb092601] to [15.0.0 20240615 (experimental) - 57af69d56e7]

gcsf66a 20240615_1900 - git checkout successful.
gcsf66a 20240615_1900 - Unable to git pull. Are we connected? Quitting.
gcsf66a 20240615_1900 - git switched back to 57af69d56e7.

gcs629f 20240615_2000 - git checkout successful.
gcs629f 20240615_2000 - git pull successful.
gcs629f 20240615_2000 - gcc has changed - [57af69d56e7] vs [6762d5738b0]. Recompiling.
gcs629f 20240615_2000 - make successful
gcs629f 20240615_2000 - make install successful.
gcs629f 20240615_2000 - gcc version string has changed from [15.0.0 20240615 (experimental) - 57af69d56e7] to [15.0.0 20240615 (experimental) - 6762d5738b0]

.
.

gcsc115 20240616_0400 - git checkout successful.
gcsc115 20240616_0400 - git pull successful.
gcsc115 20240616_0400 - No change in gcc version. Quitting.

Reference

1. https://www.postgresql.org/message-id/a1e56c889422463ebb0a7e4de0b8f596%40amazon.com

2. Compilation script source - https://github.com/robins/gcc_compile