In the digital realm where every byte of memory counts, I’ve been grappling with a persistent issue in Mess With DNS for the past three years. Periodically, the application would run out of memory and meet an untimely end, succumbing to the dreaded Out – Of – Memory (OOM) killer. For a long while, this wasn’t a top priority. The service would typically be down for just a few minutes during its restart, and the incidents were rare, occurring at most once a day. So, I simply chose to turn a blind eye, relegating the problem to the back of my mind. But last week, the issue escalated, causing real disruptions, and I finally decided it was time to roll up my sleeves and dig deep.
My investigation took me on a winding journey, filled with trials, errors, and valuable lessons. The first thing I needed to understand was the memory landscape of the virtual machine (VM) hosting Mess With DNS. With only about 465MB of RAM at its disposal, the memory was already stretched thin. PowerDNS gobbled up 100MB, Mess With DNS itself consumed 200MB, and hallpass took another 40MB. This left a meager 110MB of free memory. I had previously set the GOMEMLIMIT
to 250MB, hoping to nudge the garbage collector into action when Mess With DNS’s memory usage got too high. While this helped to some extent, it didn’t completely solve the problem.
The root cause of the recent troubles turned out to be the backup script. I had recently started using restic to back up Mess With DNS’s database. Given the tight memory constraints, restic sometimes demanded more memory than the system could spare, leading to the backup script being OOM killed. This was a double – edged sword. Not only did it risk corrupting the backups, but it also meant I had to manually unlock the restic lock to resume the backup process – a task I desperately wanted to avoid in my otherwise automated web service ecosystem.
Determined to find a solution, I decided to focus on reducing Mess With DNS’s memory footprint. Memory profiling had revealed that IP addresses were the main culprits behind the high memory usage. When Mess With DNS started up, it loaded a database into memory that allowed it to look up the Autonomous System Number (ASN) for every IP address. This way, when it received a DNS query, it could quickly identify the source IP’s owner, like recognizing that 74.125.16.248 belonged to GOOGLE. The problem was that this database, which originated from text files totaling just 37MB, was consuming a staggering 117MB of memory in its in – memory form.
My first attempt at a solution involved SQLite. I thought that storing the data on disk in an SQLite database, with properly indexed tables, would significantly reduce the memory usage. I wrote a Python script to import the tab – separated value (TSV) files into an SQLite database and adjusted my code to query the database instead. Initially, it seemed like a success. After a garbage collection cycle, the memory usage dropped dramatically since the data was now stored on disk. However, I soon encountered a series of problems. Storing IPv6 addresses was a challenge as SQLite lacked native support for big integers, and I had to decide between storing them as text or BLOBs. Additionally, the performance of the lookups was abysmal. My microbenchmark showed that the SQLite – based solution could only handle 17,000 IP address lookups per second, a far cry from the 9 million lookups per second achieved by the original binary search – based code. An EXPLAIN QUERY PLAN
revealed that the database wasn’t using the indexes as efficiently as I hoped, and my attempts to optimize the query only made things worse. Eventually, I had to abandon this approach.
Next, I turned to a trie data structure, hoping it would offer a more memory – efficient solution. I used the ipaddress - go
library to implement IP address lookups with a trie. But this experiment was a complete disaster. The trie implementation used an astronomical 800MB of memory just for the IPv4 addresses and was much slower than the original code, managing only 100,000 lookups per second. Confused and frustrated, I decided to scrap this idea and focus on optimizing my existing array – based structure.
This led me to my third and most successful attempt. I identified several areas where I could reduce the memory usage of my IPRange
struct. First, I noticed a lot of repetition in the Name
and Country
fields, as many IP ranges belonged to the same ASN. So, I created a separate ASNInfo
struct and an ASNPool
to store the unique ASN – related information, and then I used indexes in the IPRange
struct to reference this data. This simple change alone reduced the memory usage from 117MB to 65MB. Then, I discovered the netip.Addr
type in the Go standard library, which was a more memory – efficient alternative to the traditional net.IP
. Switching to netip.Addr
saved me an additional 20MB of memory, bringing the total usage down to 46MB.
The real – world debugging process was far from the neat, linear narrative I’ve presented. It was a chaotic dance of trial and error, second – guessing, and 反复 revisiting previous conclusions. But in the end, I managed to save 70MB of memory. The new version of Mess With DNS did come with a small trade – off – the lookups were a bit slower, dropping from 9 million to 6 million lookups per second. However, I considered the memory savings well worth the slightly increased CPU usage.
Throughout this journey, I received some great suggestions from others, like trying Go’s unique
package, compiling with GOARCH=386
, and exploring different data storage formats. While I didn’t implement all of these ideas this time around, they’ve given me plenty of inspiration for future performance – tuning adventures. In the end, this memory – optimization quest wasn’t just about solving a problem; it was a thrilling exploration into the intricacies of data storage, algorithms, and the art of squeezing every bit of efficiency out of my code.