25 May 2010

Open palm, apply to forehead: How to resist programming the easy way

by Ryokin

I’ve learned a great lesson over the last few months in regards to programming complexity. I have been working on a solution to uber-rapid filtration of jump distance within the Navigator. My final goal: to have every shortest path available within a database for rapid joining. Anyone that has researched into this will be familiar with typical path-finding techniques, Dijkstra, A*, or any of a dozen others. However there are fewer techniques for finding all paths simultaneously, as those find the shortest route from A to B. I did tons of research in performance etc… to figure out a good way, and came to the conclusion that a formula called The Floyd-Warshall Algorithm. It is theoretically the fastest All Paths Shortest Route calculator for a ‘Dense graph’….aka a system with many connections, such as Eve Online‘s solar systems.

So….how should I go about this? Well, I had the system list and links list in the database, so why not connect to it, read in the data, then calculate, and write back in. First thought: I already had an algorithm in use on the old navigator, why not just do that for each node to each node. Started that, and calculated that it would be done in about 2 years……so that was a definite no-go. Second thought: My website is PHP, so let’s do Floyd-Warshall in PHP. I wrote several approaches, and tried them all. Each either would not run, or crashed shortly after starting. I assumed that this was due to the script-language paradigm, so I might as well go for something a step down. I moved to Perl, using Wikipedia’s recommended Perl Graph. My first swing at this was a struggle, as I didn’t get the library. So, I thought a bit further about things, and figured “This is something ideal for MatLab!”. I fired up my edition, and labored for hours to connect it up to my database. After finally having success, I tried out the built in Floyd-Warshall…..and promptly ran out of memory. I tried tweaking a bit, but it was just far outside MatLab’s reach.

Surprised at MatLab’s failure, I returned to the Perl code, and finally got something running. I ran through a test data set to calculate how long it would take, and got an estimate of 14 processor days. Yes, 1,209,600 seconds of 100% processor usage. Ouch! Before starting on that, I decided to try Java, as I figured it is vastly faster. I was tiptoeing around C++ as much as I could, because I remembered having issues with connecting to a database. I wrote up the java app, and connected it up to the database after quite a struggle. Then I started things up, and BAM! out of memory. I tweaked the virtual machine to have over a Gig of RAM to chew on, but it still broke instantly. So I tried to just allocate the array needed for Floyd-Warshall. It couldn’t do it.

I threw up my hands…..after 3 months of trying, what was 14 processor days? So I started up the Perl job and crossed my fingers. One day passed, all was good, my site was still responsive (thanks to renice) and it had eaten a good chunk of the day in processor time.
Four days: I was getting to the point of no return, and starting to feel less enthused, but it was working.
Eight days: My website was seeming slow, processor time was struggling, I assumed it might have to take a few extra days to make up for the lost processor time, but whatever.
Eleven days: My website didn’t want to load anymore. I figured it wasn’t a big deal, and I’d fix it in the evening. I tried to log in to check status, but nothing happened. My server was a black hole. I ran out to my colo space to check on it, and it had eaten itself. Kernel panicking and horrendously unresponsive, I hard-reset it. When it came back up, I had an idling processor, and no new entries in my database.

I figured I shouldn’t try this on my production server anymore, so I set up a box at home to run it on, and reduced the number of systems from the just under 8,000 in Eve to the just over 5,000 that are actually reachable (recall that wormhole systems do not have static jumps to other systems). Estimated time was down to six days, I could live with that. And this was a Windows box, so hopefully it wouldn’t let Perl eat itself. One day later I looked at the machine, it has two Gigs of RAM, and the Perl process was using four Gigs, so there was practically no processor utilization as it desperately tried to figure out what to do outside the two Gigs of paging space. I stopped things and set the system to use eight Gigs of paging. Then I started it back up. Two days later it was shut down! Turns out I forgot to activate Windows *sigh*. So I activated, and started the job back up. The next day I looked again, and it had eaten all the room to spare, and was once again idling the processor. This project was just not happening.

So, I caved, and spent a good six hours trying to get C++ to connect to MySQL. Every article I could find said it could be done with Visual Studio, but no mention of Linux based connectivity being reliable. I gave up at the six hour mark, and had the legendary epic face-palm moment. Why was I trying so hard to load this from and into a database? Why not read in from a text file, and write to a CSV? I was such a moron. Sure enough, a two minute export of the needed tables got me the data in a usable form. I started up the hand-written Floyd-Warshall process and two hours and 256Mb of physical memory later I had a 1.4Gb CSV…..How easy was that? But now, how does one go about putting 1.4Gb of CSV into MySQL? Python to the rescue! I set up a loop to interpret and input the data to the database after splitting into 30-some files. I should have done it differently, but three full days later I had the data in the database….and my query time went from 0.00015s to 17s. Ouch! After a few indexing lessons, I’m now finally at a 0.00017s query time and everything is as it should be. Enjoy the newest rendition of the Navigator with all its fancy interface toys, and don’t forget to filter.

tags:

Adventures in Eve

Categories:

Recent posts:

Open palm, apply to forehead: How to resist programming the easy way