A highly unscientific comparison of Snort database performance

At work, we're building a system to aggregate snort sensor data into a central database.  I have a set of scripts that export from remote sensorts and aggregate into a centralized database.  I've been experimenting to see what the "best" database to use for this central database. This is specifically in relation to the snort stuff I'm working on at work, but may be interesting to other snort users as well since I'm operating on essentially the same db structure that snort uses to log directly to a database.  (I only skip a couple of tables in the export/import process.)

The database server is a 1.33Ghz Athlon with 1.5GB of RAM and four 120GB drives doing hardware RAID0.  MySQL and PostgreSQL have had their configs tuned to the point that there is virtually no disk I/O during my highly unscientific "benchmarking" test, except to occasionally flush dirty buffers out to disk.

Updated 2003 Feb 19: I forgot to mention that the database versions I'm using for each are MySQL 3.23.53 and PostgreSQL 7.2.3.  I haven't tried the 4.x series from MySQL or 7.3 series from PostgreSQL yet, although I've gotten a bit of feedback on this page saying that with the latest versions of each, insert performance has equalised as well.  I may update with those at some point in the future if I get a chance.

I have a dataset of ~60K IDS events in four separate custom "export" files that I import, via my perl scripts, into the database.  The perl script is the same script in each case, and only minor runtime changes to the SQL statements based on the database to which I'm importing.  "Single" means I import each file in serial; "Parallel" means I run four imports simultaneously.


Records Seconds R/sec.
MySQL Single 12391 200 61.96

23881 394 60.61

739 12 61.58

23328  370 63.05
Sum/Avg. 60339  976 61.82

MySQL Parallel 12391 601 20.62

23881 958 24.93

739 46 16.07

23328 934 24.98 
Sum/Avg. 60339 2539 23.76

PgSQL Single 12391 258 48.03

23881 513 46.55

739 16 46.19

23328 499 46.75
Sum/Avg. 60339 1286 46.92

PgSQL Parallel    12391 808 15.34

23881 1303 18.33

739 64 11.55

23328 1270 18.37
Sum/Avg. 60339 3445 17.51

So MySQL is about 50% faster at doing unoptimized INSERTs into the database either in serial or in parallel.  They both "degrade" almost identically in records/second (%38 of 'single' speed for MySQL, %37 for PgSQL) when doing parallel inserts.

PostgreSQL is expected to be a little slower as it's doing full transactional inserts.  (Wrapping the entire import procedure inside of a single BEGIN/COMMIT transaction block will very likely break my whole process, however, because of the way I'm doing things.)  Turning off "fsync" in the config is an absolute requirement as otherwise it takes many times longer as the backend waits to fsync the transaction logs after each INSERT.

However, it gets interesting when I start doing query tests.  The query is a fairly complicated one with several joins, GROUP BYs and a couple of ORDER BY clauses that touch a number of the standard tables in the snort database.  (It summarizes all distinct signatures in the db, how many of each, what the sig_class is, how many distinct source and destination IPs for each, and a couple of other aggregate statistics against all ~60K records in the db.)  Tables are indexed identically in both databases.  "Single" means I run the query multiple times in serial and the time is an average of the time it takes to get a response over several runs; "parallel" means I run four simultaneous queries and the time is an average time it takes for EACH query to return a response over several runs.  (i.e. the 'parallel' time is not the SUM of four parallel queries, but the average amount of time it takes EACH separately run query to return a response.)
             

Avg Seconds to return
MySQL Single 9.49
MySQL Parallel 40.39
PgSQL Single 7.29
PgSQL Parallel 28.73

PostgreSQL actually beats MySQL in sheer performance with a fairly complex (but straightforward and certainly not uncommon) query.

Considering the comments on the list recently about how MySQL blows away anything else in performance, and considering MY previous experiences with both databases, I was surprised at the results.

When you also take into account the fact that PostgreSQL is a completely ACID-compliant database, and implements/sticks to ANSI SQL much more closely than MySQL, it's even more impressive.

Of course, none of this really means anything except for my specific implementation.  And the numbers would probably look significantly different if I had 600K records instead of just 60K.  But it's worth noting.