A
highly unscientific comparison of Snort database performance
At work, we're
building a system to aggregate snort sensor data into a central
database. I have a set of scripts that export from remote sensorts
and aggregate into a centralized database. I've been experimenting
to see what the "best" database to use for this central database. This
is specifically in relation to the snort stuff I'm working on at work,
but may be interesting to other snort users as well since I'm operating
on essentially the same db structure that snort uses to log directly to
a database. (I only skip a couple of tables in the export/import
process.)
The database
server is a 1.33Ghz Athlon with 1.5GB of RAM and four 120GB drives doing
hardware RAID0. MySQL and PostgreSQL have had their configs tuned
to the point that there is virtually no disk I/O during my highly
unscientific "benchmarking" test, except to occasionally flush dirty
buffers out to disk.
Updated
2003 Feb 19: I forgot to mention
that the database versions I'm using for each are MySQL 3.23.53 and
PostgreSQL 7.2.3. I haven't tried the 4.x series from MySQL or
7.3 series from PostgreSQL yet, although I've gotten a bit of feedback
on this page saying that with the latest versions of each, insert
performance has equalised as well. I may update with those at
some point in the future if I get a chance.
I have a dataset of
~60K IDS events in four separate custom "export" files that I import,
via my perl scripts, into the database. The perl script is
the same script in each case, and only minor runtime changes to the SQL
statements based on the database to which I'm importing. "Single"
means I import each file in serial; "Parallel" means I run four imports
simultaneously.
|
Records |
Seconds |
R/sec. |
| MySQL Single |
12391 |
200 |
61.96 |
|
23881 |
394 |
60.61 |
|
739 |
12 |
61.58 |
|
23328 |
370 |
63.05 |
| Sum/Avg. |
60339 |
976 |
61.82 |
|
| MySQL Parallel |
12391 |
601 |
20.62 |
|
23881 |
958 |
24.93 |
|
739 |
46 |
16.07 |
|
23328 |
934 |
24.98 |
| Sum/Avg. |
60339 |
2539 |
23.76 |
|
| PgSQL Single |
12391 |
258 |
48.03 |
|
23881 |
513 |
46.55 |
|
739 |
16 |
46.19 |
|
23328 |
499 |
46.75 |
| Sum/Avg. |
60339 |
1286 |
46.92 |
|
| PgSQL Parallel |
12391 |
808 |
15.34 |
|
23881 |
1303 |
18.33 |
|
739 |
64 |
11.55 |
|
23328 |
1270 |
18.37 |
| Sum/Avg. |
60339 |
3445 |
17.51 |
So MySQL is
about 50% faster at doing unoptimized INSERTs into the database either
in serial or in parallel. They both "degrade" almost identically
in records/second (%38 of 'single' speed for MySQL, %37 for PgSQL) when
doing parallel inserts.
PostgreSQL is
expected to be a little slower as it's doing full transactional
inserts. (Wrapping the entire import procedure inside of a single
BEGIN/COMMIT transaction block will very likely break my whole process,
however, because of the way I'm doing things.) Turning off "fsync"
in the config is an absolute requirement as otherwise it takes many
times longer as the backend waits to fsync the transaction logs after
each INSERT.
However, it gets
interesting when I start doing query tests. The query is a fairly
complicated one with several joins, GROUP BYs and a couple of ORDER BY
clauses that touch a number of the standard tables in the snort
database. (It summarizes all distinct signatures in the db, how
many of each, what the sig_class is, how many distinct source and
destination IPs for each, and a couple of other aggregate statistics
against all ~60K records in the db.) Tables are indexed
identically in both databases. "Single" means I run the query
multiple times in serial and the time is an average of the time it takes
to get a response over several runs; "parallel" means I run four
simultaneous queries and the time is an average time it takes for EACH
query to return a response over several runs. (i.e. the 'parallel'
time is not the SUM of four parallel queries, but the average amount of
time it takes EACH separately run query to return a response.)
|
Avg Seconds to return |
| MySQL Single |
9.49 |
| MySQL Parallel |
40.39 |
| PgSQL Single |
7.29 |
| PgSQL Parallel |
28.73 |
PostgreSQL
actually beats MySQL in sheer performance with a fairly complex (but
straightforward and certainly not uncommon) query.
Considering the
comments on the list recently about how MySQL blows away anything else
in performance, and considering MY previous experiences with both
databases, I was surprised at the results.
When you also
take into account the fact that PostgreSQL is a completely
ACID-compliant database, and implements/sticks to ANSI SQL much more
closely than MySQL, it's even more impressive.
Of course, none
of this really means anything except for my specific
implementation. And the numbers would probably look significantly
different if I had 600K records instead of just 60K. But it's
worth noting.