<%flags> inherit => undef real world mod_perl performance tuning

Real World mod_perl Performance Tuning

Ask Bjørn Hansen

Open Source Convention

San Diego, July 2002

Which one is faster?

 my @a = (qw(Lorem ipsum dolor sit amet consectetuer adipiscing elit
	     sed diam nonummy nibh euismod tincidunt ut laoreet dolore
	     magna aliquam erat volutpat), "\n");
 sub print_many {
  for (@a) { print $_ };
  return 1;
 sub print_join {
   print join "", @a;
   return 1;
 sub print_list {
   print @a;
   return 1;

Measure it, they say

Use the Benchmark module to find out!

                Rate print_many print_list print_join
 print_many  78668/s         --       -60%       -76%
 print_list 194313/s       147%         --       -41%
 print_join 329448/s       319%        70%         --

print join "", @list is MUCH faster


Who cares?!

What is performance?

Show more pages

With less resource usage



Design the arcitecture right

Design the code right

Optimize the code

(in that order!)

How do we measure performance

Server performance:

 Requests per second:
     Concurrent requests 
   * Response time
   = Requests per second

User performance:

  Response time:
     Time for document to start serving
   + Download time
   = Response time

Typical simple server setup

One server with one httpd installation running mod_perl

Typical `top` output

Ouch; the machine is being killed by swap!

 252 processes: 2 running, 245 sleeping, 5 zombie
 CPU states:  1.8% user,  0.0% nice, 13.2% system,  0.9% interrupt, 84.1% idle
 Mem: 396M Active, 50M Inact, 52M Wired, 1272K Cache, 61M Buf, 992K Free
 Swap: 2048M Total, 351M Used, 1697M Free, 12% Inuse, 416K In, 6812K Out
 88477 web    2   0 29508K 17720K select 0   0:02  0.00%  0.00% httpd
 88475 web    2   0 29492K 18104K select 1   0:02  0.00%  0.00% httpd
 88478 web    2   0 29508K 19352K select 0   0:02  0.00%  0.00% httpd
 88467 web   28   0 29516K  3144K pfault 1   0:02  0.00%  0.00% httpd
 88470 web    2   0 29500K 19436K select 0   0:02  0.00%  0.00% httpd
 88472 web   28   0 29500K  8088K pfault 0   0:02  0.00%  0.00% httpd
 88476 web    2   0 29512K  3320K select 0   0:02  0.00%  0.00% httpd
 88471 web    2   0 29484K 20364K select 0   0:02  0.00%  0.00% httpd
 88460 web    2   0 29504K 20300K select 0   0:02  0.00%  0.00% httpd
 88469 web    2   0 29508K  2836K select 1   0:02  0.00%  0.00% httpd
 88468 web    2   0 29504K 19320K select 1   0:02  0.00%  0.00% httpd
 88465 web    2   0 29500K  3232K select 1   0:02  0.00%  0.00% httpd
 88464 web    2   0 29500K 18636K select 1   0:02  0.00%  0.00% httpd
 88450 web   28   0 26696K 14476K pfault 0   0:02  0.00%  0.00% httpd
 88466 web   28   0 29500K 19200K pfault 1   0:02  0.00%  0.00% httpd
 88461 web   28   0 26656K 14180K pfault 0   0:02  0.00%  0.00% httpd
 88457 web   28   0 26976K 13796K pfault 0   0:02  0.00%  0.00% httpd
 88456 web   28   0 26912K  9936K pfault 0   0:02  0.00%  0.00% httpd
 88449 web   28   0 26856K 13540K pfault 0   0:02  0.00%  0.00% httpd
 88453 web   28   0 26944K 13776K pfault 1   0:01  0.00%  0.00% httpd
 88462 web   28   0 27008K 15948K pfault 0   0:01  0.00%  0.00% httpd
 88459 web   28   0 26952K 15328K pfault 1   0:01  0.00%  0.00% httpd
 88454 web   28   0 26952K 15448K pfault 1   0:01  0.00%  0.00% httpd
 88474 web   28   0 26632K  8612K pfault 1   0:01  0.00%  0.00% httpd

What went wrong?

Default httpd.conf

  MaxClients 150

N connections = N fat mod_perl processes

    15 connections per second
  + Each request takes 4 seconds to
    write to the network
  = 60 active mod_perl processes
  + 20 spare processes for peaks
  = 80 active mod_perl processes
  * 25MB-15MB shared = 10MB memory
    per process
  = 800MB memory usage

Mostly wasted on waiting for slow clients!

The easy fix, slim back

Set the max number of httpd processes

  MaxClients 15

in httpd.conf

  10MB * 20 processes = 200MB memory

No more swapping

Performance Check

The machine is not crashing

But it's just the same!

    15 processes available
    Each request takes *4* seconds to
    write to the network
    RPS (requests per second) is below 4! 

200ms or 500ms to generate a document doesn't matter much for server performance

CPU's are almost 100% idle

Move images to a separate server

Setup images.perl.org and make all image references point to there

The "static server" can serve all not dynamic embedded content

Reverse Proxy architecture

Buffer everything through a proxy

Reverse Proxy

  • What will it do?

    Offload the buffering

    Serve static content


    Distribute requests to different backend servers

    All in a "slim" process!

Reverse Proxy, Apache 2.0

  • Apache 2.0

    Very simple to plug into your environment

     LoadModule proxy_module modules/mod_proxy.so
     LoadModule rewrite_module modules/mod_rewrite.so
     ProxyPreserveHost On
     RewriteEngine On 
     RewriteCond  %{REQUEST_URI}    !^/images/
     RewriteCond  %{REQUEST_URI}    !^/proxy-status
     RewriteRule (.*) http://localhost:8933$1 [P]
  • ProxyPreserveHost On

    Preserves the Host: header from the original request for seamless VirtualHosts

  • $r->connection->remote_ip

Reverse proxy, Apache 1.3

     RewriteEngine On 
     RewriteCond  %{REQUEST_URI}    !^/images/
     RewriteCond  %{REQUEST_URI}    !^/proxy-status
     RewriteRule (.*) http://localhost:8933$1 [P]
  • Pre-1.3.25

    To get $r->connection->remote_ip to work, install the mod_perl_add_forward.c module


  • VirtualHosts

    VirtualHosts will not work out of the box

    Use different ports

VirtualHosts with apache 1.3 proxy

  • Proxy Setup
     <VirtualHost *>
        servername dev.perl.org
        RewriteRule ^/(.*) http://localhost:8932/$1 [P,L]
     <VirtualHost *>
       servername develooper.com
       serveralias www.develooper.dk
       RewriteRule ^/(.*) http://localhost:8933/$1 [P,L]
  • mod_perl setup Port 80 Listen Listen
     <VirtualHost *:8932>
       ServerName dev.perl.org
     <VirtualHost *:8933>
       ServerName develooper.com

VirtualHost Proxying

Use different backend servers for different "virtual hosts"

images.perl.org served by proxy

URI differentiation

Use different backend servers for different urls

Can be physically different boxes

URI differentiation, examples

Use different backend servers for different urls

 RewriteRule ^/(foo.*) http://localhost:8933/$1 [P,L]
 RewriteRule ^/(.*)    http://localhost:8934/$1 [P,L]

Or different filetypes

 RewriteRule ^/(.*\.tt)$ http://localhost:8933/$1 [P,L]
 RewriteRule ^/(.*)      http://localhost:8934/$1 [P,L]

Proxy to different backend servers

 RewriteRule ^/(.*\.asp) http://win32box/$1 [P,L]
 RewriteRule ^/(.*)$     http://modperl/$1 [P,L]

Each backend can be run by a different user

group by


Whatever fits

Other proxies and "buffers"

  • Squid

    Good caching engine

    Not as flexible for "reverse proxying" as apache

  • mod_backhand

    Load balancing proxy module

  • lingerd

    tcp/ip "tweak"

Performance revisited

Handle many front-end requests with little memory usage

mod_perl processes gets freed up as fast as possible

Memory issue is eliminated

Requests Per Seconds compare to load testing on a local LAN

Basic httpd.conf tuning

  MaxRequestsPerChild  2000
  MinSpareServers         1
  MaxSpareServers        10
  MaxClients             10
  StartServers           10

Always keep at least one server ready for a new request

Allow up to 10 idle servers

Allow up to 10 servers

Start 10 servers at startup time

Let each server handle 2000 requests before it gets "respawned"

More proxy configurations

proxy and mod_perl instances sharing each physical box

Even more proxy configurations


Indispensable for monitoring how many processes are running and what they are doing

 ExtendedStatus On
 <Location /server-status>
    SetHandler server-status
    Order deny,allow
    Deny from all
    Allow from

mod_status example, 1.3 mod_perl

 Apache Server Status for dev.perl.org
 Server Version: Apache/1.3.26 (Unix) mod_perl/1.27
 Server Built: Jun 19 2002 19:11:10
 Current Time: Monday, 22-Jul-2002 17:35:16 PDT
 Restart Time: Friday, 19-Jul-2002 02:58:56 PDT
 Parent Server Generation: 0
 Server uptime: 3 days 14 hours 36 minutes 20 seconds
 Total accesses: 83892 - Total Traffic: 1.1 GB
 CPU Usage: u53.77 s5.29 cu.31 cs.02 - .019% CPU load
 .269 requests/sec - 3650 B/second - 13.3 kB/request
 1 requests currently being processed, 4 idle servers
 Scoreboard Key:
 "_" Waiting for Connection, "S" Starting up, "R" Reading Request,
 "W" Sending Reply, "K" Keepalive (read), "D" DNS Lookup,
 "L" Logging, "G" Gracefully finishing, "." Open slot with no current process
 Srv PID Acc M CPU SS Req Conn Child Slot Client VHost Request
 0-0 30341 0/120/16120 _ 1.90 236 1 0.0 1.29 201.31 dev.perl.org GET /img/perldbi.s.gif HTTP/1.1
 1-0 30044 0/497/15497 _ 8.58 1604 1 0.0 5.41 229.29 dev.perl.org GET /server-status?auto HTTP/1.1
 2-0 30077 0/465/15465 _ 8.34 867 11 0.0 4.89 192.49 dev.perl.org GET /doc/index.html HTTP/1.1
 3-0 30250 0/291/14791 _ 5.26 36 1 0.0 2.86 169.49 dev.perl.org GET /img/BKG.jpg HTTP/1.1
 4-0 30432 0/28/12528 W 0.77 172 0 0.0 0.53 148.05 dev.perl.org GET /server-status HTTP/1.1
 5-0 - 0/0/8000 . 8.13 15011 14 0.0 0.00 110.99 dev.perl.org GET / HTTP/1.1
 6-0 - 0/0/1025 . 13.90 133226 38 0.0 0.00 23.85 dev.perl.org GET /perl5/news/2002/07/18/580ann/perldelta.pod HTTP/1.1
 7-0 - 0/0/466 . 12.51 133700 11983 0.0 0.00 10.05 dev.perl.org GET /perl5/news/2002/07/18/580ann/perldelta.html HTTP/1.1

mod_status example, 2.0 proxy

 Apache Server Status for x2.develooper.com
 Server Version: Apache/2.0.39 (Unix) mod_ssl/2.0.39 OpenSSL/0.9.6b
 Server Built: Jun 19 2002 19:00:38 
 Current Time: Tuesday, 23-Jul-2002 01:00:10 PDT
 Restart Time: Friday, 19-Jul-2002 06:58:29 PDT
 Parent Server Generation: 5
 Server uptime: 3 days 18 hours 1 minute 41 seconds
 Total accesses: 674608 - Total Traffic: 59.2 GB
 CPU Usage: u1434.86 s849.12 cu0 cs0 - .705% CPU load
 2.08 requests/sec - 191.5 kB/second - 92.0 kB/request
 17 requests currently being processed, 103 idle workers
 Scoreboard Key:
 "_" Waiting for Connection, "S" Starting up, "R" Reading Request,
 "W" Sending Reply, "K" Keepalive (read), "D" DNS Lookup,
 "C" Closing connection, "L" Logging, "G" Gracefully finishing,
 "I" Idle cleanup of worker, "." Open slot with no current process
 Srv	PID	Acc	M	CPU 	SS	Req	Conn	Child	Slot	Client	VHost	Request
 0-5	28288	0/1626/2663	_ 	5.97	9	0	0.0	84.85	136.35	cpan.develooper.com	GET /icons/compressed.gif HTTP/1.0
 0-5	28288	0/1653/2695	_ 	5.86	60	0	0.0	124.06	197.53	cpan.develooper.com	GET /modules/by-module/Parse/Parse-Nibbler-1.10.readme HTTP/1.0
 0-5	28288	0/1470/2446	_ 	5.53	50	0	0.0	92.81	162.17	www.perl.org	GET /Images/download_perl.gif HTTP/1.1
 0-5	28288	0/1815/2917	_ 	6.12	51	0	0.0	157.85	323.43	cpan.develooper.com	GET /modules/ HTTP/1.1
 0-5	28288	0/1576/2775	_ 	5.94	16	0	0.0	122.98	220.87	www.perl.org	GET / HTTP/1.1
 0-5	28288	0/1820/3003	_ 	6.05	18	0	0.0	111.77	182.87	cpan.develooper.com	GET /modules/by-module/Parse/Parse-PerlConfig-0.04.readme HTTP/
 0-5	28288	1/1714/2779	W 	5.34	0	0	0.0	88.39	161.98	x2.develooper.com	GET /server-status HTTP/1.1
 0-5	28288	0/1623/2717	_ 	5.80	10	0	0.0	90.88	130.99	cpan.develooper.com	GET /misc/gif/funet.gif HTTP/1.1
 0-5	28288	0/1557/2892	_ 	5.70	40	0	0.0	128.58	217.49	x2.develooper.com	GET /server-status HTTP/1.1
 0-5	28288	0/1731/2712	_ 	5.85	22	0	0.0	105.78	206.22	x2.develooper.com	GET /server-status HTTP/1.1
 0-5	28288	0/1663/2632	_ 	5.09	54	0	0.0	125.66	240.82	dev.perl.org	GET /perl5/docs/perlhack.html HTTP/1.0
 0-5	28288	0/1524/2579	_ 	5.72	56	0	0.0	148.84	188.10	cpan.develooper.com	GET /misc/gif/funet.gif HTTP/1.0
 0-5	28288	3/1803/3109	K 	5.62	5	0	18.7	134.60	181.67	dev.perl.org	GET /perl5/news/2002/07/18/580ann/ HTTP/1.1
 0-5	28288	0/1805/2909	_ 	5.72	40	0	0.0	77.25	199.43	cpan.develooper.com	GET /authors/id/M/MS/MSERGEANT/DBIx-AnyDBD-2.00.tar.gz HTTP/1.0
 0-5	28288	0/1659/2754	_ 	4.33	68	0	0.0	71.24	122.38	x2.develooper.com	GET /server-status?auto HTTP/1.0
 0-5	28288	0/1605/2645	_ 	6.12	49	0	0.0	119.42	231.39	cpan.develooper.com	GET /misc/gif/funet.gif HTTP/1.1
 0-5	28288	0/1480/2531	_ 	6.25	38	0	0.0	133.94	187.10	cpan.develooper.com	GET /misc/gif/valid-xhtml10.gif HTTP/1.0
 0-5	28288	0/2017/3058	_ 	6.99	30	0	0.0	166.20	238.06	cpan.develooper.com	GET /authors/id/GAAS/URI-1.19.tar.gz HTTP/1.1
 0-5	28288	0/1563/2556	_ 	5.68	3	0	0.0	108.25	164.02	dev.perl.org	GET /perl5/news/2002/07/18/580ann/ HTTP/1.0
 0-5	28288	0/1680/2906	_ 	5.68	69	0	0.0	95.72	197.73	cpan.develooper.com	GET /misc/gif/funet.gif HTTP/1.1
 0-5	28288	0/1538/2660	_ 	5.85	10	0	0.0	96.27	168.83	cpan.develooper.com	GET /misc/gif/valid-xhtml10.gif HTTP/1.1
 0-5	28288	0/1603/2774	_ 	6.35	38	0	0.0	96.72	253.66	cpan.develooper.com	GET / HTTP/1.0
 0-5	28288	0/1643/2624	_ 	5.12	9	0	0.0	117.58	239.07	cpan.develooper.com	GET /icons/unknown.gif HTTP/1.0
 0-5	28288	0/1708/2709	_ 	5.48	16	0	0.0	173.60	290.92	cpan.develooper.com	GET / HTTP/1.0
 0-5	28288	0/1462/2505	_ 	5.05	34	0	0.0	94.49	157.70	cpan.develooper.com	GET /authors/Bob_Dalgleish/ HTTP/1.0
 0-5	28288	0/1339/2493	_ 	5.87	6	0	0.0	100.99	172.34	dev.perl.org	GET /server-status?auto HTTP/1.0
 0-5	28288	0/1511/2818	_ 	6.38	2	0	0.0	123.77	210.89	www.perl.org	GET /Images/download_perl.gif HTTP/1.1
 0-5	28288	0/1596/2805	_ 	5.19	24	0	0.0	104.00	226.04	www.perl.org	HEAD / HTTP/1.0
 0-5	28288	1/1561/2629	K 	4.50	13	0	2.1	109.09	156.80	www.perl.org	GET /Images/download_perl.gif HTTP/1.1
 0-5	28288	0/1637/2732	_ 	5.46	48	0	0.0	123.82	204.87	cpan.deve


  • Browsers/end user proxies can cache from servers

    Set the right headers, Expires, Content-Length, Last-Modified

  • Reverse proxies

    You set the rules

    Only when complete documents can be cached

  • Application can cache from other parts of the system (eg database)

    Cache database queries

    Temporary denormalized tables

    Cache remote data

    Complicated computations

HTTP Headers - Cache control

Make caches Expire the document in 48 hours

                 HTTP::Date::time2str(time + 60*60*24*2))

Modern caches will honor this

 $r->header_out('Cache-Control', "max-age=" . 60*60*24*2);

Don't cache

 $r->header_out('Cache-Control', "no-cache, private");

Conditional GET requests

  • Server sends with document request
     Last-Modified: Mon, 24 Jun 2002 09:07:15 GMT
  • Browser or cache sends with later request for the same document
     If-Modified-Since: Mon, 24 Jun 2002 09:07:15 GMT
  • Server checks something like the following
     $r->header_out("Last-Modified", ...);
     if ((my $rc = $r->meets_conditions) != OK) {
         return $rc;

    If document hasn't been modified since last time, return

     HTTP/1.1 304 Not Modified

    Otherwise generate the document as usual

HTTP headers; content

Always include the content_type header


Caching and Keep-Alive requires Content-Length

  $r->header_out('Content-Length', $length);



Caching example

 $ ab -c 3 -n 100 http://nntp.x.perl.org/group/

Does a slow request to a remote server to fetch data

 Time taken for tests:   10.12586 seconds
 Requests per second:    9.99 [#/sec] (mean)
 Time per request:       300.378 [ms] (mean)
 Time per request:       100.126 [ms] (mean, across all concurrent requests)
 Transfer rate:          75.90 [Kbytes/sec] received

Add mason caching

  $m->cache_self(expire_in => "12h");
 Time taken for tests:   3.978180 seconds
 Requests per second:    25.14 [#/sec] (mean)
 Time per request:       119.345 [ms] (mean)
 Time per request:       39.782 [ms] (mean, across all concurrent requests)
 Transfer rate:          192.30 [Kbytes/sec] received

Reverse proxy caching

Cache the whole document. Really really efficient.

  • Apache 2.0

    Flexible storage modules

    Make your own backend storage module (in C)

    Disk and memory implemented

    Marked experimental, and it is!

     <VirtualHost *>
       ServerName nntp.x.perl.org
       # mod_cache settings
       CacheOn On
       CacheIgnoreCacheControl on
       CacheIgnoreNoLastMod on
       CacheDefaultExpire 43200
       # mod_disk_cache settings
       LoadModule disk_cache_module modules/mod_disk_cache.so
       CacheRoot /home/web/front/cache/
       CacheSize 50000
       CacheEnable disk /
       CacheDirLevels 5
       CacheDirLength 3
       RewriteEngine On
       RewriteRule (.*) http://localhost:8222$1 [P]

Full HTTP caching

  • Apache 2.0, ... continued

    If it worked, a benchmark would show several hundreds of requests per second.

    Works without the Content-Length header

    Only the proxy will log cached requests

    Last-Modified and Expires headers controls the expire time


Databases are hard(er) to scale

Reverse proxy minimizes the need for many concurrent database connections

Apache::DBI minimizes the number of new connections made

Summary tables can make the lookups faster

Use faster databases for caching (MySQL as cache for Oracle)

Application caching

  • Lookup tables

    Frequent "name" to "id" lookups

    If the list is small, cache them in memory

     my $cache;
     my $cache_refreshed = 0;
     sub id_to_data {
       my $id = shift;
       return $cache->{$id}
         if ($cache and $cache_refreshed > time-600);
       my $dbh = db_open;
       $cache = $dbh->selectall_hashref('select * from table', 'id');
       $cache_refreshed = time;
       return $cache->{$id};


  • Cache::Cache

    Really easy to use.

    Caching "infrastructure"

       use Cache::FileCache;
       my $cache = new Cache::FileCache;
       my $customer = $cache->get($name);
       unless ($customer) {
         $customer = get_customer_from_db( $name );
         $cache->set( $name, $customer, "10 minutes" );
       return $customer;

Application caching

  • When and what to cache!

    Be sure to benchmark that it's a good idea first!

    Measure cache hit ratios

  • DBI::Profile

    Gives you list of statements and methods

    How often and how long they take

Component caching

Divide pages in smaller fragments and cache each fragment


Sometimes use regular expressions to hack components into "super components"

  $output = $cache->get('foo');
  $dynamic_stuff = get_dynamic_data($r);
  $output =~ s{<!-- DYNAMIC_DATA_HERE -->}

Caching summary

Caching is good

Multiple levels of caching better

Be careful not to spend more time managing the cache than you save caching

Caching minimizes the number of database lookups

Benchmarking and Profiling

Do your own!

  • Perl tools

    Benchmark.pm - Benchmark snippets

    Apache::DProf & Devel::DProf - Profile code

    DBI::Profile - Profile database usage

  • HTTP testing

    Run them on a separate box!

    ab - really simple http tester

    http_load - slightly more advanced, list of urls

    Apache Flood - advanced profile driven http tester

Load balancing

  • Why

    Divide load between servers

    Redundancy between servers

  • How

    "Hardware" load balancers

    LVN (Linux software solution)

Apache 2

No real world experience yet

(Can be) multithreaded

Run more http threads than "perl threads"

Don't have to use the proxy setup. Maybe.

Lots of stuff doesn't work with threads yet

What was the point again?

Architecture more important than code

Architecture much more important than code

Some architecture improvements are really easy to implement

Premature optimization is a waste of time

Lots of optimizations are a waste of time



Thank you for listening




the mod_perl list