Ask Bjørn Hansen
Open Source Convention
San Diego, July 2002
my @a = (qw(Lorem ipsum dolor sit amet consectetuer adipiscing elit sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat), "\n");
sub print_many { for (@a) { print $_ }; return 1; }
sub print_join { print join "", @a; return 1; }
sub print_list { print @a; return 1; }
Use the Benchmark module to find out!
Rate print_many print_list print_join print_many 78668/s -- -60% -76% print_list 194313/s 147% -- -41% print_join 329448/s 319% 70% --
print join "", @list is MUCH faster
Boring!
Who cares?!
Show more pages
With less resource usage
Faster
Design the arcitecture right
Design the code right
Optimize the code
(in that order!)
Server performance:
Requests per second: Concurrent requests * Response time = Requests per second
User performance:
Response time: Time for document to start serving + Download time = Response time
One server with one httpd installation running mod_perl
Ouch; the machine is being killed by swap!
252 processes: 2 running, 245 sleeping, 5 zombie CPU states: 1.8% user, 0.0% nice, 13.2% system, 0.9% interrupt, 84.1% idle Mem: 396M Active, 50M Inact, 52M Wired, 1272K Cache, 61M Buf, 992K Free Swap: 2048M Total, 351M Used, 1697M Free, 12% Inuse, 416K In, 6812K Out
PID USER PRI NICE SIZE RES STATE C TIME WCPU CPU COMMAND 88477 web 2 0 29508K 17720K select 0 0:02 0.00% 0.00% httpd 88475 web 2 0 29492K 18104K select 1 0:02 0.00% 0.00% httpd 88478 web 2 0 29508K 19352K select 0 0:02 0.00% 0.00% httpd 88467 web 28 0 29516K 3144K pfault 1 0:02 0.00% 0.00% httpd 88470 web 2 0 29500K 19436K select 0 0:02 0.00% 0.00% httpd 88472 web 28 0 29500K 8088K pfault 0 0:02 0.00% 0.00% httpd 88476 web 2 0 29512K 3320K select 0 0:02 0.00% 0.00% httpd 88471 web 2 0 29484K 20364K select 0 0:02 0.00% 0.00% httpd 88460 web 2 0 29504K 20300K select 0 0:02 0.00% 0.00% httpd 88469 web 2 0 29508K 2836K select 1 0:02 0.00% 0.00% httpd 88468 web 2 0 29504K 19320K select 1 0:02 0.00% 0.00% httpd 88465 web 2 0 29500K 3232K select 1 0:02 0.00% 0.00% httpd 88464 web 2 0 29500K 18636K select 1 0:02 0.00% 0.00% httpd 88450 web 28 0 26696K 14476K pfault 0 0:02 0.00% 0.00% httpd 88466 web 28 0 29500K 19200K pfault 1 0:02 0.00% 0.00% httpd 88461 web 28 0 26656K 14180K pfault 0 0:02 0.00% 0.00% httpd 88457 web 28 0 26976K 13796K pfault 0 0:02 0.00% 0.00% httpd 88456 web 28 0 26912K 9936K pfault 0 0:02 0.00% 0.00% httpd 88449 web 28 0 26856K 13540K pfault 0 0:02 0.00% 0.00% httpd 88453 web 28 0 26944K 13776K pfault 1 0:01 0.00% 0.00% httpd 88462 web 28 0 27008K 15948K pfault 0 0:01 0.00% 0.00% httpd 88459 web 28 0 26952K 15328K pfault 1 0:01 0.00% 0.00% httpd 88454 web 28 0 26952K 15448K pfault 1 0:01 0.00% 0.00% httpd 88474 web 28 0 26632K 8612K pfault 1 0:01 0.00% 0.00% httpd
Default httpd.conf
MaxClients 150
N connections = N fat mod_perl processes
15 connections per second + Each request takes 4 seconds to write to the network = 60 active mod_perl processes + 20 spare processes for peaks = 80 active mod_perl processes
* 25MB-15MB shared = 10MB memory per process = 800MB memory usage
Mostly wasted on waiting for slow clients!
Set the max number of httpd processes
MaxClients 15
in httpd.conf
10MB * 20 processes = 200MB memory
No more swapping
The machine is not crashing
But it's just the same!
15 processes available
Each request takes *4* seconds to write to the network
RPS (requests per second) is below 4!
200ms or 500ms to generate a document doesn't matter much for server performance
CPU's are almost 100% idle
Setup images.perl.org and make all image references point to there
The "static server" can serve all not dynamic embedded content
Buffer everything through a proxy
Very simple to plug into your environment
LoadModule proxy_module modules/mod_proxy.so LoadModule rewrite_module modules/mod_rewrite.so
ProxyPreserveHost On RewriteEngine On RewriteCond %{REQUEST_URI} !^/images/ RewriteCond %{REQUEST_URI} !^/proxy-status RewriteRule (.*) http://localhost:8933$1 [P]
Preserves the Host: header from the original request for seamless VirtualHosts
RewriteEngine On RewriteCond %{REQUEST_URI} !^/images/ RewriteCond %{REQUEST_URI} !^/proxy-status RewriteRule (.*) http://localhost:8933$1 [P]
To get $r->connection->remote_ip to work, install the mod_perl_add_forward.c module
VirtualHosts will not work out of the box
Use different ports
<VirtualHost *> servername dev.perl.org RewriteRule ^/(.*) http://localhost:8932/$1 [P,L] </VirtualHost>
<VirtualHost *> servername develooper.com serveralias www.develooper.dk RewriteRule ^/(.*) http://localhost:8933/$1 [P,L] </VirtualHost>
<VirtualHost *:8932> ServerName dev.perl.org ... </VirtualHost>
<VirtualHost *:8933> ServerName develooper.com ... </VirtualHost>
Use different backend servers for different "virtual hosts"
images.perl.org served by proxy
Use different backend servers for different urls
Can be physically different boxes
Use different backend servers for different urls
RewriteRule ^/(foo.*) http://localhost:8933/$1 [P,L] RewriteRule ^/(.*) http://localhost:8934/$1 [P,L]
Or different filetypes
RewriteRule ^/(.*\.tt)$ http://localhost:8933/$1 [P,L] RewriteRule ^/(.*) http://localhost:8934/$1 [P,L]
Proxy to different backend servers
RewriteRule ^/(.*\.asp) http://win32box/$1 [P,L] RewriteRule ^/(.*)$ http://modperl/$1 [P,L]
Each backend can be run by a different user
group by
functionality developers customers sites
Whatever fits
Handle many front-end requests with little memory usage
mod_perl processes gets freed up as fast as possible
Memory issue is eliminated
Requests Per Seconds compare to load testing on a local LAN
MaxRequestsPerChild 2000
MinSpareServers 1 MaxSpareServers 10 MaxClients 10 StartServers 10
Always keep at least one server ready for a new request
Allow up to 10 idle servers
Allow up to 10 servers
Start 10 servers at startup time
Let each server handle 2000 requests before it gets "respawned"
proxy and mod_perl instances sharing each physical box
Indispensable for monitoring how many processes are running and what they are doing
ExtendedStatus On
<Location /server-status> SetHandler server-status Order deny,allow Deny from all Allow from 1.2.3.5 </Location>
http://dev.perl.org/server-status
http://dev.perl.org/server-status?auto
Apache Server Status for dev.perl.org Server Version: Apache/1.3.26 (Unix) mod_perl/1.27 Server Built: Jun 19 2002 19:11:10 Current Time: Monday, 22-Jul-2002 17:35:16 PDT Restart Time: Friday, 19-Jul-2002 02:58:56 PDT Parent Server Generation: 0 Server uptime: 3 days 14 hours 36 minutes 20 seconds Total accesses: 83892 - Total Traffic: 1.1 GB CPU Usage: u53.77 s5.29 cu.31 cs.02 - .019% CPU load .269 requests/sec - 3650 B/second - 13.3 kB/request 1 requests currently being processed, 4 idle servers ____W........................................................... ................................................................ ................................................................ ................................................................ Scoreboard Key: "_" Waiting for Connection, "S" Starting up, "R" Reading Request, "W" Sending Reply, "K" Keepalive (read), "D" DNS Lookup, "L" Logging, "G" Gracefully finishing, "." Open slot with no current process Srv PID Acc M CPU SS Req Conn Child Slot Client VHost Request 0-0 30341 0/120/16120 _ 1.90 236 1 0.0 1.29 201.31 64.113.215.94 dev.perl.org GET /img/perldbi.s.gif HTTP/1.1 1-0 30044 0/497/15497 _ 8.58 1604 1 0.0 5.41 229.29 64.81.84.162 dev.perl.org GET /server-status?auto HTTP/1.1 2-0 30077 0/465/15465 _ 8.34 867 11 0.0 4.89 192.49 64.113.215.94 dev.perl.org GET /doc/index.html HTTP/1.1 3-0 30250 0/291/14791 _ 5.26 36 1 0.0 2.86 169.49 64.113.215.94 dev.perl.org GET /img/BKG.jpg HTTP/1.1 4-0 30432 0/28/12528 W 0.77 172 0 0.0 0.53 148.05 12.105.242.45 dev.perl.org GET /server-status HTTP/1.1 5-0 - 0/0/8000 . 8.13 15011 14 0.0 0.00 110.99 205.238.0.250 dev.perl.org GET / HTTP/1.1 6-0 - 0/0/1025 . 13.90 133226 38 0.0 0.00 23.85 195.19.198.1 dev.perl.org GET /perl5/news/2002/07/18/580ann/perldelta.pod HTTP/1.1 7-0 - 0/0/466 . 12.51 133700 11983 0.0 0.00 10.05 62.242.82.142 dev.perl.org GET /perl5/news/2002/07/18/580ann/perldelta.html HTTP/1.1
Apache Server Status for x2.develooper.com Server Version: Apache/2.0.39 (Unix) mod_ssl/2.0.39 OpenSSL/0.9.6b Server Built: Jun 19 2002 19:00:38 ------------------------------------------------------------------------ Current Time: Tuesday, 23-Jul-2002 01:00:10 PDT Restart Time: Friday, 19-Jul-2002 06:58:29 PDT Parent Server Generation: 5 Server uptime: 3 days 18 hours 1 minute 41 seconds Total accesses: 674608 - Total Traffic: 59.2 GB CPU Usage: u1434.86 s849.12 cu0 cs0 - .705% CPU load 2.08 requests/sec - 191.5 kB/second - 92.0 kB/request 17 requests currently being processed, 103 idle workers ______W_____K_______________K__________K........................ ................................................................ ................................................................ ______K_______K___W__K_____K____________........................ ___W____K____W__K___K_K________K__K_____........................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ Scoreboard Key: "_" Waiting for Connection, "S" Starting up, "R" Reading Request, "W" Sending Reply, "K" Keepalive (read), "D" DNS Lookup, "C" Closing connection, "L" Logging, "G" Gracefully finishing, "I" Idle cleanup of worker, "." Open slot with no current process Srv PID Acc M CPU SS Req Conn Child Slot Client VHost Request 0-5 28288 0/1626/2663 _ 5.97 9 0 0.0 84.85 136.35 213.23.86.3 cpan.develooper.com GET /icons/compressed.gif HTTP/1.0 0-5 28288 0/1653/2695 _ 5.86 60 0 0.0 124.06 197.53 204.123.28.25 cpan.develooper.com GET /modules/by-module/Parse/Parse-Nibbler-1.10.readme HTTP/1.0 0-5 28288 0/1470/2446 _ 5.53 50 0 0.0 92.81 162.17 12.239.20.151 www.perl.org GET /Images/download_perl.gif HTTP/1.1 0-5 28288 0/1815/2917 _ 6.12 51 0 0.0 157.85 323.43 210.255.175.82 cpan.develooper.com GET /modules/ HTTP/1.1 0-5 28288 0/1576/2775 _ 5.94 16 0 0.0 122.98 220.87 217.128.98.45 www.perl.org GET / HTTP/1.1 0-5 28288 0/1820/3003 _ 6.05 18 0 0.0 111.77 182.87 204.123.28.25 cpan.develooper.com GET /modules/by-module/Parse/Parse-PerlConfig-0.04.readme HTTP/ 0-5 28288 1/1714/2779 W 5.34 0 0 0.0 88.39 161.98 12.105.242.45 x2.develooper.com GET /server-status HTTP/1.1 0-5 28288 0/1623/2717 _ 5.80 10 0 0.0 90.88 130.99 144.132.6.172 cpan.develooper.com GET /misc/gif/funet.gif HTTP/1.1 0-5 28288 0/1557/2892 _ 5.70 40 0 0.0 128.58 217.49 64.81.84.115 x2.develooper.com GET /server-status HTTP/1.1 0-5 28288 0/1731/2712 _ 5.85 22 0 0.0 105.78 206.22 64.81.84.115 x2.develooper.com GET /server-status HTTP/1.1 0-5 28288 0/1663/2632 _ 5.09 54 0 0.0 125.66 240.82 192.198.152.98 dev.perl.org GET /perl5/docs/perlhack.html HTTP/1.0 0-5 28288 0/1524/2579 _ 5.72 56 0 0.0 148.84 188.10 219.162.170.64 cpan.develooper.com GET /misc/gif/funet.gif HTTP/1.0 0-5 28288 3/1803/3109 K 5.62 5 0 18.7 134.60 181.67 211.123.199.142 dev.perl.org GET /perl5/news/2002/07/18/580ann/ HTTP/1.1 0-5 28288 0/1805/2909 _ 5.72 40 0 0.0 77.25 199.43 212.227.67.1 cpan.develooper.com GET /authors/id/M/MS/MSERGEANT/DBIx-AnyDBD-2.00.tar.gz HTTP/1.0 0-5 28288 0/1659/2754 _ 4.33 68 0 0.0 71.24 122.38 64.81.84.162 x2.develooper.com GET /server-status?auto HTTP/1.0 0-5 28288 0/1605/2645 _ 6.12 49 0 0.0 119.42 231.39 62.236.35.224 cpan.develooper.com GET /misc/gif/funet.gif HTTP/1.1 0-5 28288 0/1480/2531 _ 6.25 38 0 0.0 133.94 187.10 196.3.50.241 cpan.develooper.com GET /misc/gif/valid-xhtml10.gif HTTP/1.0 0-5 28288 0/2017/3058 _ 6.99 30 0 0.0 166.20 238.06 157.25.125.14 cpan.develooper.com GET /authors/id/GAAS/URI-1.19.tar.gz HTTP/1.1 0-5 28288 0/1563/2556 _ 5.68 3 0 0.0 108.25 164.02 194.94.45.191 dev.perl.org GET /perl5/news/2002/07/18/580ann/ HTTP/1.0 0-5 28288 0/1680/2906 _ 5.68 69 0 0.0 95.72 197.73 80.140.10.162 cpan.develooper.com GET /misc/gif/funet.gif HTTP/1.1 0-5 28288 0/1538/2660 _ 5.85 10 0 0.0 96.27 168.83 144.132.6.172 cpan.develooper.com GET /misc/gif/valid-xhtml10.gif HTTP/1.1 0-5 28288 0/1603/2774 _ 6.35 38 0 0.0 96.72 253.66 196.3.50.241 cpan.develooper.com GET / HTTP/1.0 0-5 28288 0/1643/2624 _ 5.12 9 0 0.0 117.58 239.07 213.23.86.3 cpan.develooper.com GET /icons/unknown.gif HTTP/1.0 0-5 28288 0/1708/2709 _ 5.48 16 0 0.0 173.60 290.92 195.55.164.14 cpan.develooper.com GET / HTTP/1.0 0-5 28288 0/1462/2505 _ 5.05 34 0 0.0 94.49 157.70 138.23.89.56 cpan.develooper.com GET /authors/Bob_Dalgleish/ HTTP/1.0 0-5 28288 0/1339/2493 _ 5.87 6 0 0.0 100.99 172.34 64.81.84.162 dev.perl.org GET /server-status?auto HTTP/1.0 0-5 28288 0/1511/2818 _ 6.38 2 0 0.0 123.77 210.89 193.2.73.193 www.perl.org GET /Images/download_perl.gif HTTP/1.1 0-5 28288 0/1596/2805 _ 5.19 24 0 0.0 104.00 226.04 12.35.96.66 www.perl.org HEAD / HTTP/1.0 0-5 28288 1/1561/2629 K 4.50 13 0 2.1 109.09 156.80 80.61.33.32 www.perl.org GET /Images/download_perl.gif HTTP/1.1 0-5 28288 0/1637/2732 _ 5.46 48 0 0.0 123.82 204.87 213.23.86.3 cpan.deve
Set the right headers, Expires, Content-Length, Last-Modified
You set the rules
Only when complete documents can be cached
Cache database queries
Temporary denormalized tables
Cache remote data
Complicated computations
Make caches Expire the document in 48 hours
$r->header_out('Expires', HTTP::Date::time2str(time + 60*60*24*2))
Modern caches will honor this
$r->header_out('Cache-Control', "max-age=" . 60*60*24*2);
Don't cache
$r->header_out('Cache-Control', "no-cache, private");
Last-Modified: Mon, 24 Jun 2002 09:07:15 GMT
If-Modified-Since: Mon, 24 Jun 2002 09:07:15 GMT
$r->header_out("Last-Modified", ...);
if ((my $rc = $r->meets_conditions) != OK) { return $rc; }
If document hasn't been modified since last time, return
HTTP/1.1 304 Not Modified
Otherwise generate the document as usual
Always include the content_type header
$r->content_type("image/png");
Caching and Keep-Alive requires Content-Length
$r->header_out('Content-Length', $length);
Etags
$ ab -c 3 -n 100 http://nntp.x.perl.org/group/
Does a slow request to a remote server to fetch data
Time taken for tests: 10.12586 seconds Requests per second: 9.99 [#/sec] (mean) Time per request: 300.378 [ms] (mean) Time per request: 100.126 [ms] (mean, across all concurrent requests) Transfer rate: 75.90 [Kbytes/sec] received
Add mason caching
$m->cache_self(expire_in => "12h");
Time taken for tests: 3.978180 seconds Requests per second: 25.14 [#/sec] (mean) Time per request: 119.345 [ms] (mean) Time per request: 39.782 [ms] (mean, across all concurrent requests) Transfer rate: 192.30 [Kbytes/sec] received
Cache the whole document. Really really efficient.
Flexible storage modules
Make your own backend storage module (in C)
Disk and memory implemented
Marked experimental, and it is!
<VirtualHost *> ServerName nntp.x.perl.org
# mod_cache settings CacheOn On CacheIgnoreCacheControl on CacheIgnoreNoLastMod on CacheDefaultExpire 43200
# mod_disk_cache settings LoadModule disk_cache_module modules/mod_disk_cache.so CacheRoot /home/web/front/cache/ CacheSize 50000 CacheEnable disk / CacheDirLevels 5 CacheDirLength 3
RewriteEngine On RewriteRule (.*) http://localhost:8222$1 [P] </VirtualHost>
Databases are hard(er) to scale
Reverse proxy minimizes the need for many concurrent database connections
Apache::DBI minimizes the number of new connections made
Summary tables can make the lookups faster
Use faster databases for caching (MySQL as cache for Oracle)
Frequent "name" to "id" lookups
If the list is small, cache them in memory
my $cache; my $cache_refreshed = 0;
sub id_to_data { my $id = shift;
return $cache->{$id} if ($cache and $cache_refreshed > time-600);
my $dbh = db_open; $cache = $dbh->selectall_hashref('select * from table', 'id'); $cache_refreshed = time; return $cache->{$id}; }
Divide pages in smaller fragments and cache each fragment
Benchmark!
Sometimes use regular expressions to hack components into "super components"
$output = $cache->get('foo'); $dynamic_stuff = get_dynamic_data($r); $output =~ s{<!-- DYNAMIC_DATA_HERE -->} {$dynamic_stuff}s;
Caching is good
Multiple levels of caching better
Be careful not to spend more time managing the cache than you save caching
Caching minimizes the number of database lookups
Do your own!
No real world experience yet
(Can be) multithreaded
Run more http threads than "perl threads"
Don't have to use the proxy setup. Maybe.
Lots of stuff doesn't work with threads yet
Architecture more important than code
Architecture much more important than code
Some architecture improvements are really easy to implement
Premature optimization is a waste of time
Lots of optimizations are a waste of time
http://perl.apache.org/docs/tutorials/apps/scale_etoys/etoys.html
modperl-subscribe@perl.apache.org
Thursday 4.30pm
Thank you for listening
TicketMaster/CitySearch
ValueClick
Viridiana
the mod_perl list