Puma — Web Server written in Ruby
Current version: 5.6.5 (2022-09-13)

Web Server implements HTTP protocol. Client make HTTP Request and Web Server returns HTTP Response.


Client (cURL, browser, etc) -> request -> puma.io -> response
$ curl -s -I https://puma.io

HTTP/2 200
server: GitHub.com
content-type: text/html; charset=utf-8
last-modified: Tue, 24 May 2022 22:52:59 GMT
access-control-allow-origin: *
etag: "628d61cb-34e1"
expires: Sun, 18 Sep 2022 10:20:43 GMT
cache-control: max-age=600
x-proxy-cache: MISS
x-github-request-id: 59B0:2663:44A654:4B059D:6326EEA3
accept-ranges: bytes
date: Sun, 18 Sep 2022 10:54:42 GMT
via: 1.1 varnish
age: 0
x-served-by: cache-itm18843-ITM
x-cache: HIT
x-cache-hits: 1
x-timer: S1663498482.219915,VS0,VE184
vary: Accept-Encoding
x-fastly-request-id: 6dd9835997a5b559d511758820251e3876126e84
content-length: 13537

But actually, there is someone sitting in between.

Puma can serve static content. But for dynamic content:

Client (cURL, browser, etc) -> request -> puma.io -> rack -> response

Puma implements Rack. Rack invokes Ruby to dynamically generate response.

Rack response looks like this:

[200, {}, ["Hello World"]]

Status code. ??. Body.

$ puma git:(master) tree -L 2
├── Rakefile
├── benchmarks
│    ├── local
│    └── wrk
├── bin
│    ├── puma
│    ├── puma-wild
│    └── pumactl
├── config
├── examples
│    ├── CA
│    ├── plugins
│    ├── puma
│    └── qc_config.rb
├── ext
│    └── puma_http11
├── lib
│    ├── puma
│    ├── puma.rb
│    └── rack
├── puma.gemspec
├── tools
│    ├── Dockerfile
│    └── trickletest.rb
└── win_gem_test
    ├── Rakefile_wintest
    ├── package_gem.rb
    └── puma.ps1

Puma’s command-line program is puma.

Worker is a process.
Worker has many Threads.
Socket is a node listen to a Port.
Client makes request to Port.

client - socket - process

  • puma default
  • ENV
  • File Options: config/initializers/puma.rb
  • User Option: puma cli

2 modes: Single and Cluster. If you have workers > 1, you’re using Cluster mode.

Single mode: One Puma process.

Cluster mode: Master process and fork() many child processes.
child processes listen to the socket. Each child process has its own thread pool. You can preload the app.
Master process only cares for UNIX signals and kill/boot child processes.

Puma by default runs in Cluster mode with 2 workers and 5 threads from each worker.

You can preload your app in Cluster mode. It will copy the code of master process into the workers. Preload cannot be used with phased restart because phased restart kills and restarts workers one by one.

Code as of puma/puma@88f9cba6.

Rack::Handler.default -> Rack::Handler::Puma.run

class Rack::Handler::Puma
  def self.run(app, **options)
    conf = self.config(app, options)

    log_writer = options.delete(:Silent) ? ::Puma::LogWriter.strings : ::Puma::LogWriter.stdio

    launcher = ::Puma::Launcher.new(conf, :log_writer => log_writer)

    yield launcher if block_given?
    rescue Interrupt
      puts "* Gracefully stopping, waiting for requests to finish"
      puts "* Goodbye!"

$ puma

require "puma/cli"
cli = Puma::CLI.new ARGV

You can pass in options to puma or put all options in a file and use -C or --config to load the config file.

Puma Options
puma <options> <rackup file>
    -b, --bind URI                   URI to bind to (tcp://, unix://, ssl://)
        --bind-to-activated-sockets [only]
                                     Bind to all activated sockets
    -C, --config PATH                Load PATH as a config file
        --no-config                  Prevent Puma from searching for a config file
        --control-url URL            The bind url to use for the control server. Use 'auto' to use temp unix server
        --control-token TOKEN        The token to use as authentication for the control server
        --debug                      Log lowlevel debugging information
        --dir DIR                    Change to DIR before starting
    -e, --environment ENVIRONMENT    The environment to run the Rack app on (default development)
    -f, --fork-worker=[REQUESTS]     Fork new workers from existing worker. Cluster mode only
                                     Auto-refork after REQUESTS (default 1000)
    -I, --include PATH               Specify $LOAD_PATH directories
    -p, --port PORT                  Define the TCP port to bind to
                                     Use -b for more advanced options
        --pidfile PATH               Use PATH as a pidfile
        --preload                    Preload the app. Cluster mode only
        --prune-bundler              Prune out the bundler env if possible
        --extra-runtime-dependencies GEM1,GEM2
                                     Defines any extra needed gems when using --prune-bundler
    -q, --quiet                      Do not log requests internally (default true)
    -v, --log-requests               Log requests as they occur
    -R, --restart-cmd CMD            The puma command to run during a hot restart
                                     Default: inferred
    -s, --silent                     Do not log prompt messages other than errors
    -S, --state PATH                 Where to store the state details
    -t, --threads INT                min:max threads to use (default 0:16)
        --early-hints                Enable early hints support
    -V, --version                    Print the version information
    -w, --workers COUNT              Activate cluster mode: How many worker processes to create
        --tag NAME                   Additional text to display in process listing
        --redirect-stdout FILE       Redirect STDOUT to a specific file
        --redirect-stderr FILE       Redirect STDERR to a specific file
        --[no-]redirect-append       Append to redirected files
    -h, --help                       Show help

WEB_CONCURRENCY to set how many workers (0 or >= 2).
NOTIFY_SOCKET — enable systemd integration (requires sd_notify gem)

  • SIGUSR1 for phased restart.
  • SIGUSR2 for hot restart.

Phased restart requires Cluster mode.

  • Puma Worker Killer

Minimum: 5 threads
Maximum: 10 threads

puma -t 5:10

WEB_CONCURRENCY=2 puma -t 5:5

Total you will have 10 threads.

lowlevel_error_handler do |exception|
  Rails.error.handle(exception) rescue nil
  [500, { "Content-Type" => "text/html" }, [File.read("public/500.html")]]

Or replace Rails.error.handle with any error reporting service.