Re: About the meaning of "Nginx sub-request is synchronous but non-blocking"

agentzh · 2015-08-29T08:25:32+00:00
Hello! On Mon, Aug 31, 2015 at 4:11 PM, Puneet Agarwal wrote: > local vowpalwabbit = require("vowpalwabbit") -- vowpalwabbit is the library &gt...
Re: About the meaning of "Nginx sub-request is synchronous but non-blocking"

agentzh

Hello!

On Sat, Aug 29, 2015 at 3:31 AM, Zhe Zhang wrote:
> I have been studying openresty these days(in order to develop a
> highly-scalable image-download service for my company). While I’m really
> impressed by the power openresty adds to Nginx,

Thank you for trying out OpenResty and glad you like it :)

> I got some confusion in
> understanding the true meaning of “Nginx sub-request is synchronous but
> non-blocking”.
>

Synchronous is for the coding paradigm, you write code synchronously
instead of asynchronously (i.e., you will not have to mess up with a
lot of callbacks, for example).

Nonblocking is for the nature of I/O. That is, your I/O operations do
not block *any* OS threads so that a single OS thread can handle a lot
of concurrent connections at the same time.

> I’m listing my questions below. Can you help take a look? Thanks a lot in
> advance!
>

You're recommended to post such questions on the openresty-en mailing
list in the future. Please see https://openresty.org/#Community for
more details. And I'm cc'ing the list in the hope of helping other
users with similar questions.

> I’m using the following code block as an example:
[...]
> location / {
>   content_by_lua '
>      res = ngx.location.capture("/fast_route")
>      if res.status >= 400 then
>        res = ngx.location.capture("/slow_route")
[...]
>
> This example is basically letting Nginx download from a fast route first(by
> issuing a sub-request), and if it fails, try from a slow route(by issuing
> another sub-request).
>

One caveat with ngx.location.capture is that it always fully buffers
the response of the subrequest and load it into the Lua space. So it's
not ideal for large responses. Maybe use of the standard try_files
directive of nginx is better for this very use case? Please see

    http://nginx.org/en/docs/http/ngx_http_core_module.html#try_files

> - Question 1
> So, by "synchronous but non-blocking", does it mean that: once the Nginx
> single-thread worker sends out the request to www.fastroute.com, then,
> rather than waiting for the response from www.fastroute.com and rather than
> proceed to send sub-request to /slowroute, the worker process will
> immediately leave the current stack to handle the next event?
>

The worker process never leaves the current C stack. Instead it yields
the current running Lua coroutine and gives control back to the nginx
event loop in case of an IO operation that cannot complete
immediately, for example. You can say that it immediately leaves the
current *Lua* stack (which resides on the heap rather than the C
stack).

> - Question 2
>
> Where is the continuation(i.e. stack) of the current request saved before
> the single thread moves to the next task?
>

The current request's state is on the heap rather on the C stack. So
there's really nothing to save explicitly. Basically:

1. On the Lua side, we simply yield the current Lua coroutine and the
Lua VM just "freezes" the execution state of the current Lua coroutine
naturally.
2. On the Nginx side, request state is just on heap, anchored by those
event handlers' data structures registered in epoll/kqueue/etc.

There's nothing to save on the C stack or CPU registers.

>
> When Nginx worker process comes back to execute the continuation after the
> response from fastroute is available, the worker process has to grab the
> continuation object from somewhere.
>

The suspended Lua coroutine objects contain the "continuation"
objects. Basically we just need to anchor the Lua coroutines into the
Lua VM's registry (which is a "GC root") so that the Lua GC will not
collect these coroutine objects prematurely.

>
> -Question 3
>
> How does Nginx help reduce context-switch?
>

Nginx worker processes are single threaded and usually bind to
individual (logical) CPU cores (usually via the worker_cpu_affinity
directive). So there is no need to do context switching among the
worker processes at all. Context switching only happens when multiple
OS threads (or OS processes) compete for the same CPU core.

>
> To me, a worker process leaving the current stack to handle the next
> event(which is another stack) is essentially context-switching: a single
> thread switching through a list of stacks from the event queue.

Apparently we have different definitions of context switching. Context
switching is usually defined as those performed by the operating
system to implement transparent multitasking (transparent to the
userland). It is the OS level context switching being expensive (the
OS has to save and restore CPU register values and virtual memory page
tables and etc).

> Based on
> this, it seems each Nginx worker process is doing context-switching ALL THE
> TIME(if there're lots of requests going on), which makes it hard for me to
> understand why Nginx would help reduce context-switching(compared to
> Apache), as quoted below from
> https://www.nginx.com/blog/inside-nginx-how-we-designed-for-performance-scale/:
>

Nginx is using I/O multiplexing for concurrent request handling
exclusively while Apache usually performs blocking I/O and relies on
multiple OS threads to handle concurrency.

>
> When an NGINX server is active, only the worker processes are busy. Each
> worker process handles multiple connections in a non-blocking fashion,
> reducing the number of context switches.
>

Exactly :)

Best regards,
-agentzh