no copy version of ngx.print

brian · 2014-03-17T08:01:18+00:00

Hello! On Tue, Mar 18, 2014 at 8:42 PM, ovidiu wrote: > A security advisory has been released that describes two major > vulnerabilities related to n...

no copy version of ngx.print

brian

I have an app that mostly fetches from a shared dict and prints using ngx.print. I noticed that both of those copy the data, so I'm copying the data around unnecessarily.

Note: I haven't observed any issues with this so this may be a premature optimization.

Using the new functions exposed for ffi usage, I can avoid the copy on the shdict get. I was looking at ways to "write" a Lua string to the network without copying it. I was thinking I could ref it using luaL_ref, add it to output chain, and register a cleanup that does the luaL_unref. I wasn't sure how this would work with the coroutines, etc.

Thoughts? Should I not worry about this at all?

agentzh

Hello!

On Sun, Mar 16, 2014 at 5:01 PM, Brian Akins wrote:
> I have an app that mostly fetches from a shared dict and prints using
> ngx.print.  I noticed that both of those copy the data, so I'm copying the
> data around unnecessarily.
>

the shdict API copies the data from the shm slabs to the Lua VM, and
then the data gets copied from Lua VM to the nginx output buffers.

You could save the trip through the Lua VM if you don't want to do any
processing in Lua VM. I think you still need to copy to the nginx
temporary output buffers here because the shm slabs may get overridden
when nginx is still trying to write it out to the system socket send
buffers. They have different lifetime here per se.

> Note: I haven't observed any issues with this so this may be a premature
> optimization.

I suggest that we only optimize based on profiling results  (like
flame graphs) :)

> Using the new functions exposed for ffi usage, I can avoid the copy on the
> shdict get. I was looking at ways to "write" a Lua string to the network
> without copying it.
> I was thinking I could ref it using luaL_ref, add it to
> output chain, and register a cleanup that does the luaL_unref. I wasn't sure
> how this would work with the coroutines, etc.
>

This looks more expensive to me for common use
cases :)

> Thoughts? Should I not worry about this at all?
>

I suggest we focus on eliminating unnecessary dynamic memory
allocations, which are way more expensive than data copying according
to all the on-CPU flame graphs that I've ever seen for nginx :)

For example, I mentioned saving the intermediate Lua strings from
shdict to the nginx output buffers, which save allocations (and GC
overhead) of the Lua string objects ;) Though this use case is also
very limited because most of the times, we want to do something with
the data in Lua.

Thanks!
-agentzh

brian

Yes, I know the data needs to be copied from the shdict. My general use case is that this data never is touched by Lua, so the overhead of making it a Lua string is not needed. The above (the cleanup, etc, ) would save a copy, albeit with more "bookkeeping" overhead - so it may not matter. FWIW, luaS_newlstr (and similar) dominate my CPU time for this app - it's essentially a "cache" that uses shdicts for storage with Lua doing header manipulation, cache logic, etc.

agentzh

Hello!

On Mon, Mar 17, 2014 at 3:35 AM, Brian Akins wrote:
> My general use case
> is that this data never is touched by Lua, so the overhead of making it a
> Lua string is not needed.

Then you can expose a pure C function for ngx.print() and combine the
implementations of ngx.print() with your shdict:get() method in
lua-resty-core. You just need to save the ffi.string() call here.

> The above (the cleanup, etc, ) would  save a copy,
> albeit with more "bookkeeping" overhead - so it may not matter.

No bookkeeping is needed here. You really just need to save the Lua
string intern'ing overhead here rather than data copying.

> FWIW,
> luaS_newlstr (and similar) dominate my CPU time for this app

Are you using the latest LuaJIT v2.1? Preferably with resty.core enabled.

Also, if you're using the latest LuaJIT v2.1, the following patch
should help reduce the string intern'ing overhead a lot for busy web
apps:

    http://agentzh.org/misc/luajit/str-tab-min-size.patch

BTW, will you share some on-CPU C-land flame graphs for your apps
under load? See
https://github.com/agentzh/nginx-systemtap-toolkit#sample-bt It will
be much easier and much more effective to talk about bottlenecks and
optimizations with such a graph :)

Regards,
-agentzh

brian

Thanks!

I switched to FreeBSD at home, again.

On the bookkeeping, I was not sure if the "raw string" would get GC'd before it was actually sent to the client. I haven't traced all of that through the ngx_lua code recently. The ref/unref was to guarantee that didn't happen - but may not be necessary.

I may have not need any of this and I may be inventing scenarios in my head that will never happen in real life. I tend to do that ;)

Thanks again.

agentzh

Hello!

On Mon, Mar 17, 2014 at 1:13 PM, Brian Akins wrote:
> I switched to FreeBSD at home, again.
>

Alas. FreeBSD does not have a really good dynamic tracing framework
yet. It has an incomplete dtrace port and the last time I checked it
was not really usable for userland tracing. Simple userland flame
graphs did not work either. No joy.

> On the bookkeeping, I was not sure if the "raw string" would get GC'd before
> it was actually sent to the client.

No, it's not worth doing it. ngx_lua already recycles the nginx output
buffers automatically. So you're only trying to save memcpy() which
does not worth the extra bookkeeping overhead at all.

Regards,
-agentzh