On Sunday, 18 October 2015 17:56:00 UTC+3, Miguel wrote:
It completely makes sense now after you explanation. I just moved the 'require' to 'init_by_lua'.
I come from another languages where instantiating a var for every single incoming request could be expensive
To further explain this:
1. init_by_lua / init_by_lua_block / init_by_lua_file
OpenResty starts one global Lua process when the Nginx master process is started.
This is a place where you can do things like blocking io / initialization work.
What agentzh said is that you can run things like:
in init_by_lua. Although the only benefit here is that "resty.redis" is now found on package.loaded. Not a big benefit in this case if you ask me.
You could also do it like this in init_by_lua:
redis = require "resty.redis"
Now you have "resty.redis" in your package.loaded and also a global variable called redis. Well, this is not recommended as you will now start polluting global namespace. So I wouldn't recommend doing it.
What you can also do, but you shouldn't, is this in init_by_lua:
redis = require "resty.redis":new()
redis:connect("127.0.0.1", 6379)
This is a _BIG_ NO NO!!!
Now you are creating global redis variable and also share its state across possibly multiple workers (now think about one worker calling redis:close()) -> DON'T DO THIS!
So think about placing just "require resty.redis" in init_by_lua you only gain that this code:
https://github.com/openresty/lua-resty-redis/blob/master/lib/resty/redis.lua
is only run once (e.g. the results of running that code is placed in package.loaded).
Not a huge benefit if you ask me. Say you have 4 workers, so instead of running that code once you end up running it 4 times. But it will only be run 4 times and then workers will cache it on _their_ package.loaded. Not a big benefit of adding require "resty.redis" in init_by_lua (kinda like a micro optimization). Some other modules that might parse files and do more heavy things in "require" call might benefit from this. init_by_lua is run with privileged rights (usually root), so be careful what you place there (especially some "untrusted" code). This may also come handy if you have files that you want to read on Nginx start but do not want them to be readable from less privileged workers (usually run with account like nginx or www-data or something).
More important optimization with redis use is to use pool of redis connections, and that means that you never call close, and always call set_keepalive instead.
2. When Nginx has done its initialization it will start n workers
Each worker will receive the global Lua VM environment with copy on write semantics. But after that they start to live on their own. So if you call:
package.loaded["resty.redis"] = nil
in other context than init_by_lua*, it will only unload the lib from current workers package.loaded, and next time with that worker when you call require "resty.redis" it will just run that redis.lua again and cache the results in package.loaded.
In general I wouldn't add requires in init_by_lua if I don't have a really good reason why (they will get cached on first use after all). Good reasons are things like blocking io that only needs to be done once (as the name tells, initialization work).