New lua-resty-rabbitmq library

rohit.yadav · 2013-05-30T21:45:42+00:00

On May 30, 2013, at 11:49 AM, Nicolas Rioux <nic_...@hotmail.com> wrote: > Good thanks! What I was after also is to know if it's faster to do dynam...

New lua-resty-rabbitmq library

rohit.yadav

Hi,

I started working on implementing a client library for RabbitMQ based on STOMP 1.2 protocol, I'm sharing the initial buggy version [1] with the community for feedback, suggestions and review.

For the initial version I've implemented only authentication and publishing using receipts. I'll blog about this work with some history and background soon [2].

To test this library I wrote a load testing tool in golang which would run concurrent HTTP requests to an Openresty server. A handler runs Lua code for that request and uses the resty-rabbitmq library to sanitize, serialize and publish data to RabbitMQ's STOMP adapter.

With a noncurrent load test of total 1M requests, using the library a producer (in /example/) was able to publish all the messages to the broker without any errors. But in case of concurrent connections, I frequently got timeout errors or writing to closed socket errors.

How may I make publishing fault tolerant and avoid deduplicating messages in concurrent environment by correctly implementing states and reuse sockets using cosocket api pool? I set keepalive timeout on the cosocket tcp sock to 0 (no timeout as per wiki) and I saw exponential increase in socket consumed via the RabbitMQ management web interface, for a load of 100k requests, I saw fluctuations between 200-1200 consumed sockets with a lot of socket errors.

Ideas, flames? Thanks.

[1] https://github.com/wingify/lua-resty-rabbitmq
[2] http://engineering.wingify.com/

Regards,

Rohit Yadav

smallfish.xy

Great! but miss some other protocol (ack/publish etc..)
:-)

smallfish http://chenxiaoyu.org

On Thu, May 30, 2013 at 9:45 PM, Rohit Yadav <rohit...@wingify.com> wrote:

Hi,

I started working on implementing a client library for RabbitMQ based on STOMP 1.2 protocol, I'm sharing the initial buggy version [1] with the community for feedback, suggestions and review.

For the initial version I've implemented only authentication and publishing using receipts. I'll blog about this work with some history and background soon [2].

To test this library I wrote a load testing tool in golang which would run concurrent HTTP requests to an Openresty server. A handler runs Lua code for that request and uses the resty-rabbitmq library to sanitize, serialize and publish data to RabbitMQ's STOMP adapter.

With a noncurrent load test of total 1M requests, using the library a producer (in /example/) was able to publish all the messages to the broker without any errors. But in case of concurrent connections, I frequently got timeout errors or writing to closed socket errors.

How may I make publishing fault tolerant and avoid deduplicating messages in concurrent environment by correctly implementing states and reuse sockets using cosocket api pool? I set keepalive timeout on the cosocket tcp sock to 0 (no timeout as per wiki) and I saw exponential increase in socket consumed via the RabbitMQ management web interface, for a load of 100k requests, I saw fluctuations between 200-1200 consumed sockets with a lot of socket errors.

Ideas, flames? Thanks.

[1] https://github.com/wingify/lua-resty-rabbitmq
[2] http://engineering.wingify.com/

Regards,
Rohit Yadav

--
sp;

rohit.yadav

On Thu, May 30, 2013 at 7:28 PM, smallfish <small...@gmail.com> wrote:

Great! but miss some other protocol (ack/publish etc..)

:-)

STOMP was simplest to write in Lua so I went ahead with it after members from the community suggested for lua-resty.

In this library we do have publisher confirms by using STOMP RECEIPT which are acks from the RabbitMQ stomp adapter for a given receipt no. which we put on a SEND frame.

I see on the management web interface of RabbitMQ that channels are in confirm mode.

A typical use case for this library is to publish on localhost RMQ broker from resty/lua handler, in that case latency and network issues won't be much trouble.

Regards.

--

smallfish http://chenxiaoyu.org

On Thu, May 30, 2013 at 9:45 PM, Rohit Yadav <rohit...@wingify.com> wrote:

Hi,

I started working on implementing a client library for RabbitMQ based on STOMP 1.2 protocol, I'm sharing the initial buggy version [1] with the community for feedback, suggestions and review.

For the initial version I've implemented only authentication and publishing using receipts. I'll blog about this work with some history and background soon [2].

To test this library I wrote a load testing tool in golang which would run concurrent HTTP requests to an Openresty server. A handler runs Lua code for that request and uses the resty-rabbitmq library to sanitize, serialize and publish data to RabbitMQ's STOMP adapter.

With a noncurrent load test of total 1M requests, using the library a producer (in /example/) was able to publish all the messages to the broker without any errors. But in case of concurrent connections, I frequently got timeout errors or writing to closed socket errors.

How may I make publishing fault tolerant and avoid deduplicating messages in concurrent environment by correctly implementing states and reuse sockets using cosocket api pool? I set keepalive timeout on the cosocket tcp sock to 0 (no timeout as per wiki) and I saw exponential increase in socket consumed via the RabbitMQ management web interface, for a load of 100k requests, I saw fluctuations between 200-1200 consumed sockets with a lot of socket errors.

Ideas, flames? Thanks.

[1] https://github.com/wingify/lua-resty-rabbitmq
[2] http://engineering.wingify.com/

Regards,
Rohit Yadav

--
sp;

.

brian

maybe should be renamed to lua-resty-stomp to avoid confusion?

agentzh

Hello!

On Thu, May 30, 2013 at 6:45 AM, Rohit Yadav wrote:
> With a noncurrent load test of total 1M requests, using the library a
> producer (in /example/) was able to publish all the messages to the broker
> without any errors. But in case of concurrent connections, I frequently got
> timeout errors or writing to closed socket errors.
>

When timeout errors happen, you should handle the error in Lua and
prevent further writing operations into the socket (because the
timed-out socket is closed automatically by ngx_lua).

To trace the causes of the timeout errors, you can use the
ngx-accept-queue and ngx-recv-queue tools in my Nginx Systemtap
Toolkit if you're on Linux:

    https://github.com/agentzh/nginx-systemtap-toolkit

The dropwatch tool from RedHat can serve as a good addition as well.
Other tools in my Nginx Systemtap Toolkit like ngx-sample-bt and
ngx-sample-bt-off-cpu for rendering various kinds of Flame Graphs can
be very useful in spotting bottlenecks and other issues.

Use the tools on both the Nginx worker processes and the RabbitMQ
server processes. It's common that

1. the backend server just cannot catch up with the traffic from Nginx, and/or
2. the Nginx just cannot catch up with the client traffic.

Also, ensure that you use long enough timeout settings (i.e., the
cosocket's settimeout method call).

> How may I make publishing fault tolerant and avoid deduplicating messages in
> concurrent environment by correctly implementing states and reuse sockets
> using cosocket api pool? I set keepalive timeout on the cosocket tcp sock to
> 0 (no timeout as per wiki) and I saw exponential increase in socket consumed
> via the RabbitMQ management web interface, for a load of 100k requests, I
> saw fluctuations between 200-1200 consumed sockets with a lot of socket
> errors.
>

The number of concurrent connections depends on your client requests'
concurrency level. The size of the connection pool does not limit
concurrency because exceeding connections just become "short
connections". You can use Nginx's ngx_limit_conn module to limit the
client requests' concurrency level.

Setting unlimited max idle time for the in-pool connections (i.e.,
specifying 0 for the "max_idle_time" argument in the setkeepalive
call) is generally not a good idea because the client's peak
concurrency level may leave many extra idle connections.

Best regards,
-agentzh

rohit.yadav

On Thu, May 30, 2013 at 9:10 PM, Brian Akins <br...@akins.org> wrote:

maybe should be renamed to lua-resty-stomp to avoid confusion?

I was uncertain what to call it, but it's not resty-stomp as it will implement a subset of stomp apis with extensions provided by RabbitMQ Stomp adapter, for example certain headers, heartbeat etc.
So, I named it lua-resty-rabbitmq as there is no amqp lua-resty one yet and to keep it short :)

To avoid confusion if community wants, we can rename it to lua-resty-rabbitmq-stomp?

Regards.

.

rohit.yadav

On Fri, May 31, 2013 at 12:05 AM, agentzh <age...@gmail.com> wrote:

Hello!

On Thu, May 30, 2013 at 6:45 AM, Rohit Yadav wrote:
> With a noncurrent load test of total 1M requests, using the library a
> producer (in /example/) was able to publish all the messages to the broker
> without any errors. But in case of concurrent connections, I frequently got
> timeout errors or writing to closed socket errors.
>

Hi agentzh, thanks for your reply.

When timeout errors happen, you should handle the error in Lua and
prevent further writing operations into the socket (because the
timed-out socket is closed automatically by ngx_lua).

To trace the causes of the timeout errors, you can use the
ngx-accept-queue and ngx-recv-queue tools in my Nginx Systemtap
Toolkit if you're on Linux:

https://github.com/agentzh/nginx-systemtap-toolkit

The dropwatch tool from RedHat can serve as a good addition as well.
Other tools in my Nginx Systemtap Toolkit like ngx-sample-bt and
ngx-sample-bt-off-cpu for rendering various kinds of Flame Graphs can
be very useful in spotting bottlenecks and other issues.

Cool, I'll try those.

Use the tools on both the Nginx worker processes and the RabbitMQ
server processes. It's common that

1. the backend server just cannot catch up with the traffic from Nginx, and/or

I think this could be a reason, due to RabbitMQ's disk io for persisting queue/messages.
In that case would it be too much performance penalty to sleep for a while and retry with a new connection in the request?

To make the request finish fast I can do a pcall(ngx.eof) and end the request and then carry on with publishing message to the RabbitMQ broker.

2. the Nginx just cannot catch up with the client traffic.

Also, ensure that you use long enough timeout settings (i.e., the
cosocket's settimeout method call).

Sure, found some timeout range for my use case by hit and trial.

> How may I make publishing fault tolerant and avoid deduplicating messages in
> concurrent environment by correctly implementing states and reuse sockets
> using cosocket api pool? I set keepalive timeout on the cosocket tcp sock to
> 0 (no timeout as per wiki) and I saw exponential increase in socket consumed
> via the RabbitMQ management web interface, for a load of 100k requests, I
> saw fluctuations between 200-1200 consumed sockets with a lot of socket
> errors.
>

The number of concurrent connections depends on your client requests'
concurrency level. The size of the connection pool does not limit
concurrency because exceeding connections just become "short
connections". You can use Nginx's ngx_limit_conn module to limit the
client requests' concurrency level.

Setting unlimited max idle time for the in-pool connections (i.e.,
specifying 0 for the "max_idle_time" argument in the setkeepalive
call) is generally not a good idea because the client's peak
concurrency level may leave many extra idle connections.

+1

Regards.

Best regards,
-agentzh

.

agentzh

Hello!

On Thu, May 30, 2013 at 11:56 AM, Rohit Yadav wrote:
>
> I think this could be a reason, due to RabbitMQ's disk io for persisting
> queue/messages.
> In that case would it be too much performance penalty to sleep for a while
> and retry with a new connection in the request?

Explicit sleeping may be too much, because the waiting latency is not
under complete control.

I've been thinking about some kind of automatic request queueing in
ngx_lua's cosocket connection pool so that your tcpsock:connect() call
can just temporarily hang up waiting for other connections to complete
when the backend request concurrency limit is hit. That way, the
waiting latency involved cannot be any longer than necessary and
client traffic peaks cannot overload the backend services.

Before that happens, the standard ngx_limit_req and ngx_limit_conn
modules can also temporarily block exceeding client connections or
request, thus effectively limiting the backend request concurrency
level.

> To make the request finish fast I can do a pcall(ngx.eof) and end the
> request and then carry on with publishing message to the RabbitMQ broker.
>

This is a common trick used by the community. Alternatively, you can
also use ngx.timer.at() to do async job processing:

http://wiki.nginx.org/HttpLuaModule#ngx.timer.at

But always keep in mind, async processing can be a devil because jobs
can accumulate really fast and eventually exhaust the system resources
if the backend cannot catch up with the frontend. This is also an
effective DoS attack.

Happy hacking!

Best regards,
-agentzh