How to setup queue mode properly

yukyo · March 25, 2024, 1:50pm

Hello everyone, I’m trying to setup ActivePieces in queue mode properly.

I want this:

Main instance: To receive all the webhooks and add the jobs to Redis queue + access to UI.
Workers: Only process tasks in the Redis queue.

So I will change AP_QUEUE_MODE = REDIS in the main instance.
I think for Main instance I have to set the variable: AP_FLOW_WORKER_CONCURRENCY = 0
And for workers, each instance: AP_FLOW_WORKER_CONCURRENCY = 20

Do I need to add all the other variables to the workers?

Example:
AP_FRONTEND_URL
AP_QUEUE_MODE
AP_EXECUTION_MODE
… etc?

Or just the concurrency and the database/redis config for the workers and all the other variables are being “held” by the main instance? Or I’m missing something?

Also what is the 5 minutes thing? I think it’s this VAR? AP_TRIGGER_DEFAULT_POLL_INTERVAL

Thanks.

kishanprmr · March 28, 2024, 8:44am

@abuaboud could you kindly assist with this query?

yukyo · March 28, 2024, 12:37pm

Hi @kishanprmr @abuaboud, I tried setting up using the AI support in docs but I think I’m missing some variables…

ENV of my main instance:

AP_FRONTEND_URL=https://xxxxxxx.com
AP_DB_TYPE=POSTGRES
AP_SANDBOX_RUN_TIME_SECONDS=600
AP_POSTGRES_USERNAME=root
AP_REDIS_PORT=32450
AP_FLOW_WORKER_CONCURRENCY=0
AP_ENGINE_EXECUTABLE_PATH=dist/packages/engine/main.js
AP_QUEUE_UI_USERNAME=admin
AP_OPENAI_API_KEY=sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
AP_QUEUE_UI_ENABLED=true
AP_POSTGRES_PASSWORD=Hxxxxxxxxxxxxxxxxxxxxxxxxxe
PORT=80
AP_NODE_EXECUTABLE_PATH=/usr/local/bin/node
AP_ENCRYPTION_KEY=lxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx69
AP_POSTGRES_URL=postgresql://xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx:30862/xxxxx
AP_JWT_SECRET=jrxxxxxxxxxxxxxxxxxxxxx06
AP_EXECUTION_DATA_RETENTION_DAYS=15
AP_ENVIRONMENT=prod
AP_TELEMETRY_ENABLED=false
AP_POSTGRES_PORT=30862
AP_QUEUE_UI_PASSWORD=xxxxxxxxxxxxxxxxx
AP_EXECUTION_MODE=UNSANDBOXED
AP_SIGN_UP_ENABLED=false
AP_WEBHOOK_TIMEOUT_SECONDS=30
AP_POSTGRES_HOST=sfo1.xxxxxxxxxxxxxxxxxxx.com
AP_QUEUE_MODE=REDIS
AP_REDIS_URL=redis://xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx:32450
AP_POSTGRES_DATABASE=xxxxxxxxxxxxxxxxxx
AP_REDIS_HOST=sfo1.xxxxxxxxxxxxxx.com
AP_TRIGGER_DEFAULT_POLL_INTERVAL=5
AP_TEMPLATES_SOURCE_URL=https://cloud.activepieces.com/api/v1/flow-templates
AP_REDIS_PASSWORD=Exxxxxxxxxxxxxxxxxxxxxxxx64U

This is my worker env config:

AP_JWT_SECRET=jrxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx06
AP_ENCRYPTION_KEY=lxxxxxxxxxxxxxxxxxxxxxxx69
AP_FLOW_WORKER_CONCURRENCY=20
AP_QUEUE_MODE=REDIS
AP_REDIS_URL=redis://xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx:32450
AP_POSTGRES_URL=postgresql://xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx:30862/xxxxxxx
AP_DB_TYPE=POSTGRES

UI and editor works OK…
But when I send a webhook I can see the logs the worker picks up the task and tries to complete it but then I get this error:

03/28 06:42:03 {"level":30,"time":1711604520873,"pid":8,"hostname":"service-6604dfd4b3a6904a5affa4a5-0","msg":"[FlowWorker#executeFlow] flowRunId=qBZ3Clu28QVRiQVKAvZuV executionType=BEGIN"}

03/28 06:42:03 {"level":30,"time":1711604522517,"pid":8,"hostname":"service-6604dfd4b3a6904a5affa4a5-0","msg":"[FlowWorker#executeFlow] flowRunId=qBZ3Clu28QVRiQVKAvZuV sandboxId=36 prepareTime=1644ms"}

03/28 06:42:03 {"level":50,"time":1711604522532,"pid":8,"hostname":"service-6604dfd4b3a6904a5affa4a5-0","err":{"type":"TypeError","message":"Cannot read properties of undefined (reading 'endsWith')","stack":"TypeError: Cannot read properties of undefined (reading 'endsWith')\n at /usr/src/app/dist/packages/server/api/main.js:1:738424\n at Generator.next (<anonymous>)\n at /usr/src/app/dist/packages/server/api/node_modules/tslib/tslib.js:169:75\n at new Promise (<anonymous>)\n at Object.__awaiter (/usr/src/app/dist/packages/server/api/node_modules/tslib/tslib.js:165:16)\n at t.getServerUrl (/usr/src/app/dist/packages/server/api/main.js:1:737864)\n at Object.<anonymous> (/usr/src/app/dist/packages/server/api/main.js:1:709257)\n at Generator.next (<anonymous>)\n at fulfilled (/usr/src/app/dist/packages/server/api/node_modules/tslib/tslib.js:166:62)\n at process.processTicksAndRejections (node:internal/process/task_queues:95:5)"},"msg":"Cannot read properties of undefined (reading 'endsWith')"}

03/28 06:42:03 {"level":50,"time":1711604522532,"pid":8,"hostname":"service-6604dfd4b3a6904a5affa4a5-0","err":{"type":"TypeError","message":"Cannot read properties of undefined (reading 'endsWith')","stack":"TypeError: Cannot read properties of undefined (reading 'endsWith')\n at /usr/src/app/dist/packages/server/api/main.js:1:738424\n at Generator.next (<anonymous>)\n at /usr/src/app/dist/packages/server/api/node_modules/tslib/tslib.js:169:75\n at new Promise (<anonymous>)\n at Object.__awaiter (/usr/src/app/dist/packages/server/api/node_modules/tslib/tslib.js:165:16)\n at t.getServerUrl (/usr/src/app/dist/packages/server/api/main.js:1:737864)\n at Object.<anonymous> (/usr/src/app/dist/packages/server/api/main.js:1:709257)\n at Generator.next (<anonymous>)\n at fulfilled (/usr/src/app/dist/packages/server/api/node_modules/tslib/tslib.js:166:62)\n at process.processTicksAndRejections (node:internal/process/task_queues:95:5)"},"msg":"[FlowWorker#executeFlow] Error executing flow run idqBZ3Clu28QVRiQVKAvZuV"}

Also this is from the queue UI:

What I’m missing?
Thanks.

yukyo · April 5, 2024, 11:02pm

Pretty disappointed with the lack of support… I’m not asking someone to make a queue setup for me, I’m just asking basic questions on why I get the errors or how to set up properly a queue mode… I can’t believe this post was created 8 days ago and noone had time to just say something…

abuaboud · April 6, 2024, 12:15am

Hi @yukyo

Currently both API and Worker takes the same exact environment variable, and we haven’t documented which one exact to which one, as I recommend to give them exact same variable (with different concurrency)

We are working on complete separation of worker from backend, this will takes around 1-2 months to be completed and documented.

So if you gave them exact same variables do you face any issues?

yukyo · April 6, 2024, 12:32am

Hi @abuaboud

If I give them the same configuration but I set main to concurrency = 0 and the other instances set to = 20

Can you explain why if I send 100 webhooks to be processed, the “main” instance takes about 2 minutes to “add tasks to queue,” and then workers start to process tasks?

I’m very happy with how workers process tasks. They do perfect work and super fast as soon as those tasks are available in the Redisbull queue, but when I send 100 webhooks, for example, they take a considerable amount of time to appear in the Redis queue.

Can I add multiple instances to “receive the webhooks”? Or only main does that?

Thanks.

abuaboud · April 6, 2024, 1:22am

Hi @yukyo

Good questions all of the instances can accept Webhooks and then they will be sent to the redis, I am curious why they take so long on the main instance to be added to the queue while it’s fast on worker.

It’s good to take look at main instance, for EXECUTE_TRIGGER in logs and see how much time that’s take, Please note the first webhook received will take long (< 10 seconds).

In the new architecture we are working on, everything related to the engine will be pushed to redis and get executed by the worker.

yukyo · April 6, 2024, 1:35am

Thanks for the response @abuaboud

I thought that setting the main instance with the var “AP_FLOW_WORKER_CONCURRENCY” = 0 would make that instance not PROCESS ANY TASKS and just focus on “receiving the webhooks.”

What is the reason for making the main instance AP_FLOW_WORKER_CONCURRENCY = 0?

So, workers also receive webhooks? Isn’t that poor efficiency?

I have been using N8N for some years. When you configure queue mode, you set up a webhook instance (that receives the API requests and adds the tasks to Redis) and worker instances that process the tasks in Redis. So, ActivePieces queue mode is like FFA? Does everyone do everything?

Could you please also explain to me exactly what this variable does?

What happens if I set to 1
Or 0?

abuaboud · April 6, 2024, 1:44am

Hi @yukyo,

Currently, yes, it’s like that, but there are two significant improvements we are working on.

Firstly, we’re separating workers from the backend. Secondly, we’re implementing an optimization for single-tenant machines that isolates within the same process, rather than spawning a new “node” process each time, which has been a significant performance bottleneck (CPU Usage) but it’s still fast but not great throughput, the optimization should at least get us 10x speed.

Any engine operation is costly for example (EXECUTE_TRIGGER) which check the trigger payload and do any transforms by the webhook. Therefore, we plan to move all these operations behind a queue to be handled exclusively by workers.

I’m curious about why you’re moving away from n8n?

Edit: I forget to say I updated environment variable description, please let me know if it’s clear

yukyo · April 6, 2024, 2:07am

Hi @abuaboud

Yeah now the VAR description is much clearer…

The reason why I’m trying to move off from N8N, is because to me they lost the “open-source” vibe they had, and they now focus exclusively on non self-hosted users… they release very buggy versions, and it’s a pain to install npm packages as it requires to re-deploy everything, the executions history can’t be externalized for example to Axiom or any sentry logger if you don’t get an Enterprise version (15k euro)… etc… (ActivePieces doesn’t offer externalization of execution history either, but I hope someday that feature will be available for everyone and not only for paid customers)

In general, I found N8N pretty buggy in the last year, so I am looking for some solid alternatives to that… My current N8N instance processes about 50k/req/day with a main instance + 10 workers…

There is a lot of stuff that I love from ActivePieces, for example, the clean UI—it’s much better. The code node—it’s much better. It’s super easy to add NPM packages, and the interface is very solid. In my opinion, there are some options that should be available in the hosted solution and not paid features, but…

I mean, the target for paid users that use ActivePieces is users that want to automate their stuff but don’t want to struggle with hosting it. I’m more than happy that ActivePieces or any other tool like N8N makes paid features, ok, but don’t make “key features” like GIT sync or logging behind a paywall… Make company stuff as paid, like inviting multiple users, roles, white-labeling, etc…

Back to workers, main, etc., I think N8N has defined that a lot better than ActivePieces, where each instance does just one thing. Usually when you have polyvalent instances, they have a poor performance in everything; that’s why I like it much more to have:

Webhook instance or instances (when more than one, connected through a balancer): ONLY receive API requests and add the TASKS to the REDIS queue.
Workers: ONLY process tasks AVAILABLE in the REDIS queue.

There is still a lot that I need to learn from ActivePieces, but overall, I like the tool. But to me, right now, it feels too slow when I send a bulk of 100 API requests in 1 second it takes a few minutes to react…

As soon as the tasks are added to the queue, the workers finish them quickly, the issue is the time since the request is sent to ActivePieces, until Workers start to process the requests…

Happy to share the logs if you want…

abuaboud · April 6, 2024, 2:15am

Hi,

We usually print how much EXECUTE_TRIGGER is taking in the logs and would love to know its overall impact. I’ll wait for the new architecture to kick in before we proceed to version 1.0!

We lock features based on “buyer persona,” following GitLab’s approach, which we will clarify in our website handbook soon, we offer self hosted / cloud for all versions.

I assume you are an individual or not an enterprise but still need Git sync. I would love to hear how do you use it? because in Activepieces, flow versions are supported for free and are built-in.

The main use of Git Sync is primarily to construct an approval process (at least that’s how it’s designed in Activepieces), so I would love to know your specific use case there.

As for external logging, I’ve heard of this feature before but never fully understood what problem it solves. What issue are you facing that requires an external logging system? If you could describe it, that would be great.

yukyo · April 6, 2024, 3:16am

Hey @abuaboud .

I think probably we are talking about different usecases with Git sync. I would love to have a way to “backup” everything every workflow, every setting using Git to Github or Lab or whatever in case of dataloss. Technically the data is not important but the workflows / webhook urls, tokens, headers, configurations, npm packages etc…

Well if I want to lookup my data for an execution using an ID from a request in the webhook I can’t do it with the current ActivePieces history, I need to know the time and date… which is not ALWAYS available or I just want to datapoint the data like I do with Axiom with other stuff for example Lambda executions that helps me to monitor executions, monitor data, and have a quick view of what’s happening in my ActivePieces instance…

If I’m able to push the data executions to Axiom for example, I can build stuff like this:

Or see the data like this, where I can search by query, field, workflow name, etc…

abuaboud · April 7, 2024, 3:23am

Hi @yukyo,

Log search is interesting indeed, I will get back to you on that soon. The Git Sync we have is not designed for that purpose. It seems you’re looking for a backup solution as you described. I am wondering, which database are you using? (Assuming Postgres) Don’t you have a backup in place for such cases? are you self managing it or using cloud managed one?

As for other use cases, such as reverting to older versions, this should be supported by the user interface.

Thank you,
Mo.

yukyo · April 7, 2024, 3:59pm

Hi @abuaboud

Yeah I’m using postgres…

Log search is super interesting, you can quickly find patterns by searching for specific values or group data…

About the backup yes I know that I can backup my postgresql db, I’m not talking about that. I’m talking about being able to push to github a backup file of my ActivePieces that contains ALL the workflows, all settings, all pieces, etc.

So any time I just can restore by using that backup file or just make a whole new activepieces instance and import from that file and everything is ready in secs…

Copying/Restoring database is a pain… especially when there is “execution data” linked to a specific instance…

PD: Why ActivePieces is soooo ram hungry? I was looking the metrics the other day, and 5 works eat about 12GB RAM constantly…

Thanks.

system · April 22, 2024, 3:59pm

This topic was automatically closed 15 days after the last reply. New replies are no longer allowed.