celery
DESCRIPTION
Quick introduction to CeleryTRANSCRIPT
![Page 2: Celery](https://reader031.vdocuments.site/reader031/viewer/2022020105/554f4537b4c905cd048b570f/html5/thumbnails/2.jpg)
Outline
self.__dict__
Use task queues
Celery and RabbitMQ
Getting started with RabbitMQ
Getting started with Celery
Periodic tasks
Examples
![Page 3: Celery](https://reader031.vdocuments.site/reader031/viewer/2022020105/554f4537b4c905cd048b570f/html5/thumbnails/3.jpg)
self.__dict__
{'name': 'Òscar Vilaplana','origin': 'Catalonia','company': 'Paylogic','tags': ['developer', 'architect', 'geek'],'email': '[email protected]',
}
![Page 4: Celery](https://reader031.vdocuments.site/reader031/viewer/2022020105/554f4537b4c905cd048b570f/html5/thumbnails/4.jpg)
Proposal
I Take a slow task.
I Decouple it from your system
I Call it asynchronously
![Page 5: Celery](https://reader031.vdocuments.site/reader031/viewer/2022020105/554f4537b4c905cd048b570f/html5/thumbnails/5.jpg)
Separate projects
Separate projects allow us to:I Divide your system in sections
I e.g. frontend, backend, mailing, reportgenerator. . .
I Tackle them individuallyI Conquer them�declare them Done:
I Clean codeI Clean interfaceI Unit testedI Maintainable
(but this is not only for Celery tasks)
![Page 6: Celery](https://reader031.vdocuments.site/reader031/viewer/2022020105/554f4537b4c905cd048b570f/html5/thumbnails/6.jpg)
Coupled Tasks
In some cases, it may not be possible to decouple some tasks.Then, we either:
I Have some workers in your system's networkI with access to the code of your systemI with access to the system's database
I They handle messages from certain queues, e.g. internal.#
![Page 7: Celery](https://reader031.vdocuments.site/reader031/viewer/2022020105/554f4537b4c905cd048b570f/html5/thumbnails/7.jpg)
Candidates
Processes that:
I Need a lot of memory.
I Are slow.
I Depend on external systems.
I Need a limited amount of data to work (easy to decouple).
I Need to be scalable.
Examples:
I Render complex reports.
I Import big �les
I Send e-mails
![Page 8: Celery](https://reader031.vdocuments.site/reader031/viewer/2022020105/554f4537b4c905cd048b570f/html5/thumbnails/8.jpg)
Example: sending complex emails
Create a in independent project: yourappmailI Generator of complex e-mails.
I It needs the templates, images. . .I It doesn't need access to your system's database.
I Deploy it in servers of our own, or in Amazon serversI We can add/remove as we need themI On startup:
I Join the RabbitMQ clusterI Start celeryd
I Normal operation: 1 server is enough
I On high load: start as many servers as needed (tpspeak
tpsserver)
![Page 9: Celery](https://reader031.vdocuments.site/reader031/viewer/2022020105/554f4537b4c905cd048b570f/html5/thumbnails/9.jpg)
yourappmail
A decoupled email generator:I Has a clean API
I Decoupled from your system's db: It needs to receive allinformation
I Customer informationI Custom dataI Contents of the email
I Can be deployed to as many servers as we needI Scalable
![Page 10: Celery](https://reader031.vdocuments.site/reader031/viewer/2022020105/554f4537b4c905cd048b570f/html5/thumbnails/10.jpg)
Not for everything
I Task queues are not a magic wand to make things fasterI They can be used as such (like cache).I It hides the real problem.
![Page 11: Celery](https://reader031.vdocuments.site/reader031/viewer/2022020105/554f4537b4c905cd048b570f/html5/thumbnails/11.jpg)
Celery
I Asynchronous distributed task queue
I Based on distributed message passing.
I Mostly for real-time queuing
I Can do scheduling too.
I REST: you can query status and results via URLs.
I Written in Python
I Celery: Message Brokers and Result Storage
![Page 12: Celery](https://reader031.vdocuments.site/reader031/viewer/2022020105/554f4537b4c905cd048b570f/html5/thumbnails/12.jpg)
Celery's tasks
I Tasks can be async or sync
I Low latency
I Rate limiting
I Retries
I Each task has an UUID: you can ask for the result back if youknow the task UUID.
I RabbitMQI Messaging systemI Protocol: AMQPI Open standard for messaging middlewareI Written in Erlang
I Easy to cluster!
![Page 13: Celery](https://reader031.vdocuments.site/reader031/viewer/2022020105/554f4537b4c905cd048b570f/html5/thumbnails/13.jpg)
Install the packages from the RabbitMQ website
I RabbitMQ ServerI Management Plugin (nice HTML interface)
I rabbitmq-plugins enable rabbitmq_management
I Go to http://localhost:55672/cli/ and download the cli.
I HTML interface at http://localhost:55672/
![Page 14: Celery](https://reader031.vdocuments.site/reader031/viewer/2022020105/554f4537b4c905cd048b570f/html5/thumbnails/14.jpg)
Set up a cluster
rabbit1$ rabbitmqctl cluster_statusCluster status of node rabbit@rabbit1 ...[{nodes,[{disc,[rabbit@rabbit1]}]},{running_nodes,[rabbit@rabbit1]}]...done.rabbit2$ rabbitmqctl stop_appStopping node rabbit@rabbit2 ...done.rabbit2$ rabbitmqctl resetResetting node rabbit@rabbit2 ...done.rabbit2$ rabbitmqctl cluster rabbit@rabbit1Clustering node rabbit@rabbit2 with [rabbit@rabbit1] ...done.rabbit2$ rabbitmqctl start_appStarting node rabbit@rabbit2 ...done.
![Page 15: Celery](https://reader031.vdocuments.site/reader031/viewer/2022020105/554f4537b4c905cd048b570f/html5/thumbnails/15.jpg)
Notes
I Automatic con�guration
I Use .config �le to describe the cluster.
I Change the type of the node
I RAM node
I Disk node
![Page 16: Celery](https://reader031.vdocuments.site/reader031/viewer/2022020105/554f4537b4c905cd048b570f/html5/thumbnails/16.jpg)
Install Celery
I Just pip install
![Page 17: Celery](https://reader031.vdocuments.site/reader031/viewer/2022020105/554f4537b4c905cd048b570f/html5/thumbnails/17.jpg)
De�ne a task
Example tasks.py
from celery.task import task
@taskdef add(x, y):
print "I received the task to add {} and {}".format(x, y)return x + y
![Page 18: Celery](https://reader031.vdocuments.site/reader031/viewer/2022020105/554f4537b4c905cd048b570f/html5/thumbnails/18.jpg)
Con�gure username, vhost, permissions
$ rabbitmqctl add_user myuser mypassword$ rabbitmqctl add_vhost myvhost$ rabbitmqctl set_permissions -p myvhost myuser ".*" ".*" ".*"
![Page 19: Celery](https://reader031.vdocuments.site/reader031/viewer/2022020105/554f4537b4c905cd048b570f/html5/thumbnails/19.jpg)
Con�guration �le
Write celeryconfig.py
BROKER_HOST = "localhost"BROKER_PORT = 5672BROKER_USER = "myusername"BROKER_PASSWORD = "mypassword"BROKER_VHOST = "myvhost"CELERY_RESULT_BACKEND = "amqp"CELERY_IMPORTS = ("tasks", )
![Page 20: Celery](https://reader031.vdocuments.site/reader031/viewer/2022020105/554f4537b4c905cd048b570f/html5/thumbnails/20.jpg)
Launch daemon
celeryd -I tasks # import the tasks module
![Page 21: Celery](https://reader031.vdocuments.site/reader031/viewer/2022020105/554f4537b4c905cd048b570f/html5/thumbnails/21.jpg)
Schedule tasks
from tasks import add
# Schedule the taskresult = add.delay(1, 2)
value = result.get() # value == 3
![Page 22: Celery](https://reader031.vdocuments.site/reader031/viewer/2022020105/554f4537b4c905cd048b570f/html5/thumbnails/22.jpg)
Schedule tasks by name
Sometimes the tasks module is not available on the clients
from tasks import add
# Schedule the taskresult = add.delay(1, 2)
value = result.get() # value == 3print value
![Page 23: Celery](https://reader031.vdocuments.site/reader031/viewer/2022020105/554f4537b4c905cd048b570f/html5/thumbnails/23.jpg)
Schedule the tasks better: apply_async
task.apply_async has more options:
I countdown=n: the task will run at least n seconds in thefuture.
I eta=datetime: the task will run not earlier than thandatetime.
I expires=n or expires=datetime the task will be revoked inn seconds or at datetime
I It will be marked as REVOKEDI result.get will raise a TaskRevokedError
I serializerI pickle: default, unless CELERY_TASK_SERIALIZER says
otherwise.I alternative: json, yaml, msgpack
![Page 24: Celery](https://reader031.vdocuments.site/reader031/viewer/2022020105/554f4537b4c905cd048b570f/html5/thumbnails/24.jpg)
Result
A result has some useful operations:
I successful: True if task succeeded
I ready: True if the result is ready
I revoke: cancel the task.
I result: if task has been executed, this contains the result if itraised an exception, it contains the exception instance
I state:I PENDINGI STARTEDI RETRYI FAILUREI SUCCESS
![Page 25: Celery](https://reader031.vdocuments.site/reader031/viewer/2022020105/554f4537b4c905cd048b570f/html5/thumbnails/25.jpg)
TaskSet
Run several tasks at once. The result keeps the order.
from celery.task.sets import TaskSetfrom tasks import addjob = TaskSet(tasks=[
add.subtask((4, 4)),add.subtask((8, 8)),add.subtask((16, 16)),add.subtask((32, 32)),
])result = job.apply_async()result.ready() # True -- all subtasks completedresult.successful() # True -- all subtasks successfulvalues = result.join() # [4, 8, 16, 32, 64]print values
![Page 26: Celery](https://reader031.vdocuments.site/reader031/viewer/2022020105/554f4537b4c905cd048b570f/html5/thumbnails/26.jpg)
TaskSetResult
The TaskSetResult has some interesting properties:
I successful: if all of the subtasks �nished successfully (noException)
I failed: if any of the subtasks failed.
I waiting: if any of the subtasks is not ready yet.
I ready: if all of the subtasks are ready.
I completed_count: number of completed subtasks.
I revoke: revoke all subtasks.
I iterate: iterate oer the return values of the subtasks oncethey �nish (sorted by �nish order).
I join: gather the results of the subtasks and return them in alist (sorted by the order on which they were called).
![Page 27: Celery](https://reader031.vdocuments.site/reader031/viewer/2022020105/554f4537b4c905cd048b570f/html5/thumbnails/27.jpg)
Retrying tasks
If the task fails, you can retry it by calling retry()
@taskdef send_twitter_status(oauth, tweet):
try:twitter = Twitter(oauth)twitter.update_status(tweet)
except (Twitter.FailWhaleError, Twitter.LoginError), exc:send_twitter_status.retry(exc=exc)
To limit the number of retries set task.max_retries.
![Page 28: Celery](https://reader031.vdocuments.site/reader031/viewer/2022020105/554f4537b4c905cd048b570f/html5/thumbnails/28.jpg)
Routing
apply_async accepts the parameter routing to create someRabbitMQ queues
pdf: ticket.#import_files: import.#
I Schedule the task to the appropriate queue
import_vouchers.apply_async(args=[filename],routing_key="import.vouchers")
generate_ticket.apply_async(args=barcodes,routing_key="ticket.generate")
![Page 29: Celery](https://reader031.vdocuments.site/reader031/viewer/2022020105/554f4537b4c905cd048b570f/html5/thumbnails/29.jpg)
celerybeat
from celery.schedules import crontab
CELERYBEAT_SCHEDULE = {# Executes every Monday morning at 7:30 A.M"every-monday-morning": {"task": "tasks.add","schedule": crontab(hour=7, minute=30,day_of_week=1),"args": (16, 16),
},}
![Page 30: Celery](https://reader031.vdocuments.site/reader031/viewer/2022020105/554f4537b4c905cd048b570f/html5/thumbnails/30.jpg)
There can be only one celerybeat running
I But we can have two machines that check on each other.
![Page 31: Celery](https://reader031.vdocuments.site/reader031/viewer/2022020105/554f4537b4c905cd048b570f/html5/thumbnails/31.jpg)
Import a big �le:
tasks.py
def import_bigfile(server, filename):with create_temp_file() as tmp:
fetch_bigfile(tmp, server, filename)import_bigfile(tmp)report_result(...) # e.g. send confirmation e-mail
![Page 32: Celery](https://reader031.vdocuments.site/reader031/viewer/2022020105/554f4537b4c905cd048b570f/html5/thumbnails/32.jpg)
Import big �le: Admin interface, server-Side
import tasksdef import_bigfile(filename):
result = tasks.imporg_bigfile.delay(filename)return result.task_id
class ImportBigfile(View):def post_ajax(request):
filename = request.get('big_file')task_id = import_bigfile(filename)return task_id
![Page 33: Celery](https://reader031.vdocuments.site/reader031/viewer/2022020105/554f4537b4c905cd048b570f/html5/thumbnails/33.jpg)
Import big �le: Admin interface, client-side
I Post the �le asynchronously
I Get the task_id back
I Put some �working. . . � message.
I Periodically ask Celery if the task is ready and change�working. . . � into �done!�
I No need to call Paylogic code: just ask Celery directly
I Improvements:I Send the username to the task.I Have the task call back the Admin interface when it's done.I The Backo�ce can send an e-mail to the user when the task is
done.
![Page 34: Celery](https://reader031.vdocuments.site/reader031/viewer/2022020105/554f4537b4c905cd048b570f/html5/thumbnails/34.jpg)
Do a time-consuming task.
from tasks import do_difficult_thing...stuff...# I have all data necessary to do the difficult thingdifficult_result = do_difficult_thing.delay(some, values)# I don't need the result just yet, I can keep myself busy... stuff ...# Now I really need the resultdifficult_value = difficult_result.get()