Hello World
The first thing to do is to create a Taskblaster repository.
This can be done with tb init
command:
$ tb init
Created repository using module "taskblaster.repository" in "/home/myuser/tmprepo".
This will create a directory called tree/
which eventually will contain all of the
tasks. It will also create a hidden .taskblaster
directory. This directory contains
the registry database where metadata, such as the state of each task, is stored for efficient access.
The user should never edit any files in this directory on their own.
You may view several other important paths using the tb info
command:
$ tb info
Module: taskblaster.repository
Code: /home/docs/checkouts/readthedocs.org/user_builds/taskblaster/envs/latest/lib/python3.12/site-packages/taskblaster/repository.py
Root: /home/myuser/tmprepo
Tree: /home/myuser/tmprepo/tree
Registry: /home/myuser/tmprepo/.taskblaster/registry.db (0 entries)
Pythonpath: /home/myuser/tmprepo/src
Tasks: /home/myuser/tmprepo/tasks.py (not created)
Resources: /home/myuser/tmprepo/resources.py (not created)
Read only: False
A simple workflow with a single task
Create a file called workflow.py:
import taskblaster as tb
@tb.workflow
class Workflow:
greeting = tb.var(default='hello')
whom = tb.var()
@tb.task
def hello(self):
return tb.node('greet', greeting=self.greeting, whom=self.whom)
def workflow(runner):
runner.run_workflow(Workflow(whom='world'))
Here the class Workflow
is the workflow that will be executed. It contains
a single task hello
which will run the Python function greet
with the
kwargs greeting
and whom
defined by the input to the workflow. The function
greet
is assumed to be located in a file tasks.py
in the main working
directory. The function workflow
defines that the Workflow
class should
be executed using the default argument for greeting and whom=’world
’.
The next step is thus to create the file called tasks.py
with the function greet
.
def greet(greeting, whom):
return f'{greeting}, {whom}!'
You can now run the workflow:
$ tb workflow workflow.py entry: add new 0/0 tree/hello
This generates some tasks. To see at any time what tasks are there, use:
$ tb ls state info tags worker time folder ──────── ────────── ─────────── ─────────── ─────────── ───────────────────────────── new 0/0 tree/hello
You can see that a task hello
has been created and added to the tree.
Use tb view <path to view>
to see more detailed information about tasks:
$ tb view .
name: hello
location: /home/myuser/tmprepo/tree/hello
state: new
target: greet(…)
wait for: 0 dependencies
depth: 0
source workflow: <root workflow>
frozen by: (not frozen)
latest handled inputs:
None
handlers:
<None>
handler data:
<None>
parents:
<task has no dependencies>
input:
["greet", {"greeting": "hello", "whom": "world"}]
output:
<task not finished yet>
Task has no run information
No custom actions defined for this task.
Here you can e.g. see the status of the task and what input arguments that were
provided. Note that the task has only been added to the tree and has not yet
been executed (it is in the new
state).
To execute the task you need to run it using the tb run
command:
$ tb run .
Starting worker rank=000 size=001
[rank=000 2025-03-17 10:08:36 N/A-0/1] Worker class: —
[rank=000 2025-03-17 10:08:36 N/A-0/1] Required tags: —
[rank=000 2025-03-17 10:08:36 N/A-0/1] Supported tags: —
[rank=000 2025-03-17 10:08:36 N/A-0/1] name: None
tags: —
required_tags: —
resources: None
max_tasks: None
subworker_size: None
subworker_count: None
wall_time: None
[rank=000 2025-03-17 10:08:36 N/A-0/1] Main loop
[rank=000 2025-03-17 10:08:36 N/A-0/1] Running hello ...
[rank=000 2025-03-17 10:08:36 N/A-0/1] Task hello finished in 0:00:00.000880
[rank=000 2025-03-17 10:08:36 N/A-0/1] No available tasks, end worker main loop
ls shows it is now in the done
state:
$ tb ls state info tags worker time folder ──────── ────────── ─────────── ─────────── ─────────── ───────────────────────────── done 0/0 N/A-0/1 00:00:00 tree/hello
You can now view the output:
$ tb view .
name: hello
location: /home/myuser/tmprepo/tree/hello
state: done
target: greet(…)
wait for: 0 dependencies
depth: 0
source workflow: <root workflow>
frozen by: (not frozen)
latest handled inputs:
None
handlers:
[]
handler data:
<None>
parents:
<task has no dependencies>
input:
["greet", {"greeting": "hello", "whom": "world"}]
output:
'hello, world!'
Run information:
Worker name: N/A-0/1
Start time: 2025-03-17 10:08:36
End time: 2025-03-17 10:08:36
Duration: 0:00:00
Error: None
No custom actions defined for this task.
Congratulations, you finished your first small TaskBlaster workflow!
Adding new Tasks
To add a new task to an existing workflow tree we edit the original
workflow.py
script and save it to a new file workflow2.py
.
import taskblaster as tb
@tb.workflow
class Workflow:
greeting = tb.var(default='hello')
whom = tb.var()
username = tb.var(default='User')
@tb.task
def hello(self):
return tb.node('greet', greeting=self.greeting, whom=self.whom)
@tb.task
def hello_user(self):
return tb.node('greet', greeting=self.greeting, whom=self.username)
def workflow(runner):
runner.run_workflow(Workflow(whom='world', username='Tara'))
You can now see what happens when you run the workflow:
$ tb workflow workflow2.py entry: have done 0/0 tree/hello add new 0/0 tree/hello_user
Try running ls:
$ tb ls state info tags worker time folder ──────── ────────── ─────────── ─────────── ─────────── ───────────────────────────── done 0/0 N/A-0/1 00:00:00 tree/hello new 0/0 tree/hello_user
As you can see, the old task hello
is still done
, but there is a new task in
the new
state in the tree.
Run the new task
$ tb run .
Starting worker rank=000 size=001
[rank=000 2025-03-17 10:08:37 N/A-0/1] Worker class: —
[rank=000 2025-03-17 10:08:37 N/A-0/1] Required tags: —
[rank=000 2025-03-17 10:08:37 N/A-0/1] Supported tags: —
[rank=000 2025-03-17 10:08:37 N/A-0/1] name: None
tags: —
required_tags: —
resources: None
max_tasks: None
subworker_size: None
subworker_count: None
wall_time: None
[rank=000 2025-03-17 10:08:37 N/A-0/1] Main loop
[rank=000 2025-03-17 10:08:37 N/A-0/1] Running hello_user ...
[rank=000 2025-03-17 10:08:37 N/A-0/1] Task hello_user finished in 0:00:00.000860
[rank=000 2025-03-17 10:08:37 N/A-0/1] No available tasks, end worker main loop
$ tb ls state info tags worker time folder ──────── ────────── ─────────── ─────────── ─────────── ───────────────────────────── done 0/0 N/A-0/1 00:00:00 tree/hello done 0/0 N/A-0/1 00:00:00 tree/hello_user
The task is now marked as done
and you can view the output with tb view
.
Creating a conflict
We will now investigate what happens if we change the input to the workflow.
Change the input argument whom
in workflow2.py
import taskblaster as tb
@tb.workflow
class Workflow:
greeting = tb.var(default='hello')
whom = tb.var()
username = tb.var(default='User')
@tb.task
def hello(self):
return tb.node('greet', greeting=self.greeting, whom=self.whom)
@tb.task
def hello_user(self):
return tb.node('greet', greeting=self.greeting, whom=self.username)
def workflow(runner):
runner.run_workflow(Workflow(whom='new world', username='Tara'))
and save it as a new file workflow3.py
.
Now run the workflow
$ tb workflow workflow3.py entry: conflict done 0/0 ❄ C tree/hello have done 0/0 tree/hello_user
You are informed that there is a conflict for the task hello
.
You can also view the information about the conflict using the ls
command
$ tb ls -c sfcC state folder conflict conflict info ──────── ───────────────────────────── ─────────── ─────────────── done tree/hello conflict Input changed. Old input ["greet", {"greeting": "hello", "whom": "world"}] New input: ["greet", {"greeting": "hello", "whom": "new world"}] done tree/hello_user none No conflict
The additional argument -c sfcC
specifies that we want to see the state,
directory, conflict state and conflict info (try tb ls --help
to see all
options). From the output we can see which state that has a conflict and what
the previous input to the function was. Note that the state
of the task is
still done
, so no output files have been deleted. The conflict state is
merely information to the user on which tasks that are affected by the change
of input parameters. We can choose to change the conflict state to resolved
,
meaning that we have noticed that there is a conflict but want to continue to
do calculations for this task based on the old input parameters.
$ tb resolve tree/hello
$ tb ls -csfcC state folder conflict conflict info ──────── ───────────────────────────── ─────────── ─────────────── done tree/hello resolved Input changed. Old input ["greet", {"greeting": "hello", "whom": "world"}] New input: ["greet", {"greeting": "hello", "whom": "new world"}] done tree/hello_user none No conflict
The conflict state has now changed to resolved
. If we change our minds we can
mark it as conflict again
$ tb unresolve tree/hello
$ tb ls -csfcC state folder conflict conflict info ──────── ───────────────────────────── ─────────── ─────────────── done tree/hello conflict Input changed. Old input ["greet", {"greeting": "hello", "whom": "world"}] New input: ["greet", {"greeting": "hello", "whom": "new world"}] done tree/hello_user none No conflict
If we decide that we still want to go with the old input parameters we can run the old workflow again.
$ tb workflow workflow2.py entry: have done 0/0 tree/hello have done 0/0 tree/hello_user
$ tb ls -csfcC state folder conflict conflict info ──────── ───────────────────────────── ─────────── ─────────────── done tree/hello none No conflict done tree/hello_user none No conflict
and we can see that the conflict has disappeared.
However, suppose we want to run the task with the new input. We then have to unrun the task to first put it in the new state. This will delete the output from the task.
$ tb unrun tree/hello --force unrun: done hello 1 task were unrun.
$ tb ls -csfcC state folder conflict conflict info ──────── ───────────────────────────── ─────────── ─────────────── new tree/hello none No conflict done tree/hello_user none No conflict
The task is now in the new state. Rerun workflow3 to apply the new input parameters:
$ tb workflow workflow3.py entry: update new 0/0 tree/hello have done 0/0 tree/hello_user
You can now see that the conflict has disappeared and the state has changed
to new
. With tb view
you can also verify that the output has been deleted.
It is now possible to run the task with the new input parameters:
$ tb run tree/hello
Starting worker rank=000 size=001
[rank=000 2025-03-17 10:08:38 N/A-0/1] Worker class: —
[rank=000 2025-03-17 10:08:38 N/A-0/1] Required tags: —
[rank=000 2025-03-17 10:08:38 N/A-0/1] Supported tags: —
[rank=000 2025-03-17 10:08:38 N/A-0/1] name: None
tags: —
required_tags: —
resources: None
max_tasks: None
subworker_size: None
subworker_count: None
wall_time: None
[rank=000 2025-03-17 10:08:38 N/A-0/1] Main loop
[rank=000 2025-03-17 10:08:38 N/A-0/1] Running hello ...
[rank=000 2025-03-17 10:08:38 N/A-0/1] Task hello finished in 0:00:00.000886
[rank=000 2025-03-17 10:08:38 N/A-0/1] No available tasks, end worker main loop
and verify that the input as well as output has been updated
$ tb view tree/hello
name: hello
location: /home/myuser/tmprepo/tree/hello
state: done
target: greet(…)
wait for: 0 dependencies
depth: 0
source workflow: <root workflow>
frozen by: (not frozen)
latest handled inputs:
None
handlers:
[]
handler data:
<None>
parents:
<task has no dependencies>
input:
["greet", {"greeting": "hello", "whom": "new world"}]
output:
'hello, new world!'
Run information:
Worker name: N/A-0/1
Start time: 2025-03-17 10:08:38
End time: 2025-03-17 10:08:38
Duration: 0:00:00
Error: None
No custom actions defined for this task.