Redun

Here, we’ll see how to track redun workflow runs with LaminDB.

Note

This use case is based on github.com/ricomnl/bioinformatics-pipeline-tutorial.

!lamin init --storage ./test-redun-lamin --schema bionty
Hide code cell output
Operations to perform:
  Apply all migrations: bionty, contenttypes, lamindb
Running pre-migrate handlers for application contenttypes
Running pre-migrate handlers for application lamindb
Running pre-migrate handlers for application bionty
Running migrations:
  Applying lamindb.0069_squashed...
 OK (1.982s)
  Applying lamindb.0070_lamindbv1_migrate_data...
 OK (1.193s)
  Applying lamindb.0071_lamindbv1_migrate_schema...
 OK (4.951s)
  Applying lamindb.0072_remove_user__branch_code_remove_user_aux_and_more...
 OK (0.798s)
  Applying lamindb.0073_merge_ourprojects...
 OK (1.393s)
  Applying lamindb.0074_lamindbv1_part4...
 OK (4.158s)
  Applying bionty.0041_squashed...
 OK (11.058s)
  Applying bionty.0042_lamindbv1...
 OK (6.628s)
  Applying bionty.0043_lamindbv2_part2...
 OK (2.172s)
  Applying bionty.0044_alter_cellline_space_alter_cellmarker_space_and_more...
 OK (2.291s)
  Applying bionty.0045_rename_aux_cellline__aux_rename_aux_cellmarker__aux_and_more...
 OK (2.268s)
  Applying bionty.0046_alter_cellline__aux_alter_cellmarker__aux_and_more...
 OK (2.271s)
  Applying lamindb.0075_lamindbv1_part5...
 OK (5.580s)
  Applying bionty.0047_lamindbv1_part5...
 OK (3.227s)
  Applying contenttypes.0001_initial... OK (0.011s)
  Applying contenttypes.0002_remove_content_type_name...
 OK (0.157s)
Running post-migrate handlers for application contenttypes
Adding content type 'contenttypes | contenttype'
Running post-migrate handlers for application lamindb
Adding content type 'lamindb | feature'
Adding content type 'lamindb | param'
Adding content type 'lamindb | user'
Adding content type 'lamindb | artifact'
Adding content type 'lamindb | collection'
Adding content type 'lamindb | collectionartifact'
Adding content type 'lamindb | featurevalue'
Adding content type 'lamindb | artifactfeaturevalue'
Adding content type 'lamindb | paramvalue'
Adding content type 'lamindb | artifactparamvalue'
Adding content type 'lamindb | run'
Adding content type 'lamindb | runparamvalue'
Adding content type 'lamindb | storage'
Adding content type 'lamindb | transform'
Adding content type 'lamindb | ulabel'
Adding content type 'lamindb | collectionulabel'
Adding content type 'lamindb | artifactulabel'
Adding content type 'lamindb | space'
Adding content type 'lamindb | transformulabel'
Adding content type 'lamindb | artifactproject'
Adding content type 'lamindb | artifactreference'
Adding content type 'lamindb | collectionproject'
Adding content type 'lamindb | collectionreference'
Adding content type 'lamindb | person'
Adding content type 'lamindb | project'
Adding content type 'lamindb | reference'
Adding content type 'lamindb | transformproject'
Adding content type 'lamindb | transformreference'
Adding content type 'lamindb | schema'
Adding content type 'lamindb | artifactschema'
Adding content type 'lamindb | schemafeature'
Adding content type 'lamindb | schemaparam'
Running post-migrate handlers for application bionty
Adding content type 'bionty | artifactcellline'
Adding content type 'bionty | artifactcellmarker'
Adding content type 'bionty | artifactcelltype'
Adding content type 'bionty | artifactdevelopmentalstage'
Adding content type 'bionty | artifactdisease'
Adding content type 'bionty | artifactethnicity'
Adding content type 'bionty | artifactexperimentalfactor'
Adding content type 'bionty | artifactgene'
Adding content type 'bionty | artifactorganism'
Adding content type 'bionty | artifactpathway'
Adding content type 'bionty | artifactphenotype'
Adding content type 'bionty | artifactprotein'
Adding content type 'bionty | artifacttissue'
Adding content type 'bionty | cellline'
Adding content type 'bionty | cellmarker'
Adding content type 'bionty | celltype'
Adding content type 'bionty | developmentalstage'
Adding content type 'bionty | disease'
Adding content type 'bionty | ethnicity'
Adding content type 'bionty | experimentalfactor'
Adding content type 'bionty | gene'
Adding content type 'bionty | organism'
Adding content type 'bionty | pathway'
Adding content type 'bionty | phenotype'
Adding content type 'bionty | protein'
Adding content type 'bionty | source'
Adding content type 'bionty | tissue'
Adding content type 'bionty | schemacellmarker'
Adding content type 'bionty | schemagene'
Adding content type 'bionty | schemapathway'
Adding content type 'bionty | schemaprotein'
 initialized lamindb: testuser1/test-redun-lamin

Amend the workflow

import lamindb as ln
import json
 connected lamindb: testuser1/test-redun-lamin

Let’s amend a redun workflow.py to register input & output artifacts in LaminDB:

  • To track the workflow run in LaminDB, add (see on GitHub):

    ln.track(params=params)
    
  • To register the output file via LaminDB, add (see on GitHub):

    ln.Artifact(output_path, description="results").save()
    

Run redun

Let’s see what the input files are:

!ls ./fasta
KLF4.fasta  MYC.fasta  PO5F1.fasta  SOX2.fasta

And call the workflow:

!redun run workflow.py main --input-dir ./fasta --tag run=test-run  1> redun_stdout.txt 2>redun_stderr.txt

Inspect the output:

!cat redun_stdout.txt
 connected lamindb: testuser1/test-redun-lamin
 running outside of synched git repo, cloning https://github.com/laminlabs/redun-lamin into /home/runner/.cache/lamindb/redun-lamin
! code blob hash e37260d21609885c5a16a79fe01de24b0f074df7 was found in non-default branch(es): origin/slacknotif
 created Transform('taasWKawCiNA0000'), started new Run('EDruK7Uz...') at 2025-01-14 21:01:21 UTC
! git repo /home/runner/.cache/lamindb/redun-lamin already exists locally
! code blob hash e37260d21609885c5a16a79fe01de24b0f074df7 was found in non-default branch(es): origin/slacknotif
 created Transform('taasWKawCiNA0000'), started new Run('AjDvah2V...') at 2025-01-14 21:01:22 UTC
→ params: input_dir=./fasta, amino_acid=C, enzyme_regex=[KR], missed_cleavages=0, min_length=4, max_length=75, executor=Executor.default
! folder is outside existing storage location, will copy files from ./fasta to /home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/fasta
?25l
downloading... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━   0% -:--:--
downloading... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╺━━━━━━━━━━  74% 0:00:01
downloading... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00
downloading... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00
?25h
 finished Run('AjDvah2V') after 0d 0h 0m 6s at 2025-01-14 21:01:28 UTC
File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/data/results.tgz, hash=a5ecf173)

And the error log:

!tail -1 redun_stderr.txt
[redun] Execution duration: 12.66 seconds

View data lineage:

artifact = ln.Artifact.filter(description="results", suffix=".tgz").one()
artifact.view_lineage()
_images/c9077652ef3442caffb52764dab9138a8ba259f718e0db1ae38159535eac8d3c.svg

Track the redun execution id

If we want to be able to query LaminDB for redun execution ID, this here is a way to get it:

# export the run information from redun
!redun log --exec --exec-tag run=test-run --format json --no-pager > redun_exec.json
# load the redun execution id from the JSON and store it in the LaminDB run record
with open("redun_exec.json") as file:
    redun_exec = json.loads(file.readline())
artifact.run.reference = redun_exec["id"]
artifact.run.reference_type = "redun_id"
artifact.run.save()
Run(uid='AjDvah2V8Xw60bTrOkCv', started_at=2025-01-14 21:01:22 UTC, finished_at=2025-01-14 21:01:28 UTC, reference='70307f22-9649-4042-b6b7-e7de177ab1bf', reference_type='redun_id', space_id=1, transform_id=1, report_id=6, environment_id=5, created_by_id=1, created_at=2025-01-14 21:01:22 UTC)

Track the redun run report

Attach a run report:

report = ln.Artifact(
    "redun_stderr.txt",
    description=f"Redun run report of {redun_exec['id']}",
    run=False,
    visibility=0,
).save()
artifact.run.report = report
artifact.run.save()
Run(uid='AjDvah2V8Xw60bTrOkCv', started_at=2025-01-14 21:01:22 UTC, finished_at=2025-01-14 21:01:28 UTC, reference='70307f22-9649-4042-b6b7-e7de177ab1bf', reference_type='redun_id', space_id=1, transform_id=1, report_id=8, environment_id=5, created_by_id=1, created_at=2025-01-14 21:01:22 UTC)

View transforms and runs in LaminHub

hub

View the database content

ln.view()
****************
* module: core *
****************
Artifact
uid key description suffix kind otype size hash n_files n_observations _hash_type _key_is_virtual _overwrite_versions space_id storage_id schema_id version is_latest run_id created_at created_by_id _aux _branch_code
id
7 A1EEzDtCtlQHvFNn0000 data/results.tgz results .tgz None None 83556 BCOM2cx_DSzQn4Wu4Ip6dA None None md5 False False 1 1 None None True 2.0 2025-01-14 21:01:33.101173+00:00 1 None 1
4 Hr8tju9S02Qz88vL0000 fasta/PO5F1.fasta None .fasta None None 477 -7iJgveFO9ia0wE1bqVu6g None None md5 True False 1 1 None None True NaN 2025-01-14 21:01:23.959035+00:00 1 None 1
3 mfDXA89XArRzSTIw0000 fasta/SOX2.fasta None .fasta None None 414 C5q_yaFXGk4SAEpfdqBwnQ None None md5 True False 1 1 None None True NaN 2025-01-14 21:01:23.958471+00:00 1 None 1
2 esRNQNhzhAeJt7o50000 fasta/KLF4.fasta None .fasta None None 609 LyuoYkWs4SgYcH7P7JLJtA None None md5 True False 1 1 None None True NaN 2025-01-14 21:01:23.957725+00:00 1 None 1
1 blGLxbZyHJpDjSWn0000 fasta/MYC.fasta None .fasta None None 536 WGbEtzPw-3bQEGcngO_pHQ None None md5 True False 1 1 None None True NaN 2025-01-14 21:01:23.956527+00:00 1 None 1
Param
name dtype is_type _expect_many space_id type_id run_id created_at created_by_id _aux _branch_code
id
7 executor str None False 1 None None 2025-01-14 21:01:20.570411+00:00 1 None 1
6 max_length int None False 1 None None 2025-01-14 21:01:20.570343+00:00 1 None 1
5 min_length int None False 1 None None 2025-01-14 21:01:20.570280+00:00 1 None 1
4 missed_cleavages int None False 1 None None 2025-01-14 21:01:20.570216+00:00 1 None 1
3 enzyme_regex str None False 1 None None 2025-01-14 21:01:20.570118+00:00 1 None 1
2 amino_acid str None False 1 None None 2025-01-14 21:01:20.570047+00:00 1 None 1
1 input_dir str None False 1 None None 2025-01-14 21:01:20.569827+00:00 1 None 1
ParamValue
value hash space_id param_id created_at created_by_id _aux _branch_code
id
1 ./fasta None 1 1 2025-01-14 21:01:22.677237+00:00 1 None 1
2 C None 1 2 2025-01-14 21:01:22.677304+00:00 1 None 1
3 [KR] None 1 3 2025-01-14 21:01:22.677351+00:00 1 None 1
4 0 None 1 4 2025-01-14 21:01:22.677404+00:00 1 None 1
5 4 None 1 5 2025-01-14 21:01:22.677455+00:00 1 None 1
6 75 None 1 6 2025-01-14 21:01:22.677503+00:00 1 None 1
7 Executor.default None 1 7 2025-01-14 21:01:22.677549+00:00 1 None 1
Run
uid name started_at finished_at reference reference_type _is_consecutive _status_code space_id transform_id report_id _logfile_id environment_id initiated_by_run_id created_at created_by_id _aux _branch_code
id
1 EDruK7UznZesaE6eb44l None 2025-01-14 21:01:21.265278+00:00 NaT None None None 0 1 1 NaN None NaN None 2025-01-14 21:01:21.265330+00:00 1 None 1
2 AjDvah2V8Xw60bTrOkCv None 2025-01-14 21:01:22.649053+00:00 2025-01-14 21:01:28.871947+00:00 70307f22-9649-4042-b6b7-e7de177ab1bf redun_id None 0 1 1 8.0 None 5.0 None 2025-01-14 21:01:22.649097+00:00 1 None 1
Storage
uid root description type region instance_uid space_id run_id created_at created_by_id _aux _branch_code
id
1 x0awepL3rlqv /home/runner/work/redun-lamin/redun-lamin/docs... None local None iQlBPgD8uaqR 1 None 2025-01-14 21:00:55.230571+00:00 1 None 1
Transform
uid key description type source_code hash reference reference_type space_id _template_id version is_latest created_at created_by_id _aux _branch_code
id
1 taasWKawCiNA0000 workflow.py None script """workflow.py."""\n\n# This code is a copy fr... Lg_7dutVm-W285sKmwyz3A https://github.com/laminlabs/redun-lamin/blob/... url 1 None None True 2025-01-14 21:01:21.261136+00:00 1 None 1
ULabel
uid name is_type description reference reference_type space_id type_id run_id created_at created_by_id _aux _branch_code
id
1 JdiyQep5 redun None None None None 1 None 2 2025-01-14 21:01:23.915650+00:00 1 None 1
******************
* module: bionty *
******************
Organism
uid name ontology_id scientific_name synonyms description space_id source_id run_id created_at created_by_id _aux _branch_code
id
1 1dpCL6Td human NCBITaxon:9606 homo_sapiens None None 1 1 2 2025-01-14 21:01:26.250930+00:00 1 None 1
Protein
uid name uniprotkb_id synonyms description length gene_symbol ensembl_gene_ids space_id source_id organism_id run_id created_at created_by_id _aux _branch_code
id
4 3qNrC4hwnDC9 PO5F1_HUMAN POU domain, class 5, transcription... Q01860 Octamer-binding protein 3|Oct-3|Octamer-bindin... class 5, transcription factor 1 360 POU5F1 ENST00000259915.13 [Q01860-1];ENST00000376243.... 1 22 1 2 2025-01-14 21:01:28.852370+00:00 1 None 1
3 38rbzWPtKmb2 SOX2_HUMAN Transcription factor SOX-2 P48431 317 SOX2 ENST00000325404.3; 1 22 1 2 2025-01-14 21:01:28.004894+00:00 1 None 1
2 6ThKerPbf6DR KLF4_HUMAN Krueppel-like factor 4 O43474 Epithelial zinc finger protein EZF|Gut-enriche... 513 KLF4 ENST00000374672.5 [O43474-1]; 1 22 1 2 2025-01-14 21:01:27.149445+00:00 1 None 1
1 36jnmKHdiT9m MYC_HUMAN Myc proto-oncogene protein P01106 Class E basic helix-loop-helix protein 39|bHLH... 454 MYC ENST00000377970.6 [P01106-1];ENST00000524013.2... 1 22 1 2 2025-01-14 21:01:26.259675+00:00 1 None 1
Source
uid entity organism name in_db currently_used description url md5 source_website space_id dataframe_artifact_id version run_id created_at created_by_id _aux _branch_code
id
103 5JnV BioSample all ncbi False True NCBI BioSample attributes s3://bionty-assets/df_all__ncbi__2023-09__BioS... 918db9bd1734b97c596c67d9654a4126 https://www.ncbi.nlm.nih.gov/biosample/docs/at... 1 None 2023-09 None 2025-01-14 21:00:55.568608+00:00 1 None 1
102 MJRq bionty.Ethnicity human hancestro False True Human Ancestry Ontology https://github.com/EBISPOT/hancestro/raw/3.0/h... 76dd9efda9c2abd4bc32fc57c0b755dd https://github.com/EBISPOT/hancestro 1 None 3.0 None 2025-01-14 21:00:55.568526+00:00 1 None 1
101 6vJm bionty.DevelopmentalStage mouse mmusdv False False Mouse Developmental Stages http://aber-owl.net/media/ontologies/MMUSDV/9/... 5bef72395d853c7f65450e6c2a1fc653 https://github.com/obophenotype/developmental-... 1 None 2020-03-10 None 2025-01-14 21:00:55.568446+00:00 1 None 1
100 10va bionty.DevelopmentalStage mouse mmusdv False True Mouse Developmental Stages https://github.com/obophenotype/developmental-... https://github.com/obophenotype/developmental-... 1 None 2024-05-28 None 2025-01-14 21:00:55.568365+00:00 1 None 1
99 7Zm9 bionty.DevelopmentalStage human hsapdv False False Human Developmental Stages http://aber-owl.net/media/ontologies/HSAPDV/11... 52181d59df84578ed69214a5cb614036 https://github.com/obophenotype/developmental-... 1 None 2020-03-10 None 2025-01-14 21:00:55.568284+00:00 1 None 1
98 1GbF bionty.DevelopmentalStage human hsapdv False True Human Developmental Stages https://github.com/obophenotype/developmental-... https://github.com/obophenotype/developmental-... 1 None 2024-05-28 None 2025-01-14 21:00:55.568201+00:00 1 None 1
97 1atB Drug all chebi False False Chemical Entities of Biological Interest s3://bionty-assets/df_all__chebi__2024-07-27__... https://www.ebi.ac.uk/chebi/ 1 None 2024-07-27 None 2025-01-14 21:00:55.568107+00:00 1 None 1

Delete the test instance:

!rm -rf test-redun-lamin
!lamin delete --force test-redun-lamin
Hide code cell output
 deleting instance testuser1/test-redun-lamin