Skip to content

The DataHub Workshops

Upcoming Workshop Series 2025/26

We will offer monthly workshops for all DataHub users starting in October 2025.
Registration will open on 25.08.2025. Access details will be provided after registration.

DataHub Workshop Series in Winter Semester 2025/26

Figure 1: DataHub Workshop Series in Winter Semester 2025/26.

DataHub Crash Course: Services and Workflow (online)

This course will provide an overview of our common data management workflow, our technical platform and our services. It conveys the minimum knowledge to start using the DataHub for versioned storing, sharing and publishing of your research code and data. It will also provide an overview on high performance compute resources and Jupyter interactive compute.

The workshop is open to everyone in the DataHub consortia. We would particularly invite persons who are or will be in supervising roles to join the course.

Date: 08.10.2025, 09:00-12:00
Location: Online

MaRC3 Cluster Introduction and Handling Sensitive Data (online)

This course will cover basic concepts of high performance compute (HPC) and will prepare for using the MaRC3 cluster in Marburg. Prior experience in using command-line and Linux systems is helpful but not required.

Following the cluster introduction, there will be two modules on handling sensitive personal research data in the context of MaRC3 and the DataHub. Prior experience with git version control or taking DataHub Crash Course is highly recommended.

Taking this course will be mandatory to work with sensitive data on the MaRC3 cluster and/or on the DataHub platforms.

Date: November 2025 (3 h, to be announced).
Location: Online

Git, Git-LFS, GitLab Basic Workshop (in-person)

This hands-on course will enable you to use version control with git in a local and distributed/collaborative context via our TAM GitLab. The course will provide a basic understanding of git and its most important commands. It will include concepts of branching and merging parallel streams of work and will enable you to judge on advantages and limitations in using git version control.

Using Git-LFS you will learn how to manage even large amounts of research data efficiently. Demonstrating the usage of GitLab groups, visibility and memberships, we will enable you to collaborate in multiple distributed teams and projects.

Date: December 2025 (4 h, to be announced).
Location: In-person.

Open Science, RDMPs, Metadata and Publication (online)

This course aims at making your research process as smooth as possible while making your research products as visible and valuable as possible. First, we will look at the most underestimated phase of research data life cycle: Planning data management. We will cover a broad understanding of metadata, the interoperable and sustainable choice of data formats and platforms and finally the definition of re-usable pieces of work. We will discuss how this is reflected in research data managing plans (RDMP) and demonstrate how to set up an RDMP using the RDMO software.

These early investments pay off when it comes to publishing your research. We will discuss advantages and limitations of typical publication platforms like OSF, Zenodo, institutional repositories and subject-specific repositories like our own DataHub Repository. You will learn how to bundle and publish meaningful, richly described and independently reusable research objects (data, code, textual publications) with persistent identifiers (DOI). To increase findability and re-usability, we will cover typical licenses and how to interlink your work with other published work using defined relations.

Date: January 2025 (3 h, to be announced)
Location: to be announced.

MaRC3 and Jupyter Hands-on Workshop (in-person)

This hands-on course we will practice how to scale up your computation using the MaRC3 high performance compute cluster. You will learn how to use git/GitLab from this central compute environment, manage software and dependencies using virtual environments and how to use the Slurm scheduling system to submit real-world compute jobs. We will have experienced HPC users who give first-hand advice on using compute resources effectively and avoiding typical pitfalls. We will discuss showcases to demonstrate the capabilities of high performance compute.

With our Jupyter_hpc service we also have a modern solution for browser-based interactive compute with various programming languages. As the Jupyter_hpc actually runs within the MaRC3, and includes the available filesystems and software, it bridges the gap between local and central code execution. You will learn about the Jupyter notebook format as a very powerful way of simultaneously generating, visualizing and describing research data in interleaved sections of code/results and text/documentation. We will demonstrate how to use the considerable capabilities of Jupyter_hpc for your research and discuss in which cases conventional HPC usage is more advisable.

Date: February 2025 (3 h, to be announced)
Location: In-person.