C. Download data to Duke Compute Cluster (DCC)


1. Check that you have permission to download bproject


Click on this link to log into Duke Data Services.

Click log in and you will be prompted for your NetID and Password.

The first time you log in, it will ask you to authorize Duke Data Service. Click Authorize.





You will be directed to your dashboard.

If you are the person who placed the order with the Sequencing Core and received a data delivery email from us, your project should appear on your dashboard.

If you do not see the NGS project listed on your dashboard, it means you are not the owner of that data and need to request access. The owner, i.e. the person who placed the order, can follow the instructions under heading 1.1 to grant you access.

1.1. Grant a collaborator access to your NGS data

If you are the owner of the data and need to grant other investigators download permissions, Click on this link to log into Duke Data Services. Find the project to share on your dashboard.

Click on the on the icon in the upper-right corner of your project. This will open a drop-down menu.

Select 'Add Project Members'.

This opens another pop-up window.

You can 'Add A New Member' by registering the name of the person you want to share the data with.

DDS is linked to Duke's directory. Begin typing the name of a collaborator and select it from the drop down menu. Data on DDS can only be shared with people who have NetIDs (i.e they are in the Duke directory).

You must assign the new team member a project role. Make sure to select either the 'Project Administrator' or 'File Downloader' roles to give download privileges.

Please note that the 'Project Administrator' role will grant that person permission to delete files. 'File Downloader' will be able to download files, but not modify or delete them on DDS.



2. Log into Duke Compute Cluster (DCC)

Open a terminal window by going to Applications>Utilities and click on Terminal. You can also press command-spacebar to open spotlight search, type terminal, and then press Enter.

SSH into DCC by typing the following command and authenticate using your NetID credentials. Note that to connect to the cluster, you must first be on the Duke network either via a direct physical or wireless network connection to that network, or by using the Duke VPN.

ssh NetID@dcc-slogin.oit.duke.edu

3. Config file setup

DDSclient tools, which you will need to download your data are written in Python and have been installed on the Duke Compute Cluster.

However, before you can use those tools, you will need to create in your home direcoty on HARDAC a config file containing an agent_key and a user_key.

To set it up:

1. Go to the Duke [Data Service Portal](https://dataservice.duke.edu/#/login)

2. Click login and login with your NetID

3. Click top left menu button







4. Click **Software Agents**












5. Click **ADD NEW AGENT** button











6. Fill in data for a new agent and click **SUBMIT**.













Pick a descriptive name for your software agent, as it will be associated with the projects you upload.



7. Click **CREDENTIALS** on the software agent you just created.








8. Click **COPY CREDENTIALS TO CLIPBOARD**.














You now have your user and agent key in your clipboard.

9. Go to your terminal window and copy your clipboard to a new file

The new file needs to be called .ddsclient (do not add .txt or any other extension) and needs to be located in your home directory.

To do this, type the following commands into your Terminal window (type what is in bold):

cd ~

This will bring you to your home directory.

nano .ddsclient

This will fire up a file editor within the Terminal window and create a new empty file named .ddsclient for you to paste your clipboard in.

On your keyboard, click Command + V to copy your clipboard into the terminal window. Close the newly created file by clicking Control + X on your keyborad.

Click the "Y" key on your keyboard to save the changes and exit the unix file editor.

You should now be able to use ddsclient!




4. Load DDSClient tools and Download Project

While logged into DCC, you can now load the DDSClient toolkit by typing

module load Python/3.8.1

You can list all the projects you have access to by typing

ddsclient list

and you can download an entire project by typing

ddsclient download -p ProjectName Folder

where ProjectName is the name of the project you want to download as listed by the ddsclient list command and Folder is the path to the directory you want the data to be downloaded to. Currently it requires the directory be empty or not exist. It will create Folder (named like the project) if it doesn't exist.