2  Basics

2.1 Project directories on the FDZ servers

Each research project receives its own directory (fdzXXXX) per FDZ data set. Different FDZ data products cannot be used together or merged at the individual observation level, even if they are used in the same project. If you are using more than one FDZ data product, you can check which FDZ data product is stored in which project directory. To do this, you can create overview lists of the available files for each project directory during remote execution. You can find an example in the FDZ templates (master.do).

Each project directory contains the subdirectories listed in Table 1. Creating any other directories or subdirectories is not allowed.

Table 1: Subdirectories in the project directory

Table 2.1: Subdirectories in the project directory
Name Description Access
orig All requested original data is provided in this directory. This folder may also contain external aggregate data (see Section 2.2.3). You have read-only permission for this directory. The data in the directories orig and data can also be accessed by programs via JoSuA.
data All user-generated data sets are stored here. A maximum of 30 GB is allowed per directory. If you exceed this maximum, the memory requirement must be reduced. You can find an example of this in the FDZ-Templates. The data in the directories orig and data can also be accessed by programs via JoSuA.
prog This folder contains all scripts (e.g. Stata do-files, R-scripts, .m-files) for the on-site use (do-files) as well as supplementary files (e.g. ado-files, R packages). Scripts and Output-Files are not automatically synchronised between JoSuA and the project directory.
log This folder contains all the result files of the on-site use, including graphs. Scripts and Output-Files are not automatically synchronised between JoSuA and the project directory.
doc This folder is used to take notes during your on-site stay. These documents are only available during your stay and cannot be released.

If you use remote data processing via remote desktop in your project, your project directory will contain additional subdirectories. These are described in Chapter 6.

2.2 Further resources

2.2.1 Ado-files (Stata)

It is not possible to download ado-files from the internet to our on-site workstations. The FDZ provides the packages of the Statistical Software Components (SSC) Archive (originally hosted at the Boston College Department of Economics (BOCODE) and provided by RePEc) in its network (see the list at https://doku.iab.de/fdz/access/stata_ado.pdf).

  • You can access those packages within the FDZ IT infrastructure. Please use the command below to install the packages. It copies the corresponding files into your prog directory:

fdzinstall packagename

This command copies all relevant files to the prog directory of the respective project. The command should only be used once per project directory, as repeated executions can lead to involuntary updates of the ados. The packages are then available in the prog directory and are automatically integrated into Stata. The installation and use of the ado-files in the prog folder are equally available in the guest stay and via JoSuA.

  • The ado-file collection provided in the FDZ guest network is updated regularly. Files copied by users to prog directories are not updated by the FDZ. To copy any newer version of an ado package into the prog directory, the command

fdzinstall packagename

command must be executed again. The previous version of the ado file will then be replaced.

To revert to the previous version in the event of any problems caused by updates, fdzinstall offers the repo() option, which can be used to call up a previous version of the ado file collection. A list of available versions can be found here (https://doku.iab.de/fdz/access/stata_ado.pdf).

  • If you intend to use ado-files, which are not part of the SSC archive or for which a new version is not yet available at the FDZ, please upload your files to JoSuA (text files only, such as .ado, .do, ...). Ado-files need to be uploaded once in the Projects tab (see Section 10.4) in Resources. Afterwards they are available for remote execution in JoSuA. Ado-files uploaded here can then also be downloaded and thus used during your on-site visit and in the remote desktop environment.

  • If you want to use ado-files, which cannot be uploaded via Resources (e.g. because they contain .mlib or .plugin files), please send them to the FDZ mailbox at least three working days prior to your on-site visit. These ado-files will also be stored in the prog folder.

  • Unfortunately, .mo files cannot be checked. Please create such files in the guest network of the FDZ. For ado-files, there are usually do-files that define the objects and have "function" or "fun" in their names. .mo files can be generated via JoSuA (independent of the mode) in your project directory as follows:

    • Create/modify the do-file that creates the .mo file; please note:

      • saving the created object with a command like this: mata mosave xyz(), replace dir(\$localprog/x)
      • Here, xyz stands for the name of the created object. The letter after "$localprog/" stands for a possible subfolder in the prog folder in your project directory where the ado-file expects the .mo file (e.g. "$localprog/x" for a file "xyz.mo").
      • Create the necessary subfolder in your prog folder before saving the .mo file with this command:
        capture mkdir \$localprog/x
    • Execute the relevant do-file, e.g. by starting it in a master.do via JoSuA.

    • If you want to create the .mo files during an on-site visit please use $prog/x instead of $localprog/x.

2.2.2 R-packages

R-packages cannot be downloaded from the internet or uploaded via JoSuA. If you require R-packages that are not available on the guest network, please contact the FDZ by email.

2.2.3 Linking external aggregated data

  • Data from external sources that does not contain any personal or establishment references (e.g. the proportion of agricultural land in the total area of a district, the distance between the capital of a country of origin and Berlin, or the daily lunar calendar) is harmless and can always be merged.

  • The linkage of individual observations relating to persons or establishments is not possible and permitted!

  • External variables at an aggregate level (e.g. unemployment rates by districts) may be merged to your dataset if they comply with the FDZ data protection guidelines. Each aggregate value must be based on at least three individual observations (in the example, at least three unemployed persons and three non-unemployed persons). This is generally the case for aggregate values from official statistical sources.

  • In this aggregate data set, variables have to be included indicating the respective number of observations underlying each variable or cell. Please specify absolute frequencies only (e.g. number of men and number of women instead of the share of women). Relative frequencies do not readily allow checking compliance with data protection. The calculation of share values should only be done within the FDZ environment.

  • The aggregate data set must be submitted as a Stata data set to the FDZ (), together with a description of the data set (including variable descriptions, the aggregation level, and a source citation). Please compress the aggregated data set with the command compress before sending it to the FDZ. Data sets sent as Excel file cannot be made available. It is also not allowed to submit aggregate data within scripts.

  • After inspection and approval by the FDZ, the external data sets will be made available in the orig folder. Please coordinate the linkage of aggregate data with the FDZ staff early on. The provision of external aggregated data in your project folder can take up to 3 working days after a successful approval.

  • If you need data sets from the FDZ’s working tools, please write an e-mail with the names of the desired data sets to the FDZ mailbox. We will then provide the data sets in your orig folder within three working days. The available FDZ working tools can be found on our website at ‘Key working Tools of the FDZ’.

2.3 Setting up a test environment at your own workplace

For the development and testing of scripts, we recommend setting up a test environment at your own workstation in order to best prepare for working via remote execution with JoSuA or on-site. In the following, you will learn how to set up a such a test environment outside the FDZ infrastructure that corresponds to the project directory on the FDZ servers. If you also use remote desktop access, you may be able to skip this section.

2.3.1 Setting up a project directory

  • Create a directory „fdz[your projectnumber]” on your PC.

  • Within this directory, create the folders orig, prog, data and log.

  • Do not create any subfolders.

  • For JoSuA and on-site use, the path globals are defined automatically. For your own test environment, however, you have to define the globals $orig, $data, $prog, $log and, if applicable, adopath before running master.do. Do not write the global definitions into master.do but into a file called profile.do (see FDZ templates). This file is automatically executed by Stata and does not have to be called by master.do. You could, for instance, save the file in the current working directory (see also http://www.stata.com/help.cgi?profile). Please do not upload the file profile.do to JoSuA.

  • In case you use more than one data product in your project, please create a project directory for each data set as described above. Please note that users cannot transfer (copy or move) data between project directories at the FDZ.

2.3.2 Downloading and preparing the test data

  • We provide test data for most of the data sets offered by the FDZ. These have the same data structure as the original data, but cannot be used for analysis. Please note the information on the respective test data on our website.

  • Save the test data in the orig folder.

  • Many data products of the FDZ contain sensitive variables, which you have to request separately. In principle, all sensitive variables are included in the test data. Therefore, you need to modify the test data in advance, according to the sensitive variables that you have requested (see FDZ template prepare_test_environment.do).

  • There are no test data for expansion modules. You can create these additional data sets by following the descriptions of the data set in the data report. For this purpose, you may have to fill variables with random numbers (see FDZ template prepare_test_environment.do).

2.3.3 Usage of FDZ template

  • You can download the FDZ templates from the FDZ website. They are only available in Stata format. In case it is not possible to execute individual calculations using Stata, please follow the basic structure of these templates using other software (e.g. R).

  • Save the scripts in the prog folder.

  • These FDZ templates were developed using the SIAB test data. You can try out your test environment setup and the FDZ templates by downloading the SIAB test data from our website. For other data sets, you have to adjust the scripts accordingly.

  • For more information on how to use the FDZ templates to prepare for working with the test data at your own workplace and as a basis for designing your own scripts, see the FDZ templates zip archive.

2.3.4 Programme packages and external aggregate data

  • To test merging and analysing external aggregate data in your test environment at home, please save the external data in the orig folder.

  • If you want to use programme packages, save them in the prog folder (see Section 2.2.1).1

2.3.5 Testing scripts before on-site visits and remote execution with JoSuA

  • Check your programmes prior to submission for remote execution with JoSuA or before an on-site visit using test data provided on the FDZ website.

  • Run master.do in your test environment.

  • Please note that the file sizes of the test data correspond only to a part of the original data. The original data and prepared data sets are significantly larger and programmes run accordingly longer.

  • After successful testing, upload your programmes unchanged to JoSuA in Internal Use Mode.


  1. Stata looks for ado-files in multiple directories by default. In order to avoid that ado-files are included during the test at home and then are not available during remote data access or on-site use, the search path in the test environment should be restricted to $prog (see FDZ template profile.do).↩︎