Experiment D.1#
Caution
Running this experiment is somewhat complicated, because it actually consists of a large number of sub-experiments: Basically, every point in the final figure is one experiment.
This experiment requires several steps:
First, you need to create the experiment directories; one for each combination of
algorithm(signal fitting and signal masking with and without observing conditions, PCA with \(n\) components) andbinning_factor. Additionally, you can repeat the whole experiment for different data sets. To create the experiment folders for a given combination, you need to run the following command:python $HSR4HCI_EXPERIMENTS_DIR/appendix/D.1_fpf-as-function-of-temporal-binning/00_create_experiments.py \ --dataset beta_picturis__lp \ --algorithm signal_fitting__oc
This is just an example; as mentioned, you will have to repeat this for different values of
datasetandalgorithm. For each data set, running this command will create a new subdirectory in the$HSR4HCI_EXPERIMENTS_DIR/appendix/D.1_fpf-as-function-of-temporal-binningwith the name of the respective data set. These subdirectories will again contain subdirectories for each algorithm.If you run the above command for all combinations in the paper, you should get the following structure:
|--beta_pictoris__lp | |--signal_fitting__oc | |--signal_fitting | |--signal_masking | |--signal_masking__oc | |--pca |--beta_pictoris__mp | |--signal_fitting__oc | |--signal_masking | |--pca | |--signal_fitting | |--signal_masking__oc |--r_cra__lp | |---signal_fitting__oc | |---signal_masking | |---pca | |---signal_fitting | |---signal_masking__oc
Each of the directories above (i.e., each combination of data set and algorithm) will again contain a set of subdirectories, one for each binning factor. This should look as follows:
00_create_submit_files.sh 01_start_jobs.sh binning_factor-0001 -> $HSR4HCI_EXPERIMENTS_DIR/main/5.1_first-results/<algorithm>/<dataset> binning_factor-0002 binning_factor-0003 binning_factor-0005 binning_factor-0007 binning_factor-0011 binning_factor-0015 binning_factor-0022 binning_factor-0031 binning_factor-0044 binning_factor-0063 binning_factor-0089 binning_factor-0126 binning_factor-0178 binning_factor-0251 binning_factor-0355 binning_factor-0502 binning_factor-0709 binning_factor-1002 binning_factor-1415 binning_factor-2000
The two shell script files (
00_create_submit_files.shand01_start_jobs.sh) are only relevant if you are running the experiment on an HTCondor cluster; otherwise, you can ignore them.The
binning_factor-0001experiment is just a symlink to the respective part of experiment 5.1 (to prevent running the same experiment multiple times). If you have not yet run that experiment, now would be a good time. All other directories are “normal” experiment directories, that is, they contain aconfig.jsonthat describes the experiment.For each
binning_factor(and, of course, eachalgorithmanddataset…), you need to run the HSR (or PCA) pipeline. For example, you want to run something like this:# HSR python $HSR4HCI_SCRIPTS_DIR/experiments/single-script/01_run_pipeline.py --experiment-dir $HSR4HCI_EXPERIMENTS_DIR/appendix/D.1_fpf-as-function-of-temporal-binning/<dataset>/<algorithm>/<binning-factor> # PCA python $HSR4HCI_SCRIPTS_DIR/experiments/run-pca/01_run_pca.py --experiment-dir $HSR4HCI_EXPERIMENTS_DIR/appendix/D.1_fpf-as-function-of-temporal-binning/<dataset>/<algorithm>/<binning-factor>
In practice, it is probably useful to do all this programmatically (i.e., via a script).
Once you have run all the experiments (i.e., all combinations of algorithm, binning factor, and data set), you need to “collect the results”, that is, run the evaluation procedure to compute the logFPF score for every experiment. For each experiment, you need to run the following command:
python $HSR4HCI_EXPERIMENTS_DIR/appendix/D.1_fpf-as-function-of-temporal-binning/01_collect_results.py \ --dataset <dataset> \ --algorithm signal_fitting__oc \ --n-jobs 8
The parameter
--n-jobscontrols the number of parallel processes that are used to collect the results (this is independent of a potential parallelization on a cluster).Running this command will create a file
metrics__<planet>.tsv(where<planet>isbfor all experiments in the paper) in$HSR4HCI_EXPERIMENTS_DIR/appendix/D.1_fpf-as-function-of-temporal-binning/<dataset>/<algorithm>. These TSV files contain the logFPF score for every binning factor.Once you have collected the results of all experiments, you can create the final plot by running:
python $HSR4HCI_EXPERIMENTS_DIR/appendix/D.1_fpf-as-function-of-temporal-binning/02_plot_logfpf_over_binning_factor.py \ --dataset <dataset>
The plot will be placed in:
$HSR4HCI_EXPERIMENTS_DIR/appendix/D.1_fpf-as-function-of-temporal-binning/<dataset>