Cache File Operation¶

This documents how caching works within EnergyPlus-Launch. Caching is how workflow runs and output data are persisted on disk.

High Level Overview¶

At a high level, caching simply occurs by persisting a JSON file in a run directory. When a workflow starts, if the cache file does not exist, it is created. If it already exists, it is read, updated, and re-written. The cache file includes input parameters, including workflow name, and output parameters as defined by the workflow. When a workflow is done, the cache file for that directory is updated with output data. When a user browses to a folder in EnergyPlus-Launch, if it has a cache file, that is parsed and previous output data is shown.

Detailed Operation¶

In real operation within EnergyPlus-Launch, there are complications that make the operation a difficult problem:

EnergyPlus-Launch allows multiple workflows to be running, even within the same folder, and on the same file.
It is completely uncertain as to when workflows will complete, two workflows could complete in the same directory at essentially the same time.

The full documentation of the CacheFile class is shown below: CacheFile Class. The GUI creates instances of this class to read or write cache data to disk. These are the important parts of the caching operation in EnergyPlus-Launch:

When a new folder is selected, a CacheFile instance is created to read data from disk, then released.
When a workflow is run, a CacheFile in the current directory is opened and workflow parameters are written, including workflow name, weather file name, and other data.
When a workflow is completed, a CacheFile is retrieved for the workflow’s directory, results are added from the workflow, and the cache is written.

Cache File Layout¶

The cache file is a simple JSON file. At the root of the JSON is an object with a single key “workflows”, that captures the entire context The value of this key is another object with keys for each workflow. The value of each workflow key is an object with a single key, “files”, whose keys correspond to files that have been run for this workflow. Each file object has two keys: “config” and “result”. The config key captures any input data related to this run, for now it is only weather data. The result key captures all the output column data corresponding to this workflow run.

An example of the layout is provided here:

{
  "workflows": {
    "Get Site:Location": {
      "files": {
        "1ZoneEvapCooler.idf": {
          "config": {
            "weather": ""
          },
          "result": {
            "Site:Location []": "Denver Centennial CO USA WMO=724666"
          }
        }
      }
    },
    "EnergyPlus 8.9 SI": {
      "files": {
        "1ZoneEvapCooler.idf": {
          "config": {
            "weather": "MyWeather.epw"
          },
          "result": {
            "Errors": 0,
            "Warnings": 1,
            "Runtime [s]": 1.23,
            "Version": "8.9"
          }
        },
        "RefBldgMediumOfficeNew2004_Chicago.idf": {
          "config": {
            "weather": ""
          },
          "result": {
            "Errors": 0,
            "Warnings": 4,
            "Runtime [s]": 1.58,
            "Version": "8.9"
          }
        }
      }
    }
  }
}

Future Work¶

Timestamps need to be added to the run data to easily check for stale results when input files are changed.

CacheFile Class¶

This is the auto-generated documentation of the Cache module that may provide a deeper understanding of the topics described above.

class eplaunch.utilities.cache.CacheFile(working_directory: Path)

Bases: object

Represents the file that is kept in each folder where workflows have been started Keeps track of the most recent state of the file, with some metadata that is workflow dependent

Usage:

To ensure thread-safety, this class employs a form of a mutex, where the unique id is the current directory Any worker function that wants to alter the queue should follow the following process:

The worker should call the ok_to_continue() function, which will check the mutex and then wait a predetermined amount of time for the mutex to clear, or fail.
The worker should check the return value of this function and if False, fail. If True, it should set up a block on the directory by adding the current directory to the cache_files_currently_updating_or_writing array
The worker can then proceed to read the cache, modify ir, and write to disk
The worker must then release the mutex by removing the current directory from the list

FileName = '.eplaunch'

FilesKey = 'files'

ParametersKey = 'config'

QueueCheckInterval = 0.1

QueueTotalCheckTime = 5

ResultsKey = 'result'

RootKey = 'workflows'

WeatherFileKey = 'weather'

add_config(workflow_name, file_name, config_data) → None

This function is used to add a config data block for a workflow. A config data block contains data that is generally thought of as “input data” for a workflow, such as a weather file for a simulation run.

Parameters:

workflow_name – The name of the workflow to alter, as given by the workflow’s name() method
file_name – The file name of the file to alter
config_data – A map of data to write to this config section

Returns:

None

add_result(workflow_name, file_name, column_data) → None

This function is used to add a result data block for a workflow. A result data block contains data that is generally thought of as “output data” for a workflow, such as energy usage for a simulation run.

Parameters:

workflow_name – The name of the workflow to alter, as given by the workflow’s name() method
file_name – The file name of the file to alter
column_data – A map of data to write to this result section, the keys are expected to be defined by the workflow itself as given by the get_interface_columns() method

Returns:

None

get_files_for_workflow(current_workflow_name) → Dict

Gets a list of files that are found in this cache inside the given workflow name

Parameters:: current_workflow_name – The name of a workflow (as determined by the name() function on the workflow)
Returns:: A map with keys that are file names found in this workflow

ok_to_continue() → bool

This function does the check-and-wait part of the mutex. If the current directory is not blocked, it immediately returns. If the current directory is blocked, it will attempt to check over a certain amount of time, at a tight interval, to wait on the mutex to be unlocked. Ultimately if it can’t pass, it returns False.

Returns:: True or False, whether it is safe to write to this cache

read() → None

Reads the existing cache file, if it exists, and stores the data in the workflow_state instance variable. If the cache file doesn’t exist, this simply initializes the workflow_state instance variable.

Returns:: None

write() → None

Writes out the workflow state to the previously determined cache file location Note that this function does not protect for thread-safety! It is expected that functions who are altering the state of the cache should call write() within their own blocking structure

Returns:: None

eplaunch.utilities.cache.cache_files_currently_updating_or_writing: List[Path] = []: This is used as the mutex queue, the list of unique directories being altered at a given time

Cache File Operation¶

High Level Overview¶

Detailed Operation¶

Cache File Layout¶

Future Work¶

CacheFile Class¶

Table of Contents

Previous topic

Next topic

This Page