TM1 Git Branching and Configuration Best Practices

TM1 Git Branching

tm1.json Configuration Best Practices

In our previous article, we discussed the importance of using Git version control when developing TM1 models, especially when working in a development team of several people. We mentioned that the tm1project.json file acts as a set of rules that map the server state (schema and data) to Git. In this post, we would like to share how our own tm1.json methodology works.

What is TM1Project.JSON used for, and how can we use it to track our TM1 model in Git?

In order to link our TM1 model to any Git tracking system we need a TM1Project.json configuration file. The challenge in TM1 Git tracking is that we want to manage all the components that make up the schema of a database, including schema information, the content of objects, and - in some cases - object data content and code parts defined in the schema (rules, ETL, TI processes) too. In addition, we define CI/CD steps before and after Git commands, related to schema changes and track changes to our model’s server configuration.

The TM1project.json is essentially a custom set of declarative rules to instruct our model via the Git endpoint of the TM1 REST API: which objects do we want to track and how we want to track them in Git, what process steps we want to perform when pushing to Git or pulling from Git, and which system settings we want to handle.

In the tm1project configuration file, you can list in a simple JSON structure which TM1 objects you want to include or exclude from your Git tracking, define which CI/CD tasks you want to perform, or which additional data assets you want to Git track. Within a TM1project.JSON, you can distinguish between the different system environments (dev/test/prod) of a given model, thus making a sophisticated distinction between the different CI/CD stages.

TM1 Git tracking different object types - main cases and specialties

Different TM1 object types require different strategies when tracking in Git

  • 1. Cubes: The TM1 Git integration only deals with the structure of the cubes and the related object links (rule references, views). If you want to handle data in Git, you must manage that separately.
    • Data cubes: Since these usually contain real business content, we recommend not tracking their content, as different data content is expected at dev/test/prod levels. (IMPORTANT: if the dimensionality of a cube changes, Git will not implement this. This should be done with a separate ETL change script, which should include data migration, deleting the old cube structure, and creating a new one.)
    • Control cubes: These are likely to be handled individually in each TM1 model according to the development methodology defined for that model. Note that TM1 Git tracking generally skips all } cubes. For separate handling, refer to the TM1json separately.
    • Dimensions metadata cubes: Because they are represented as data in TM1, the attributes of dimensions do not travel through the Git REST API by default. They should be taken care of individually.
  • 2. Dimensions: TM1 Git integration contains only the structure and associated object links (subsets) of dimensions and their associated hierarchies. If you want to manage the data containing the attribute values of dimension elements in Git, a separate custom way is needed.
    • Business dimensions: These dimensions make up the data cubes and parameter cubes needed for the system to work. A distinction should be made between:
      • Specific dimensions: Used only in TM1, usually maintained manually, and therefore need to be managed in Git individually, according to the needs of the model.
      • Enterprise-level DWH dimensions: It is recommended to leave these out of Git tracking completely and to take care of the content and structure of these dimensions in the ETL processes of the TM1 model.
    • Dimensions containing system object blocks: These are usually managed by the TM1 engine and are by default excluded from Git integration. In special cases, they can be included in Git tracking, but it is not recommended.
  • 3. Views: Tracking a view defined in cubes requires specific considerations across environments. If the project uses public views, it is advisable to track them. It is important that }views are not tracked, and it is useful to introduce some kind of naming pattern to make it easy to include or exclude views that do not like to be handled by Git in TM1project json.
  • 4. Subsets: The tracking of subsets defined in dimensions requires special consideration between environments. If the project uses fixed public subsets, it is advisable to track them. It is important that }subsets are not tracked, and it is useful to introduce some kind of naming pattern to make it easy to include or exclude subsets that we do not want to be handled by Git in TM1project json.
  • 5. Rules: It is important to note that the rules associated with technical cubes are omitted by default, and should be included in the TM1json if they are used.
  • 6. Processes: The code of data processing processes written in TM1's own internal procedural ETL language should always be Git tracked.
  • 7. Chores: Objects of processes used for simple orchestration and scheduling of TM1 processes are worth tracking, although special attention is needed as TM1 Git stores both scheduling and parameters, so they need to be handled individually if different parameterization or scheduling is used between environments.
  • 8. File assets: To solve the complex problems mentioned in the introduction, in the TM1project.json, you can define any files or folders with their regexp-based patterns into the TM1 Git integration. These files should be inside the .\data| folder of the model and should be organized in a separate folder.
  • 9. TM1 configuration: If you also want to Git track the parameters used in the model or hotpromote them via the Git pipeline, TM1project.json provides a way to do this.

Important Disclaimer

TM1 Git does not delete objects, we have to take care of that in a TI or some other way. That is, if an object is deleted it will be removed from the git repository, but it will not be deleted from the model if we don't ensure this by some action ourselves!

Code examples

{
    "Version": "1.0",
    // Git tracked custom files, this folder found under the model data directories and we separate
    // different use case:
    "Files": [
        // git controlled datasets, these are mostly small csv-s which have business configuration
        // data or parameters like volume curves, user settings, etc which cannot be sourced via a general
        // ETL pipeline and are maintained by usually business users but we want a trace of changes
        "GitControlledDataSet/*.*",
        // git controlled model parameters these are usually our zSYS Maintenance cubes data dumps
        // which consist of generic parameters which we want to transmit between models
        "GitControlledConfigs/*.*",
        // our custom “manually” maintained dimension CSV-s which maintained by business and we want
        // to track changes between environments
        "GitControlledDimensionCSV/*.*"
    ]
}

// our default tasks implemented by custom TI processes
"Tasks": {
    "Backup": {
        "Process": "Processes('zSYS Backup')",
        "Parameters": [
            { "Name": "pWait", "Value": "1"}
        ]
    },
    // this is handy to drop all rules before master data changes come
    // for example an element removed which had previously a reference so during git migration we can detach all rule and
    // apply the master data changes and then the git flow will deploy the new rules.
    "PrePullDropRules": {
        "Process": "Processes('zSYS Maintenance Clear All Cube Rule')"
    },
    // update all manually maintained and git tracked dimensions from the above mentioned folder
    "PostPullUpdateAllGitControlledDimFromCSV": {
        "Process": "Processes('zSYS Maintenance Dimension Import All from CSV')"
    },
    // if we need any maintenance change to do after deployment like delete an object which
    // excluded from a model, or change a system parameter, or run an ETL process , etc, we are
    // using a naming convention postpull_SOMETHING and then this process will execute those
    // processes in abc order.
    "PostPullRunAllPostPullProcess": {
        "Process": "Processes('zSYS Maintenance Run All PostPull Prefix Named Process')"
    },
    // handy cleanup task to remove unused subsets and views
    "GarbageCleanUp": {
        "Process": "Processes('zSYS Maintenance View and Subset Cleanup')",
        "Parameters": [
            { "Name": "pRun", "Value": "1"}
        ]
    }
}

// this part the generic NOT system environment specific task execution definitions
"PrePull": [],
"PostPull": [
    "Tasks('PostPullUpdateAllGitControlledDimFromCSV')",
    "Tasks('GarbageCleanUp')"
],
"PrePush": [
    "Tasks('GarbageCleanUp')"
]

// generic TM1 object exclusion / inclusion list
"Ignore": [
    "Cubes/Views",
    // special bedrock alternatives :) by default } are ignored but these we would like to track
    "!Processes('}bedrock.cube.data.export.ks')",
    "!Processes('}bedrock.cube.data.import.ks')",
    "!Processes('}bedrock.hier.export.ks')",
    "!Processes('}bedrock.hier.import.ks')",
    // element attributes rules which we want to track need to include the element attribute cubes
    "!Cubes('}ElementAttributes_Employee')",
    "!Cubes('}ElementAttributes_Organization Units')",
    "!Cubes('}ElementAttributes_Profitability Segments')",
    "!Cubes('}ElementAttributes_Versions')",
    "!Cubes('}ElementAttributes_Organization Units')",
    "!Cubes('}ElementAttributes_Simulation Case')",
    // picklists to track
    "!Cubes('}PickList_Employee Settings')",
    "!Cubes('}PickList_Headcount')",
    // example to ignore all } subset in all hierarchy
    "Dimensions/Hierarchies/Subsets('}*')",
    // main ETL pipeline handled DWH master data dimensions or environment specific dimensions which cannot be deployed
    "Dimensions('Business Partners')",
    "Dimensions('Business Partner Dummies')",
    "Dimensions('Cost Objects')",
    "Dimensions('Curve Types')",
    "Dimensions('Employee')",
    "Dimensions('Key Account Managers')",
    "Dimensions('Organization Units')",
    "Dimensions('Projects')",
    "Dimensions('Profitability Segments')",
    "Dimensions('Scenarios')",
    "Dimensions('Simulation Case')",
    "Dimensions('Versions')",
    "Dimensions('zSYS Analogic UserPool')",
    "Dimensions('zSYS Analogic System Messages')"
],
// environment specific overrides
"Deployment": {
    "dev": {
        "PrePull": [
            "Tasks('Backup')"
        ]
    },
    "preprod": {
        "PrePull": [
            "Tasks('Backup')"
        ]
    },
    "prod": {
        "PrePull": [
            "Tasks('Backup')"
        ]
    }
}

Tips & best practices

  • Before we start integrating our TM1 model with Git, it is a good idea to consult the IBM and other documentation referenced at the end of this article
  • The syntax of the TM1project.json is not simple. Leave time for experimentation
  • Before you start, carefully map out which objects you really want to Git manage and which you don't.
  • Once you start managing your model in Git, you should do everything through it. I.e. no more manual changes in environments.
  • It's a good idea to manage everything as code and track it as code (e.g. some maintenance tasks should be written as TI processes and published between environments, not executed manually.
  • TM1 git does not delete objects, so you always have to take care of that yourself.
  • Never handle large files under git Files\ because it can significantly increase deployment time
  • Once we have created a project json file that fits our model, we are ready to automate our TM1 CI/CD process, which is the topic of our next article.

    References