Flow Building Best Practices

Learn how to build robust and scalable flows.

Getting started with the basics

Flow status

Once you start building a new flow, its status is always draft. In this mode, you can build, debug and test. Once ready, set the status to active. Once a flow is set to active, the scheduler will actively schedule and execute flows. This is not the case in draft or paused. So the status active should always be used for all actively running flows. If a flow throws multiple errors or you want to actively prevent it from executing use the status paused. You can change the status either in the upper left corner of the FlowBuilder by clicking on the button:

... or by clicking on the edit button:

Flow triggers

There are several ways to trigger (run) a Flow:

  1. Manual trigger - Clicking the "Run Now" button on the Flow Builder (visible in the screenshot above)

  2. Schedule trigger - via Schedule Helper

  3. Webhook trigger - via Webhook Helper or Connector Webhook Trigger

  4. Flow triggering another flow - via Multi Flow Trigger

Naming flows

Always include your initials MM for Max Miller as well as a date or version in the flow name: "Salesforce contacts sync with Zendesk MM 04.03.2020", this makes it easy for everyone in your team to quickly recognize who built the flow and whether it's up to date.

Default values

In order to easily check for variable existence and default to an empty string, use this pattern:

{{ property.country | default() }}

This defaults to an empty string if the field/value for country doesn’t exist.

{{ property.country | default("Germany") }}

This defaults to Germany. This is particularly useful when dealing with an ERP System for example, and your Sales team only filled the country field when it's not Germany.

Further best practices

  • Include some comments in the description of the Flow under the edit button for everyone to know what the flow is about

  • Edit each individual Connector description text on the workspace to tell others (or yourself when you return a month later) what each step does

  • Improve the short-references (e.g. zen1 for Zendesk); ideally, rename this to zendeskContacts or zendesk_contacts, depending on your preference of camel case vs. snake case

  • Once a flow is ready to be deployed, set the status to "live" to let everyone know it is ready to go

  • Reduce the flow in its components and try running (debugging) one step at a time rather than building everything at once. Once the first steps work, add in more functionality.

  • Initially, use test-data sets that are rather small, just to test the functionality of each step properly. For CSV, XML, or JSON files you can e.g. upload test data to AWS S3, Google Drive, or Dropbox and use them. If you query a REST API, try filtering e.g. by date range to keep the data set small.

  • Look at other flows you already built or find in a template on the Flow Template Marketplace.

  • If you send emails, only send them to yourself for testing purposes.

  • If you need more space, you can zoom in and out by pressing the CMD key on a Mac or the CTRL key on a PC and zoom in or out with your mouse.

  • CMD/CTRL + Z and CMD/CTRL + Shift + Z can be used to revert graphical adjustments in Flows (moving connectors and helpers on the flow builder canvas)

Advanced best practices

These best practices are most likely not relevant when you get to know the app at first, however, they will come in handy once you build larger flows, such as multi-entity import flows.

The maximum data of each step output is 50MB, in case this limited is reached, the step will fail with a message pointing out the size. Talk to us if this limit is not sufficient for your use case.

Splitting large Flows into Sub-Flows

Once your Flows grows to cover more edge cases, do more complicated imports, or handle a variety of data sources, you should think about splitting your Flow into smaller Sub-Flows using the Multi Flow Trigger Helper.

This has multiple advantages:

  • Different parts of the Flow are clearly separated, which makes it easier to see at a glance what each Sub-Flow does and to troubleshoot it

  • You are able to reuse Sub-Flows across multiple Flows, which reduces maintenance and prevents differences across Flows (e.g. you have to do the same 10 steps for multiple different Flows, instead of copying the steps into each Flow, you create a separate Flow for it, which will be triggered by the other Flows)

  • Performance of your Flows can be increased if you trigger Sub-Flows "async" (e.g. within a loop), as then multiple Sub-Flows can run in parallel

Preparing large datasets before looping over them

Often it makes sense to do some data preparation with Helpers that handle large datasets with ease such as the Spreadsheet Helper or a Jinja Loop with the Dict Helper's Define Variables action, before looping over data and doing the same operations for each iteration instead of once beforehand.

Example: Mapping of multiple flat files

A typical importing scenario is that data from flat files (e.g. CSV files) should be imported to a system via a regular Connector.

However, the data that should be sent in a single api call might be spread across multiple flat files, thus some mapping needs to happen in order to import the data to the system.

Let's suppose the api expects data in the following format, where the main dictionary is company data and inside of it, a list of company addresses should be present:

{
  "id": "...",
  "name": "...",
  ...,
  "addresses": [
    {
      "street": "..."
      ...
    },
    {
      ...
    }
  ]
}

One could achieve this by building a Flow like this:

The following is happening there:

  1. Data is read from the CSV files

  2. A loop iterates over the companies

  3. Addresses are filtered for the addresses of the company

  4. Another loop loops over the addresses themselves and prepares them in the expected format

  5. Inside the last step, the company data is prepared, with a reference to the address loop and a request to the api is sent in order to create the company

This means that for each company, the entire companies address data is first loaded again, to be then filtered and finally looped over. For one iteration this might take only a few seconds (dependent on the size of the companies address data file; for large file, some performance improvements can be already gained by using the Spreadsheet Helper's Query Spreadsheet action), but this quickly adds up for each iteration.

Instead, this should happen directly in a Dict Helper with the Define Variables action before the loop:

Inside the Prepare Company Data step the following is happening:

[
  {% for company in companies %}
    {
      "id": "{{ company.id }}",
      "name": {% if probe(company, 'company_name') %}"{{ company.company_name }}"{% else %}None{% endif %},
      ...,
      "addresses": [
        {% for address in companies_addresses | selectattr('company_id', '==', company.id) %}
            {
              "street": {% if probe(address, 'street_1') %}"{{ address.street_1 }}"{% else %}None{% endif %},
              ...
            }
          {% if not loop.last %},{% endif %}
        {% endfor %}
      ]
    }
    {% if not loop.last %},{% endif %}
  {% endfor %}
]

Which results in a list like this:

[
  {
    "id": "...",
    "name": "...",
    ...,
    "addresses": [
      {
        "street": "..."
        ...
      },
      {
        ...
      }
    ]
  },
  {
    ...
  }
]

Now, the loop simply loops over this list and the Create Company step only has to reference to {{ company }}, as the data has been prepared before the loop already.

The Dict Helper step will, depending on the data size, take a bit longer, however, this additional time is quickly gained by shaving off multiple seconds for each loop iteration.

In one similar production Flow, the Flow speed could be increased 14 times, which cut flow run time for an initial import from more than 25 hours to less than 2 hours.

Changing large amounts of dictionaries from a list

In some cases, you might want to do some kind of data cleaning, where you e.g. change the values of a few columns, create the hash of entire rows, etc.

Instead of using the Looper Helper and a Dict Helper, we recommend to use either a Dict Helper and build a Jinja Loop in there or to use the Spreadsheet Helper's Query Action, as both can handle large amounts of data in a more performant way than the Looper.

For the Dict Helper option, you can create a Jinja loop similar to this one with the "Define variables" action:

[
  {% for row in data %}
    {
      "field_1": "{{ row.field_1 | some_data_cleaning }}",
      "field_2": "{{ row.field_2 | some_data_cleaning }}"
    }
    {% if not loop.last %},{% endif %}
  {% endfor %}
]

For the Spreadsheet Helper option, you can make use of regular SQL.

Last updated