Skip to content

2.3.0 (May 27th. 2021)

Compare
Choose a tag to compare
@savingoyal savingoyal released this 27 May 19:13
· 937 commits to master since this release
0b5cf10

Metaflow 2.3.0 Release Notes

The Metaflow 2.3.0 release is a minor release.

Features

Coordinate larger Metaflow projects with @project

It's not uncommon for multiple people to work on the same workflow simultaneously. Metaflow makes it possible by keeping executions isolated through independently stored artifacts and namespaces. However, by default, all AWS Step Functions deployments are bound to the name of the workflow. If multiple people call step-functions create independently, each deployment will overwrite the previous one.
In the early stages of a project, this simple model is convenient but as the project grows, it is desirable that multiple people can test their own AWS Step Functions deployments without interference. Or, as a single developer, you may want to experiment with multiple independent AWS Step Functions deployments of their workflow.
This release introduces a @project decorator to address this need. The @project decorator is used at the FlowSpec-level to bind a Flow to a specific project. All flows with the same project name belong to the same project.

from metaflow import FlowSpec, step, project, current

@project(name='example_project')
class ProjectFlow(FlowSpec):

    @step
    def start(self):
        print('project name:', current.project_name)
        print('project branch:', current.branch_name)
        print('is this a production run?', current.is_production)
        self.next(self.end)

    @step
    def end(self):
        pass

if __name__ == '__main__':
    ProjectFlow()
python flow.py run

The flow works exactly as before when executed outside AWS Step Functions and introduces project_name, branch_name & is_production in the current object.

On AWS Step Functions, however, step-functions create will create a new workflow example_project.user.username.ProjectFlow (where username is your user name) with a user-specific isolated namespace and a separate production token.

For deploying experimental (test) versions that can run in parallel with production, you can deploy custom branches with --branch

python flow.py --branch foo step-functions create

To deploy a production version, you can deploy with --production flag (or pair it up with --branch if you want to run multiple variants in production)

python project_flow.py --production step-functions create

Note that the isolated namespaces offered by @project work best when your code is designed to respect these boundaries. For instance, when writing results to a table, you can use current.branch_name to choose the table to write to or you can disable writes outside production by checking current.is_production.

Hyphenated-parameters support in AWS Step Functions

Prior to this release, hyphenated parameters in AWS Step Functions weren't supported through CLI.

from metaflow import FlowSpec, Parameter, step

class ParameterFlow(FlowSpec):
    foo_bar = Parameter('foo-bar',
                      help='Learning rate',
                      default=0.01)

    @step
    def start(self):
        print('foo_bar is %f' % self.foo_bar)
        self.next(self.end)

    @step
    def end(self):
        print('foo_bar is still %f' % self.foo_bar)

if __name__ == '__main__':
    ParameterFlow()

Now, users can create their flows as usual on AWS Step Functions (with step-functions create) and trigger the deployed flows through CLI with hyphenated parameters -

python flow.py step-functions trigger --foo-bar 42

State Machine execution history logging for AWS Step Functions

Metaflow now logs State Machine execution history in AWS CloudWatch Logs for deployed Metaflow flows. You can enable it by specifying --log-execution-history flag while creating the state machine

python flow.py step-functions create --log-execution-history

Note that you would need to set the environment variable (or alternatively in your Metaflow config) METAFLOW_SFN_EXECUTION_LOG_GROUP_ARN to your AWS CloudWatch Logs Log Group ARN to pipe the execution history logs to AWS CloudWatch Logs