Public/ClustersAPI.ps1

#requires -Version 3.0
Function Add-DatabricksCluster
{
    <#
            .SYNOPSIS
            Creates a new Spark cluster. This method acquires new instances from the cloud provider if necessary. This method is asynchronous; the returned cluster_id can be used to poll the cluster state. When this method returns, the cluster is in a PENDING state. The cluster is usable once it enters a RUNNING state. See ClusterState.
            You can either specify all single properties of the cluster on your own or provide a cluster object that contains all the properties.
            Single properties will overwrite the values in the cluster object!
            .DESCRIPTION
            Creates a new Spark cluster. This method acquires new instances from the cloud provider if necessary. This method is asynchronous; the returned cluster_id can be used to poll the cluster state. When this method returns, the cluster is in a PENDING state. The cluster is usable once it enters a RUNNING state. See ClusterState.
            Official API Documentation: https://docs.databricks.com/api/latest/clusters.html#create
            .PARAMETER ClusterObject
            A PowerShell object representing the definition of a cluster according to Databricks documentation.
            .PARAMETER NumWorkers
            Number of worker nodes that this cluster should have. A cluster has one Spark Driver and num_workers Executors for a total of num_workers + 1 Spark nodes.
            Note: When reading the properties of a cluster, this field reflects the desired number of workers rather than the actual current number of workers. For instance, if a cluster is resized from 5 to 10 workers, this field will immediately be updated to reflect the target size of 10 workers, whereas the workers listed in spark_info will gradually increase from 5 to 10 as the new nodes are provisioned.
            .PARAMETER MinWorkers
            The minimum number of workers to provision for this autoscale-enabled cluster.
            .PARAMETER MaxWorkers
            The maximum number of workers to provision for this autoscale-enabled cluster.
            .PARAMETER ClusterName
            Cluster name requested by the user. This doesn't have to be unique. If not specified at creation, the cluster name will be an empty string.
            .PARAMETER SparkVersion
            The Spark version of the cluster. A list of available Spark versions can be retrieved by using the List Zones API call. This field is required.
            .PARAMETER SparkConf
            An object containing a set of optional, user-specified Spark configuration key-value pairs. You can also pass in a string of extra JVM options to the driver and the executors via spark.driver.extraJavaOptions and spark.executor.extraJavaOptions respectively. Example Spark confs: {"spark.speculation": true, "spark.streaming.ui.retainedBatches": 5} or {"spark.driver.extraJavaOptions": "-verbose:gc -XX:+PrintGCDetails"}
            .PARAMETER AwsAttributes
            Attributes related to clusters running on Amazon Web Services. If not specified at cluster creation, a set of default values will be used.
            .PARAMETER NodeTypeId
            This field encodes, through a single value, the resources available to each of the Spark nodes in this cluster. For example, the Spark nodes can be provisioned and optimized for memory or compute intensive workloads A list of available node types can be retrieved by using the List Node Types API call. This field is required.
            .PARAMETER DriverNodeTypeId
            The node type of the Spark driver. Note that this field is optional; if unset, the driver node type will be set as the same value as node_type_id defined above.
            .PARAMETER SshPublicKeys
            SSH public key contents that will be added to each Spark node in this cluster. The corresponding private keys can be used to login with the user name ubuntu on port 2200. Up to 10 keys can be specified.
            .PARAMETER CustomTags
            Additional tags for cluster resources. Databricks will tag all cluster resources (e.g., AWS instances and EBS volumes) with these tags in addition to default_tags. Notes:
            Tags are not supported on legacy node types such as compute-optimized and memory-optimized
            Databricks allows at most 45 custom tags
            .PARAMETER ClusterLogConf
            The configuration for delivering Spark logs to a long-term storage destination. Only one destination can be specified for one cluster. If the conf is given, the logs will be delivered to the destination every 5 mins. The destination of driver logs is <destination>/<cluster-id>/driver, while the destination of executor logs is <destination>/<cluster-id>/executor.
            .PARAMETER InitScripts
            The configuration for storing init scripts. Any number of destinations can be specified. The scripts are executed sequentially in the order provided. If cluster_log_conf is specified, init script logs are sent to <destination>/<cluster-id>/init_scripts.
            $init_scripts = @( @{ "dbfs" = @{ "destination" = "dbfs:/databricks/my-init-script.sh"; }; } )
            .PARAMETER SparkEnvVars
            An object containing a set of optional, user-specified environment variable key-value pairs. Key-value pairs of the form (X,Y) are exported as is (i.e., export X='Y') while launching the driver and workers. In order to specify an additional set of SPARK_DAEMON_JAVA_OPTS, we recommend appending them to $SPARK_DAEMON_JAVA_OPTS as shown in the example below. This ensures that all default databricks managed environmental variables are included as well. Example Spark environment variables: {"SPARK_WORKER_MEMORY": "28000m", "SPARK_LOCAL_DIRS": "/local_disk0"} or {"SPARK_DAEMON_JAVA_OPTS": "$SPARK_DAEMON_JAVA_OPTS -Dspark.shuffle.service.enabled=true"}
            .PARAMETER AutoterminationMinutes
            Automatically terminates the cluster after it is inactive for this time in minutes. If not set, this cluster will not be automatically terminated. If specified, the threshold must be between 10 and 10000 minutes. You can also set this value to 0 to explicitly disable automatic termination.
            .PARAMETER EnableElasticDisk
            Autoscaling Local Storage: when enabled, this cluster will dynamically acquire additional disk space when its Spark workers are running low on disk space. This feature requires specific AWS permissions to function correctly - refer to Autoscaling local storage for details.
            .PARAMETER PythonVersion
            Allows you to explicitly set the Python version for the cluster by adding the entry 'PYSPARK_PYTHON' to the SparkEnvVars parameter. Default is Python 2 (2.7)
            For details please refer to https://docs.azuredatabricks.net/user-guide/clusters/python3.html
            .EXAMPLE
            Add-DatabricksCluster -NumWorkers 2 -ClusterName "MyCluster" -SparkVersion "4.0.x-scala2.11" -NodeTypeId 'Standard_DS3_v2'
    #>

    [CmdletBinding()]
    param
    (
        [Parameter(ParameterSetName = "FixedSize", Mandatory = $true, Position = 1)] [int32] $NumWorkers,
        [Parameter(ParameterSetName = "Autoscale", Mandatory = $true, Position = 1)] [int32] $MinWorkers, 
        [Parameter(ParameterSetName = "Autoscale", Mandatory = $true, Position = 2)] [int32] $MaxWorkers, 
        
        [Parameter(ParameterSetName = "ClusterObject", Mandatory = $false, Position = 3)] [object] $ClusterObject,
        
        [Parameter(Mandatory = $false, Position = 3)] [string] $ClusterName, 
        [Parameter(Mandatory = $false, Position = 3)] [string] $SparkVersion, 
        [Parameter(Mandatory = $false, Position = 4)] [hashtable] $SparkConf, 
        [Parameter(Mandatory = $false, Position = 5)] [hashtable] $AwsAttributes, 
        [Parameter(Mandatory = $false, Position = 6)] [string] $NodeTypeId, 
        [Parameter(Mandatory = $false, Position = 7)] [string] $DriverNodeTypeId, 
        [Parameter(Mandatory = $false, Position = 8)] [string[]] $SshPublicKeys, 
        [Parameter(Mandatory = $false, Position = 9)] [hashtable] $CustomTags, 
        [Parameter(Mandatory = $false, Position = 10)] [object] $ClusterLogConf, 
        [Parameter(Mandatory = $false, Position = 11)] [object[]] $InitScripts, 
        [Parameter(Mandatory = $false, Position = 12)] [hashtable] $SparkEnvVars, 
        [Parameter(Mandatory = $false, Position = 13)] [int32] $AutoterminationMinutes, 
        [Parameter(Mandatory = $false, Position = 14)] [bool] $EnableElasticDisk,
        [Parameter(Mandatory = $false, Position = 15)] [string] [ValidateSet("2 (2.7)", "3 (3.5)")] $PythonVersion = "3 (3.5)",
        [Parameter(Mandatory = $false, Position = 16)] [string] [ValidateSet("HighConcurrency", "Standard")] $ClusterMode
    )

    $requestMethod = "POST"
    $apiEndpoint = "/2.0/clusters/create"

    #Set parameters
    Write-Verbose "Building Body/Parameters for final API call ..."
    if($ClusterObject)
    {
        $parameters = $ClusterObject | ConvertTo-Hashtable
    }
    else
    {
        $parameters = @{}
    }
    
    if($PythonVersion) # check if a PythonVersion was explicitly specified
    {
        if(-not $SparkEnvVars) # ensure that the SparkEnvVars variable exists - otherwise create it as empty hashtable
        {
            $SparkEnvVars = @{}
        }
        switch($PythonVersion) # set PYSPARK_PYTHON environment variable accordingly
        { 
            '2 (2.7)'  { $SparkEnvVars | Add-Property -Name 'PYSPARK_PYTHON' -Value '/databricks/python/bin/python' -Force } 
            '3 (3.5)'  { $SparkEnvVars | Add-Property -Name 'PYSPARK_PYTHON' -Value '/databricks/python3/bin/python3' -Force }
        }
        Write-Verbose "PythonVersion set to $PythonVersion"
    }
    
    if($ClusterMode) # check if a ClusterMode was explicitly specified
    {
        if(-not $CustomTags) # ensure that the SparkConf variable exists - otherwise create it as empty hashtable
        {
            $CustomTags = @{}
        }
        switch($ClusterMode) # set PYSPARK_PYTHON environment variable accordingly
        { 
            'Standard'  { $CustomTags | Add-Property -Name "ResourceClass" -Value "Standard" -Force } 
            'HighConcurrency'  { $CustomTags | Add-Property -Name "ResourceClass" -Value "Serverless" -Force }
        }
        Write-Verbose "ClusterMode set to $ClusterMode"
    }

    $parameters | Add-Property -Name "cluster_name" -Value $ClusterName -Force
    $parameters | Add-Property -Name "spark_version" -Value $SparkVersion -Force
    $parameters | Add-Property -Name "node_type_id" -Value $NodeTypeId -Force
    $parameters | Add-Property -Name "spark_conf" -Value $SparkConf -Force
    $parameters | Add-Property -Name "aws_attributes" -Value $AwsAttributes -Force
    $parameters | Add-Property -Name "driver_node_type_id" -Value $DriverNodeTypeId -Force
    $parameters | Add-Property -Name "ssh_public_keys" -Value $SshPublicKeys -Force
    $parameters | Add-Property -Name "custom_tags" -Value $CustomTags -Force
    $parameters | Add-Property -Name "cluster_log_conf" -Value $ClusterLogConf -Force
    $parameters | Add-Property -Name "init_scripts" -Value $InitScripts -Force
    $parameters | Add-Property -Name "spark_env_vars" -Value $SparkEnvVars -Force
    $parameters | Add-Property -Name "autotermination_minutes" -Value $AutoterminationMinutes -NullValue 0 -Force
    $parameters | Add-Property -Name "enable_elastic_disk" -Value $EnableElasticDisk -Force
    
    switch($PSCmdlet.ParameterSetName) 
    { 
        "FixedSize"  { $parameters | Add-Property -Name "num_workers" -Value $NumWorkers -Force } 
        "Autoscale"  { $parameters | Add-Property -Name "autoscale" -Value @{ min_workers = $MinWorkers; max_workers = $MaxWorkers } -Force }
    } 

    $result = Invoke-DatabricksApiRequest -Method $requestMethod -EndPoint $apiEndpoint -Body $parameters
    
    return $result
}

Function Update-DatabricksCluster
{
    <#
            .SYNOPSIS
            Edit the configuration of a cluster to match the provided attributes and size.
            You can edit a cluster if it is in a RUNNING or TERMINATED state. If you edit a cluster while it is in a RUNNING state, it will be restarted so that the new attributes can take effect. If you edit a cluster while it is in a TERMINATED state, it will remain TERMINATED. The next time it is started using the clusters/start API, the new attributes will take effect. An attempt to edit a cluster in any other state will be rejected with an INVALID_STATE error code.
            Clusters created by the Databricks Jobs service cannot be edited.
            You can either specify all single properties of the cluster on your own or provide a cluster object that contains all the properties.
            Single properties will overwrite the values in the cluster object!
            .DESCRIPTION
            Edit the configuration of a cluster to match the provided attributes and size.
            You can edit a cluster if it is in a RUNNING or TERMINATED state. If you edit a cluster while it is in a RUNNING state, it will be restarted so that the new attributes can take effect. If you edit a cluster while it is in a TERMINATED state, it will remain TERMINATED. The next time it is started using the clusters/start API, the new attributes will take effect. An attempt to edit a cluster in any other state will be rejected with an INVALID_STATE error code.
            Clusters created by the Databricks Jobs service cannot be edited.
            Official API Documentation: https://docs.databricks.com/api/latest/clusters.html#edit
             
            .PARAMETER ClusterID
            The ID of the cluster to be edited.
            .PARAMETER ClusterObject
            A PowerShell object representing the definition of a cluster according to Databricks documentation.
            .PARAMETER NumWorkers
            Number of worker nodes that this cluster should have. A cluster has one Spark Driver and num_workers Executors for a total of num_workers + 1 Spark nodes.
            Note: When reading the properties of a cluster, this field reflects the desired number of workers rather than the actual current number of workers. For instance, if a cluster is resized from 5 to 10 workers, this field will immediately be updated to reflect the target size of 10 workers, whereas the workers listed in spark_info will gradually increase from 5 to 10 as the new nodes are provisioned.
            .PARAMETER MinWorkers
            The minimum number of workers to provision for this autoscale-enabled cluster.
            .PARAMETER MaxWorkers
            The maximum number of workers to provision for this autoscale-enabled cluster.
            .PARAMETER ClusterName
            Cluster name requested by the user. This doesn't have to be unique. If not specified at creation, the cluster name will be an empty string.
            .PARAMETER SparkVersion
            The Spark version of the cluster. A list of available Spark versions can be retrieved by using the List Zones API call. This field is required.
            .PARAMETER SparkConf
            An object containing a set of optional, user-specified Spark configuration key-value pairs. You can also pass in a string of extra JVM options to the driver and the executors via spark.driver.extraJavaOptions and spark.executor.extraJavaOptions respectively. Example Spark confs: {"spark.speculation": true, "spark.streaming.ui.retainedBatches": 5} or {"spark.driver.extraJavaOptions": "-verbose:gc -XX:+PrintGCDetails"}
            .PARAMETER AwsAttributes
            Attributes related to clusters running on Amazon Web Services. If not specified at cluster creation, a set of default values will be used.
            .PARAMETER NodeTypeId
            This field encodes, through a single value, the resources available to each of the Spark nodes in this cluster. For example, the Spark nodes can be provisioned and optimized for memory or compute intensive workloads A list of available node types can be retrieved by using the List Node Types API call. This field is required.
            .PARAMETER Drive_NodeTypeId
            The node type of the Spark driver. Note that this field is optional; if unset, the driver node type will be set as the same value as node_type_id defined above.
            .PARAMETER SshPublicKeys
            SSH public key contents that will be added to each Spark node in this cluster. The corresponding private keys can be used to login with the user name ubuntu on port 2200. Up to 10 keys can be specified.
            .PARAMETER CustomTags
            Additional tags for cluster resources. Databricks will tag all cluster resources (e.g., AWS instances and EBS volumes) with these tags in addition to default_tags. Notes:
            Tags are not supported on legacy node types such as compute-optimized and memory-optimized
            Databricks allows at most 45 custom tags
            .PARAMETER ClusterLogConf
            The configuration for delivering Spark logs to a long-term storage destination. Only one destination can be specified for one cluster. If the conf is given, the logs will be delivered to the destination every 5 mins. The destination of driver logs is <destination>/<cluster-id>/driver, while the destination of executor logs is <destination>/<cluster-id>/executor.
            .PARAMETER InitScripts
            The configuration for storing init scripts. Any number of destinations can be specified. The scripts are executed sequentially in the order provided. If cluster_log_conf is specified, init script logs are sent to <destination>/<cluster-id>/init_scripts.
            $init_scripts = @( @{ "dbfs" = @{ "destination" = "dbfs:/databricks/my-init-script.sh"; }; } )
            .PARAMETER SparkEnvVars
            An object containing a set of optional, user-specified environment variable key-value pairs. Key-value pairs of the form (X,Y) are exported as is (i.e., export X='Y') while launching the driver and workers. In order to specify an additional set of SPARK_DAEMON_JAVA_OPTS, we recommend appending them to $SPARK_DAEMON_JAVA_OPTS as shown in the example below. This ensures that all default databricks managed environmental variables are included as well. Example Spark environment variables: {"SPARK_WORKER_MEMORY": "28000m", "SPARK_LOCAL_DIRS": "/local_disk0"} or {"SPARK_DAEMON_JAVA_OPTS": "$SPARK_DAEMON_JAVA_OPTS -Dspark.shuffle.service.enabled=true"}
            .PARAMETER AutoterminationMinutes
            Automatically terminates the cluster after it is inactive for this time in minutes. If not set, this cluster will not be automatically terminated. If specified, the threshold must be between 10 and 10000 minutes. You can also set this value to 0 to explicitly disable automatic termination.
            .PARAMETER EnableElasticDisk
            Autoscaling Local Storage: when enabled, this cluster will dynamically acquire additional disk space when its Spark workers are running low on disk space. This feature requires specific AWS permissions to function correctly - refer to Autoscaling local storage for details.
            .EXAMPLE
            Update-DatabricksCluster -NumWorkers 2 -ClusterName "MyCluster" -SparkVersion "4.0.x-scala2.11" -NodeTypeId "i3.xlarge"
    #>

    [CmdletBinding(DefaultParametersetName = "FixedSize")]
    param
    (
        [Parameter(Mandatory = $true, Position = 1, ValueFromPipelineByPropertyName = $true)] [Alias("cluster_id")] [string] $ClusterID, 

        [Parameter(ParameterSetName = "FixedSize", Mandatory = $false, Position = 2)] [int32] $NumWorkers,
        [Parameter(ParameterSetName = "Autoscale", Mandatory = $false, Position = 2)] [int32] $MinWorkers, 
        [Parameter(ParameterSetName = "Autoscale", Mandatory = $false, Position = 3)] [int32] $MaxWorkers, 
        
        [Parameter(Mandatory = $false, Position = 3)] [object] $ClusterObject,
        [Parameter(Mandatory = $false, Position = 3)] [string] $ClusterName, 
        [Parameter(Mandatory = $false, Position = 3)] [string] $SparkVersion, 
        [Parameter(Mandatory = $false, Position = 4)] [hashtable] $SparkConf, 
        [Parameter(Mandatory = $false, Position = 5)] [hashtable] $AwsAttributes, 
        [Parameter(Mandatory = $false, Position = 6)] [string] $NodeTypeId, 
        [Parameter(Mandatory = $false, Position = 7)] [string] $DriverNodeTypeId, 
        [Parameter(Mandatory = $false, Position = 8)] [string[]] $SshPublicKeys, 
        [Parameter(Mandatory = $false, Position = 9)] [hashtable] $CustomTags, 
        [Parameter(Mandatory = $false, Position = 10)] [object] $ClusterLogConf, 
        [Parameter(Mandatory = $false, Position = 11)] [object[]] $InitScripts, 
        [Parameter(Mandatory = $false, Position = 12)] [hashtable] $SparkEnvVars, 
        [Parameter(Mandatory = $false, Position = 13)] [int32] $AutoterminationMinutes, 
        [Parameter(Mandatory = $false, Position = 14)] [bool] $EnableElasticDisk,
        [Parameter(Mandatory = $false, Position = 15)] [string] [ValidateSet("2 (2.7)", "3 (3.5)")] $PythonVersion
    )
    
    $requestMethod = "POST"
    $apiEndpoint = "/2.0/clusters/edit"

    #Set parameters
    Write-Verbose "Building Body/Parameters for final API call ..."
    if($ClusterObject)
    {
        $parameters = $ClusterObject | ConvertTo-Hashtable
    }
    else
    {
        $parameters = @{}
    }
    
    if($PythonVersion) # check if a PythonVersion was explicitly specified
    {
        if(-not $SparkEnvVars) # ensure that the SparkEnvVars variable exists - otherwise create it as empty hashtable
        {
            $SparkEnvVars = @{}
        }
        switch($PythonVersion) # set PYSPARK_PYTHON environment variable accordingly
        { 
            '2 (2.7)'  { $SparkEnvVars | Add-Property -Name 'PYSPARK_PYTHON' -Value '/databricks/python/bin/python' -Force } 
            '3 (3.5)'  { $SparkEnvVars | Add-Property -Name 'PYSPARK_PYTHON' -Value '/databricks/python3/bin/python3' -Force }
        }
        Write-Verbose "PythonVersion set to $PythonVersion"
    }

    $parameters | Add-Property -Name "cluster_id" -Value $ClusterID -Force
    $parameters | Add-Property -Name "cluster_name" -Value $ClusterName -Force
    $parameters | Add-Property -Name "spark_version" -Value $SparkVersion -Force
    $parameters | Add-Property -Name "node_type_id" -Value $NodeTypeId -Force
    $parameters | Add-Property -Name "spark_conf" -Value $SparkConf -Force
    $parameters | Add-Property -Name "aws_attributes" -Value $AwsAttributes -Force
    $parameters | Add-Property -Name "driver_node_type_id" -Value $DriverNodeTypeId -Force
    $parameters | Add-Property -Name "ssh_public_keys" -Value $SshPublicKeys -Force
    $parameters | Add-Property -Name "custom_tags" -Value $CustomTags -Force
    $parameters | Add-Property -Name "cluster_log_conf" -Value $ClusterLogConf -Force
    $parameters | Add-Property -Name "init_scripts" -Value $InitScripts -Force
    $parameters | Add-Property -Name "spark_env_vars" -Value $SparkEnvVars -Force
    $parameters | Add-Property -Name "autotermination_minutes" -Value $AutoterminationMinutes -NullValue 0 -Force
    $parameters | Add-Property -Name "enable_elastic_disk" -Value $EnableElasticDisk -Force
    
    switch($PSCmdlet.ParameterSetName) 
    { 
        "FixedSize"  { $parameters | Add-Property -Name "num_workers" -Value $NumWorkers -Force } 
        "Autoscale"  { $parameters | Add-Property -Name "autoscale" -Value @{ min_workers = $MinWorkers; max_workers = $MaxWorkers } -Force }
    }
    
    $result = Invoke-DatabricksApiRequest -Method $requestMethod -EndPoint $apiEndpoint -Body $parameters

    return (ConvertTo-PSObject -InputObject $parameters)
}

Function Start-DatabricksCluster
{
    <#
            .SYNOPSIS
            Starts a terminated Spark cluster given its ID. This is similar to createCluster, except:
            .DESCRIPTION
            Starts a terminated Spark cluster given its ID. This is similar to createCluster, except:
            Official API Documentation: https://docs.databricks.com/api/latest/clusters.html#start
            .PARAMETER ClusterID
            The cluster to be started. This field is required.
            .EXAMPLE
            Start-DatabricksCluster -ClusterID "1202-211320-brick1"
    #>

    [CmdletBinding()]
    param
    (
        [Parameter(Mandatory = $true, Position = 1, ValueFromPipelineByPropertyName = $true)] [Alias("cluster_id")] [string] $ClusterID
    )
    begin {
        $requestMethod = "POST"
        $apiEndpoint = "/2.0/clusters/start"
    }

    process {
        #Set parameters
        Write-Verbose "Building Body/Parameters for final API call ..."
        $parameters = @{
            cluster_id = $ClusterID 
        }

        $result = Invoke-DatabricksApiRequest -Method $requestMethod -EndPoint $apiEndpoint -Body $parameters

        return $result
    }
}

Function Restart-DatabricksCluster
{
    <#
            .SYNOPSIS
            Restarts a Spark cluster given its id. If the cluster is not in a RUNNING state, nothing will happen.
            .DESCRIPTION
            Restarts a Spark cluster given its id. If the cluster is not in a RUNNING state, nothing will happen.
            Official API Documentation: https://docs.databricks.com/api/latest/clusters.html#restart
            .PARAMETER ClusterID
            The cluster to be started. This field is required.
            .EXAMPLE
            Restart-DatabricksCluster -ClusterID "1202-211320-brick1"
    #>

    [CmdletBinding()]
    param
    (
        [Parameter(Mandatory = $true, Position = 1, ValueFromPipelineByPropertyName = $true)] [Alias("cluster_id")] [string] $ClusterID
    )
    begin {
        $requestMethod = "POST"
        $apiEndpoint = "/2.0/clusters/restart"
    }
    
    process {
        Write-Verbose "Building Body/Parameters for final API call ..."
        #Set parameters
        $parameters = @{
            cluster_id = $ClusterID 
        }

        $result = Invoke-DatabricksApiRequest -Method $requestMethod -EndPoint $apiEndpoint -Body $parameters

        return $result
    }
}

Function Stop-DatabricksCluster
{
    <#
            .SYNOPSIS
            Terminates a Spark cluster given its id. The cluster is removed asynchronously. Once the termination has completed, the cluster will be in a TERMINATED state. If the cluster is already in a TERMINATING or TERMINATED state, nothing will happen.
            .DESCRIPTION
            Terminates a Spark cluster given its id. The cluster is removed asynchronously. Once the termination has completed, the cluster will be in a TERMINATED state. If the cluster is already in a TERMINATING or TERMINATED state, nothing will happen.
            Official API Documentation: https://docs.databricks.com/api/latest/clusters.html#delete-terminate
            .PARAMETER ClusterID
            The cluster to be terminated. This field is required.
            .EXAMPLE
            Stop-DatabricksCluster -ClusterID "1202-211320-brick1"
    #>

    [CmdletBinding()]
    param
    (
        [Parameter(Mandatory = $true, Position = 1, ValueFromPipelineByPropertyName = $true)] [Alias("cluster_id")] [string] $ClusterID
    )
    begin {
        $requestMethod = "POST"
        $apiEndpoint = "/2.0/clusters/delete"
    }
    
    process {
        Write-Verbose "Building Body/Parameters for final API call ..."
        #Set parameters
        $parameters = @{
            cluster_id = $ClusterID 
        }

        $result = Invoke-DatabricksApiRequest -Method $requestMethod -EndPoint $apiEndpoint -Body $parameters

        return $result
    }
}

Function Resize-DatabricksCluster
{
    <#
            .SYNOPSIS
            Resize a cluster to have a desired number of workers. This will fail unless the cluster is in a RUNNING state.
            .DESCRIPTION
            Resize a cluster to have a desired number of workers. This will fail unless the cluster is in a RUNNING state.
            Official API Documentation: https://docs.databricks.com/api/latest/clusters.html#resize
            .PARAMETER ClusterID
            The cluster to be resized. This field is required.
            .PARAMETER NumWorkers
            Number of worker nodes that this cluster should have. A cluster has one Spark Driver and num_workers Executors for a total of num_workers + 1 Spark nodes.
            Note: When reading the properties of a cluster, this field reflects the desired number of workers rather than the actual current number of workers. For instance, if a cluster is resized from 5 to 10 workers, this field will immediately be updated to reflect the target size of 10 workers, whereas the workers listed in spark_info will gradually increase from 5 to 10 as the new nodes are provisioned.
            .PARAMETER MinWorkers
            The minimum number of workers to provision for this autoscale-enabled cluster.
            .PARAMETER MaxWorkers
            The maximum number of workers to provision for this autoscale-enabled cluster.
            .EXAMPLE
            Resize-DatabricksCluster -ClusterID "1202-211320-brick1" -NumWorkers 10
    #>

    [CmdletBinding()]
    param
    (
        [Parameter(Mandatory = $true, Position = 1, ValueFromPipelineByPropertyName = $true)] [Alias("cluster_id")] [string] $ClusterID,
        [Parameter(ParameterSetName = "NumberOfWorkers", Mandatory = $true, Position = 2)] [int32] $NumWorkers,
        [Parameter(ParameterSetName = "Autoscale", Mandatory = $true, Position = 2)] [int32] $MinWorkers, 
        [Parameter(ParameterSetName = "Autoscale", Mandatory = $true, Position = 3)] [int32] $MaxWorkers
    )
    begin {
        $requestMethod = "POST"
        $apiEndpoint = "/2.0/clusters/resize"
    }
    
    process {
        Write-Verbose "Building Body/Parameters for final API call ..."
        #Set parameters
        $parameters = @{
            cluster_id = $ClusterID 
        }
        
        switch($PSCmdlet.ParameterSetName) 
        { 
            "NumberOfWorkers"  { $parameters | Add-Property -Name "num_workers" -Value $NumWorkers -Force } 
            "Autoscale"  { $parameters | Add-Property -Name "autoscale" -Value @{ min_workers = $MinWorkers; max_workers = $MaxWorkers } -Force }
        } 

        $result = Invoke-DatabricksApiRequest -Method $requestMethod -EndPoint $apiEndpoint -Body $parameters

        return $result
    }
}

Function Remove-DatabricksCluster
{
    <#
            .SYNOPSIS
            Permanently deletes a Spark cluster. If the cluster is running, it is terminated and its resources are asynchronously removed. If the cluster is terminated, then it is immediately removed.
            .DESCRIPTION
            Permanently deletes a Spark cluster. If the cluster is running, it is terminated and its resources are asynchronously removed. If the cluster is terminated, then it is immediately removed.
            Official API Documentation: https://docs.databricks.com/api/latest/clusters.html#permanent-delete
            .PARAMETER ClusterID
            The cluster to be permanently deleted. This field is required.
            .EXAMPLE
            Remove-DatabricksCluster -ClusterID "1202-211320-brick1"
    #>

    [CmdletBinding()]
    param
    (
        [Parameter(Mandatory = $true, Position = 1, ValueFromPipelineByPropertyName = $true)] [Alias("cluster_id")] [string] $ClusterID
    )
    begin {
        $requestMethod = "POST"
        $apiEndpoint = "/2.0/clusters/permanent-delete"
    }
    
    process {
        Write-Verbose "Building Body/Parameters for final API call ..."
        #Set parameters
        $parameters = @{
            cluster_id = $ClusterID 
        }

        $result = Invoke-DatabricksApiRequest -Method $requestMethod -EndPoint $apiEndpoint -Body $parameters

        # this call does not return any results
        #return $result
    }
}

Function Get-DatabricksCluster
{
    <#
            .SYNOPSIS
            Retrieves the information for a cluster given its identifier. Clusters can be described while they are running, or up to 30 days after they are terminated.
            .DESCRIPTION
            Retrieves the information for a cluster given its identifier. Clusters can be described while they are running, or up to 30 days after they are terminated.
            Official API Documentation: https://docs.databricks.com/api/latest/clusters.html#get
            .PARAMETER ClusterID
            The cluster about which to retrieve information. This field is required.
            .EXAMPLE
            Get-DatabricksCluster -ClusterID "1202-211320-brick1"
            .EXAMPLE
            #AUTOMATED_TEST:List existing clusters
            Get-DatabricksCluster
    #>

    [CmdletBinding()]
    param
    (
        [Parameter(Mandatory = $false, Position = 1, ValueFromPipelineByPropertyName = $true)] [Alias("cluster_id")] [string] $ClusterID = $null
    )
    begin {
        $requestMethod = "GET"
        $apiEndpoint = "/2.0/clusters/list"
        if($ClusterID)
        {
            Write-Verbose "ClusterID specified ($ClusterID) - using get endpoint instead of list endpoint..."
            $apiEndpoint =  "/2.0/clusters/get"
        }
    }

    process {
        #Set parameters
        Write-Verbose "Building Body/Parameters for final API call ..."
        $parameters = @{}
        $parameters | Add-Property  -Name "cluster_id" -Value $ClusterID

        $result = Invoke-DatabricksApiRequest -Method $requestMethod -EndPoint $apiEndpoint -Body $parameters

        if($ClusterID)
        {
            # if a ClusterID was specified, we return the result as it is
            return $result
        }
        else
        {
            # if no ClusterID was specified, we return the clusters as an array
            return $result.clusters
        }
    }
}

Function Pin-DatabricksCluster            
{
    <#
            .SYNOPSIS
            Note
            .DESCRIPTION
            Note
            Official API Documentation: https://docs.databricks.com/api/latest/clusters.html#pin
            .PARAMETER ClusterID
            The cluster to pin. This field is required.
            .EXAMPLE
            Pin-DatabricksCluster -ClusterID "1202-211320-brick1"
    #>

    [CmdletBinding()]
    param
    (
        [Parameter(Mandatory = $true, Position = 1, ValueFromPipelineByPropertyName = $true)] [Alias("cluster_id")] [string] $ClusterID
    )
    begin {
        $requestMethod = "POST"
        $apiEndpoint = "/2.0/clusters/pin"
    }
    
    process {
        Write-Verbose "Building Body/Parameters for final API call ..."
        #Set parameters
        $parameters = @{
            cluster_id = $ClusterID 
        }

        $result = Invoke-DatabricksApiRequest -Method $requestMethod -EndPoint $apiEndpoint -Body $parameters

        return $result
    }
}

Function Unpin-DatabricksCluster
{
    <#
            .SYNOPSIS
            Note
            .DESCRIPTION
            Note
            Official API Documentation: https://docs.databricks.com/api/latest/clusters.html#unpin
            .PARAMETER ClusterID
            The cluster to unpin. This field is required.
            .EXAMPLE
            Unpin-DatabricksCluster -ClusterID "1202-211320-brick1"
    #>

    [CmdletBinding()]
    param
    (
        [Parameter(Mandatory = $true, Position = 1, ValueFromPipelineByPropertyName = $true)] [Alias("cluster_id")] [string] $ClusterID
    )
    begin {
        $requestMethod = "POST"
        $apiEndpoint = "/2.0/clusters/unpin"
    }
    
    process {
        Write-Verbose "Building Body/Parameters for final API call ..."
        #Set parameters
        $parameters = @{
            cluster_id = $ClusterID 
        }

        $result = Invoke-DatabricksApiRequest -Method $requestMethod -EndPoint $apiEndpoint -Body $parameters

        return $result
    }
}

Function Get-DatabricksClusterEvent
{
    <#
            .SYNOPSIS
            Retrieves a list of events about the activity of a cluster. This API is paginated. If there are more events to read, the response includes all the parameters necessary to request the next page of events.
            .DESCRIPTION
            Retrieves a list of events about the activity of a cluster. This API is paginated. If there are more events to read, the response includes all the parameters necessary to request the next page of events.
            Official API Documentation: https://docs.databricks.com/api/latest/clusters.html#events
            .PARAMETER Cluster_Id
            The ID of the cluster to retrieve events about. This field is required.
            .PARAMETER Start_Time
            The start time in epoch milliseconds. If empty, returns events starting from the beginning of time.
            .PARAMETER End_Time
            The end time in epoch milliseconds. If empty, returns events up to the current time.
            .PARAMETER Order
            The order to list events in; either ASC or DESC. Defaults to DESC.
            .PARAMETER Event_Types
            An optional set of event types to filter on. If empty, all event types are returned.
            .PARAMETER Offset
            The offset in the result set. Defaults to 0 (no offset). When an offset is specified and the results are requested in descending order, the end_time field is required.
            .PARAMETER Limit
            The maximum number of events to include in a page of events. Defaults to 50, and maximum allowed value is 500.
            .EXAMPLE
            Get-ClusterEvent -ClusterID <cluster_id> -StartTime <start_time> -EndTime <end_time> -Order <order> -EventTypes <event_types> -Offset <offset> -Limit <limit>
            .EXAMPLE
            #AUTOMATED_TEST:Get cluster Events
            $cluster = Get-DatabricksCluster
            $cluster[0] | Get-DatabricksClusterEvent
    #>

    [CmdletBinding()]
    param
    (
        [Parameter(Mandatory = $true, Position = 1, ValueFromPipelineByPropertyName = $true)] [Alias("cluster_id")] [string] $ClusterID, 
        [Parameter(Mandatory = $false, Position = 2)] [int64] $StartTime, 
        [Parameter(Mandatory = $false, Position = 3)] [int64] $EndTime, 
        [Parameter(Mandatory = $false, Position = 4)] [ValidateSet("ASC", "DESC")] [string] $Order, 
        [Parameter(Mandatory = $false, Position = 5)] [ValidateSet("CREATING",    "DID_NOT_EXPAND_DISK",    "EXPANDED_DISK",    "FAILED_TO_EXPAND_DISK",    "INIT_SCRIPTS_STARTING",    "INIT_SCRIPTS_FINISHED",    "STARTING",    "RESTARTING",    "TERMINATING",    "EDITED",    "RUNNING",    "RESIZING",    "UPSIZE_COMPLETED",    "NODES_LOST")] [string[]]  $EventTypes, 
        [Parameter(Mandatory = $false, Position = 6)] [int] $Offset = -1, 
        [Parameter(Mandatory = $false, Position = 7)] [int] $Limit = -1
    )
    begin {
        $requestMethod = "POST"
        $apiEndpoint = "/2.0/clusters/events"
    }
    
    process {
        Write-Verbose "Building Body/Parameters for final API call ..."
        #Set parameters
        $parameters = @{
            cluster_id = $ClusterID 
        }
    
        $parameters | Add-Property  -Name "start_time" -Value $StartTime
        $parameters | Add-Property  -Name "end_time" -Value $EndTime
        $parameters | Add-Property  -Name "order" -Value $Order
        $parameters | Add-Property  -Name "event_types" -Value $EventTypes
        $parameters | Add-Property  -Name "offset" -Value $Offset -NullValue -1
        $parameters | Add-Property  -Name "limit" -Value $Limit -NullValue -1

        $result = Invoke-DatabricksApiRequest -Method $requestMethod -EndPoint $apiEndpoint -Body $parameters

        return $result
    }
}

Function Get-DatabricksNodeType
{
    <#
            .SYNOPSIS
            Returns a list of supported Spark node types. These node types can be used to launch a cluster.
            .DESCRIPTION
            Returns a list of supported Spark node types. These node types can be used to launch a cluster.
            Official API Documentation: https://docs.databricks.com/api/latest/clusters.html#list-node-types
            .EXAMPLE
            #AUTOMATED_TEST:List cluster node types
            Get-DatabricksNodeType
    #>

    [CmdletBinding()]
    param ()

    $requestMethod = "GET"
    $apiEndpoint = "/2.0/clusters/list-node-types"

    Write-Verbose "Building Body/Parameters for final API call ..."
    #Set parameters
    $parameters = @{}

    $result = Invoke-DatabricksApiRequest -Method $requestMethod -EndPoint $apiEndpoint -Body $parameters

    return $result.node_types
}

Function Get-DatabricksZone
{
    <#
            .SYNOPSIS
            Returns a list of availability zones where clusters can be created in (ex: us-west-2a). These zones can be used to launch a cluster.
            .DESCRIPTION
            Returns a list of availability zones where clusters can be created in (ex: us-west-2a). These zones can be used to launch a cluster.
            Official API Documentation: https://docs.databricks.com/api/latest/clusters.html#list-zones
            .EXAMPLE
            #AUTOMATED_TEST:List cluster zones
            Get-DatabricksZone
             
    #>

    [CmdletBinding()]
    param() 
    
    $requestMethod = "GET"
    $apiEndpoint = "/2.0/clusters/list-zones"
    
    if($script:dbCloudProvider -in  @("Azure"))
    {
        Write-Warning "API call '$requestMethod $apiEndpoint' is not supported on Cloud Provider '$script:dbCloudProvider'"
        return
    }

    Write-Verbose "Building Body/Parameters for final API call ..."
    #Set parameters
    $parameters = @{}

    $result = Invoke-DatabricksApiRequest -Method $requestMethod -EndPoint $apiEndpoint -Body $parameters

    return $result
}

Function Get-DatabricksSparkVersion
{
    <#
            .SYNOPSIS
            Returns the list of available Spark versions. These versions can be used to launch a cluster.
            .DESCRIPTION
            Returns the list of available Spark versions. These versions can be used to launch a cluster.
            Official API Documentation: https://docs.databricks.com/api/latest/clusters.html#spark-versions
            .EXAMPLE
            #AUTOMATED_TEST:List spark versions
            Get-DatabricksSparkVersion
    #>

    [CmdletBinding()]
    param ()

    $requestMethod = "GET"
    $apiEndpoint = "/2.0/clusters/spark-versions"

    Write-Verbose "Building Body/Parameters for final API call ..."
    #Set parameters
    $parameters = @{}

    $result = Invoke-DatabricksApiRequest -Method $requestMethod -EndPoint $apiEndpoint -Body $parameters

    return $result.versions
}