Public/New-DuckDBConnectionWithFunctions.ps1
|
function Out-DuckData { <# .SYNOPSIS Outputs data from a DuckDB data reader as PowerShell custom objects. .DESCRIPTION The Out-DuckData function reads data from a DuckDB data reader and converts each row into a PowerShell custom object. Each column in the result set becomes a property of the object, with the column name as the property name and the column value as the property value. .PARAMETER reader The DuckDB data reader object from which to read the data. This parameter is mandatory. .PARAMETER Help Displays the full help information for this function. .PARAMETER Version Displays the version and release information for this function. .OUTPUTS PSCustomObject Each row from the data reader is output as a PSCustomObject with properties corresponding to the column names and values. .EXAMPLE PS C:\> $conn = New-DuckDBConnection PS C:\> $reader = $conn.sql("SELECT id, name FROM users").ExecuteReader() PS C:\> Out-DuckData -reader $reader Reads data from the 'users' table and outputs each row as a custom object. .NOTES AUTHOR: OpusTecnica - posh@opustecnica.com VERSION: 0.9.2 DATE: 2025-12-23T17:57-05:00 RELEASES: 0.9.1 - Updated on 2025-12-23T17:57-05:00 0.9.0 - Initial release on 2025-12-23 .LINK New-DuckDBConnection #> [CmdletBinding()] param( $reader = $null, [switch]$Help, [switch]$Version ) begin { if ($Help) { Get-Help -Name $MyInvocation.MyCommand -Full $Return = $true } elseif ($Version) { $HelpContent = Get-Help -Name $MyInvocation.MyCommand -Full | Out-String $Notes = [regex]::Match($HelpContent, '(?s)(?<=NOTES\s*).*?(?=(RELATED\sLINKS|--+))').Value -split "`n" | ForEach-Object { if ($_ -notmatch '^\s+$') { $_ -replace '(\s\s)+', ' ' } } Write-Output ($MyInvocation.InvocationName + "`n") $Notes "`n" $Return = $true } else { $Return = $false } } process { if ($Return) { return } if (-not $reader) { throw "The 'reader' parameter is required when not using -Help or -Version." } while ($reader.read()) { # Create a hashtable for the current row $rowObject = [Ordered]@{} for ($columnIndex = 0; $columnIndex -lt $reader.FieldCount; $columnIndex++ ) { # Add field name and value as key-value pair $rowObject[$reader.GetName($columnIndex)] = $reader.GetValue($columnIndex) } # Convert the hashtable to a custom object and add it to the array [PSCustomObject]$rowObject } } end {} } function New-DuckDBConnection { <# .SYNOPSIS Creates a new connection to a DuckDB database. .DESCRIPTION The New-DuckDBConnection function creates a new connection to a DuckDB database using the specified path and configuration options. The connection can be to an in-memory database (default) or to a persistent file-based database. Once created, use the .Open() method or call .sql() which auto-opens the connection. .PARAMETER Path The path to the DuckDB database file. If not specified, the connection will be created to an in-memory database. Use ':memory:' explicitly for in-memory databases, or provide a file path for persistent storage. .PARAMETER AccessMode The access mode for the database connection. Valid values are 'READ_WRITE' (default) and 'READ_ONLY'. .PARAMETER Threads The maximum number of threads DuckDB can use for query execution. If not specified, DuckDB will use all available cores. .PARAMETER MemoryLimit The maximum amount of memory DuckDB can use. Can be specified as a number (bytes) or with units like '1GB', '512MB', etc. .PARAMETER TempDirectory The directory to use for temporary files. If not specified, the system default temporary directory is used. .PARAMETER MaxMemory Alternative to MemoryLimit. The maximum memory limit for the database. .PARAMETER SharedCache For in-memory databases, whether to use a shared cache across connections. Only applicable when Path is ':memory:'. .PARAMETER MotherDuckToken The MotherDuck authentication token for connecting to MotherDuck databases. .PARAMETER AdditionalOptions A hashtable of additional DuckDB configuration options to include in the connection string. Example: @{ 'enable_external_access' = 'false'; 'allow_unsigned_extensions' = 'true' } .PARAMETER Help Displays the full help information for this function. .PARAMETER Version Displays the version and release information for this function. .OUTPUTS DuckDB.NET.Data.DuckDBConnection A DuckDB connection object with extended methods: sql(), CreateFunction(), RemoveFunction(), CloseDB(). .EXAMPLE PS C:\> $conn = New-DuckDBConnection PS C:\> $conn.Open() PS C:\> $conn.sql("SELECT 'Hello DuckDB' as greeting") Creates and opens an in-memory DuckDB connection, then runs a simple query. .EXAMPLE PS C:\> $conn = New-DuckDBConnection -Path 'C:\Data\mydb.duckdb' PS C:\> $conn.sql("CREATE TABLE users (id INT, name VARCHAR)") PS C:\> $conn.sql("INSERT INTO users VALUES (1, 'Alice'), (2, 'Bob')") PS C:\> $conn.sql("SELECT * FROM users") PS C:\> $conn.CloseDB() Creates a persistent database, creates a table, inserts data, queries it, and closes. .EXAMPLE PS C:\> $conn = New-DuckDBConnection -Path 'C:\Data\mydb.duckdb' -AccessMode READ_ONLY PS C:\> $conn.sql("SELECT * FROM users") Opens a database in read-only mode. .EXAMPLE PS C:\> $conn = New-DuckDBConnection -Threads 4 -MemoryLimit '2GB' PS C:\> $conn.sql("SELECT * FROM large_table") Creates an in-memory database with limited threads and memory. .EXAMPLE PS C:\> $conn = New-DuckDBConnection -Path 'md:my_database' -MotherDuckToken 'ey...' PS C:\> $conn.sql("SELECT * FROM my_table") Connects to a MotherDuck database. .EXAMPLE PS C:\> $conn = New-DuckDBConnection -AdditionalOptions @{ 'enable_external_access' = 'false' } PS C:\> $conn.sql("SELECT * FROM read_csv_auto('data.csv')") Creates a connection with additional security restrictions. .NOTES AUTHOR: OpusTecnica - posh@opustecnica.com VERSION: 0.9.2 DATE: 2025-12-23T17:57-05:00 RELEASES: 0.9.1 - Updated on 2025-12-23T17:57-05:00 0.9.0 - Initial release on 2025-12-23 Requires DuckDB.NET.Data.dll (net8.0 build) for UDF support via CreateFunction(). #> [CmdletBinding()] param ( [string]$Path = ':memory:', [ValidateSet('READ_WRITE', 'READ_ONLY')] [string]$AccessMode, [int]$Threads, [string]$MemoryLimit, [string]$TempDirectory, [string]$MaxMemory, [switch]$SharedCache, [string]$MotherDuckToken, [hashtable]$AdditionalOptions, [switch]$Help, [switch]$Version ) begin { if ($Help) { Get-Help -Name $MyInvocation.MyCommand -Full $Return = $true } elseif ($Version) { $HelpContent = Get-Help -Name $MyInvocation.MyCommand -Full | Out-String $Notes = [regex]::Match($HelpContent, '(?s)(?<=NOTES\s*).*?(?=(RELATED\sLINKS|--+))').Value -split "`n" | ForEach-Object { if ($_ -notmatch '^\s+$') { $_ -replace '(\s\s)+', ' ' } } Write-Output ($MyInvocation.InvocationName + "`n") $Notes "`n" $Return = $true } else { $Return = $false } } process { if ($Return) { return } # Build connection string $connectionStringParts = @() # Handle MotherDuck connections if ($MotherDuckToken) { $connectionStringParts += "DataSource=md:$Path" $connectionStringParts += "motherduck_token=$MotherDuckToken" } else { $connectionStringParts += "Data Source=$Path" } # Add configuration options if ($AccessMode) { $connectionStringParts += "ACCESS_MODE=$AccessMode" } if ($Threads) { $connectionStringParts += "threads=$Threads" } if ($MemoryLimit) { $connectionStringParts += "memory_limit=$MemoryLimit" } if ($MaxMemory) { $connectionStringParts += "max_memory=$MaxMemory" } if ($TempDirectory) { $connectionStringParts += "temp_directory=$TempDirectory" } # Handle shared cache for in-memory databases if ($SharedCache -and $Path -eq ':memory:') { $connectionStringParts += 'cache=shared' } # Add additional options if ($AdditionalOptions) { foreach ($key in $AdditionalOptions.Keys) { $connectionStringParts += "$key=$($AdditionalOptions[$key])" } } $connectionString = $connectionStringParts -join ';' Write-Verbose "Connection string: $connectionString" [DuckDB.NET.Data.DuckDBConnection]::new($connectionString) } end {} } function Add-DuckDBSqlMethod { <# .SYNOPSIS Executes a SQL query on a DuckDB connection. .DESCRIPTION This script method executes a SQL query on a DuckDB connection. It automatically opens the connection if it's not already open. The query results are returned as PowerShell objects that can be piped to other cmdlets or formatted for display. .PARAMETER query The SQL query to be executed. Supports all DuckDB SQL syntax including: - Standard SELECT, INSERT, UPDATE, DELETE statements - DuckDB extensions like read_csv_auto(), read_parquet(), read_json() - Window functions, CTEs, PIVOT/UNPIVOT - User-defined functions registered via CreateFunction() .OUTPUTS PSCustomObject[] Query results as PowerShell objects with properties matching column names. .EXAMPLE PS> $conn.sql("SELECT * FROM Customers") Executes a simple SELECT query and returns all customers. .EXAMPLE PS> $conn.sql("SELECT * FROM read_csv_auto('data.csv') WHERE amount > 100") | Sort-Object amount -Descending | Select-Object -First 10 Reads a CSV file, filters rows, and uses PowerShell pipeline to sort and limit results. .EXAMPLE PS> $results = $conn.sql(@" WITH monthly_sales AS ( SELECT DATE_TRUNC('month', sale_date) as month, SUM(amount) as total FROM sales GROUP BY 1 ) SELECT * FROM monthly_sales ORDER BY month "@) PS> $results | Format-Table -AutoSize Executes a complex query with a CTE and formats results as a table. .EXAMPLE PS> $conn.sql("SELECT getenv('USERNAME') as user, getenv('COMPUTERNAME') as computer") Calls user-defined functions registered with CreateFunction(). .NOTES AUTHOR: OpusTecnica - posh@opustecnica.com VERSION: 0.9.2 DATE: 2025-12-23T17:57-05:00 RELEASES: 0.9.1 - Initial release on 2025-12-23 #> Update-TypeData -TypeName 'DuckDB.NET.Data.DuckDBConnection' -MemberType ScriptMethod -MemberName sql -Force -Value { param([string]$query) if ($this.State -ne 'Open') { $this.Open() } $cmd = $this.CreateCommand() $cmd.CommandText = $query $reader = $cmd.ExecuteReader() Out-DuckData $reader } } function Add-DuckDBCloseDBMethod { <# .SYNOPSIS Closes the DuckDB connection and disposes of any associated resources. .DESCRIPTION The CloseDB method is used to close the DuckDB connection and dispose of any associated resources such as the reader and command objects. Always call this method when you're done with the connection to release database locks and free memory. .INPUTS None. You cannot pipe objects to this method. .OUTPUTS None. This method does not generate any output. .EXAMPLE PS> $conn = New-DuckDBConnection -Path 'mydb.duckdb' PS> $conn.sql("SELECT * FROM users") PS> $conn.CloseDB() Opens a database, runs a query, then properly closes the connection. .EXAMPLE PS> try { $conn = New-DuckDBConnection $conn.sql("SELECT * FROM read_parquet('large_file.parquet')") } finally { $conn.CloseDB() } Uses try/finally to ensure the connection is always closed, even if an error occurs. .NOTES AUTHOR: OpusTecnica - posh@opustecnica.com VERSION: 0.9.2 DATE: 2025-12-23T17:57-05:00 RELEASES: 0.9.1 - Initial release on 2025-12-23 #> Update-TypeData -TypeName 'DuckDB.NET.Data.DuckDBConnection' -MemberType ScriptMethod -MemberName CloseDB -Force -Value { if ($null -ne $reader) { $reader.Dispose() } if ($null -ne $cmd) { $cmd.Dispose() } $this.Close() } } function Add-DuckDBCreateFunctionMethod { <# .SYNOPSIS Registers a user-defined scalar function with the DuckDB connection. .DESCRIPTION The CreateFunction method registers a PowerShell scriptblock as a scalar UDF (User-Defined Function) in DuckDB. This mimics the Python create_function API. The function takes input parameters, processes them through a PowerShell scriptblock, and returns a result that can be used in SQL queries. Supports 1 to 4 input parameters with any combination of supported types: - String types: [string] - Numeric types: [int], [long], [double], [decimal] - Boolean: [bool] - Date/Time: [DateTime] The UDF can be called from any SQL query just like built-in DuckDB functions. .PARAMETER Name The name of the function to register in DuckDB. This name will be used in SQL queries. Function names are case-insensitive in DuckDB. .PARAMETER Function A PowerShell scriptblock that implements the function logic. The scriptblock receives the input value(s) as parameters and should return a single value of the specified ReturnType. The scriptblock is invoked once per row when the function is called in a query. .PARAMETER Parameters An array of .NET types representing the input parameter types. The order must match the parameter order in the scriptblock. Supported types include: [string], [int], [long], [double], [decimal], [bool], [DateTime] .PARAMETER ReturnType The .NET type of the return value. Must match what the scriptblock actually returns. DuckDB will convert the result to the appropriate SQL type. .PARAMETER SideEffects If $true, indicates the function has side effects (is not a pure function). Default is $false (function is pure - same input always produces same output). Pure functions can be optimized by DuckDB (e.g., result caching). .EXAMPLE PS> # Read environment variables in SQL queries PS> $conn.CreateFunction( 'getenv', { param($varName) [System.Environment]::GetEnvironmentVariable($varName) }, @([string]), [string] ) PS> $conn.sql("SELECT getenv('USERNAME') as user, getenv('COMPUTERNAME') as host") user host ---- ---- jsmith WORKSTATION01 .EXAMPLE PS> # Perform arithmetic with custom logic PS> $conn.CreateFunction( 'calculate_tax', { param([double]$amount, [double]$rate) [Math]::Round($amount * $rate / 100, 2) }, @([double], [double]), [double] ) PS> $conn.sql("SELECT product, price, calculate_tax(price, 8.25) as tax FROM products") product price tax ------- ----- --- Widget 29.99 2.47 Gadget 149.99 12.37 .EXAMPLE PS> # Boolean function for filtering PS> $conn.CreateFunction( 'is_valid_email', { param($email) $email -match '^[\w.-]+@[\w.-]+\.\w+$' }, @([string]), [bool] ) PS> $conn.sql("SELECT email FROM users WHERE is_valid_email(email)") .EXAMPLE PS> # String manipulation PS> $conn.CreateFunction( 'slugify', { param($text) ($text -replace '[^\w\s-]', '' -replace '\s+', '-').ToLower() }, @([string]), [string] ) PS> $conn.sql("SELECT title, slugify(title) as slug FROM articles") title slug ----- ---- Hello World! hello-world What's New in 2024? whats-new-in-2024 .EXAMPLE PS> # Multi-parameter function PS> $conn.CreateFunction( 'format_name', { param($first, $last, $title) "$title $first $last" }, @([string], [string], [string]), [string] ) PS> $conn.sql("SELECT format_name(first_name, last_name, 'Dr.') as formal FROM doctors") .EXAMPLE PS> # Use with table data and aggregation PS> $conn.CreateFunction('double_it', { param([int]$n) $n * 2 }, @([int]), [int]) PS> $conn.sql(@" SELECT category, SUM(quantity) as total, SUM(double_it(quantity)) as doubled FROM inventory GROUP BY category "@) .NOTES Requires DuckDB.NET.Data.dll net8.0 build (v1.4.0 or later) for UDF support. The RegisterScalarFunction API is marked experimental in DuckDB.NET. Type Mapping: - PowerShell [string] -> DuckDB VARCHAR - PowerShell [int] -> DuckDB INTEGER - PowerShell [long] -> DuckDB BIGINT - PowerShell [double] -> DuckDB DOUBLE - PowerShell [bool] -> DuckDB BOOLEAN Note: When using with range() or other functions that return BIGINT, you may need to cast to INTEGER: SELECT my_func(i::INTEGER) FROM range(10) t(i) #> # Load the C# helper class source code from external file $ScalarFunctionWrapperCode = Get-Content -Path (Join-Path $PSScriptRoot '..\cs\ScalarFunctionWrapper.cs') -Raw # Check if the type is already loaded if (-not ([System.Management.Automation.PSTypeName]'PaperinikDB.ScalarFunctionWrapper').Type) { # Try to load from pre-compiled DLL first $scalarDllPath = Join-Path $PSScriptRoot '..\lib\net8\PaperinikDB.ScalarFunctionWrapper.dll' if (Test-Path $scalarDllPath) { try { Add-Type -Path $scalarDllPath -ErrorAction Stop Write-Verbose 'ScalarFunctionWrapper loaded from DLL' } catch { Write-Verbose "Could not load ScalarFunctionWrapper DLL, will compile: $_" $scalarDllPath = $null } } else { $scalarDllPath = $null } # Compile from source if DLL not available if (-not $scalarDllPath -or -not ([System.Management.Automation.PSTypeName]'PaperinikDB.ScalarFunctionWrapper').Type) { $duckDbDataAssembly = [DuckDB.NET.Data.DuckDBConnection].Assembly if ($duckDbDataAssembly) { $referencedAssemblies = @( [System.Management.Automation.PSObject].Assembly.Location, [System.Object].Assembly.Location, [System.Collections.Generic.List[object]].Assembly.Location, [System.Collections.Generic.IList[object]].Assembly.Location, [System.Runtime.CompilerServices.DynamicAttribute].Assembly.Location, [Microsoft.CSharp.RuntimeBinder.Binder].Assembly.Location, $duckDbDataAssembly.Location ) | Select-Object -Unique try { Add-Type -TypeDefinition $ScalarFunctionWrapperCode -ReferencedAssemblies $referencedAssemblies -IgnoreWarnings -ErrorAction Stop Write-Verbose 'ScalarFunctionWrapper compiled successfully' } catch { Write-Warning "Could not compile ScalarFunctionWrapper: $_" } } } } Update-TypeData -TypeName 'DuckDB.NET.Data.DuckDBConnection' -MemberType ScriptMethod -MemberName CreateFunction -Force -Value { param( [string]$Name, [scriptblock]$Function, [type[]]$Parameters, [type]$ReturnType, [bool]$SideEffects = $false ) if ($this.State -ne 'Open') { $this.Open() } # Validate parameters $paramCount = $Parameters.Count if ($paramCount -lt 1 -or $paramCount -gt 4) { throw "CreateFunction supports 1 to 4 input parameters. Got $paramCount." } # Create the wrapper instance with the result type $wrapper = [PaperinikDB.ScalarFunctionWrapper]::new($Function, $ReturnType) # Get the Invoke method from the wrapper $invokeMethod = $wrapper.GetType().GetMethod('Invoke') # Create the Action delegate from the wrapper's Invoke method $readerListType = [System.Collections.Generic.IReadOnlyList[DuckDB.NET.Data.DataChunk.Reader.IDuckDBDataReader]] $writerType = [DuckDB.NET.Data.DataChunk.Writer.IDuckDBDataWriter] $actionType = [System.Action`3].MakeGenericType($readerListType, $writerType, [ulong]) $actionDelegate = [System.Delegate]::CreateDelegate($actionType, $wrapper, $invokeMethod) # Build the generic type arguments: input types + return type $genericArgs = [type[]]($Parameters + $ReturnType) # Find the correct RegisterScalarFunction overload based on parameter count $connectionType = $this.GetType() $methods = $connectionType.GetMethods() | Where-Object { $_.Name -eq 'RegisterScalarFunction' -and $_.GetGenericArguments().Count -eq ($paramCount + 1) } if (-not $methods) { throw "No RegisterScalarFunction overload found for $paramCount parameters" } $genericMethod = $methods | Select-Object -First 1 $closedMethod = $genericMethod.MakeGenericMethod($genericArgs) # Determine isPureFunction (opposite of SideEffects) $isPureFunction = -not $SideEffects # Invoke the closed generic method $methodParams = $closedMethod.GetParameters() if ($methodParams.Count -eq 4) { # Has the @params parameter (1-parameter version) $closedMethod.Invoke($this, @($Name, $actionDelegate, $isPureFunction, $false)) } else { $closedMethod.Invoke($this, @($Name, $actionDelegate, $isPureFunction)) } Write-Verbose "Registered scalar function '$Name' with $paramCount parameter(s)" } } function Add-DuckDBCreateTableFunctionMethod { <# .SYNOPSIS Registers a user-defined table function with the DuckDB connection. .DESCRIPTION The CreateTableFunction method registers a PowerShell scriptblock as a table-valued UDF in DuckDB. Unlike scalar functions that return a single value per row, table functions return entire result sets that can be queried like tables using SELECT * FROM my_table_function(). The scriptblock should return an enumerable collection (array, ArrayList, or objects) where each item represents a row. Rows can be: - PSCustomObjects with properties matching column names - Hashtables with keys matching column names - Arrays (positional mapping to columns) - .NET objects with properties matching column names .PARAMETER Name The name of the table function to register in DuckDB. This name will be used in SQL queries with the FROM clause: SELECT * FROM function_name() .PARAMETER Function A PowerShell scriptblock that returns the table data. The scriptblock may receive input parameters if Parameters is specified. It should return an enumerable collection of rows. .PARAMETER Columns A hashtable defining the output columns. Keys are column names, values are .NET types. Example: @{ 'id' = [int]; 'name' = [string]; 'value' = [double] } .PARAMETER Parameters Optional. An array of .NET types for input parameters. If specified, the table function can accept arguments: SELECT * FROM my_func('arg1', 123) .EXAMPLE PS> # Simple table function returning static data PS> $conn.CreateTableFunction( 'get_users', { @( [PSCustomObject]@{ id = 1; name = 'Alice'; active = $true } [PSCustomObject]@{ id = 2; name = 'Bob'; active = $false } [PSCustomObject]@{ id = 3; name = 'Charlie'; active = $true } ) }, @{ 'id' = [int]; 'name' = [string]; 'active' = [bool] } ) PS> $conn.sql("SELECT * FROM get_users() WHERE active = true") id name active -- ---- ------ 1 Alice True 3 Charlie True .EXAMPLE PS> # Table function reading from file system PS> $conn.CreateTableFunction( 'list_files', { param($path) Get-ChildItem -Path $path -File | ForEach-Object { [PSCustomObject]@{ name = $_.Name size = $_.Length modified = $_.LastWriteTime.ToString('yyyy-MM-dd HH:mm:ss') } } }, @{ 'name' = [string]; 'size' = [long]; 'modified' = [string] }, @([string]) ) PS> $conn.sql("SELECT * FROM list_files('C:\Temp') WHERE size > 1000000") .EXAMPLE PS> # Table function calling REST API PS> $conn.CreateTableFunction( 'get_github_repos', { param($username) $repos = Invoke-RestMethod "https://api.github.com/users/$username/repos" $repos | ForEach-Object { [PSCustomObject]@{ name = $_.name stars = $_.stargazers_count language = $_.language url = $_.html_url } } }, @{ 'name' = [string]; 'stars' = [int]; 'language' = [string]; 'url' = [string] }, @([string]) ) PS> $conn.sql("SELECT name, stars FROM get_github_repos('microsoft') ORDER BY stars DESC LIMIT 10") .EXAMPLE PS> # Table function returning environment variables PS> $conn.CreateTableFunction( 'get_env_vars', { [System.Environment]::GetEnvironmentVariables().GetEnumerator() | ForEach-Object { [PSCustomObject]@{ name = $_.Key; value = $_.Value } } }, @{ 'name' = [string]; 'value' = [string] } ) PS> $conn.sql("SELECT * FROM get_env_vars() WHERE name LIKE 'PATH%'") .EXAMPLE PS> # Table function with filtering done in SQL PS> $conn.CreateTableFunction( 'get_processes', { Get-Process | ForEach-Object { [PSCustomObject]@{ name = $_.ProcessName pid = $_.Id cpu = [double]$_.CPU memory_mb = [int]($_.WorkingSet64 / 1MB) } } }, @{ 'name' = [string]; 'pid' = [int]; 'cpu' = [double]; 'memory_mb' = [int] } ) PS> $conn.sql("SELECT name, memory_mb FROM get_processes() WHERE memory_mb > 100 ORDER BY memory_mb DESC") .NOTES Requires DuckDB.NET.Data.dll net8.0 build (v1.4.0 or later) for table function support. Performance Note: The entire result set is materialized when the function is called. For very large datasets, consider using DuckDB's native file reading functions instead. Supported column types: - [string] -> VARCHAR - [int] -> INTEGER - [long] -> BIGINT - [double] -> DOUBLE - [bool] -> BOOLEAN - [DateTime] -> TIMESTAMP #> # Load the C# helper class source code from external file $TableFunctionWrapperCode = Get-Content -Path (Join-Path $PSScriptRoot '..\cs\TableFunctionWrapper.cs') -Raw # Load or compile TableFunctionWrapper if (-not ([System.Management.Automation.PSTypeName]'PaperinikDB.TableFunctionWrapper').Type) { # Try to load from pre-compiled DLL first $tableDllPath = Join-Path $PSScriptRoot '..\lib\net8\PaperinikDB.TableFunctionWrapper.dll' if (Test-Path $tableDllPath) { try { Add-Type -Path $tableDllPath -ErrorAction Stop Write-Verbose 'TableFunctionWrapper loaded from DLL' } catch { Write-Verbose "Could not load TableFunctionWrapper DLL, will compile: $_" $tableDllPath = $null } } else { $tableDllPath = $null } # Compile from source if DLL not available if (-not $tableDllPath -or -not ([System.Management.Automation.PSTypeName]'PaperinikDB.TableFunctionWrapper').Type) { $duckDbDataAssembly = [DuckDB.NET.Data.DuckDBConnection].Assembly $duckDbBindingsAssembly = [DuckDB.NET.Native.NativeMethods].Assembly if ($duckDbDataAssembly -and $duckDbBindingsAssembly) { $referencedAssemblies = @( [System.Management.Automation.PSObject].Assembly.Location, [System.Object].Assembly.Location, [System.Collections.Generic.List[object]].Assembly.Location, [System.Collections.Generic.IList[object]].Assembly.Location, [System.Runtime.CompilerServices.DynamicAttribute].Assembly.Location, [Microsoft.CSharp.RuntimeBinder.Binder].Assembly.Location, $duckDbDataAssembly.Location, $duckDbBindingsAssembly.Location ) | Select-Object -Unique try { Add-Type -TypeDefinition $TableFunctionWrapperCode -ReferencedAssemblies $referencedAssemblies -IgnoreWarnings -ErrorAction Stop Write-Verbose 'TableFunctionWrapper compiled successfully' } catch { Write-Warning "Could not compile TableFunctionWrapper: $_" } } } } Update-TypeData -TypeName 'DuckDB.NET.Data.DuckDBConnection' -MemberType ScriptMethod -MemberName CreateTableFunction -Force -Value { param( [string]$Name, [scriptblock]$Function, [hashtable]$Columns, [type[]]$Parameters = @() ) if ($this.State -ne 'Open') { $this.Open() } # Build column info array in order $columnInfoList = [System.Collections.Generic.List[DuckDB.NET.Data.ColumnInfo]]::new() $columnTypes = [System.Collections.Generic.List[type]]::new() foreach ($col in $Columns.GetEnumerator()) { $columnInfoList.Add([DuckDB.NET.Data.ColumnInfo]::new($col.Key, $col.Value)) $columnTypes.Add($col.Value) } $columnInfoArray = $columnInfoList.ToArray() $columnTypesArray = $columnTypes.ToArray() # Create the wrapper instance with parameter types for reading input values $wrapper = [PaperinikDB.TableFunctionWrapper]::new($Function, $columnInfoArray, $columnTypesArray, $Parameters) # Create delegates based on whether we have input parameters $connectionType = $this.GetType() $paramCount = $Parameters.Count if ($paramCount -eq 0) { # No input parameters - use Func<TableFunction> $bindMethod = $wrapper.GetType().GetMethod('Bind') $bindFuncType = [System.Func[DuckDB.NET.Data.TableFunction]] $bindDelegate = [System.Delegate]::CreateDelegate($bindFuncType, $wrapper, $bindMethod) # Create mapper delegate: Action<object, IDuckDBDataWriter[], ulong> $mapMethod = $wrapper.GetType().GetMethod('MapRow') $writerArrayType = [DuckDB.NET.Data.DataChunk.Writer.IDuckDBDataWriter[]] $mapActionType = [System.Action`3].MakeGenericType([object], $writerArrayType, [ulong]) $mapDelegate = [System.Delegate]::CreateDelegate($mapActionType, $wrapper, $mapMethod) # Find the 0-param RegisterTableFunction method $regMethod = $connectionType.GetMethods() | Where-Object { $_.Name -eq 'RegisterTableFunction' -and $_.GetGenericArguments().Count -eq 0 } | Select-Object -First 1 if (-not $regMethod) { throw 'No RegisterTableFunction overload found for 0 parameters' } $regMethod.Invoke($this, @($Name, $bindDelegate, $mapDelegate)) } else { # With input parameters - use Func<IReadOnlyList<IDuckDBValueReader>, TableFunction> $bindMethod = $wrapper.GetType().GetMethod('BindWithParams') $readerListType = [System.Collections.Generic.IReadOnlyList[DuckDB.NET.Native.IDuckDBValueReader]] $bindFuncType = [System.Func`2].MakeGenericType($readerListType, [DuckDB.NET.Data.TableFunction]) $bindDelegate = [System.Delegate]::CreateDelegate($bindFuncType, $wrapper, $bindMethod) # Create mapper delegate: Action<object, IDuckDBDataWriter[], ulong> $mapMethod = $wrapper.GetType().GetMethod('MapRow') $writerArrayType = [DuckDB.NET.Data.DataChunk.Writer.IDuckDBDataWriter[]] $mapActionType = [System.Action`3].MakeGenericType([object], $writerArrayType, [ulong]) $mapDelegate = [System.Delegate]::CreateDelegate($mapActionType, $wrapper, $mapMethod) # Find the correct RegisterTableFunction overload based on parameter count $regMethods = $connectionType.GetMethods() | Where-Object { $_.Name -eq 'RegisterTableFunction' -and $_.GetGenericArguments().Count -eq $paramCount } if (-not $regMethods) { throw "No RegisterTableFunction overload found for $paramCount parameters" } $genericMethod = $regMethods | Select-Object -First 1 $closedMethod = $genericMethod.MakeGenericMethod($Parameters) $closedMethod.Invoke($this, @($Name, $bindDelegate, $mapDelegate)) } Write-Verbose "Registered table function '$Name' with $($columnInfoArray.Count) column(s) and $paramCount parameter(s)" } } function Add-DuckDBRemoveFunctionMethod { <# .SYNOPSIS Removes a user-defined function from the DuckDB connection. .DESCRIPTION The RemoveFunction method unregisters a previously registered UDF from the DuckDB connection using the SQL DROP FUNCTION statement. After removal, the function can no longer be called in SQL queries. Use this to clean up functions or to re-register them with different logic. .PARAMETER Name The name of the function to remove. Function names are case-insensitive. .EXAMPLE PS> $conn.RemoveFunction('getenv') Removes the 'getenv' function from the connection. .EXAMPLE PS> # Re-register a function with updated logic PS> $conn.RemoveFunction('calculate_tax') PS> $conn.CreateFunction( 'calculate_tax', { param([double]$amount) [Math]::Round($amount * 0.0925, 2) }, # New tax rate @([double]), [double] ) .EXAMPLE PS> # Clean up all custom functions before closing PS> @('getenv', 'slugify', 'is_valid_email') | ForEach-Object { $conn.RemoveFunction($_) } PS> $conn.CloseDB() .NOTES This method uses DROP FUNCTION IF EXISTS, so it won't error if the function doesn't exist. #> Update-TypeData -TypeName 'DuckDB.NET.Data.DuckDBConnection' -MemberType ScriptMethod -MemberName RemoveFunction -Force -Value { param([string]$Name) if ($this.State -ne 'Open') { $this.Open() } # DuckDB doesn't have a direct unregister method, so we use SQL DROP FUNCTION $this.sql("DROP FUNCTION IF EXISTS $Name") } } function Add-DuckDBGetAssemblyVersionsMethod { <# .SYNOPSIS Retrieves the versions of the DuckDB.NET assemblies loaded by the module. .DESCRIPTION The GetAssemblyVersions method returns the versions of the DuckDB.NET assemblies (DuckDB.NET.Data.dll and DuckDB.NET.Bindings.dll) that are loaded in the current PowerShell session. This is useful for verifying which version of DuckDB.NET is being used and for troubleshooting compatibility issues. .OUTPUTS PSCustomObject[] An array of custom objects with Assembly and Version properties. .EXAMPLE PS> $conn.GetAssemblyVersions() Assembly Version --------- ------- C:\path\to\DuckDB.NET.Data.dll 1.0.2.0 C:\path\to\DuckDB.NET.Bindings.dll 1.0.2.0 Returns the versions of the loaded DuckDB.NET assemblies. .NOTES This method inspects the assemblies loaded from the module's lib directory. #> Update-TypeData -TypeName 'DuckDB.NET.Data.DuckDBConnection' -MemberType ScriptMethod -MemberName GetAssemblyVersions -Force -Value { Get-ChildItem -File -Filter 'DuckDB.NET.*' -Path (Get-Module PaperinikDB | Select-Object @{ Name = 'LibPath' ; Expression = { Join-Path (Split-Path $_.Path) 'lib' } }).LibPath | ForEach-Object { [pscustomobject]@{Assembly = ($_.FullName); Version = ([Reflection.Assembly]::LoadFile("$($_.FullName)").GetName().Version) } } } } # Initialize the script methods on the DuckDBConnection type Add-DuckDBSqlMethod Add-DuckDBCloseDBMethod Add-DuckDBCreateFunctionMethod Add-DuckDBCreateTableFunctionMethod Add-DuckDBRemoveFunctionMethod Add-DuckDBGetAssemblyVersionsMethod |