What is a "Scalar" in PowerShell

239 views Asked by At

The term scalar is often used in PowerShell issues and documentation along with e.g. the about_Comparison_Operators document. I think, I do have an abstract understanding of its meaning (in fact I am using the word myself quite often) but I am unsure about the concrete PowerShell definition.

Scalar data type (Wikipedia):

A scalar data type, or just scalar, is any non-composite value.

Converting from scalar types (about_Booleans)

A scalar type is an atomic quantity that can hold only one value at a time.

But how exactly would I (script wise) check for a scalar in PowerShell?
E.g. should a DateTime Struct and a collection with single item (as [Int[]]1) considered a scalar?
There is no concrete scalar type or interface definition, to do something like:

If ($Something -is [Scalar]) { ...

So, I guess the concrete PowerShell definition is something like:

$IsScalar =
    ($_ -isnot [Management.Automation.PSCustomObject]) -and
    ($_ -isnot [ComponentModel.Component]) -and           
    ($_ -isnot [Collections.IDictionary]) -and  # Probably covered by ICollection
    ($_ -isnot [Collections.ICollection])         

But not sure if that actually covers it.

1

There are 1 answers

0
mklement0 On BEST ANSWER

To add to the helpful information in your question and in the comments:

There are two, context-dependent definitions of what you might loosely call "scalar" in PowerShell, and while they technically differ, the difference typically doesn't matter:

  • Note:

    • What the two definitions have in common is the following loose definition: a "scalar" is any object that isn't primarily a collection of other objects.
  • In enumeration contexts, notably in the pipeline, and on the LHS of comparison operators, among others:

    • A "scalar" is any object that cannot be or isn't auto-enumerated; that is, it is treated as a single object.

    • The behavior is primarily based on whether a type implements the IEnumerable interface, with selective, hard-coded exceptions; an informal summary is:

      • .NET enumerables except strings, dictionaries, and XML nodes are auto-enumerated.
      • Additionally - even though they are not enumerables per se - so are data tables and enumerators.
      • Any other object is a "scalar", i.e. processed as itself.
    • See the next section for details.

  • In to-Boolean coercions (conversions):

    • A "scalar" is any object that isn't list-like, as exclusively determined by whether its type implements the IList interface.[1]

    • A quick summary of the to-Boolean coercion (conversion) logic, which applies to implicit coercions, such as in conditionals, as well as to explicit ones with a [bool] cast):

      • List-like objects that contain no elements are always $false, those that contain only one object (element) are treated as that object (see below), whereas those with two or more elements are always $true, irrespective of what elements they contain; e.g., [bool] @($false) is $false, but [bool] @($false, $false) is $true

      • The following yield $false:

        • A numeric 0 value, regardless of the specific numeric type
        • The empty string (e.g. '')
        • $null, including the "enumerable null", i.e. the special singleton[2] that signals "no output" from commands, but is treated like $null in expressions; e.g. both [bool] $null and [bool] (Get-ChildItem NoSuchFiles*) yield $false.
      • Any other object yields $true, notably including:

        • any nonempty string, irrespective of its content, e.g. [bool] 'foo' and [bool] 'false'
        • any non-string, non-.NET primitive instance, e.g.
          [bool] [pscustomobject] @{},
          • including any nonempty list-like object that happens to be the only element of a list-like object; in other words: the treat-a-single-element-list-as-that-element logic is not applied recursively; e.g.
            [bool] (, @($false)) is $true (but a nested empty list is $false, e.g. [bool] (, @()))

The - comparatively rare - scenarios where the two definitions of "scalar" do make a difference:

  • Since the IList interface (ultimately) derives from IEnumerable, and none of the IEnumerable exceptions in enumerable contexts implement IList, the difference comes down to the following scenarios:

  • IEnumerable-implementing types that do not also implement IList, e.g. the objects returned by [System.Linq.Enumerable]::Range():

     # NON-scalar in the pipeline:
     # Auto-enumeration -> nothing is sent through the pipeline,
     # because the enumeration is *empty*
     [System.Linq.Enumerable]::Range(0, 0) | Measure-Object # -> 0
    
     # SCALAR in to-Boolean conversion:
     # The object doesn't implement [IList], and since it is
     # not a .NET primitive type, it is implicitly $true.
     [bool] [System.Linq.Enumerable]::Range(0, 0)  # -> !! $true
    
     # Note that if you force enumeration via @(...),
     # which collects the enumerated objects in an [object[]] array,
     # the [IList] logic does apply:
     [bool] @([System.Linq.Enumerable]::Range(0, 0)) # -> $false
    
  • IEnumerator-implementing types; e.g., the object returned by an explicit .GetEnumerator() call:

     # NON-scalar in the pipeline:
     # Auto-enumeration -> nothing is sent through the pipeline.
     @{}.GetEnumerator() | Measure-Object # -> 0
    
     # SCALAR in to-Boolean conversion:
     # [IEnumerator] doesn't implement [IList], and since it is
     # not a .NET primitive type, it is implicitly $true.
     [bool] @{}.GetEnumerator()  # -> !! $true
    
  • System.Data.DataTable, the only non-IEnumerable type that PowerShell auto-enumerates in enumeration contexts.

     # NON-scalar in the pipeline:
     # Auto-enumeration -> nothing is sent through the pipeline,
     # because the data table has *no rows*.
     [System.Data.DataTable]::new() | Measure-Object # -> 0
    
     # SCALAR in to-Boolean conversion:
     # [System.Data.DataTable] doesn't implement [IList], 
     # and since it is not a .NET primitive type, it is implicitly $true.
     [bool] [System.Data.DataTable]::new()  # -> !! $true
    

"Scalars" in enumeration contexts:

Enumeration contexts are:

  • The pipeline - as also implicitly used by a single command, e.g. Get-ChildItem $HOME

  • Select operators:

  • The inputs to select language statements (keyword-based statements) (represented as below):

    • foreach (foreach ($elem in …) { <# body #> })

    • switch (switch (…) { <# body #> })

In any such enumeration context, a "scalar" is any object that cannot be or isn't auto-enumerated.

Auto-enumeration means:

  • Instead of processing the object itself, process the objects its enumeration returns, one by one, (ultimately) via the IEnumerable interface (e.g., enumeration of an array returns its elements).

That is:

  • It doesn't matter whether a "scalar" object is a composite value (has properties) or not (e.g. a .NET primitive type such as [int]).

  • While whether an instance's type implements the IEnumerable interface (which makes them enumerable on demand from .NET's perspective) is the basis for PowerShell's decision whether to auto-enumerator or not, there are selective, hard-coded exceptions, detailed below.

    • To get PowerShell to enumerate instances of such types on demand, a prominent example of which are hashtables, their .GetEnumerator() method must be called explicitly; e.g., the following enumerates the key-value pairs (System.Collections.DictionaryEntry instances) that make up the sample hashtable and stringifies each; without .GetEnumerator(), the hashtable instance as a whole would be sent to the pipeline:
      @{ foo = 1; bar = 2 }.GetEnumerator() | ForEach-Object ToString

    • The fact that this works implies yet another non-"scalar": any object whose type implements the IEnumerator interface, as implemented by IEnumerable-implementing types.[3]

The following function encapsulates the exact logic PowerShell uses to determine automatic enumerability;[4] it accepts any object and indicates whether PowerShell would auto-enumerate it in enumeration contexts.

function Test-Enumerability {
  [CmdletBinding()]
  param (
    [Parameter(Mandatory)]
    [object] $InputObject
  )

  (
    $InputObject -is    [System.Collections.IEnumerable] -and
    $InputObject -isnot [System.Collections.IDictionary] -and
    $InputObject -isnot [string]                         -and 
    $InputObject -isnot [System.Xml.XmlNode]
  ) -or
    $InputObject -is    [System.Data.DataTable]          -or
    $InputObject -is    [System.Collections.IEnumerator]
  
}

An informal summary of the above:

  • .NET enumerables except strings, dictionaries,[5] and XML nodes are auto-enumerated.

  • Additionally - even though they are not enumerables per se - the following are automatically enumerated too:

    • data tables, by their rows (.Rows property)
    • enumerators (i.e. objects that enumerate enumerables)

[1] Here's the link to the source code, as of this writing (this logic is highly unlikely to change, however).

[2] This special singleton is [System.Management.Automation.Internal.AutomationNull]::Value]; for more information, see this answer.

[3] The same applies to types implementing the generic counterparts of these interface, IEnumerator`1 and IEnumerable1`, given that these derive from their non-generic cousins.

[4] Here's the link to the source code, as of this writing (this logic is highly unlikely to change, however).

[5] Note that PowerShell only tests for the non-generic dictionary interface, System.Collections.IDictionary, not also for its generic counterpart, System.Collections.Generic.IDictionary`2. Since the latter does not derive from the former - unlike in the IEnumerable / IEnumerable`1 pair - types that implement only the generic interface unexpectedly are auto-enumerated; a prominent example is System.Dynamic.ExpandoObject; see GitHub issue #15204 for a discussion of this problematic behavior.