en-US/about_binshred.help.txt

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
TOPIC
    about_binshred
     
SHORT DESCRIPTION
    Describes the syntax and usage of the ConvertFrom-BinaryData cmdlet.
     
LONG DESCRIPTION
    The ConvertFrom-BinaryData cmdlet is a general purpse cmdlet for parsing
    binary files and content. To direct the parsing of this binary data, you
    describe the file format using a simple text-based template structure.
     
    Its default alias is "binshred".
     
    Most binary file formats are structured into a series of conceptual
    regions. For example, a header, followed by a body, followed by some data
    rows, followed by a footer. These regions usually have properties. For
    example, a header might have a few "magic bytes", followed by a length
    field, followed by a version number.
     
 A simple example
    Consider a simple example of the following binary content:
     
    PS C:\> Format-Hex words.bin
 
               Path: C:\words.bin
 
               00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
 
    00000000 4C 48 02 00 00 00 05 00 00 00 48 65 6C 6C 6F 05 LH........Hello.
    00000010 00 00 00 57 6F 72 6C 64 ...World
     
    From either documentation or investigation, we've determined that the file
    format has two main portions: a header, followed by a list of words. The
    header itself has 2 bytes in ASCII as the magic signature, followed by an
    integer representing the count of the number of words. After that, each
    word entry has an integer representing the word length, followed by a
    word (of that length) in UTF8.
     
    A BinShred Template (.bst) for this file looks like this:
     
        header :
            magic (2 bytes as ASCII)
            wordCount (4 bytes as UINT32)
            words (wordCount items);
        words :
            wordLength (4 bytes as UINT32)
            word (wordLength bytes as UTF8);
     
    Regions are identified as words followed by a colon. Within a region, you
    identify properties by writing their property names followed by the length
    and data type of that property. A semicolon identifies the end of a region.
     
    When you supply this template to the Invoke-BinShred cmdlet, the resulting
    object represents the data structures contained in that binary file as
    objects.
     
    PS > binshred -Path .\words.bin -TemplatePath .\wordParser.bst
 
    Name Value
    ---- -----
    magic LH
    wordCount 2
    words (...)
 
    PS > (binshred -Path .\words.bin -TemplatePath .\wordParser.bst).Words[0]
 
    Name Value
    ---- -----
    wordLength 5
    word Hello
     
 Supported features in a BinShred Template
  
    Whitespace / Capitalization
        BinShred templates are not sensitive to whitespace or capitalization.
        Newlinesor spaces can be added as desired.
         
    //
        A single-line comment. This type of comment does not appear in the
        parsed result objects, unlike the documentation comments described below.
         
    /* */
        A block comment. This type of comment does not appear in the
        parsed result objects, unlike the documentation comments described below.
         
    LABEL : ... ;
        A region of data to be parsed.
 
        The LABEL of the region can be any name of your choice. The colon is
        mandatory, as is the trailing semicolon. The region between the colon
        and semicolon represents properties of that region. The LABEL will be
        used as a property name on the resulting parsed object.
         
    PROPERTY (BYTES bytes as DATATYPE described by LOOKUPTABLE)
        A property definition within a region.
 
        The name of the property ("PROPERTY") can be any name of your choice.
        The parenthesis (and their contents) are optional. Without parenthesis,
        the property will be treated as a nested property definition, and
        BinShred will look for a LABEL of that name to continue processing.
         
        If you include parenthesis, this will provide instructions on how
        to interpret that property.
 
        The byte count ("BYTES") is mandatory. This will usually be either
        an absolute number (10 bytes ...), or refer to a property that would have
        already been parsed - for example "( header.ByteCount bytes ... )".
         
        You can also specify a native (C#) expression for this value. The
        native expression can refer to properties that would have already been
        parsed, and must return an integer - for example
        "( { return (letterCount * 2); } bytes as Unicode )".
         
        Specifying a byte count with a native expression is much slower than
        specifying it with a direct byte count or property reference, so you
        you should avoid it if possible.
 
        The optional "as DATATYPE" section of the parsing instruction
        describes how to interpret these bytes. If not specified, the property
        will use an array of bytes as its data type. Supported data types are:
 
            ASCII, UNICODE, UTF8, UINT64, UINT32, UINT16
            INT64, INT32, INT16, SINGLE, FLOAT, DOUBLE
             
        You can also specify a native (C#) expression for the interpretation
        of these bytes. The native expression can refer to properties that have
        already been parsed. In addition to the properties that have already
        been parsed, three parameters are available to the native expression:
         
            _content: The byte array representing the entire binary content
                being parsed.
            _contentPosition: The current position in the binary content
                being parsed.
            _byteCount: The number of bytes to be parsed, as specified (or
                dynamically evaluated) by the byte count property.
                 
        For example, you could write a native expression to parse a series of
        bytes as ASCII like this:
         
            (4 bytes as {
                return Encoding.ASCII.GetString(_content, _contentPosition, _byteCount);
            })
             
        Specifying a data interpretation with a native expression is much slower than
        specifying a data type directly, so you you should avoid it if possible.
             
        The optional "as described by LOOKUPTABLE" section of the parsing
        instruction lets you define a lookup table that maps this property
        value to a more meaningful description. This description will be
        included as a "PROPERTY.description" property.
 
    PROPERTY (COUNT items)
        A property that is an array of items.
         
        The parenthesis are mandatory, as is the COUNT field. The COUNT field
        may be either an absolute number (4 items), or refer to a property
        that would have already been parsed - for example
        "( header.ItemCount items )". You must also define a parsing
        rule that matches this property name to describe the data format of the
        property items.
         
    /** Comment */
        A documentation comment.
         
        If you include this above a property definition, this comment will
        be included as a "PROPERTY.description" property for that region.
        If you include this above a lookup table definition, this comment
        will be added to the "PROPERTY.description" field of the property
        being described by the lookup table.
         
    (Additional properties identified by PROPERTY from LOOKUPTABLE)
        A property inclusion rule.
         
        This is useful when you have a data structure that changes based on
        the value of a property that you've already parsed. For example, a
        'version' property might imply different properties for different
        versions. These additional properties will be included as sibling
        properties of the current region, rather than nested regions.
 
    (Padding to multiple of BYTES bytes)
        A property padding rule within a data region.
         
        The byte count ("BYTES") is mandatory. It may be either an absolute
        number (10 bytes ...), or refer to a property that would have
        already been parsed - for example "( header.ByteCount bytes ... )".
         
        This is useful when you have a region within a data structure that
        must be a multiple of a specified number of bytes - even when the
        properties within that region don't consume that many bytes. The
        remainder is called padding, or sometimes alignment.
         
        For example, in the bitmap file format, each row of pixel data must
        be a multiple of four bytes. If the pixel data itself (3 bytes
        for each pixel) doesn't consume a multiple of four bytes, then you
        can use a padding rule to ensure that it does.
         
        You could write the 'rows' data region this way:
         
            rows :
                pixels (bitmap.dibHeader.bitmapWidth items)
                (padding to multiple of 4 bytes);
         
         
    LOOKUPTABLE : VALUE : LABEL ;
        A lookup table for property inclusion rules.
         
        This form of lookup table is used to identify the region / definition
        that should be used to parse the rest of the data in this region. The
        beginning colon and trailing semicolon are mandatory.
         
        The "VALUE : LABEL" pair can be repeated. Each new pair should be
        placed on separate lines for clarity, although it is not required.
         
        Values can be strings, integers, hexadecimal constants, or arrays of
        these three data types.
         
    LOOKUPTABLE : VALUE : "Description" ;
        A lookup table for property descriptions.
         
        This form of lookup table is used to add additional context-sensitive
        documentation to property values when a rule uses the
        "as described by LOOKUPTABLE" feature. The beginning colon and
        trailing semicolon are mandatory.
         
        The "VALUE : "Description"" pair can be repeated. Each new pair should be
        placed on separate lines for clarity, although it is not required.
     
 A complex example
  
    The following BinShred template demonstrates many of these concepts by
    parsing simple Windows bitmap files:
     
    // A bitmap file
    bitmap :
            /** The bitmap header */
            header
 
            /** The Device Independent Bitmap header */
            dibHeader
 
            /** The color table */
            colorTable
 
            /** The pixel data */
            pixelData
        ;
 
    header:
            /** The bitmap type */
            headerField (2 bytes as ASCII described by headerFieldType)
 
            /** The size of the entire file */
            fileSize (4 bytes as UINT32)
 
            /** Application specific */
            reserved1 (2 bytes)
 
            /** Application specific */
            reserved2 (2 bytes)
 
            /** Offset to the start of the image bytes */
            imageDataOffset (4 bytes as UINT32)
        ;
 
    headerFieldType :
            BM : "Windows Bitmap"
            BA : "OS/2 struct bitmap array"
            CI : "OS/2 struct color icon"
            CP : "OS/2 const color pointer"
            IC : "OS/2 struct icon"
            PT : "OS/2 pointer"
        ;
         
    dibHeader:
            /** The size of the DIB header */
            headerSize (4 bytes as UINT32)
 
            (additional properties identified by headerSize from bitmapType)
        ;
 
    bitmapType :
            /** Windows 2.0 or later / OS/2 1.x */
            12 : bitmapCoreHeader
 
            /** OS/2 BITMAPCOREHEADER2 - Adds halftoning. */
            64 : os22xBitmapHeader
 
            /** Windows NT, 3.1x or later - Adds 16 bpp and 32 bpp formats. */
            40 : bitmapInfoHeader
 
            /** Undocumented - adds RGB bit masks */
            52 : bitmapV2Header
 
            /** Bitmap with alpha mask */
            56 : bitmapV3Header
 
            /** Windows NT 4.0, 95 or later */
            108 : bitmapV4Header
 
            /** Windows NT 5.0, 98 or later - Adds ICC color profiles */
            124 : bitmapV5Header
        ;
 
    bitmapInfoHeader :
            /** bitmap width in pixels */
            bitmapWidth (4 bytes as INT32)
 
            /** bitmap height in pixels */
            bitmapHeight (4 bytes as INT32)
 
            /** number of color planes. Must be 1. */
            colorPlanes (2 bytes as UINT16)
 
            /** number of bits per pixel, which is the color depth of the image. */
            bitsPerPixel (2 bytes as UINT16)
 
            /** compression method */
            compressionMethod (4 bytes as UINT32 described by compressionMethod)
 
            /** image size. This is the size of the raw bitmap data */
            imageSize (4 bytes as UINT32)
 
            /** horizontal resolution of the image (pixels per meter) */
            horizontalResolution (4 bytes as INT32)
 
            /** vertical resolution of the image (pixels per meter) */
            verticalResolution (4 bytes as INT32)
 
            /** number of colors in the color palette - or 0 to default to 2^n */
            colorsInColorPalette (4 bytes as UINT32)
 
            /** number of important colors used, or 0 when every color is important */
            importantColors (4 bytes as UINT32)
        ;
 
    compressionMethod :
            0 : "BI_RGB - none"
            1 : "BI_RLE8 - RLE 8-bit/pixel"
            2 : "BI_RLE4 - RLE 4-bit/pixel"
            3 : "BI_BITFIELDS"
            4 : "BI_JPEG - OS22XBITMAPHEADER: RLE-24, BITMAPV4INFOHEADER+: JPEG image for printing"
            5 : "BI_PNG - BITMAPV4INFOHEADER+: PNG image for printing"
            6 : "BI_ALPHABITFIELDS - RGBA bit field masks (only Windows CE 5.0 with .NET 4.0 or later)"
            11 : "BI_CMYK - none (only Windows Metafile CMYK)"
            12 : "BI_CMYKRLE8 - RLE-8 (only Windows Metafile CMYK)"
            13 : "BI_CMYKRLE4 - RLE-4 (only Windows Metafile CMYK)"
        ;
 
    colorTable :
        /** Stored in RGBA32 format */
        colorTableEntries (bitmap.dibHeader.colorsInColorPalette items);
 
    colorTableEntries :
        colorDefinition (4 bytes);
 
    pixelData :
        rows (bitmap.dibHeader.bitmapHeight items);
 
    rows :
        pixels (bitmap.dibHeader.bitmapWidth items);
 
    pixels :
        pixel (3 bytes);