en-US/about_binshred.help.txt
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 |
TOPIC
about_binshred SHORT DESCRIPTION Describes the syntax and usage of the ConvertFrom-BinaryData cmdlet. LONG DESCRIPTION The ConvertFrom-BinaryData cmdlet is a general purpse cmdlet for parsing binary files and content. To direct the parsing of this binary data, you describe the file format using a simple text-based template structure. Its default alias is "binshred". Most binary file formats are structured into a series of conceptual regions. For example, a header, followed by a body, followed by some data rows, followed by a footer. These regions usually have properties. For example, a header might have a few "magic bytes", followed by a length field, followed by a version number. A simple example Consider a simple example of the following binary content: PS C:\> Format-Hex words.bin Path: C:\words.bin 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 00000000 4C 48 02 00 00 00 05 00 00 00 48 65 6C 6C 6F 05 LH........Hello. 00000010 00 00 00 57 6F 72 6C 64 ...World From either documentation or investigation, we've determined that the file format has two main portions: a header, followed by a list of words. The header itself has 2 bytes in ASCII as the magic signature, followed by an integer representing the count of the number of words. After that, each word entry has an integer representing the word length, followed by a word (of that length) in UTF8. A BinShred Template (.bst) for this file looks like this: header : magic (2 bytes as ASCII) wordCount (4 bytes as UINT32) words (wordCount items); words : wordLength (4 bytes as UINT32) word (wordLength bytes as UTF8); Regions are identified as words followed by a colon. Within a region, you identify properties by writing their property names followed by the length and data type of that property. A semicolon identifies the end of a region. When you supply this template to the Invoke-BinShred cmdlet, the resulting object represents the data structures contained in that binary file as objects. PS > binshred -Path .\words.bin -TemplatePath .\wordParser.bst Name Value ---- ----- magic LH wordCount 2 words (...) PS > (binshred -Path .\words.bin -TemplatePath .\wordParser.bst).Words[0] Name Value ---- ----- wordLength 5 word Hello Supported features in a BinShred Template Whitespace / Capitalization BinShred templates are not sensitive to whitespace or capitalization. Newlinesor spaces can be added as desired. // A single-line comment. This type of comment does not appear in the parsed result objects, unlike the documentation comments described below. /* */ A block comment. This type of comment does not appear in the parsed result objects, unlike the documentation comments described below. LABEL : ... ; A region of data to be parsed. The LABEL of the region can be any name of your choice. The colon is mandatory, as is the trailing semicolon. The region between the colon and semicolon represents properties of that region. The LABEL will be used as a property name on the resulting parsed object. PROPERTY (BYTES bytes as DATATYPE described by LOOKUPTABLE) A property definition within a region. The name of the property ("PROPERTY") can be any name of your choice. The parenthesis (and their contents) are optional. Without parenthesis, the property will be treated as a nested property definition, and BinShred will look for a LABEL of that name to continue processing. If you include parenthesis, this will provide instructions on how to interpret that property. The byte count ("BYTES") is mandatory. This will usually be either an absolute number (10 bytes ...), or refer to a property that would have already been parsed - for example "( header.ByteCount bytes ... )". You can also specify a native (C#) expression for this value. The native expression can refer to properties that would have already been parsed, and must return an integer - for example "( { return (letterCount * 2); } bytes as Unicode )". Specifying a byte count with a native expression is much slower than specifying it with a direct byte count or property reference, so you you should avoid it if possible. The optional "as DATATYPE" section of the parsing instruction describes how to interpret these bytes. If not specified, the property will use an array of bytes as its data type. Supported data types are: ASCII, UNICODE, UTF8, UINT64, UINT32, UINT16 INT64, INT32, INT16, SINGLE, FLOAT, DOUBLE You can also specify a native (C#) expression for the interpretation of these bytes. The native expression can refer to properties that have already been parsed. In addition to the properties that have already been parsed, three parameters are available to the native expression: _content: The byte array representing the entire binary content being parsed. _contentPosition: The current position in the binary content being parsed. _byteCount: The number of bytes to be parsed, as specified (or dynamically evaluated) by the byte count property. For example, you could write a native expression to parse a series of bytes as ASCII like this: (4 bytes as { return Encoding.ASCII.GetString(_content, _contentPosition, _byteCount); }) Specifying a data interpretation with a native expression is much slower than specifying a data type directly, so you you should avoid it if possible. The optional "as described by LOOKUPTABLE" section of the parsing instruction lets you define a lookup table that maps this property value to a more meaningful description. This description will be included as a "PROPERTY.description" property. PROPERTY (COUNT items) A property that is an array of items. The parenthesis are mandatory, as is the COUNT field. The COUNT field may be either an absolute number (4 items), or refer to a property that would have already been parsed - for example "( header.ItemCount items )". You must also define a parsing rule that matches this property name to describe the data format of the property items. /** Comment */ A documentation comment. If you include this above a property definition, this comment will be included as a "PROPERTY.description" property for that region. If you include this above a lookup table definition, this comment will be added to the "PROPERTY.description" field of the property being described by the lookup table. (Additional properties identified by PROPERTY from LOOKUPTABLE) A property inclusion rule. This is useful when you have a data structure that changes based on the value of a property that you've already parsed. For example, a 'version' property might imply different properties for different versions. These additional properties will be included as sibling properties of the current region, rather than nested regions. (Padding to multiple of BYTES bytes) A property padding rule within a data region. The byte count ("BYTES") is mandatory. It may be either an absolute number (10 bytes ...), or refer to a property that would have already been parsed - for example "( header.ByteCount bytes ... )". This is useful when you have a region within a data structure that must be a multiple of a specified number of bytes - even when the properties within that region don't consume that many bytes. The remainder is called padding, or sometimes alignment. For example, in the bitmap file format, each row of pixel data must be a multiple of four bytes. If the pixel data itself (3 bytes for each pixel) doesn't consume a multiple of four bytes, then you can use a padding rule to ensure that it does. You could write the 'rows' data region this way: rows : pixels (bitmap.dibHeader.bitmapWidth items) (padding to multiple of 4 bytes); LOOKUPTABLE : VALUE : LABEL ; A lookup table for property inclusion rules. This form of lookup table is used to identify the region / definition that should be used to parse the rest of the data in this region. The beginning colon and trailing semicolon are mandatory. The "VALUE : LABEL" pair can be repeated. Each new pair should be placed on separate lines for clarity, although it is not required. Values can be strings, integers, hexadecimal constants, or arrays of these three data types. LOOKUPTABLE : VALUE : "Description" ; A lookup table for property descriptions. This form of lookup table is used to add additional context-sensitive documentation to property values when a rule uses the "as described by LOOKUPTABLE" feature. The beginning colon and trailing semicolon are mandatory. The "VALUE : "Description"" pair can be repeated. Each new pair should be placed on separate lines for clarity, although it is not required. A complex example The following BinShred template demonstrates many of these concepts by parsing simple Windows bitmap files: // A bitmap file bitmap : /** The bitmap header */ header /** The Device Independent Bitmap header */ dibHeader /** The color table */ colorTable /** The pixel data */ pixelData ; header: /** The bitmap type */ headerField (2 bytes as ASCII described by headerFieldType) /** The size of the entire file */ fileSize (4 bytes as UINT32) /** Application specific */ reserved1 (2 bytes) /** Application specific */ reserved2 (2 bytes) /** Offset to the start of the image bytes */ imageDataOffset (4 bytes as UINT32) ; headerFieldType : BM : "Windows Bitmap" BA : "OS/2 struct bitmap array" CI : "OS/2 struct color icon" CP : "OS/2 const color pointer" IC : "OS/2 struct icon" PT : "OS/2 pointer" ; dibHeader: /** The size of the DIB header */ headerSize (4 bytes as UINT32) (additional properties identified by headerSize from bitmapType) ; bitmapType : /** Windows 2.0 or later / OS/2 1.x */ 12 : bitmapCoreHeader /** OS/2 BITMAPCOREHEADER2 - Adds halftoning. */ 64 : os22xBitmapHeader /** Windows NT, 3.1x or later - Adds 16 bpp and 32 bpp formats. */ 40 : bitmapInfoHeader /** Undocumented - adds RGB bit masks */ 52 : bitmapV2Header /** Bitmap with alpha mask */ 56 : bitmapV3Header /** Windows NT 4.0, 95 or later */ 108 : bitmapV4Header /** Windows NT 5.0, 98 or later - Adds ICC color profiles */ 124 : bitmapV5Header ; bitmapInfoHeader : /** bitmap width in pixels */ bitmapWidth (4 bytes as INT32) /** bitmap height in pixels */ bitmapHeight (4 bytes as INT32) /** number of color planes. Must be 1. */ colorPlanes (2 bytes as UINT16) /** number of bits per pixel, which is the color depth of the image. */ bitsPerPixel (2 bytes as UINT16) /** compression method */ compressionMethod (4 bytes as UINT32 described by compressionMethod) /** image size. This is the size of the raw bitmap data */ imageSize (4 bytes as UINT32) /** horizontal resolution of the image (pixels per meter) */ horizontalResolution (4 bytes as INT32) /** vertical resolution of the image (pixels per meter) */ verticalResolution (4 bytes as INT32) /** number of colors in the color palette - or 0 to default to 2^n */ colorsInColorPalette (4 bytes as UINT32) /** number of important colors used, or 0 when every color is important */ importantColors (4 bytes as UINT32) ; compressionMethod : 0 : "BI_RGB - none" 1 : "BI_RLE8 - RLE 8-bit/pixel" 2 : "BI_RLE4 - RLE 4-bit/pixel" 3 : "BI_BITFIELDS" 4 : "BI_JPEG - OS22XBITMAPHEADER: RLE-24, BITMAPV4INFOHEADER+: JPEG image for printing" 5 : "BI_PNG - BITMAPV4INFOHEADER+: PNG image for printing" 6 : "BI_ALPHABITFIELDS - RGBA bit field masks (only Windows CE 5.0 with .NET 4.0 or later)" 11 : "BI_CMYK - none (only Windows Metafile CMYK)" 12 : "BI_CMYKRLE8 - RLE-8 (only Windows Metafile CMYK)" 13 : "BI_CMYKRLE4 - RLE-4 (only Windows Metafile CMYK)" ; colorTable : /** Stored in RGBA32 format */ colorTableEntries (bitmap.dibHeader.colorsInColorPalette items); colorTableEntries : colorDefinition (4 bytes); pixelData : rows (bitmap.dibHeader.bitmapHeight items); rows : pixels (bitmap.dibHeader.bitmapWidth items); pixels : pixel (3 bytes); |