-
Notifications
You must be signed in to change notification settings - Fork 13
/
parser language.txt
executable file
·124 lines (87 loc) · 4.95 KB
/
parser language.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
The parser knows a relatively small language:
string A string of ascii characters
name <string>
value An integer number
| <type>.<name>
| <value> [+-*] <value>
| HERE
| START
| OWNER
| REMAINDER
The value HERE represents the current absolute offset in the file.
The value START represents the absolute offset (in bytes) of the
start of the collection it is used in.
The value OWNER represents a collection's owning collection. You
can use OWNER.recordName to access an owner record.
The value REMAINDER represents the unknown quantity that takes
up the rest of the data in the datafile (GLOBAL) or collection (LOCAL)
as of this point in the file. It can be used in arrays to indicate
"as many as fit the rest of the file or collection".
type BYTE 8 bit value
| ASCII 8 bit ascii value (expressly not called a 'char')
| SHORT 16 bit signed integer
| USHORT 16 bit unsigned integer
| UINT24 24 bit unsigned integer
| LONG 32 bit signed integer
| ULONG 32 bit unsigned integer
| <type>[<value>] array notation
| <collection name> custom type (think "struct")
| COLLECTION<name>
The COLLECTION<<name> type indicates a block of data that conforms
to the collection indicated by the value of <name>. For instance:
ASCII[4] name
COLLECTION<name> ihdr
marks the ihdr record as being of a type indicated by the value of "name";
if name is "IHDR" then ihdr will be a data block that should be interpreted
using the Collection IHDR. This allows for specifications in which blocks
are self-descriptive.
record RESERVED <type>
| <type> <name>
| (GLOBAL|LOCAL|IMMEDIATE)? <type> OFFSET <name> (TO <collection name>)? (RELATIVE TO <name>)?
| RELATIVE <type> OFFSET <name> TO <collection name> FROM <value>
Sometimes specs have "reserved" data, which is a byte, short, etc. that
basically acts as a spacer, or is reserved for a placeholder for use in
a future version of the specification. These will be read and then discarded.
For offsets, we need an indicator that tells us whether it's:
GLOBAL relative to the start of this file
LOCAL relative to the start of this collection
IMMEDIATE relative to the next byte's position
Or, if the offset is relative to another record, RELATIVE TO <name of that
record, within scope>.
RELATIVE offsets to some collection
comparator any language-supported numerical comparator
conditional (<name> <comparator> <value>)
conditioned <conditional> { <collection body> }
conditioned collection blocks let us generate parsers that can deal with
multiple versions of data blocks (since specs tend to change over the years).
collection name: <name>
collection identifier: [Defining] Collection <collection name>
There can only be one collection marked "Defining". This is the
collection that is treated as the master collection representing
the start of the binary data file, and its parse function will be
bound to the Parser.parse(data) function for starting a parse run.
collection: <collection identifier> { <collection body> }
collection body: (record | collection | conditioned)*
Collections can be empty, but if not they list records and possible "private"
collections for use as "structs" in that collection. Any parent collection
won't see it.
It has an even smaller set of functions and macros:
VALUE(<name>) Returns the value of the passed record
TERMINATE Will cause the generated parser to print the termination message, and
cancel the parsing process.
WARN Will cause a message to be generated, without terminating the parsing process.
The language also supports /* block comments */ as well as // line comments.
It'll generate JavaScript, for now, because it's a stupidly simple language to generate
code for. I'm sure any number of other target languages can be added later, because
honestly, it's not rocket science =)
This library will simply map the binary data to unpacked (as far as it knows how to)
data in memory, but without any traversal logic; it's simply a data unpacker. This
allows for data->XML or data->JSON unpacking, and probably also the reverse, provided
the information from the .spec file is kept ... somewhere...
The generated code operates on a data object that looks like this:
{
pointer: <number> // should default to 0, indicates the current offset in the data
marks: [] // an array for pointer save/restores
bytecode: arraybuffer // the data array
}
For an example of a .spec file, have a look at "OpenType.spec"