I want to share with you this language I’ve been developing that I call smartgrammar. A language to write smart scripts provides a more natural way to develop content. The idea is neither dealing with forms nor columns of a SQL sentence while thinking an AI. Smartgrammar enables you to just focus on the idea and not technical details. With this you can quickly understand what your AI does.
Here is an example that shows it expressiveness, you only require to understand how smart scirpts works.
TYPE CREATURE ENTRY 37127 // Ymirjar Frostbinder
PHASE(0)
CHANCE(100)
FLAGS(30) // normal dungeon, heroic dungeon, normal raid, heroic raid
SMART_EVENT_UPDATE_IC(*initialMin=1000, *initialMax=8000, *repeatMin=8000, *repeatMax=10000)
SMART_ACTION_CAST(*spellId=71270) // Arctic Chill
SMART_TARGET_SELF()
There are many things to improve and features to be implemented. I would like to hear opinions and suggestions. Feel free to ask about any problem you face using it.
I'm not sure what kind of tool you're using but I would recommend you to use a tool to write languages. This is not the tool but the theory behind it: [https://en.wikipedia.org/wiki/LALR_parser](https://en.wikipedia.org/wiki/LALR_parser) In the case of smartgrammar the tool is [PLY](http://www.dabeaz.com/ply/)
I like your syntax, although I would add optional custom paramter naming. It adds readability over time and when sharing your work. For me it also gives consistency to the grammar you proposed. Every number matches a brief description about what it is but event/action/target parameters don't. PHASE(NUMBER), CHANCE(NUMBER), FLAGS(NUMBER) they tell you the semantic meaning of the number the precede.
This example is trivial because the events are quite obvious but in complex cases I find it useful.
Also I would use TC naming for events, actions, targets etc but removing the SMART_ prefix. I think you’re doing this already. Otherwise it’s quite time consuming renaming each label.
PHASE(NUMBER), CHANCE(NUMBER), FLAGS(NUMBER) would be optional if using default values. I fixed the speed issue by separating keyword searches for events, actions, and others.
The advantage of using a LALR parser is that you don’t have to deal with input reading, that’s already done by the algorithm/framework. You only have to define a set of formal rules. Those will define the set of accepted or rejected strings of the parser.
Toy example:
S → AB
A → aA
B → bB
If S is the initial symbol you can define the following strings: ab, aabb, aaabb, etc. The string 'a' won't be accepted because it cannot be build using the rules.
In this case the set of rules could be something like this (just a sketch):
S → TYPE_CREATURE(number) C E A T
C → PHASE(number) C
C → CHANCE(number) C
C → FLAGS(number) C
c → empty
E → event_token(number, number, number, number)
A → action_token(number, number, number, number, number, number)
T → target_token(number, number, number)
event_token, action_token, target_token and number are symbols that represent every possible event, action, target or number respectively.
You give that rules to the tool and it does the string recognition for you. It also build an [AST ](https://en.wikipedia.org/wiki/Abstract_syntax_tree)that helps you to generate the output.
In the next days I will also try to implement this alternative syntax.
Maybe you could try a LL parser which has better error reporting then LR by algorithm design.
Also I like the idea of implementing the transpiler in C++ and add it to the tools directory (the ANTLR4 LL parser with the C++ target and left recursion elimination does a pretty amazing job).
[ATTACH]1997._xfImport[/ATTACH]
This picture shows the error reporting I implemented in the compiler for my bachelors thesis - it uses the LLVM diagnostic engine and the error reporting from ANTLR4. Probably that’s the best you could reach using a parser generator.
I think this is a very good idea. I’m quite skilled on the theoretical part (i.e. formal languages), much less on the technological side. I strongly agree with the choice of an LL parser since the scripting language doesn’t need to be particularly powerful.