Xtext itself and every language infrastructure developed with Xtext is configured and wired-up using dependency injection. Xtext may be used in different environments which introduce different constraints. Especially important is the difference between OSGi managed containers and plain vanilla Java programs. To honor these differences Xtext uses the concept of ISetup-implementations in normal mode and uses Eclipse’s extension mechanism when it should be configured in an OSGi environment.
For each language there is an implementation of ISetup generated. It implements a method called createInjectorAndDoEMFRegistration()
, which can be called to do the initialization of the language infrastructure.
Caveat: The ISetup class is intended to be used for runtime and for unit testing, only. if you use it in a Equinox scenario, you will very likely break the running application because entries to the global registries will be overwritten.
The setup method returns an Injector, which can further be used to obtain a parser, etc. It also registers the Factory and generated EPackages to the respective global registries provided by EMF. So basically after having run the setup and you can start using EMF API to load and store models of your language.
Within Eclipse we have a generated Activator, which creates a Guice Injector using the modules. In addition an IExecutableExtensionFactory is generated for each language, which is used to create IExecutableExtensions. This means that everything which is created via extension points is managed by Guice as well, i.e. you can declare dependencies and get them injected upon creation.
The only thing you have to do in order to use this factory is to prefix the class with the factory MyDslExecutableExtensionFactory name followed by a colon.
<extension point="org.eclipse.ui.editors">
<editor
class="<MyDsl>ExecutableExtensionFactory:
org.eclipse.xtext.ui.editor.XtextEditor"
contributorClass=
"org.eclipse.ui.editors.text.TextEditorActionContributor"
default="true"
extensions="mydsl"
id="org.eclipse.xtext.example.MyDsl"
name="MyDsl Editor">
</editor>
</extension>
Xtext uses Apache’s log4j for logging. It is configured using files named log4j.properties, which are looked up in the root of the Java class path. If you want to change or provide configuration at runtime (i.e. non-OSGi), all you have to do is putting such a log4j.properties in place and make sure that it is not overridden by other log4j.properties in previous class path entries.
In OSGi you provide configuration by creating a fragment for org.apache.log4j. In this case you need to make sure that there is not any second fragment contributing a log4j.properties file.
Once you have a language you probably want to do something with it. There are two options, you can either write an interpreter that inspects the AST and does something based on that or you translate your language to another programming language or configuration files. In this section we’re going to show how to implement a code generator for an Xtext-based language.
If you go with the default MWE workflow for your language and you haven’t used Xbase, than you’ll be provided with a callback stub that implements IGenerator. It has one method that is called from the builder infrastructure whenever a DSL file has changed or should be translated otherwise. The two parameters passed in to this method are:
The IFileSystemAccess API abstracts over the different file systems the code generator my run over. These are typically Eclipse’s file system, when the code generator is triggered from within the incremental build infrastructure in Eclipse, and java.io.File
when the code generator is executed outside Eclipse, say in a headless build.
A very simple implementation of a code generator for the example statemachine language introduced earlier could be the following:
class StatemachineGenerator implements IGenerator {
override void doGenerate(Resource resource, IFileSystemAccess fsa) {
fsa.generateFile("relative/path/AllTheStates.txt", '''
«FOR state : resource.allContents.filter(State).toIterable»
State «state.name»
«ENDFOR»
''')
}
}
We use Xtend for implementing code generators as it is much better suited for that task then Java (or any other language on the planet :-)). Please refer to the Xtend documentation for further details. For Java developers it’s extremely easy to learn, as the basics are similar and you only need to learn the additional powerful concepts.
You don’t want to deal with platform or even installation dependent paths in your code generator, rather you want to be able to configure the code generator with some basic outlet roots where the different generated files should be placed under. This is what output configurations are made for.
By default every language will have a single outlet, which points to <project-root>/src-gen/
. The files that go here are treated as fully derived and will be erased by the compiler automatically when a new file should be generated. If you need additional outlets or want to have a different default configuration, you need to implement the interface IOutputConfigurationProvider. It’s straight forward to understand and the default implementation gives you a good idea about how to implement it.
With this implementation you lay out the basic defaults which can be changed by users on a workspace or per project level using the preferences.
Static analysis or validation is one of the most interesting aspects when developing a programming language. The users of your languages will be grateful if they get informative feedback as they type. In Xtext there are basically three different kinds of validation.
Some implementation aspects (e.g. the grammar, scoping) of a language have an impact on what is required for a document or semantic model to be valid. Xtext automatically takes care of this.
The syntactical correctness of any textual input is validated automatically by the parser. The error messages are generated by the underlying parser technology. One can use the ISyntaxErrorMessageProvider-API to customize this messages. Any syntax errors can be retrieved from the Resource using the common EMF API:
Any broken cross-links can be checked generically. As cross-link resolution is done lazily (see linking), any broken links are resolved lazily as well. If you want to validate whether all links are valid, you will have to navigate through the model so that all installed EMF proxies get resolved. This is done automatically in the editor.
Similar to syntax errors, any unresolvable cross-links will be reported and can be obtained through:
The IConcreteSyntaxValidator validates all constraints that are implied by a grammar. Meeting these constraints for a model is mandatory to be serialized.
Example:
MyRule:
({MySubRule} "sub")? (strVal+=ID intVal+=INT)*;
This implies several constraints:
strVal.size() == intVal.size()
.The typical use case for the concrete syntax validator is validation in non-Xtext-editors that, however, use an XtextResource. This is, for example, the case when combining GMF and Xtext. Another use case is when the semantic model is modified “manually” (not by the parser) and then serialized again. Since it is very difficult for the serializer to provide meaningful error messages, the concrete syntax validator is executed by default before serialization. A textual Xtext editor itself is not a valid use case. Here, the parser ensures that all syntactical constraints are met. Therefore, there is no value in additionally running the concrete syntax validator.
There are some limitations to the concrete syntax validator which result from the fact that it treats the grammar as declarative, which is something the parser doesn’t always do.
{MyType.myFeature=current}
are ignored. Unassigned actions (e.g. {MyType}
), however, are supported.Rule: (foo+=R1 foo+=R2)*
implies that foo is expected to contain instances of R1 and R2 in an alternating order.To use concrete syntax validation you can let Guice inject an instance of IConcreteSyntaxValidator and use it directly. Furthermore, there is an adapter which allows to use the concrete syntax validator as an EValidator. You can, for example, enable it in your runtime module, by adding:
@SingletonBinding(eager = true)
public Class<? extends ConcreteSyntaxEValidator>
bindConcreteSyntaxEValidator() {
return ConcreteSyntaxEValidator.class;
}
To customize error messages please see IConcreteSyntaxDiagnosticProvider and subclass ConcreteSyntaxDiagnosticProvider.
In addition to the afore mentioned kinds of validation, which are more or less done automatically, you can specify additional constraints specific for your Ecore model. We leverage existing EMF API and have put some convenience stuff on top. Basically all you need to do is to make sure that an EValidator is registered for your EPackage. The Registry can only be filled programmatically. That means contrary to the Registry and the Registry there is no Equinox extension point to populate the validator registry.
For Xtext we provide a generator fragment for the convenient Java-based EValidator API. Just add the following fragment to your generator configuration and you are good to go:
fragment =
org.eclipse.xtext.generator.validation.JavaValidatorFragment {}
The generator will provide you with two Java classes. An abstract class generated to src-gen/ which extends the library class AbstractDeclarativeValidator. This one just registers the EPackages for which this validator introduces constraints. The other class is a subclass of that abstract class and is generated to the src/ folder in order to be edited by you. That is where you put the constraints in.
The purpose of the AbstractDeclarativeValidator is to allow you to write constraints in a declarative way - as the class name already suggests. That is instead of writing exhaustive if-else constructs or extending the generated EMF switch you just have to add the Check annotation to any method and it will be invoked automatically when validation takes place. Moreover you can state for what type the respective constraint method is, just by declaring a typed parameter. This also lets you avoid any type casts. In addition to the reflective invocation of validation methods the AbstractDeclarativeValidator provides a couple of convenient assertions.
The Check annotation has a parameter that can be used to declare when a check should be run, FAST will be run whenever a file is modified, NORMAL checks will run when saving the file, and EXPENSIVE checks are run when explicitly validating the file via the menu option.
All in all this is very similar to how JUnit 4 works. Here is an example:
public class DomainmodelJavaValidator
extends AbstractDomainmodelJavaValidator {
@Check(FAST)
public void checkTypeNameStartsWithCapital(Type type) {
if (!Character.isUpperCase(type.getName().charAt(0)))
warning("Name should start with a capital",
DomainmodelPackage.TYPE__NAME);
}
}
You can also implement quick fixes for individual validation errors and warnings. See the section on quick fixes for details.
As noted above, Xtext uses EMF’s EValidator API to register validators. You can run the validators on your model programmatically using EMF’s Diagnostician, e.g.
EObject myModel = myResource.getContents().get(0);
Diagnostic diagnostic = Diagnostician.INSTANCE.validate(myModel);
switch (diagnostic.getSeverity()) {
case Diagnostic.ERROR:
System.err.println("Model has errors: ",diagnostic);
break;
case Diagnostic.WARNING:
System.err.println("Model has warnings: ",diagnostic);
}
If you have implemented your validators by extending AbstractDeclarativeValidator, there are helper classes which assist you when testing your validators.
Testing validators typically works as follows:
To create models, you can either use EMF’s ResourceSet to load models from your hard disk or you can utilize the MyDslFactory that EMF generates for each EPackage, to construct the tested model elements manually. While the first option has the advantages that you can edit your models in your textual concrete syntax, the second option has the advantage that you can create partial models.
To run the @Check-methods and ensure they raise the intended errors and warnings, you can utilize ValidatorTester as shown by the following example:
Validator:
public class MyLanguageValidator extends AbstractDeclarativeValidator {
@Check
public void checkFooElement(FooElement element) {
if(element.getBarAttribute().contains("foo"))
error("Only Foos allowed", element,
MyLanguagePackage.FOO_ELEMENT__BAR_ATTRIBUTE, 101);
}
}
JUnit-Test:
public class MyLanguageValidatorTest extends AbstractXtextTests {
private ValidatorTester<MyLanguageValidator> tester;
@Override
public void setUp() {
with(MyLanguageStandaloneSetup.class);
MyLanguageValidator validator = get(MyLanguageValidator.class);
tester = new ValidatorTester<TestingValidator>(validator);
}
public void testError() {
FooElement model = MyLanguageFactory.eINSTANCE.createFooElement()
model.setBarAttribute("barbarbarbarfoo");
tester.validator().checkFooElement(model);
tester.diagnose().assertError(101);
}
public void testError2() {
FooElement model = MyLanguageFactory.eINSTANCE.createFooElement()
model.setBarAttribute("barbarbarbarfoo");
tester.validate(model).assertError(101);
}
}
This example uses JUnit 3, but since the involved classes from Xtext have no dependency on JUnit whatsoever, JUnit 4 and other testing frameworks will work as well. JUnit runs the setUp()
-method before each test case and thereby helps to create some common state. In this example, the validator is instantiated by means of Google Guice. As we inherit from the AbstractXtextTests there are a plenty of useful methods available and the state of the global EMF singletons will be restored in the method tearDown()
. Afterwards, the ValidatorTester is created and parameterized with the actual validator. It acts as a wrapper for the validator, ensures that the validator has a valid state and provides convenient access to the validator itself (tester.validator()
) as well as to the utility classes which assert diagnostics created by the validator (tester.diagnose()
). Please be aware that you have to call validator()
before you can call diagnose()
. However, you can call validator()
multiple times in a row.
While validator()
allows to call the validator’s @Check-methods directly, validate(model)
leaves it to the framework to call the applicable @Check-methods. However, to avoid side-effects between tests, it is recommended to call the @Check-methods directly.
diagnose()
and validate(model)
return an object of type AssertableDiagnostics which provides several assert-methods to verify whether the expected diagnostics are present:
assertError(int code)
: There must be one diagnostic with severity ERROR and the supplied error code.assertErrorContains(String messageFragment)
: There must be one diagnostic with severity ERROR and its message must contain messageFragment.assertError(int code, String messageFragment)
: Verifies severity, error code and messageFragment.assertWarning(...)
: This method is available for the same combination of parameters as assertError()
.assertOK()
: Expects that no diagnostics (errors, warnings etc.) have been raised.assertDiagnostics(int severity, int code, String messageFragment)
: Verifies severity, error code and messageFragment.assertAll(DiagnosticPredicate... predicates)
: Allows to describe multiple diagnostics at the same time and verifies that all of them are present. Class AssertableDiagnostics contains static error()
and warning()
methods which help to create the needed DiagnosticPredicate. Example: assertAll(error(123), warning("some part of the message"))
.assertAny(DiagnosticPredicate predicate)
: Asserts that a diagnostic exists which matches the predicate.The linking feature allows for specification of cross-references within an Xtext grammar. The following things are needed for the linking:
In the grammar a cross-reference is specified using square brackets.
CrossReference :
'[' type=ReferencedEClass ('|' terminal=CrossReferenceTerminal)? ']'
;
Example:
ReferringType :
'ref' referencedObject=[Entity|STRING]
;
The Ecore model inference would create an EClass ReferringType with an EReference referencedObject of type Entity with its containment property set to false
. The referenced object would be identified either by a STRING and the surrounding information in the current context (see scoping). If you do not use generate
but import
an existing Ecore model, the class ReferringType (or one of its super types) would need to have an EReference of type Entity (or one of its super types) declared. Also the EReference’s containment and container properties needs to be set to false
.
Xtext uses lazy linking by default and we encourage users to stick to this because it provides many advantages. One of which is improved performance in all scenarios where you don’t have to load the whole closure of all transitively referenced resources. Furthermore it automatically solves situations where one link relies on other links. Though cyclic linking dependencies are not supported by Xtext at all.
When parsing a given input string, say
ref Entity01
the LazyLinker first creates an EMF proxy and assigns it to the corresponding EReference. In EMF a proxy is described by a URI, which points to the real EObject. In the case of lazy linking the stored URI comprises of the context information given at parse time, which is the EObject containing the cross-reference, the actual EReference, the index (in case it’s a multi-valued cross-reference) and the string which represented the cross-link in the concrete syntax. The latter usually corresponds to the name of the referenced EObject. In EMF a URI consists of information about the resource the EObject is contained in as well as a so called fragment part, which is used to find the EObject within that resource. When an EMF proxy is resolved, the current ResourceSet is asked. The resource set uses the first part to obtain (i.e. load if it is not already loaded) the resource. Then the resource is asked to return the EObject based on the fragment in the URI. The actual cross-reference resolution is done by LazyLinkingResource.getEObject(String) which receives the fragment and delegates to the implementation of the ILinkingService. The default implementation in turn delegates to the scoping API.
A simple implementation of the linking service is shipped with Xtext and used for any grammar per default. Usually any necessary customization of the linking behavior can best be described using the scoping API.
Using the scoping API one defines which elements are referable by a given reference. For instance, using the introductory example (Fowler’s state machine language) a transition contains two cross-references: one to a declared event and one to a declared state.
Example:
events
nothingImportant MYEV
end
state idle
nothingImportant => idle
end
The grammar rule for transitions looks like this:
Transition :
event=[Event] '=>' state=[State];
The grammar declares that for the reference event only instances of the type Event are allowed and that for the EReference state only instances of type State can be referenced. However, this simple declaration doesn’t say anything about where to find the states or events. That is the duty of scopes.
An IScopeProvider is responsible for providing an IScope for a given context EObject and EReference. The returned IScope should contain all target candidates for the given object and cross-reference.
public interface IScopeProvider {
/**
* Returns a scope for the given context. The scope
* provides access to the compatible visible EObjects
* for a given reference.
*
* @param context the element from which an element shall be
* referenced
* @param reference the reference to be used to filter the
* elements.
* @return {@link IScope} representing the inner most
* {@link IScope} for the passed context and reference.
* Note for implementors: The result may not be
* <code>null</code>. Return
* <code>IScope.NULLSCOPE</code> instead.
*/
IScope getScope(EObject context, EReference reference);
}
A single IScope represents an element of a linked list of scopes. That means that a scope can be nested within an outer scope. Each scope works like a symbol table or a map where the keys are strings and the values are so called IEObjectDescription, which is effectively an abstract description of a real EObject. In order to create IEObjectDescriptions for your model elements, the class Scopes is very useful.
To have a concrete example, let’s deal with the following simple grammar.
grammar org.xtext.example.mydsl.MyScopingDsl with
org.eclipse.xtext.common.Terminals
generate myDsl "http://www.xtext.org/example/mydsl/MyScopingDsl"
Root:
elements+=Element;
Element:
'element' name=ID ('extends' superElement=[Element])?;
If you want to define the scope for the superElement cross-reference, the following code is one way to go.
IScope getScope(EObject context, EReference reference) {
// We want to define the Scope for the Element's superElement cross-reference
if(context instanceof Element
&& reference == MyDslPackage.Literals.ELEMENT__SUPER_ELEMENT){
// Collect a list of candidates by going through the model
// EcoreUtil2 provides useful functionality to do that
// For example searching for all elements within the root Object's tree
EObject rootElement = EcoreUtil2.getRootContainer(context);
List<Element> candidates = EcoreUtil2.getAllContentsOfType(rootElement, Element.class);
// Scopes.scopeFor creates IEObjectDescriptions and puts them into an IScope instance
IScope scope = Scopes.scopeFor(candidates);
return scope;
}
return super.getScope(context, reference);
}
There are different useful implementations for IScopes shipped with Xtext. We want to mention only some of them here.
The MapBasedScope comes with the efficiency of a map to look up a certain name. If you prefer to deal with Multimaps the MultimapBasedScope should work for you. For situations where some elements should be filtered out of an existing scope, the FilteringScope is the right way to go. As scopes can be nested, we strongly recommend to use FilteringScope only for leaf scopes without nested scopes.
Coming back to our example, one possible scenario for the FilteringScope could be to exclude the context element from the list of candidates as it should not be a super-element of itself.
IScope getScope(final EObject context, EReference reference) {
if(context instanceof Element
&& reference == MyDslPackage.Literals.ELEMENT__SUPER_ELEMENT){
EObject rootElement = EcoreUtil2.getRootContainer(context);
List<Element> candidates = EcoreUtil2.getAllContentsOfType(rootElement, Element.class);
IScope existingScope = Scopes.scopeFor(candidates);
// Scope that filters out the context element from the candidates list
IScope filteredScope = new FilteringScope(existingScope,
new Predicate<IEObjectDescription>() {
public boolean apply(IEObjectDescription input) {
return input.getEObjectOrProxy() != context;
}
});
return filteredScope;
}
return super.getScope(context, reference);
}
In the state machine example we don’t have references across model files. Neither is there a concept like a namespace which would make scoping a bit more complicated. Basically, every State and every Event declared in the same resource is visible by their name. However, in the real world things are most likely not that simple: What if you want to reuse certain declared states and events across different state machines and you want to share those as library between different users? You would want to introduce some kind of cross-resource reference.
Defining what is visible from outside the current resource is the responsibility of global scopes. As the name suggests, global scopes are provided by instances of the IGlobalScopeProvider. The data structures (called index) used to store its elements are described in the next section.
In order to make states and events of one file referable from another file, you need to export them as part of a so called IResourceDescription.
A IResourceDescription contains information about the resource itself, which primarily its URI, a list of exported EObjects in the form of IEObjectDescriptions, as well as information about outgoing cross-references and qualified names it references. The cross-references contain only resolved references, while the list of imported qualified names also contains the names that couldn’t be resolved. This information is leveraged by Xtext’s indexing infrastructure in order to compute the transitive hull of dependent resources.
For users, and especially in the context of scoping, the most important information is the list of exported EObjects. An IEObjectDescription stores the URI of the actual EObject, its QualifiedName, as well as its EClass. In addition one can export arbitrary information using the user data map. The following diagram gives an overview on the description classes and their relationships.
A language is configured with default implementations of IResourceDescription.Manager and DefaultResourceDescriptionStrategy, which are responsible to compute the list of exported IEObjectDescriptions. The Manager iterates over the whole EMF model for each Resource and asks the ResourceDescriptionStrategy to compute an IEObjectDescription for each EObject. The ResourceDescriptionStrategy applies the getQualifiedName(EObject obj)
from IQualifiedNameProvider on the object, and if it has a qualified name an IEObjectDescription is created and passed back to the Manager which adds it to the list of exported objects. If an EObject doesn’t have a qualified name, the element is considered to be not referable from outside the resource and consequently not indexed. If you don’t like this behavior, you can implement and bind your own implementation of IDefaultResourceDescriptionStrategy.
There are also two different default implementations of IQualifiedNameProvider. Both work by looking up an EAttribute ‘name’. The SimpleNameProvider simply returns the plain value, while the DefaultDeclarativeQualifiedNameProvider concatenates the simple name with the qualified name of its parent exported EObject. This effectively simulates the qualified name computation of most namespace-based languages (like e.g. Java).
As already mentioned, the default implementation strategy exports every model element that the IQualifiedNameProvider can provide a name for. This is a good starting point, but when your models become bigger and you have a lot of them the index will become larger and larger. In most scenarios only a small part of your model should be visible from outside. For that reason only a small part of your model needs to be in the index. If you come to that point, please bind a custom implementation of IDefaultResourceDescriptionStrategy and create index representations only for those elements that you want to reference to from outside the resource they are contained in. From within the resource, references to those filtered elements are still possible as long as they have a name. In summary, there are two ways to control which elements will go into the index. The first one is through the IQualifiedNameProvider, but this implies that an element is not referable even within the same resource. The second one is though the IDefaultResourceDescriptionStrategy, which does not imply that you cannot refer to the elment within the same resource.
Beside the exported elements the index contains IReferenceDescriptions that contain the information who is referencing who. They are created through the Manager and IDefaultResourceDescriptionStrategy, too. If there is a model element that references another model element, the IDefaultResourceDescriptionStrategy creates an IReferenceDescription that contains the URI of the referencing element (sourceEObjectURI) and the referenced element (targetEObjectURI). At the end this IReferenceDescriptions are very useful to find references and calculate affected resources.
As mentioned above, in order to calculate an IResourceDescription for a resource the framework asks the Manager which delegates to the IDefaultResourceDescriptionStrategy. To convert between a QualifiedName and its String representation you can use the IQualifiedNameConverter. Here is some Java code showing how to do that:
@Inject IQualifiedNameConverter converter;
Manager manager = // obtain an instance of IResourceDescription.Manager
IResourceDescription description = manager.getResourceDescription(resource);
for (IEObjectDescription eod : description.getExportedObjects()) {
System.out.println(converter.toString(eod.getQualifiedName()));
}
In order to obtain a Manager it is best to ask the corresponding IResourceServiceProvider. That is because each language might have a totally different implementation, and as you might refer from your language to a different language you cannot reuse your language’s Manager. One basically asks the Registry (there is usually one global instance) for an IResourceServiceProvider, which in turn provides a Manager along other useful services.
If you are running in a Guice enabled scenario, the code looks like this:
@Inject
private IResourceServiceProvider.Registry rspr;
private IResourceDescription.Manager getManager(Resource res) {
IResourceServiceProvider resourceServiceProvider =
rspr.getResourceServiceProvider(res.getURI());
return resourceServiceProvider.getResourceDescriptionManager();
}
If you don’t run in a Guice enabled context you will likely have to directly access the singleton:
private IResourceServiceProvider.Registry rspr =
IResourceServiceProvider.Registry.INSTANCE;
However, we strongly encourage you to use dependency injection. Now that we know how to export elements to be referable from other resources, we need to learn how those exported IEObjectDescriptions can be made available to the referencing resources. That is the responsibility of global scoping which is described in the following section.
If you would like to see what’s in the index, you could use the ‘Open Model Element’ dialog from the navigation menu entry.
Instead of explicitly referring to imported resources, another option is to have some kind of external configuration in order to define what is visible from outside a resource. Java for instance uses the notion of the class path to define containers (jars and class folders) which contain referenceable elements. In the case of Java the order of such entries is also important.
To enable support for this kind of global scoping in Xtext, a DefaultGlobalScopeProvider has to be bound to the IGlobalScopeProvider interface. By default Xtext leverages the class path mechanism since it is well designed and already understood by most of our users. The available tooling provided by JDT and PDE to configure the class path adds even more value. However, it is just a default: you can reuse the infrastructure without using Java and be independent from the JDT.
In order to know what is available in the “world”, a global scope provider which relies on external configuration needs to read that configuration in and be able to find all candidates for a certain EReference. If you don’t want to force users to have a folder and file name structure reflecting the actual qualified names of the referenceable EObjects, you’ll have to load all resources up front and either keep holding them in memory or remember all information which is needed for the resolution of cross-references. In Xtext that information is provided by a so called IEObjectDescription.
Xtext ships with an index which remembers all IResourceDescription and their IEObjectDescription objects. In the IDE-context (i.e. when running the editor, etc.) the index is updated by an incremental project builder. As opposed to that, in a non-UI context you typically do not have to deal with changes, hence the infrastructure can be much simpler. In both situations the global index state is held by an implementation of IResourceDescriptions (note the plural form!). The bound singleton in the UI scenario is even aware of unsaved editor changes, such that all linking happens to the latest maybe unsaved version of the resources. You will find the Guice configuration of the global index in the UI scenario in SharedModule.
The index is basically a flat list of instances of IResourceDescription. The index itself doesn’t know about visibility constraints due to class path restriction. Rather than that, they are defined by the referencing language by means of so called IContainers: While Java might load a resource via ClassLoader.loadResource() (i.e. using the class path mechanism), another language could load the same resource using the file system paths.
Consequently, the information which container a resource belongs to depends on the referencing context. Therefore an IResourceServiceProvider provides another interesting service, which is called Manager. For a given IResourceDescription, the Manager provides you with the IContainer as well as with a list of all IContainers which are visible from there. Note that the index is globally shared between all languages while the Manager which adds the semantics of containers, can be very different depending on the language. The following method lists all resources visible from a given Resource:
@Inject
IContainer.Manager manager;
public void listVisibleResources(
Resource myResource, IResourceDescriptions index) {
IResourceDescription descr =
index.getResourceDescription(myResource.getURI());
for(IContainer visibleContainer:
manager.getVisibleContainers(descr, index)) {
for(IResourceDescription visibleResourceDesc:
visibleContainer.getResourceDescriptions()) {
System.out.println(visibleResourceDesc.getURI());
}
}
}
Xtext ships two implementations of Manager which are usually bound with Guice: The default binding is to SimpleResourceDescriptionsBasedContainerManager, which assumes all IResourceDescription to be in a single common container. If you don’t care about container support, you’ll be fine with this one. Alternatively, you can bind StateBasedContainerManager and an additional IAllContainersState which keeps track of the set of available containers and their visibility relationships.
Xtext offers a couple of strategies for managing containers: If you’re running an Eclipse workbench, you can define containers based on Java projects and their class paths or based on plain Eclipse projects. Outside Eclipse, you can provide a set of file system paths to be scanned for models. All of these only differ in the bound instance of IAllContainersState of the referring language. These will be described in detail in the following sections.
As JDT is an Eclipse feature, this JDT-based container management is only available in the UI scenario. It assumes so called IPackageFragmentRoots as containers. An IPackageFragmentRoot in JDT is the root of a tree of Java model elements. It usually refers to
So for an element to be referable, its resource must be on the class path of the caller’s Java project and it must be exported (as described above).
As this strategy allows to reuse a lot of nice Java things like jars, OSGi, maven, etc. it is part of the default: You should not have to reconfigure anything to make it work. Nevertheless, if you messed something up, make sure you bind
public Class<? extends IContainer.Manager> bindIContainer$Manager() {
return StateBasedContainerManager.class;
}
in the runtime module and
public Provider<IAllContainersState> provideIAllContainersState() {
return org.eclipse.xtext.ui.shared.Access.getJavaProjectsState();
}
in the UI module of the referencing language. The latter looks a bit more difficult than a common binding, as we have to bind a global singleton to a Guice provider. A StrictJavaProjectsState requires all elements to be on the class path, while the default JavaProjectsState also allows models in non-source folders.
If the class path based mechanism doesn’t work for your case, Xtext offers an alternative container manager based on plain Eclipse projects: Each project acts as a container and the project references (Properties → Project References) are the visible containers.
In this case, your runtime module should define
public Class<? extends IContainer.Manager> bindIContainer$Manager() {
return StateBasedContainerManager.class;
}
and the UI module should bind
public Provider<IAllContainersState> provideIAllContainersState() {
return org.eclipse.xtext.ui.shared.Access.getWorkspaceProjectsState();
}
If you need a Manager that is independent of Eclipse projects, you can use the ResourceSetBasedAllContainersState. This one can be configured with a mapping of container handles to resource URIs.
It is unlikely you want to use this strategy directly in your own code, but it is used in the back-end of the MWE2 workflow component Reader. This is responsible for reading in models in a workflow, e.g. for later code generation. The Reader allows to either scan the whole class path or a set of paths for all models therein. When paths are given, each path entry becomes an IContainer of its own.
component = org.eclipse.xtext.mwe.Reader {
// lookup all resources on the class path
// useJavaClassPath = true
// or define search scope explicitly
path = "src/models"
path = "src/further-models"
...
}
We now know how the outer world of referenceable elements can be defined in Xtext. Nevertheless, not everything is available in all contexts and with a global name. Rather than that, each context can usually have a different scope. As already stated, scopes can be nested, i.e. a scope can contain elements of a parent scope in addition to its own elements. When parent and child scope contain different elements with the same name, the parent scope’s element will usually be shadowed by the element from the child scope.
To illustrate that, let’s have a look at Java: Java defines multiple kinds of scopes (object scope, type scope, etc.). For Java one would create the scope hierarchy as commented in the following example:
// file contents scope
import static my.Constants.STATIC;
public class ScopeExample { // class body scope
private Object field = STATIC;
private void method(String param) { // method body scope
String localVar = "bar";
innerBlock: { // block scope
String innerScopeVar = "foo";
Object field = innerScopeVar;
// the scope hierarchy at this point would look like this:
// blockScope{field,innerScopeVar}->
// methodScope{localVar, param}->
// classScope{field}-> ('field' is shadowed)
// fileScope{STATIC}->
// classpathScope{
// 'all qualified names of accessible static fields'} ->
// NULLSCOPE{}
//
}
field.add(localVar);
}
}
In fact the class path scope should also reflect the order of class path entries. For instance:
classpathScope{stuff from bin/}
-> classpathScope{stuff from foo.jar/}
-> ...
-> classpathScope{stuff from JRE System Library}
-> NULLSCOPE{}
Please find the motivation behind this and some additional details in this blog post .
The imported namespace aware scoping is based on qualified names and namespaces. It adds namespace support to your language, which is comparable and similar to namespaces in Scala and C#. Scala and C# both allow to have multiple nested packages within one file, and you can put imports per namespace, such that imported names are only visible within that namespace. See the domain model example: its scope provider extends ImportedNamespaceAwareLocalScopeProvider.
The ImportedNamespaceAwareLocalScopeProvider makes use of the so called IQualifiedNameProvider service. It computes QualifiedNames for EObjects. A qualified name consists of several segments. The default implementation uses a simple name look-up composing the qualified name of the simple names of all containers and the object itself. It also allows to override the name computation declaratively. The following snippet shows how you could make Transitions in the state machine example referable by giving them a name. Don’t forget to bind your implementation in your runtime module.
FowlerDslQualifiedNameProvider
extends DefaultDeclarativeQualifiedNameProvider {
public QualifiedName qualifiedName(Transition t) {
if(t.getEvent() == null || !(t.eContainer() instanceof State))
return null;
else
return QualifiedName.create((State)t.eContainer()).getName(),
t.getEvent().getName());
}
}
The ImportedNamespaceAwareLocalScopeProvider looks up EAttributes with name ‘importedNamespace’ and interprets them as import statements.
Root:
imports+=Import*
childs+=Child*;
Import:
'import' importedNamespace=QualifiedName ('.*')?;
QualifiedName:
ID ('.' ID)*;
By default qualified names with or without a wildcard at the end are supported. For an import of a qualified name the simple name is made available as we know from e.g. Java, where import java.util.Set;
makes it possible to refer to java.util.Set by its simple name Set. Contrary to Java, the import is not active for the whole file, but only for the namespace it is declared in and its child namespaces. That is why you can write the following in the example DSL:
package foo {
import bar.Foo
entity Bar extends Foo {
}
}
package bar {
entity Foo {}
}
Of course the declared elements within a package are as well referable by their simple name:
package bar {
entity Bar extends Foo {}
entity Foo {}
}
The following would also be ok:
package bar {
entity Bar extends bar.Foo {}
entity Foo {}
}
See the JavaDocs and this blog post for details.
Value converters are registered to convert the parsed text into a data type instance and vice versa. The primary hook is the IValueConverterService and the concrete implementation can be registered via the runtime Guice module. Simply override the corresponding binding in your runtime module like shown in this example:
@Override
public Class<? extends IValueConverterService>
bindIValueConverterService() {
return MySpecialValueConverterService.class;
}
The most simple way to register additional value converters is to make use of AbstractDeclarativeValueConverterService, which allows to declaratively register an IValueConverter by means of an annotated method.
@ValueConverter(rule = "MyRuleName")
public IValueConverter<MyDataType> getMyRuleNameConverter() {
return new MyValueConverterImplementation();
}
If you use the common terminals grammar org.eclipse.xtext.common.Terminals
you should extend the DefaultTerminalConverters and override or add value converters by adding the respective methods. In addition to the explicitly defined converters in the default implementation, a delegating converter is registered for each available EDataType. The delegating converter reuses the functionality of the corresponding EMF EFactory.
Many languages introduce a concept for qualified names, i.e. names composed of namespaces separated by a delimiter. Since this is such a common use case, Xtext provides an extensible converter implementation for qualified names. The QualifiedNameValueConverter handles comments and white space gracefully and is capable to use the appropriate value converter for each segment of a qualified name. This allows for individually quoted segments. The domainmodel example shows how to use it.
The protocol of an IValueConverter allows to throw a ValueConverterException if something went wrong. The exception is propagated as a syntax error by the parser or as a validation problem by the ConcreteSyntaxValidator if the value cannot be converted to a valid string. The AbstractLexerBasedConverter is useful when implementing a custom value converter. If the converter needs to know about the rule that it currently works with, it may implement the interface RuleSpecific. The framework will set the rule such as the implementation may use it afterwards.
Serialization is the process of transforming an EMF model into its textual representation. Thereby, serialization complements parsing and lexing.
In Xtext, the process of serialization is split into the following steps:
Serialization is invoked when calling XtextResource.save(..). Furthermore, the Serializer provides resource-independent support for serialization. Another situation that triggers serialization is applying quick fixes with semantic modifications. Serialization is not called when a textual editors contents is saved to disk.
The contract of serialization says that a model which is saved (serialized) to its textual representation and then loaded (parsed) again yields a new model that is equal to the original model. Please be aware that this does not imply that loading a textual representation and serializing it back produces identical textual representations. However, the serialization algorithm tries to restore as much information as possible. That is, if the parsed model was not modified in-memory, the serialized output will usually be equal to the previous input. Unfortunately, this cannot be ensured for each and every case. A use case where is is hardly possible, is shown in the following example:
MyRule:
(xval+=ID | yval+=INT)*;
The given MyRule reads ID- and INT-elements which may occur in an arbitrary order in the textual representation. However, when serializing the model all ID-elements will be written first and then all INT-elements. If the order is important it can be preserved by storing all elements in the same list - which may require wrapping the ID- and INT-elements into other objects.
A serialized document represents the state of the semantic model. However, if there is a node model available (i.e. the semantic model has been created by the parser), the serializer
The parse tree constructor usually does not need to be customized since it is automatically derived from the Xtext Grammar. However, it can be helpful to look into it to understand its error messages and its runtime performance.
For serialization to succeed, the parse tree constructor must be able to consume every non-transient element of the to-be-serialized EMF model. To consume means, in this context, to write the element to the textual representation of the model. This can turn out to be a not-so-easy-to-fulfill requirement, since a grammar usually introduces implicit constraints to the EMF model as explained for the concrete syntax validator.
If a model can not be serialized, an XtextSerializationException is thrown. Possible reasons are listed below:
To understand error messages and performance issues of the parse tree constructor it is important to know that it implements a backtracking algorithm. This basically means that the grammar is used to specify the structure of a tree in which one path (from the root node to a leaf node) is a valid serialization of a specific model. The parse tree constructor’s task is to find this path - with the condition that all model elements are consumed while walking this path. The parse tree constructor’s strategy is to take the most promising branch first (the one that would consume the most model elements). If the branch leads to a dead end (for example, if a model element needs to be consumed that is not present in the model), the parse tree constructor goes back the path until a different branch can be taken. This behavior has two consequences:
SaveOptions can be passed to XtextResource.save(options) and to Serializer.serialize(..). Available options are:
false
. If enabled, it is the formatters job to determine all white space information during serialization. If disabled, the formatter only defines white space information for the places in which no white space information can be preserved from the node model. E.g. When new model elements are inserted or there is no node model.true
: Run the concrete syntax validator before serializing the model.The ICommentAssociater associates comments with semantic objects. This is important in case an element in the semantic model is moved to a different position and the model is serialized, one expects the comments to be moved to the new position in the document as well.
Which comment belongs to which semantic object is surely a very subjective issue. The default implementation behaves as follows, but can be customized:
Transient values are values or model elements which are not persisted (written to the textual representation in the serialization phase). If a model contains model elements which can not be serialized with the current grammar, it is critical to mark them transient using the ITransientValueService, or serialization will fail. The default implementation marks all model elements transient, which are eStructuralFeature.isTransient()
or not eObject.eIsSet(eStructuralFeature)
. By default, EMF returns false
for eIsSet(..)
if the value equals the default value.
If there are calls of data type rules or terminal rules that do not reside in an assignment, the serializer by default doesn’t know which value to use for serialization.
Example:
PluralRule:
'contents:' count=INT Plural;
terminal Plural:
'item' | 'items';
Valid models for this example are contents 1 item
or `contents 5 items. However, it is not stored in the semantic model whether the keyword item or items has been parsed. This is due to the fact that the rule call Plural is unassigned. However, the parse tree constructor needs to decide which value to write during serialization. This decision can be be made by customizing the IValueSerializer.serializeUnassignedValue(EObject, RuleCall, INode).
The cross-reference serializer specifies which values are to be written to the textual representation for cross-references. This behavior can be customized by implementing ICrossReferenceSerializer. The default implementation delegates to various other services such as the IScopeProvider or the LinkingHelper each of which may be the better place for customization.
After the parse tree constructor has done its job to create a stream of tokens which are to be written to the textual representation, and the comment associator has done its work, existing white space form the node model is merged into the stream.
The strategy is as follows: If two tokens follow each other in the stream and the corresponding nodes in the node model follow each other as well, then the white space information in between is kept. In all other cases it is up to the formatter to calculate new white space information.
The parse tree constructor and the formatter use an ITokenStream for their output, and the latter for its input as well. This allows for chaining the two components. Token streams can be converted to a String using the TokenStringBuffer and to a Writer using the WriterTokenStream.
public interface ITokenStream {
void flush() throws IOException;
void writeHidden(EObject grammarElement, String value);
void writeSemantic(EObject grammarElement, String value);
}
Formatting is the process of rearranging the text in a document to improve the readability without changing the semantic value of the document. Therefore, a formatter is responsible for arranging line-wraps, indentation, whitespace, etc. in a text to emphasize its structure. A formatter is not supposed to alter a document in a way that impacts the semantic model.
The new formatting API is available since Xtext 2.8. It resolves the limitations of the first API which was present since the first version of Xtext. The new API allows to implement formatting not only based on the static structure of the grammar, but it is possible to make decisions based on the actual model structure. Things that are now possible include:
The actual formatting is done by constructing a list of text replacements. A text replacement describes a new text which should replace an existing part of the document. This is described by offset and length. Applying the text replacements turns the unformatted document into a formatted document.
To invoke the formatter programmatically, you need to instantiate a request and pass it to the formatter. The formatter will return a list of text replacements. The document modification itself can be performed by a utility that is part of the formatting API.
Implementors of a formatter should extend AbstractFormatter2 and add dispatch method for the model elements that should be formatted. The format routine has to be invoked recursively if the children of an object should be formatted, too.
The following example illustrates that pattern. An instance of package declaration is passed to the format method along with the current formattable document. In this scenario, the package name is surrounded by a single space, the curly brace is followed by a new line and increased indentation etc. All elements within that package should be formatted, too, thus format(..)
is invoked on these as well.
def dispatch void format(PackageDeclaration p, extension IFormattableDocument doc) {
p.regionForFeature(ABSTRACT_ELEMENT__NAME).surround[oneSpace]
p.regionForKeyword("{").append[newLine; increaseIndentation]
for (AbstractElement element : p.elements) {
format(element, doc);
element.append[setNewLines(1, 1, 2)]
}
p.regionForKeyword("}").prepend[decreaseIndentation]
}
The API is designed in a way that allows to describe the formatting in a declarative way by calling methods on the IHiddenRegionFormatter which is made available inside invocations of prepend
, surround
or append
to specify the formatting rules. This can be done in arbitrary order – the infrastructure will reorder all the configurations to execute them from top to bottom of the document. If the configuration-based approach is not sufficient for a particular use case, the document also accepts imperative logic that is associated with a given range. The ITextReplacer that can be added directly to the document allows to perform all kinds of modifications to the text in the region that it is associated with.
More detailed information about the API is available as JavaDoc on the org.eclipse.xtext.formatting2 package.
The API in org.eclipse.xtext.formatting
is available since the early days of Xtext and still present in Xtext 2.8. However, it will be deprecated and eventually be removed because of the limitations that it imposes due to its declarative and static approach. New formatting implementations should be based on the new API in org.eclipse.xtext.formatting2
.
A formatter can be implemented via the IFormatter service. Technically speaking, a formatter is a Token Stream which inserts/removes/modifies hidden tokens (white space, line-breaks, comments).
The formatter is invoked during the serialization phase and when the user triggers formatting in the editor (for example, using the CTRL+SHIFT+F shortcut).
Xtext ships with two formatters:
A declarative formatter can be implemented by subclassing AbstractDeclarativeFormatter, as shown in the following example:
public class ExampleFormatter extends AbstractDeclarativeFormatter {
@Override
protected void configureFormatting(FormattingConfig c) {
ExampleLanguageGrammarAccess f = getGrammarAccess();
c.setAutoLinewrap(120);
// find common keywords an specify formatting for them
for (Pair<Keyword, Keyword> pair : f.findKeywordPairs("(", ")")) {
c.setNoSpace().after(pair.getFirst());
c.setNoSpace().before(pair.getSecond());
}
for (Keyword comma : f.findKeywords(",")) {
c.setNoSpace().before(comma);
}
// formatting for grammar rule Line
c.setLinewrap(2).after(f.getLineAccess().getSemicolonKeyword_1());
c.setNoSpace().before(f.getLineAccess().getSemicolonKeyword_1());
// formatting for grammar rule TestIndentation
c.setIndentationIncrement().after(
f.getTestIndentationAccess().getLeftCurlyBracketKeyword_1());
c.setIndentationDecrement().before(
f.getTestIndentationAccess().getRightCurlyBracketKeyword_3());
c.setLinewrap().after(
f.getTestIndentationAccess().getLeftCurlyBracketKeyword_1());
c.setLinewrap().after(
f.getTestIndentationAccess().getRightCurlyBracketKeyword_3());
// formatting for grammar rule Param
c.setNoLinewrap().around(f.getParamAccess().getColonKeyword_1());
c.setNoSpace().around(f.getParamAccess().getColonKeyword_1());
// formatting for Comments
cfg.setLinewrap(0, 1, 2).before(g.getSL_COMMENTRule());
cfg.setLinewrap(0, 1, 2).before(g.getML_COMMENTRule());
cfg.setLinewrap(0, 1, 1).after(g.getML_COMMENTRule());
}
}
The formatter has to implement the method configureFormatting(...)
which declaratively sets up a FormattingConfig.
The FormattingConfig consist of general settings and a set of formatting instructions:
setAutoLinewrap(int)
defines the amount of characters after which a line-break should be dynamically inserted between two tokens. The instructions setNoLinewrap().???()
, setNoSpace().???()
and setSpace(space).???()
suppress this behavior locally. The default is 80.
Per default, the declarative formatter inserts one white space between two tokens. Instructions can be used to specify a different behavior. They consist of two parts: When to apply the instruction and what to do.
To understand when an instruction is applied think of a stream of tokens whereas each token is associated with the corresponding grammar element. The instructions are matched against these grammar elements. The following matching criteria exist:
after(element)
: The instruction is applied after the grammar element has been matched. For example, if your grammar uses the keyword ";"
to end lines, this can instruct the formatter to insert a line break after the semicolon.before(element)
: The instruction is executed before the matched element. For example, if your grammar contains lists which separate their values with the keyword ","
, you can instruct the formatter to suppress the white space before the comma.around(element)
: This is the same as before(element)
combined with after(element)
.between(left, right)
: This matches if left directly follows right in the document. There may be no other tokens in between left and right.bounds(left, right)
: This is the same as after(left)
combined with before(right)
.range(start, end)
: The rule is enabled when start is matched, and disabled when end is matched. Thereby, the rule is active for the complete region which is surrounded by start and end.The term tokens is used slightly different here compared to the parser/lexer. Here, a token is a keyword or the string that is matched by a terminal rule, data type rule or cross-reference. In the terminology of the lexer a data type rule can match a composition of multiple tokens.
The parameter element can be a grammar’s AbstractElement or a grammar’s AbstractRule. All grammar rules and almost all abstract elements can be matched. This includes rule calls, parser rules, groups and alternatives. The semantic of before(element)
, after(element)
, etc. for rule calls and parser rules is identical to when the parser would “pass” this part of the grammar. The stack of called rules is taken into account. The following abstract elements can not have assigned formatting instructions:
{MyAction}
or {MyAction.myFeature=current}
.After having explained how rules can be activated, this is what they can do:
setIndentationIncrement()
increments indentation by one unit at this position. Whether one unit consists of one tab-character or spaces is defined by IIndentationInformation. The default implementation consults Eclipse’s IPreferenceStore.setIndentationDecrement()
decrements indentation by one unit.setLinewrap()
: Inserts a line-wrap at this position.setLinewrap(int count)
: Inserts count numbers of line-wrap at this position.setLinewrap(int min, int def, int max)
: If the amount of line-wraps that have been at this position before formatting can be determined (i.e. when a node model is present), then the amount of of line-wraps is adjusted to be within the interval min, max and is then reused. In all other cases def line-wraps are inserted. Example: setLinewrap(0, 0, 1)
will preserve existing line-wraps, but won’t allow more than one line-wrap between two tokens.setNoLinewrap()
: Suppresses automatic line wrap, which may occur when the line’s length exceeds the defined limit.setSpace(String space)
: Inserts the string space at this position. If you use this to insert something else than white space, tabs or newlines, a small puppy will die somewhere in this world.setNoSpace()
: Suppresses the white space between tokens at this position. Be aware that between some tokens a white space is required to maintain a valid concrete syntax.Sometimes, if a grammar contains many similar elements for which the same formatting instructions ought to apply, it can be tedious to specify them for each grammar element individually. The IGrammarAccess provides convenience methods for this. The find methods are available for the grammar and for each parser rule.
findKeywords(String... keywords)
returns all keywords that equal one of the parameters.findKeywordPairs(String leftKw, String rightKw)
: returns tuples of keywords from the same grammar rule. Pairs are matched nested and sequentially. Example: for Rule: '(' name=ID ('(' foo=ID ')') ')' | '(' bar=ID ')'
findKeywordPairs("(", ")")
returns three pairs.Although inter-Xtext linking is not done by URIs, you may want to be able to reference your EObject from non-Xtext models. In those cases URIs are used, which are made up of a part identifying the resource and a second part that points to an object. Each EObject contained in a resource can be identified by a so called fragment.
A fragment is a part of an EMF URI and needs to be unique per resource.
The generic resource shipped with EMF provides a generic path-like computation of fragments. These fragment paths are unique by default and do not have to be serialized. On the other hand, they can be easily broken by reordering the elements in a resource.
With an XMI or other binary-like serialization it is also common and possible to use UUIDs. UUIDs are usually binary and technical, so you don’t want to deal with them in human readable representations.
However with a textual concrete syntax we want to be able to compute fragments out of the human readable information. We don’t want to force people to use UUIDs (i.e. synthetic identifiers) or fragile, relative, generic paths in order to refer to EObjects.
Therefore one can contribute an IFragmentProvider per language. It has two methods: getFragment(EObject, Fallback)
to calculate the fragment of an EObject and getEObject(Resource, String, Fallback)
to go the opposite direction. The Fallback interface allows to delegate to the default strategy - which usually uses the fragment paths described above.
The following snippet shows how to use qualified names as fragments:
public QualifiedNameFragmentProvider implements IFragmentProvider {
@Inject
private IQualifiedNameProvider qualifiedNameProvider;
public String getFragment(EObject obj, Fallback fallback) {
String qName = qualifiedNameProvider.getQualifiedName(obj);
return qName != null ? qName : fallback.getFragment(obj);
}
public EObject getEObject(Resource resource,
String fragment,
Fallback fallback) {
if (fragment != null) {
Iterator<EObject> i = EcoreUtil.getAllContents(resource, false);
while(i.hasNext()) {
EObject eObject = i.next();
String candidateFragment = (eObject.eIsProxy())
? ((InternalEObject) eObject).eProxyURI().fragment()
: getFragment(eObject, fallback);
if (fragment.equals(candidateFragment))
return eObject;
}
}
return fallback.getEObject(fragment);
}
}
For performance reasons it is usually a good idea to navigate the resource based on the fragment information instead of traversing it completely. If you know that your fragment is computed from qualified names and your model contains something like NamedElements, you should split your fragment into those parts and query the root elements, the children of the best match and so on.
Furthermore it’s a good idea to have some kind of conflict resolution strategy to be able to distinguish between equally named elements that actually are different, e.g. properties may have the very same qualified name as entities.
Encoding, aka character set, describes the way characters are encoded into bytes and vice versa. Famous standard encodings are UTF-8 or ISO-8859-1. The list of available encodings can be determined by calling Charset.availableCharsets(). There is also a list of encodings and their canonical Java names in the API docs.
Unfortunately, each platform and/or spoken language tends to define its own native encoding, e.g. Cp1258 on Windows in Vietnamese or MacIceland on Mac OS X in Icelandic.
In an Eclipse workspace, files, folders, projects can have individual encodings, which are stored in the hidden file .settings/org.eclipse.core.resources.prefs in each project. If a resource does not have an explicit encoding, it inherits the one from its parent recursively. Eclipse chooses the native platform encoding as the default for the workspace root. You can change the default workspace encoding in the Eclipse preferences Preferences → Workspace → Default text encoding. If you develop on different platforms, you should consider choosing an explicit common encoding for your text or code files, especially if you use special characters.
While Eclipse allows to define and inspect the encoding of a file, your file system usually doesn’t. Given an arbitrary text file there is no general strategy to tell how it was encoded. If you deploy an Eclipse project as a jar (even a plug-in), any encoding information not stored in the file itself is lost, too. Some languages define the encoding of a file explicitly, as in the first processing instruction of an XML file. Most languages don’t. Others imply a fixed encoding or offer enhanced syntax for character literals, e.g. the unicode escape sequences \uXXXX in Java.
As Xtext is about textual modeling, it allows to tweak the encoding in various places.
The plug-ins created by the New Xtext Project wizard are by default encoded in the workspace’s standard encoding. The same holds for all files that Xtext generates in there. If you want to change that, e.g. because your grammar uses/allows special characters, you should manually set the encoding in the properties of these projects after their creation. Do this before adding special characters to your grammar or at least make sure the grammar reads correctly after the encoding change. To tell the Xtext generator to generate files in the same encoding, set the encoding property in the workflow next to your grammar, e.g.
Generator {
encoding ="UTF-8"
...
As each language could handle the encoding problem differently, Xtext offers a service here. The IEncodingProvider has a single method getEncoding(URI)
to define the encoding of the resource with the given URI. Users can implement their own strategy but keep in mind that this is not intended to be a long running method. If the encoding is stored within the model file itself, it should be extractable in an easy way, like from the first line in an XML file. The default implementation returns the default Java character set in the runtime scenario.
In the UI scenario, when there is a workspace, users will expect the encoding of the model files to be settable the same way as for other files in the workspace. The default implementation of the IEncodingProvider in the UI scenario therefore returns the file’s workspace encoding for files in the workspace and delegates to the runtime implementation for all other resources, e.g. models in a jar or from a deployed plug-in. Keep in mind that you are going to loose the workspace encoding information as soon as you leave this workspace, e.g. deploy your project.
Unless you want to enforce a uniform encoding for all models of your language, we advise to override the runtime service only. It is bound in the runtime module using the binding annotation @Runtime:
@Override
public void configureRuntimeEncodingProvider(Binder binder) {
binder.bind(IEncodingProvider.class)
.annotatedWith(DispatchingProvider.Runtime.class)
.to(MyEncodingProvider.class);
}
For the uniform encoding, bind the plain IEncodingProvider to the same implementation in both modules:
@Override
public Class<? extends IEncodingProvider> bindIEncodingProvider() {
return MyEncodingProvider.class;
}
An XtextResource uses the IEncodingProvider of your language by default. You can override that by passing an option on load and save, e.g.
Map<?,?> options = new HashMap();
options.put(XtextResource.OPTION_ENCODING, "UTF-8");
myXtextResource.load(options);
options.put(XtextResource.OPTION_ENCODING, "ISO-8859-1");
myXtextResource.save(options);
The SimpleProjectWizardFragment generates a wizard that clients of your language can use to create model projects. This wizard expects its templates to be in the encoding of the Generator that created it (see above). As for every new project wizard, its output will be encoded in the default encoding of the target workspace. If your language enforces a special encoding that ignores the workspace settings, you’ll have to make sure that your wizard uses the right encoding by yourself.
The source code of the Xtext framework itself is completely encoded in ISO 8859-1, which is necessary to make the Xpand templates work everywhere (they use french quotation markup). That encoding is hard coded into the Xtext generator code. You are likely never going to change that.
Automated tests are crucial for the maintainability and the quality of a software product. That is why it is strongly recommended to write unit tests for your language, too. The Xtext project wizard creates a test project for that purpose. It simplifies the setup procedure both for the Eclipse agnostic tests and the UI tests for Junit4.
The following is about testing the parser and the linker for the Domainmodel language from the tutorial. It leverages Xtend to write the test case.
First of all, a new Xtend class has to be created. Therefore, choose the src folder of the test plugin, and select New → Xtend Class from the context menu. Provide a meaningful name and enter the package before you hit finish.
The core of the test infrastructure is the XtextRunner and the language specific IInjectorProvider. Both have to be provided by means of class annotations:
import org.eclipse.xtext.junit4.XtextRunner
import org.example.domainmodel.DomainmodelInjectorProvider
@InjectWith(DomainmodelInjectorProvider)
@RunWith(XtextRunner)
class ParserTest {
}
This configuration will make sure that you can use dependency injection in your test class, and that the global EMF registries are properly populated and cleaned up before respectively after each test.
The class org.eclipse.xtext.junit4.util.ParseHelper allows to parse an arbitrary string into an AST model. The AST model itself can be traversed and checked afterwards. A static import of Assert leads to concise and readable test cases.
import org.eclipse.xtext.junit4.util.ParseHelper
import static org.junit.Assert.*
...
@Inject
ParseHelper<Domainmodel> parser
@Test
def void parseDomainmodel() {
val model = parser.parse('''
entity MyEntity {
parent: MyEntity
}
''')
val entity = model.elements.head as Entity
assertSame(entity, entity.features.head.type)
}
If in addition to the main language your tests require using other languages for references from/to your main language, you’ll have to parse and load dependant resources into the same ResourceSet first for cross reference resolution to work.
As your main language’s default generated IInjectorProvider (e.g. DomainmodelInjectorProvider) does not know about any other such dependant languages, they must be initialized explicitly. The recommended pattern for this is to create a new subclass of the generated MyLanguageInjectorProvider in your *.test project and make sure the dependenant language is intizialized properly. You can and then use this new injector provider instead of the original one in your test’s @InjectWith:
class MyLanguageWithDependenciesInjectorProvider extends MyLanguageInjectorProvider {
override internalCreateInjector() {
MyOtherLangLanguageStandaloneSetup.doSetup
return super.internalCreateInjector
}
}
@RunWith(XtextRunner)
@InjectWith(MyLanguageWithDependenciesInjectorProvider)
class YourTest {
...
}
You should not put injector creation for referenced languages in your standalone setup. Note that for the headless code generation use case, the Maven plug-in is configured with multiple setups, so usually there is no problem there.
You may also need to initialize ‘import’-ed ecore models that are not generated by your Xtext language. This should be done by using an explicit MyModelPackage.eINSTANCE.getName(); in the doSetup() method of your respective language’s StandaloneSetup class. Note that it is strongly recommended to follow this pattern instead of just using @Before methods in your *Test class, as due to internal technical reasons that won’t work anymore as soon as you have more than just one @Test.
class MyLanguageStandaloneSetup extends MyLanguageStandaloneSetupGenerated {
def static void doSetup() {
if (!EPackage.Registry.INSTANCE.containsKey(MyPackage.eNS_URI))
EPackage.Registry.INSTANCE.put(MyPackage.eNS_URI, MyPackage.eINSTANCE);
new MyLanguageStandaloneSetup().createInjectorAndDoEMFRegistration
}
}
This only applies to referencing dependencies to ‘import’-ed Ecore models and languages based on them which may be used in the test. The inherited dependencies from mixed-in grammars are automatically listed in the generated super class already, and nothing needs to be done for those.