| =============================== | 
 | ORC Design and Implementation | 
 | =============================== | 
 |  | 
 | .. contents:: | 
 |    :local: | 
 |  | 
 | Introduction | 
 | ============ | 
 |  | 
 | This document aims to provide a high-level overview of the design and | 
 | implementation of the ORC JIT APIs. Except where otherwise stated all discussion | 
 | refers to the modern ORCv2 APIs (available since LLVM 7). Clients wishing to | 
 | transition from OrcV1 should see Section :ref:`transitioning_orcv1_to_orcv2`. | 
 |  | 
 | Use-cases | 
 | ========= | 
 |  | 
 | ORC provides a modular API for building JIT compilers. There are a number | 
 | of use cases for such an API. For example: | 
 |  | 
 | 1. The LLVM tutorials use a simple ORC-based JIT class to execute expressions | 
 | compiled from a toy language: Kaleidoscope. | 
 |  | 
 | 2. The LLVM debugger, LLDB, uses a cross-compiling JIT for expression | 
 | evaluation. In this use case, cross compilation allows expressions compiled | 
 | in the debugger process to be executed on the debug target process, which may | 
 | be on a different device/architecture. | 
 |  | 
 | 3. In high-performance JITs (e.g. JVMs, Julia) that want to make use of LLVM's | 
 | optimizations within an existing JIT infrastructure. | 
 |  | 
 | 4. In interpreters and REPLs, e.g. Cling (C++) and the Swift interpreter. | 
 |  | 
 | By adopting a modular, library-based design we aim to make ORC useful in as many | 
 | of these contexts as possible. | 
 |  | 
 | Features | 
 | ======== | 
 |  | 
 | ORC provides the following features: | 
 |  | 
 | **JIT-linking** | 
 |   ORC provides APIs to link relocatable object files (COFF, ELF, MachO) [1]_ | 
 |   into a target process at runtime. The target process may be the same process | 
 |   that contains the JIT session object and jit-linker, or may be another process | 
 |   (even one running on a different machine or architecture) that communicates | 
 |   with the JIT via RPC. | 
 |  | 
 | **LLVM IR compilation** | 
 |   ORC provides off the shelf components (IRCompileLayer, SimpleCompiler, | 
 |   ConcurrentIRCompiler) that make it easy to add LLVM IR to a JIT'd process. | 
 |  | 
 | **Eager and lazy compilation** | 
 |   By default, ORC will compile symbols as soon as they are looked up in the JIT | 
 |   session object (``ExecutionSession``). Compiling eagerly by default makes it | 
 |   easy to use ORC as an in-memory compiler for an existing JIT (similar to how | 
 |   MCJIT is commonly used). However ORC also provides built-in support for lazy | 
 |   compilation via lazy-reexports (see :ref:`Laziness`). | 
 |  | 
 | **Support for Custom Compilers and Program Representations** | 
 |   Clients can supply custom compilers for each symbol that they define in their | 
 |   JIT session. ORC will run the user-supplied compiler when the a definition of | 
 |   a symbol is needed. ORC is actually fully language agnostic: LLVM IR is not | 
 |   treated specially, and is supported via the same wrapper mechanism (the | 
 |   ``MaterializationUnit`` class) that is used for custom compilers. | 
 |  | 
 | **Concurrent JIT'd code** and **Concurrent Compilation** | 
 |   JIT'd code may be executed in multiple threads, may spawn new threads, and may | 
 |   re-enter the ORC (e.g. to request lazy compilation) concurrently from multiple | 
 |   threads. Compilers launched my ORC can run concurrently (provided the client | 
 |   sets up an appropriate dispatcher). Built-in dependency tracking ensures that | 
 |   ORC does not release pointers to JIT'd code or data until all dependencies | 
 |   have also been JIT'd and they are safe to call or use. | 
 |  | 
 | **Removable Code** | 
 |   Resources for JIT'd program representations | 
 |  | 
 | **Orthogonality** and **Composability** | 
 |   Each of the features above can be used independently. It is possible to put | 
 |   ORC components together to make a non-lazy, in-process, single threaded JIT | 
 |   or a lazy, out-of-process, concurrent JIT, or anything in between. | 
 |  | 
 | LLJIT and LLLazyJIT | 
 | =================== | 
 |  | 
 | ORC provides two basic JIT classes off-the-shelf. These are useful both as | 
 | examples of how to assemble ORC components to make a JIT, and as replacements | 
 | for earlier LLVM JIT APIs (e.g. MCJIT). | 
 |  | 
 | The LLJIT class uses an IRCompileLayer and RTDyldObjectLinkingLayer to support | 
 | compilation of LLVM IR and linking of relocatable object files. All operations | 
 | are performed eagerly on symbol lookup (i.e. a symbol's definition is compiled | 
 | as soon as you attempt to look up its address). LLJIT is a suitable replacement | 
 | for MCJIT in most cases (note: some more advanced features, e.g. | 
 | JITEventListeners are not supported yet). | 
 |  | 
 | The LLLazyJIT extends LLJIT and adds a CompileOnDemandLayer to enable lazy | 
 | compilation of LLVM IR. When an LLVM IR module is added via the addLazyIRModule | 
 | method, function bodies in that module will not be compiled until they are first | 
 | called. LLLazyJIT aims to provide a replacement of LLVM's original (pre-MCJIT) | 
 | JIT API. | 
 |  | 
 | LLJIT and LLLazyJIT instances can be created using their respective builder | 
 | classes: LLJITBuilder and LLazyJITBuilder. For example, assuming you have a | 
 | module ``M`` loaded on a ThreadSafeContext ``Ctx``: | 
 |  | 
 | .. code-block:: c++ | 
 |  | 
 |   // Try to detect the host arch and construct an LLJIT instance. | 
 |   auto JIT = LLJITBuilder().create(); | 
 |  | 
 |   // If we could not construct an instance, return an error. | 
 |   if (!JIT) | 
 |     return JIT.takeError(); | 
 |  | 
 |   // Add the module. | 
 |   if (auto Err = JIT->addIRModule(TheadSafeModule(std::move(M), Ctx))) | 
 |     return Err; | 
 |  | 
 |   // Look up the JIT'd code entry point. | 
 |   auto EntrySym = JIT->lookup("entry"); | 
 |   if (!EntrySym) | 
 |     return EntrySym.takeError(); | 
 |  | 
 |   // Cast the entry point address to a function pointer. | 
 |   auto *Entry = (void(*)())EntrySym.getAddress(); | 
 |  | 
 |   // Call into JIT'd code. | 
 |   Entry(); | 
 |  | 
 | The builder classes provide a number of configuration options that can be | 
 | specified before the JIT instance is constructed. For example: | 
 |  | 
 | .. code-block:: c++ | 
 |  | 
 |   // Build an LLLazyJIT instance that uses four worker threads for compilation, | 
 |   // and jumps to a specific error handler (rather than null) on lazy compile | 
 |   // failures. | 
 |  | 
 |   void handleLazyCompileFailure() { | 
 |     // JIT'd code will jump here if lazy compilation fails, giving us an | 
 |     // opportunity to exit or throw an exception into JIT'd code. | 
 |     throw JITFailed(); | 
 |   } | 
 |  | 
 |   auto JIT = LLLazyJITBuilder() | 
 |                .setNumCompileThreads(4) | 
 |                .setLazyCompileFailureAddr( | 
 |                    toJITTargetAddress(&handleLazyCompileFailure)) | 
 |                .create(); | 
 |  | 
 |   // ... | 
 |  | 
 | For users wanting to get started with LLJIT a minimal example program can be | 
 | found at ``llvm/examples/HowToUseLLJIT``. | 
 |  | 
 | Design Overview | 
 | =============== | 
 |  | 
 | ORC's JIT program model aims to emulate the linking and symbol resolution | 
 | rules used by the static and dynamic linkers. This allows ORC to JIT | 
 | arbitrary LLVM IR, including IR produced by an ordinary static compiler (e.g. | 
 | clang) that uses constructs like symbol linkage and visibility, and weak [3]_ | 
 | and common symbol definitions. | 
 |  | 
 | To see how this works, imagine a program ``foo`` which links against a pair | 
 | of dynamic libraries: ``libA`` and ``libB``. On the command line, building this | 
 | program might look like: | 
 |  | 
 | .. code-block:: bash | 
 |  | 
 |   $ clang++ -shared -o libA.dylib a1.cpp a2.cpp | 
 |   $ clang++ -shared -o libB.dylib b1.cpp b2.cpp | 
 |   $ clang++ -o myapp myapp.cpp -L. -lA -lB | 
 |   $ ./myapp | 
 |  | 
 | In ORC, this would translate into API calls on a hypothetical CXXCompilingLayer | 
 | (with error checking omitted for brevity) as: | 
 |  | 
 | .. code-block:: c++ | 
 |  | 
 |   ExecutionSession ES; | 
 |   RTDyldObjectLinkingLayer ObjLinkingLayer( | 
 |       ES, []() { return std::make_unique<SectionMemoryManager>(); }); | 
 |   CXXCompileLayer CXXLayer(ES, ObjLinkingLayer); | 
 |  | 
 |   // Create JITDylib "A" and add code to it using the CXX layer. | 
 |   auto &LibA = ES.createJITDylib("A"); | 
 |   CXXLayer.add(LibA, MemoryBuffer::getFile("a1.cpp")); | 
 |   CXXLayer.add(LibA, MemoryBuffer::getFile("a2.cpp")); | 
 |  | 
 |   // Create JITDylib "B" and add code to it using the CXX layer. | 
 |   auto &LibB = ES.createJITDylib("B"); | 
 |   CXXLayer.add(LibB, MemoryBuffer::getFile("b1.cpp")); | 
 |   CXXLayer.add(LibB, MemoryBuffer::getFile("b2.cpp")); | 
 |  | 
 |   // Create and specify the search order for the main JITDylib. This is | 
 |   // equivalent to a "links against" relationship in a command-line link. | 
 |   auto &MainJD = ES.createJITDylib("main"); | 
 |   MainJD.addToLinkOrder(&LibA); | 
 |   MainJD.addToLinkOrder(&LibB); | 
 |   CXXLayer.add(MainJD, MemoryBuffer::getFile("main.cpp")); | 
 |  | 
 |   // Look up the JIT'd main, cast it to a function pointer, then call it. | 
 |   auto MainSym = ExitOnErr(ES.lookup({&MainJD}, "main")); | 
 |   auto *Main = (int(*)(int, char*[]))MainSym.getAddress(); | 
 |  | 
 |   int Result = Main(...); | 
 |  | 
 | This example tells us nothing about *how* or *when* compilation will happen. | 
 | That will depend on the implementation of the hypothetical CXXCompilingLayer. | 
 | The same linker-based symbol resolution rules will apply regardless of that | 
 | implementation, however. For example, if a1.cpp and a2.cpp both define a | 
 | function "foo" then ORCv2 will generate a duplicate definition error. On the | 
 | other hand, if a1.cpp and b1.cpp both define "foo" there is no error (different | 
 | dynamic libraries may define the same symbol). If main.cpp refers to "foo", it | 
 | should bind to the definition in LibA rather than the one in LibB, since | 
 | main.cpp is part of the "main" dylib, and the main dylib links against LibA | 
 | before LibB. | 
 |  | 
 | Many JIT clients will have no need for this strict adherence to the usual | 
 | ahead-of-time linking rules, and should be able to get by just fine by putting | 
 | all of their code in a single JITDylib. However, clients who want to JIT code | 
 | for languages/projects that traditionally rely on ahead-of-time linking (e.g. | 
 | C++) will find that this feature makes life much easier. | 
 |  | 
 | Symbol lookup in ORC serves two other important functions, beyond providing | 
 | addresses for symbols: (1) It triggers compilation of the symbol(s) searched for | 
 | (if they have not been compiled already), and (2) it provides the | 
 | synchronization mechanism for concurrent compilation. The pseudo-code for the | 
 | lookup process is: | 
 |  | 
 | .. code-block:: none | 
 |  | 
 |   construct a query object from a query set and query handler | 
 |   lock the session | 
 |   lodge query against requested symbols, collect required materializers (if any) | 
 |   unlock the session | 
 |   dispatch materializers (if any) | 
 |  | 
 | In this context a materializer is something that provides a working definition | 
 | of a symbol upon request. Usually materializers are just wrappers for compilers, | 
 | but they may also wrap a jit-linker directly (if the program representation | 
 | backing the definitions is an object file), or may even be a class that writes | 
 | bits directly into memory (for example, if the definitions are | 
 | stubs). Materialization is the blanket term for any actions (compiling, linking, | 
 | splatting bits, registering with runtimes, etc.) that are required to generate a | 
 | symbol definition that is safe to call or access. | 
 |  | 
 | As each materializer completes its work it notifies the JITDylib, which in turn | 
 | notifies any query objects that are waiting on the newly materialized | 
 | definitions. Each query object maintains a count of the number of symbols that | 
 | it is still waiting on, and once this count reaches zero the query object calls | 
 | the query handler with a *SymbolMap* (a map of symbol names to addresses) | 
 | describing the result. If any symbol fails to materialize the query immediately | 
 | calls the query handler with an error. | 
 |  | 
 | The collected materialization units are sent to the ExecutionSession to be | 
 | dispatched, and the dispatch behavior can be set by the client. By default each | 
 | materializer is run on the calling thread. Clients are free to create new | 
 | threads to run materializers, or to send the work to a work queue for a thread | 
 | pool (this is what LLJIT/LLLazyJIT do). | 
 |  | 
 | Top Level APIs | 
 | ============== | 
 |  | 
 | Many of ORC's top-level APIs are visible in the example above: | 
 |  | 
 | - *ExecutionSession* represents the JIT'd program and provides context for the | 
 |   JIT: It contains the JITDylibs, error reporting mechanisms, and dispatches the | 
 |   materializers. | 
 |  | 
 | - *JITDylibs* provide the symbol tables. | 
 |  | 
 | - *Layers* (ObjLinkingLayer and CXXLayer) are wrappers around compilers and | 
 |   allow clients to add uncompiled program representations supported by those | 
 |   compilers to JITDylibs. | 
 |  | 
 | - *ResourceTrackers* allow you to remove code. | 
 |  | 
 | Several other important APIs are used explicitly. JIT clients need not be aware | 
 | of them, but Layer authors will use them: | 
 |  | 
 | - *MaterializationUnit* - When XXXLayer::add is invoked it wraps the given | 
 |   program representation (in this example, C++ source) in a MaterializationUnit, | 
 |   which is then stored in the JITDylib. MaterializationUnits are responsible for | 
 |   describing the definitions they provide, and for unwrapping the program | 
 |   representation and passing it back to the layer when compilation is required | 
 |   (this ownership shuffle makes writing thread-safe layers easier, since the | 
 |   ownership of the program representation will be passed back on the stack, | 
 |   rather than having to be fished out of a Layer member, which would require | 
 |   synchronization). | 
 |  | 
 | - *MaterializationResponsibility* - When a MaterializationUnit hands a program | 
 |   representation back to the layer it comes with an associated | 
 |   MaterializationResponsibility object. This object tracks the definitions | 
 |   that must be materialized and provides a way to notify the JITDylib once they | 
 |   are either successfully materialized or a failure occurs. | 
 |  | 
 | Absolute Symbols, Aliases, and Reexports | 
 | ======================================== | 
 |  | 
 | ORC makes it easy to define symbols with absolute addresses, or symbols that | 
 | are simply aliases of other symbols: | 
 |  | 
 | Absolute Symbols | 
 | ---------------- | 
 |  | 
 | Absolute symbols are symbols that map directly to addresses without requiring | 
 | further materialization, for example: "foo" = 0x1234. One use case for | 
 | absolute symbols is allowing resolution of process symbols. E.g. | 
 |  | 
 | .. code-block: c++ | 
 |  | 
 |   JD.define(absoluteSymbols(SymbolMap({ | 
 |       { Mangle("printf"), | 
 |         { pointerToJITTargetAddress(&printf), | 
 |           JITSymbolFlags::Callable } } | 
 |     }); | 
 |  | 
 | With this mapping established code added to the JIT can refer to printf | 
 | symbolically rather than requiring the address of printf to be "baked in". | 
 | This in turn allows cached versions of the JIT'd code (e.g. compiled objects) | 
 | to be re-used across JIT sessions as the JIT'd code no longer changes, only the | 
 | absolute symbol definition does. | 
 |  | 
 | For process and library symbols the DynamicLibrarySearchGenerator utility (See | 
 | :ref:`How to Add Process and Library Symbols to JITDylibs | 
 | <ProcessAndLibrarySymbols>`) can be used to automatically build absolute | 
 | symbol mappings for you. However the absoluteSymbols function is still useful | 
 | for making non-global objects in your JIT visible to JIT'd code. For example, | 
 | imagine that your JIT standard library needs access to your JIT object to make | 
 | some calls. We could bake the address of your object into the library, but then | 
 | it would need to be recompiled for each session: | 
 |  | 
 | .. code-block: c++ | 
 |  | 
 |   // From standard library for JIT'd code: | 
 |  | 
 |   class MyJIT { | 
 |   public: | 
 |     void log(const char *Msg); | 
 |   }; | 
 |  | 
 |   void log(const char *Msg) { ((MyJIT*)0x1234)->log(Msg); } | 
 |  | 
 | We can turn this into a symbolic reference in the JIT standard library: | 
 |  | 
 | .. code-block: c++ | 
 |  | 
 |   extern MyJIT *__MyJITInstance; | 
 |  | 
 |   void log(const char *Msg) { __MyJITInstance->log(Msg); } | 
 |  | 
 | And then make our JIT object visible to the JIT standard library with an | 
 | absolute symbol definition when the JIT is started: | 
 |  | 
 | .. code-block: c++ | 
 |  | 
 |   MyJIT J = ...; | 
 |  | 
 |   auto &JITStdLibJD = ... ; | 
 |  | 
 |   JITStdLibJD.define(absoluteSymbols(SymbolMap({ | 
 |       { Mangle("__MyJITInstance"), | 
 |         { pointerToJITTargetAddress(&J), JITSymbolFlags() } } | 
 |     }); | 
 |  | 
 | Aliases and Reexports | 
 | --------------------- | 
 |  | 
 | Aliases and reexports allow you to define new symbols that map to existing | 
 | symbols. This can be useful for changing linkage relationships between symbols | 
 | across sessions without having to recompile code. For example, imagine that | 
 | JIT'd code has access to a log function, ``void log(const char*)`` for which | 
 | there are two implementations in the JIT standard library: ``log_fast`` and | 
 | ``log_detailed``. Your JIT can choose which one of these definitions will be | 
 | used when the ``log`` symbol is referenced by setting up an alias at JIT startup | 
 | time: | 
 |  | 
 | .. code-block: c++ | 
 |  | 
 |   auto &JITStdLibJD = ... ; | 
 |  | 
 |   auto LogImplementationSymbol = | 
 |    Verbose ? Mangle("log_detailed") : Mangle("log_fast"); | 
 |  | 
 |   JITStdLibJD.define( | 
 |     symbolAliases(SymbolAliasMap({ | 
 |         { Mangle("log"), | 
 |           { LogImplementationSymbol | 
 |             JITSymbolFlags::Exported | JITSymbolFlags::Callable } } | 
 |       }); | 
 |  | 
 | The ``symbolAliases`` function allows you to define aliases within a single | 
 | JITDylib. The ``reexports`` function provides the same functionality, but | 
 | operates across JITDylib boundaries. E.g. | 
 |  | 
 | .. code-block: c++ | 
 |  | 
 |   auto &JD1 = ... ; | 
 |   auto &JD2 = ... ; | 
 |  | 
 |   // Make 'bar' in JD2 an alias for 'foo' from JD1. | 
 |   JD2.define( | 
 |     reexports(JD1, SymbolAliasMap({ | 
 |         { Mangle("bar"), { Mangle("foo"), JITSymbolFlags::Exported } } | 
 |       }); | 
 |  | 
 | The reexports utility can be handy for composing a single JITDylib interface by | 
 | re-exporting symbols from several other JITDylibs. | 
 |  | 
 | .. _Laziness: | 
 |  | 
 | Laziness | 
 | ======== | 
 |  | 
 | Laziness in ORC is provided by a utility called "lazy reexports". A lazy | 
 | reexport is similar to a regular reexport or alias: It provides a new name for | 
 | an existing symbol. Unlike regular reexports however, lookups of lazy reexports | 
 | do not trigger immediate materialization of the reexported symbol. Instead, they | 
 | only trigger materialization of a function stub. This function stub is | 
 | initialized to point at a *lazy call-through*, which provides reentry into the | 
 | JIT. If the stub is called at runtime then the lazy call-through will look up | 
 | the reexported symbol (triggering materialization for it if necessary), update | 
 | the stub (to call directly to the reexported symbol on subsequent calls), and | 
 | then return via the reexported symbol. By re-using the existing symbol lookup | 
 | mechanism, lazy reexports inherit the same concurrency guarantees: calls to lazy | 
 | reexports can be made from multiple threads concurrently, and the reexported | 
 | symbol can be any state of compilation (uncompiled, already in the process of | 
 | being compiled, or already compiled) and the call will succeed. This allows | 
 | laziness to be safely mixed with features like remote compilation, concurrent | 
 | compilation, concurrent JIT'd code, and speculative compilation. | 
 |  | 
 | There is one other key difference between regular reexports and lazy reexports | 
 | that some clients must be aware of: The address of a lazy reexport will be | 
 | *different* from the address of the reexported symbol (whereas a regular | 
 | reexport is guaranteed to have the same address as the reexported symbol). | 
 | Clients who care about pointer equality will generally want to use the address | 
 | of the reexport as the canonical address of the reexported symbol. This will | 
 | allow the address to be taken without forcing materialization of the reexport. | 
 |  | 
 | Usage example: | 
 |  | 
 | If JITDylib ``JD`` contains definitions for symbols ``foo_body`` and | 
 | ``bar_body``, we can create lazy entry points ``Foo`` and ``Bar`` in JITDylib | 
 | ``JD2`` by calling: | 
 |  | 
 | .. code-block:: c++ | 
 |  | 
 |   auto ReexportFlags = JITSymbolFlags::Exported | JITSymbolFlags::Callable; | 
 |   JD2.define( | 
 |     lazyReexports(CallThroughMgr, StubsMgr, JD, | 
 |                   SymbolAliasMap({ | 
 |                     { Mangle("foo"), { Mangle("foo_body"), ReexportedFlags } }, | 
 |                     { Mangle("bar"), { Mangle("bar_body"), ReexportedFlags } } | 
 |                   })); | 
 |  | 
 | A full example of how to use lazyReexports with the LLJIT class can be found at | 
 | ``llvm/examples/OrcV2Examples/LLJITWithLazyReexports``. | 
 |  | 
 | Supporting Custom Compilers | 
 | =========================== | 
 |  | 
 | TBD. | 
 |  | 
 | .. _transitioning_orcv1_to_orcv2: | 
 |  | 
 | Transitioning from ORCv1 to ORCv2 | 
 | ================================= | 
 |  | 
 | Since LLVM 7.0, new ORC development work has focused on adding support for | 
 | concurrent JIT compilation. The new APIs (including new layer interfaces and | 
 | implementations, and new utilities) that support concurrency are collectively | 
 | referred to as ORCv2, and the original, non-concurrent layers and utilities | 
 | are now referred to as ORCv1. | 
 |  | 
 | The majority of the ORCv1 layers and utilities were renamed with a 'Legacy' | 
 | prefix in LLVM 8.0, and have deprecation warnings attached in LLVM 9.0. In LLVM | 
 | 12.0 ORCv1 will be removed entirely. | 
 |  | 
 | Transitioning from ORCv1 to ORCv2 should be easy for most clients. Most of the | 
 | ORCv1 layers and utilities have ORCv2 counterparts [2]_ that can be directly | 
 | substituted. However there are some design differences between ORCv1 and ORCv2 | 
 | to be aware of: | 
 |  | 
 |   1. ORCv2 fully adopts the JIT-as-linker model that began with MCJIT. Modules | 
 |      (and other program representations, e.g. Object Files)  are no longer added | 
 |      directly to JIT classes or layers. Instead, they are added to ``JITDylib`` | 
 |      instances *by* layers. The ``JITDylib`` determines *where* the definitions | 
 |      reside, the layers determine *how* the definitions will be compiled. | 
 |      Linkage relationships between ``JITDylibs`` determine how inter-module | 
 |      references are resolved, and symbol resolvers are no longer used. See the | 
 |      section `Design Overview`_ for more details. | 
 |  | 
 |      Unless multiple JITDylibs are needed to model linkage relationships, ORCv1 | 
 |      clients should place all code in a single JITDylib. | 
 |      MCJIT clients should use LLJIT (see `LLJIT and LLLazyJIT`_), and can place | 
 |      code in LLJIT's default created main JITDylib (See | 
 |      ``LLJIT::getMainJITDylib()``). | 
 |  | 
 |   2. All JIT stacks now need an ``ExecutionSession`` instance. ExecutionSession | 
 |      manages the string pool, error reporting, synchronization, and symbol | 
 |      lookup. | 
 |  | 
 |   3. ORCv2 uses uniqued strings (``SymbolStringPtr`` instances) rather than | 
 |      string values in order to reduce memory overhead and improve lookup | 
 |      performance. See the subsection `How to manage symbol strings`_. | 
 |  | 
 |   4. IR layers require ThreadSafeModule instances, rather than | 
 |      std::unique_ptr<Module>s. ThreadSafeModule is a wrapper that ensures that | 
 |      Modules that use the same LLVMContext are not accessed concurrently. | 
 |      See `How to use ThreadSafeModule and ThreadSafeContext`_. | 
 |  | 
 |   5. Symbol lookup is no longer handled by layers. Instead, there is a | 
 |      ``lookup`` method on JITDylib that takes a list of JITDylibs to scan. | 
 |  | 
 |      .. code-block:: c++ | 
 |  | 
 |        ExecutionSession ES; | 
 |        JITDylib &JD1 = ...; | 
 |        JITDylib &JD2 = ...; | 
 |  | 
 |        auto Sym = ES.lookup({&JD1, &JD2}, ES.intern("_main")); | 
 |  | 
 |   6. The removeModule/removeObject methods are replaced by | 
 |      ``ResourceTracker::remove``. | 
 |      See the subsection `How to remove code`_. | 
 |  | 
 | For code examples and suggestions of how to use the ORCv2 APIs, please see | 
 | the section `How-tos`_. | 
 |  | 
 | How-tos | 
 | ======= | 
 |  | 
 | How to manage symbol strings | 
 | ---------------------------- | 
 |  | 
 | Symbol strings in ORC are uniqued to improve lookup performance, reduce memory | 
 | overhead, and allow symbol names to function as efficient keys. To get the | 
 | unique ``SymbolStringPtr`` for a string value, call the | 
 | ``ExecutionSession::intern`` method: | 
 |  | 
 |   .. code-block:: c++ | 
 |  | 
 |     ExecutionSession ES; | 
 |     /// ... | 
 |     auto MainSymbolName = ES.intern("main"); | 
 |  | 
 | If you wish to perform lookup using the C/IR name of a symbol you will also | 
 | need to apply the platform linker-mangling before interning the string. On | 
 | Linux this mangling is a no-op, but on other platforms it usually involves | 
 | adding a prefix to the string (e.g. '_' on Darwin). The mangling scheme is | 
 | based on the DataLayout for the target. Given a DataLayout and an | 
 | ExecutionSession, you can create a MangleAndInterner function object that | 
 | will perform both jobs for you: | 
 |  | 
 |   .. code-block:: c++ | 
 |  | 
 |     ExecutionSession ES; | 
 |     const DataLayout &DL = ...; | 
 |     MangleAndInterner Mangle(ES, DL); | 
 |  | 
 |     // ... | 
 |  | 
 |     // Portable IR-symbol-name lookup: | 
 |     auto Sym = ES.lookup({&MainJD}, Mangle("main")); | 
 |  | 
 | How to create JITDylibs and set up linkage relationships | 
 | -------------------------------------------------------- | 
 |  | 
 | In ORC, all symbol definitions reside in JITDylibs. JITDylibs are created by | 
 | calling the ``ExecutionSession::createJITDylib`` method with a unique name: | 
 |  | 
 |   .. code-block:: c++ | 
 |  | 
 |     ExecutionSession ES; | 
 |     auto &JD = ES.createJITDylib("libFoo.dylib"); | 
 |  | 
 | The JITDylib is owned by the ``ExecutionEngine`` instance and will be freed | 
 | when it is destroyed. | 
 |  | 
 | How to remove code | 
 | ------------------ | 
 |  | 
 | To remove an individual module from a JITDylib it must first be added using an | 
 | explicit ``ResourceTracker``. The module can then be removed by calling | 
 | ``ResourceTracker::remove``: | 
 |  | 
 |   .. code-block:: c++ | 
 |  | 
 |     auto &JD = ... ; | 
 |     auto M = ... ; | 
 |  | 
 |     auto RT = JD.createResourceTracker(); | 
 |     Layer.add(RT, std::move(M)); // Add M to JD, tracking resources with RT | 
 |  | 
 |     RT.remove(); // Remove M from JD. | 
 |  | 
 | Modules added directly to a JITDylib will be tracked by that JITDylib's default | 
 | resource tracker. | 
 |  | 
 | All code can be removed from a JITDylib by calling ``JITDylib::clear``. This | 
 | leaves the cleared JITDylib in an empty but usable state. | 
 |  | 
 | JITDylibs can be removed by calling ``ExecutionSession::removeJITDylib``. This | 
 | clears the JITDylib and then puts it into a defunct state. No further operations | 
 | can be performed on the JITDylib, and it will be destroyed as soon as the last | 
 | handle to it is released. | 
 |  | 
 | An example of how to use the resource management APIs can be found at | 
 | ``llvm/examples/OrcV2Examples/LLJITRemovableCode``. | 
 |  | 
 |  | 
 | How to add the support for custom program representation | 
 | -------------------------------------------------------- | 
 | In order to add the support for a custom program representation, a custom ``MaterializationUnit`` | 
 | for the program representation, and a custom ``Layer`` are needed. The Layer will have two | 
 | operations: ``add`` and ``emit``. The ``add`` operation takes an instance of your program | 
 | representation, builds one of your custom ``MaterializationUnits`` to hold it, then adds it | 
 | to a ``JITDylib``. The emit operation takes a ``MaterializationResponsibility`` object and an | 
 | instance of your program representation and materializes it, usually by compiling it and handing | 
 | the resulting object off to an ``ObjectLinkingLayer``. | 
 |  | 
 | Your custom ``MaterializationUnit`` will have two operations: ``materialize`` and ``discard``. The | 
 | ``materialize`` function will be called for you when any symbol provided by the unit is looked up, | 
 | and it should just call the ``emit`` function on your layer, passing in the given | 
 | ``MaterializationResponsibility`` and the wrapped program representation. The ``discard`` function | 
 | will be called if some weak symbol provided by your unit is not needed (because the JIT found an | 
 | overriding definition). You can use this to drop your definition early, or just ignore it and let | 
 | the linker drops the definition later. | 
 |  | 
 | Here is an example of an ASTLayer: | 
 |  | 
 |   .. code-block:: c++ | 
 |  | 
 |     // ... In you JIT class | 
 |     AstLayer astLayer; | 
 |     // ... | 
 |  | 
 |  | 
 |     class AstMaterializationUnit : public orc::MaterializationUnit { | 
 |     public: | 
 |       AstMaterializationUnit(AstLayer &l, Ast &ast) | 
 |       : llvm::orc::MaterializationUnit(l.getInterface(ast)), astLayer(l), | 
 |       ast(ast) {}; | 
 |  | 
 |       llvm::StringRef getName() const override { | 
 |         return "AstMaterializationUnit"; | 
 |       } | 
 |  | 
 |       void materialize(std::unique_ptr<orc::MaterializationResponsibility> r) override { | 
 |         astLayer.emit(std::move(r), ast); | 
 |       }; | 
 |  | 
 |     private: | 
 |       void discard(const llvm::orc::JITDylib &jd, const llvm::orc::SymbolStringPtr &sym) override { | 
 |         llvm_unreachable("functions are not overridable"); | 
 |       } | 
 |  | 
 |  | 
 |       AstLayer &astLayer; | 
 |       Ast * | 
 |     }; | 
 |  | 
 |     class AstLayer { | 
 |       llvhm::orc::IRLayer &baseLayer; | 
 |       llvhm::orc::MangleAndInterner &mangler; | 
 |  | 
 |     public: | 
 |       AstLayer(llvm::orc::IRLayer &baseLayer, llvm::orc::MangleAndInterner &mangler) | 
 |       : baseLayer(baseLayer), mangler(mangler){}; | 
 |  | 
 |       llvm::Error add(llvm::orc::ResourceTrackerSP &rt, Ast &ast) { | 
 |         return rt->getJITDylib().define(std::make_unique<AstMaterializationUnit>(*this, ast), rt); | 
 |       } | 
 |  | 
 |       void emit(std::unique_ptr<orc::MaterializationResponsibility> mr, Ast &ast) { | 
 |         // compileAst is just function that compiles the given AST and returns | 
 |         // a `llvm::orc::ThreadSafeModule` | 
 |         baseLayer.emit(std::move(mr), compileAst(ast)); | 
 |       } | 
 |  | 
 |       llvm::orc::MaterializationUnit::Interface getInterface(Ast &ast) { | 
 |           SymbolFlagsMap Symbols; | 
 |           // Find all the symbols in the AST and for each of them | 
 |           // add it to the Symbols map. | 
 |           Symbols[mangler(someNameFromAST)] = | 
 |             JITSymbolFlags(JITSymbolFlags::Exported | JITSymbolFlags::Callable); | 
 |           return MaterializationUnit::Interface(std::move(Symbols), nullptr); | 
 |       } | 
 |     }; | 
 |  | 
 | Take look at the source code of `Building A JIT's Chapter 4 <tutorial/BuildingAJIT4.html>`_ for a complete example. | 
 |  | 
 | How to use ThreadSafeModule and ThreadSafeContext | 
 | ------------------------------------------------- | 
 |  | 
 | ThreadSafeModule and ThreadSafeContext are wrappers around Modules and | 
 | LLVMContexts respectively. A ThreadSafeModule is a pair of a | 
 | std::unique_ptr<Module> and a (possibly shared) ThreadSafeContext value. A | 
 | ThreadSafeContext is a pair of a std::unique_ptr<LLVMContext> and a lock. | 
 | This design serves two purposes: providing a locking scheme and lifetime | 
 | management for LLVMContexts. The ThreadSafeContext may be locked to prevent | 
 | accidental concurrent access by two Modules that use the same LLVMContext. | 
 | The underlying LLVMContext is freed once all ThreadSafeContext values pointing | 
 | to it are destroyed, allowing the context memory to be reclaimed as soon as | 
 | the Modules referring to it are destroyed. | 
 |  | 
 | ThreadSafeContexts can be explicitly constructed from a | 
 | std::unique_ptr<LLVMContext>: | 
 |  | 
 |   .. code-block:: c++ | 
 |  | 
 |     ThreadSafeContext TSCtx(std::make_unique<LLVMContext>()); | 
 |  | 
 | ThreadSafeModules can be constructed from a pair of a std::unique_ptr<Module> | 
 | and a ThreadSafeContext value. ThreadSafeContext values may be shared between | 
 | multiple ThreadSafeModules: | 
 |  | 
 |   .. code-block:: c++ | 
 |  | 
 |     ThreadSafeModule TSM1( | 
 |       std::make_unique<Module>("M1", *TSCtx.getContext()), TSCtx); | 
 |  | 
 |     ThreadSafeModule TSM2( | 
 |       std::make_unique<Module>("M2", *TSCtx.getContext()), TSCtx); | 
 |  | 
 | Before using a ThreadSafeContext, clients should ensure that either the context | 
 | is only accessible on the current thread, or that the context is locked. In the | 
 | example above (where the context is never locked) we rely on the fact that both | 
 | ``TSM1`` and ``TSM2``, and TSCtx are all created on one thread. If a context is | 
 | going to be shared between threads then it must be locked before any accessing | 
 | or creating any Modules attached to it. E.g. | 
 |  | 
 |   .. code-block:: c++ | 
 |  | 
 |     ThreadSafeContext TSCtx(std::make_unique<LLVMContext>()); | 
 |  | 
 |     ThreadPool TP(NumThreads); | 
 |     JITStack J; | 
 |  | 
 |     for (auto &ModulePath : ModulePaths) { | 
 |       TP.async( | 
 |         [&]() { | 
 |           auto Lock = TSCtx.getLock(); | 
 |           auto M = loadModuleOnContext(ModulePath, TSCtx.getContext()); | 
 |           J.addModule(ThreadSafeModule(std::move(M), TSCtx)); | 
 |         }); | 
 |     } | 
 |  | 
 |     TP.wait(); | 
 |  | 
 | To make exclusive access to Modules easier to manage the ThreadSafeModule class | 
 | provides a convenience function, ``withModuleDo``, that implicitly (1) locks the | 
 | associated context, (2) runs a given function object, (3) unlocks the context, | 
 | and (3) returns the result generated by the function object. E.g. | 
 |  | 
 |   .. code-block:: c++ | 
 |  | 
 |     ThreadSafeModule TSM = getModule(...); | 
 |  | 
 |     // Dump the module: | 
 |     size_t NumFunctionsInModule = | 
 |       TSM.withModuleDo( | 
 |         [](Module &M) { // <- Context locked before entering lambda. | 
 |           return M.size(); | 
 |         } // <- Context unlocked after leaving. | 
 |       ); | 
 |  | 
 | Clients wishing to maximize possibilities for concurrent compilation will want | 
 | to create every new ThreadSafeModule on a new ThreadSafeContext. For this | 
 | reason a convenience constructor for ThreadSafeModule is provided that implicitly | 
 | constructs a new ThreadSafeContext value from a std::unique_ptr<LLVMContext>: | 
 |  | 
 |   .. code-block:: c++ | 
 |  | 
 |     // Maximize concurrency opportunities by loading every module on a | 
 |     // separate context. | 
 |     for (const auto &IRPath : IRPaths) { | 
 |       auto Ctx = std::make_unique<LLVMContext>(); | 
 |       auto M = std::make_unique<LLVMContext>("M", *Ctx); | 
 |       CompileLayer.add(MainJD, ThreadSafeModule(std::move(M), std::move(Ctx))); | 
 |     } | 
 |  | 
 | Clients who plan to run single-threaded may choose to save memory by loading | 
 | all modules on the same context: | 
 |  | 
 |   .. code-block:: c++ | 
 |  | 
 |     // Save memory by using one context for all Modules: | 
 |     ThreadSafeContext TSCtx(std::make_unique<LLVMContext>()); | 
 |     for (const auto &IRPath : IRPaths) { | 
 |       ThreadSafeModule TSM(parsePath(IRPath, *TSCtx.getContext()), TSCtx); | 
 |       CompileLayer.add(MainJD, ThreadSafeModule(std::move(TSM)); | 
 |     } | 
 |  | 
 | .. _ProcessAndLibrarySymbols: | 
 |  | 
 | How to Add Process and Library Symbols to JITDylibs | 
 | =================================================== | 
 |  | 
 | JIT'd code may need to access symbols in the host program or in supporting | 
 | libraries. The best way to enable this is to reflect these symbols into your | 
 | JITDylibs so that they appear the same as any other symbol defined within the | 
 | execution session (i.e. they are findable via `ExecutionSession::lookup`, and | 
 | so visible to the JIT linker during linking). | 
 |  | 
 | One way to reflect external symbols is to add them manually using the | 
 | absoluteSymbols function: | 
 |  | 
 |   .. code-block:: c++ | 
 |  | 
 |     const DataLayout &DL = getDataLayout(); | 
 |     MangleAndInterner Mangle(ES, DL); | 
 |  | 
 |     auto &JD = ES.createJITDylib("main"); | 
 |  | 
 |     JD.define( | 
 |       absoluteSymbols({ | 
 |         { Mangle("puts"), pointerToJITTargetAddress(&puts)}, | 
 |         { Mangle("gets"), pointerToJITTargetAddress(&getS)} | 
 |       })); | 
 |  | 
 | Using absoluteSymbols is reasonable if the set of symbols to be reflected is | 
 | small and fixed. On the other hand, if the set of symbols is large or variable | 
 | it may make more sense to have the definitions added for you on demand by a | 
 | *definition generator*.A definition generator is an object that can be attached | 
 | to a JITDylib, receiving a callback whenever a lookup within that JITDylib fails | 
 | to find one or more symbols. The definition generator is given a chance to | 
 | produce a definition of the missing symbol(s) before the lookup proceeds. | 
 |  | 
 | ORC provides the ``DynamicLibrarySearchGenerator`` utility for reflecting symbols | 
 | from the process (or a specific dynamic library) for you. For example, to reflect | 
 | the whole interface of a runtime library: | 
 |  | 
 |   .. code-block:: c++ | 
 |  | 
 |     const DataLayout &DL = getDataLayout(); | 
 |     auto &JD = ES.createJITDylib("main"); | 
 |  | 
 |     if (auto DLSGOrErr = | 
 |         DynamicLibrarySearchGenerator::Load("/path/to/lib" | 
 |                                             DL.getGlobalPrefix())) | 
 |       JD.addGenerator(std::move(*DLSGOrErr); | 
 |     else | 
 |       return DLSGOrErr.takeError(); | 
 |  | 
 |     // IR added to JD can now link against all symbols exported by the library | 
 |     // at '/path/to/lib'. | 
 |     CompileLayer.add(JD, loadModule(...)); | 
 |  | 
 | The ``DynamicLibrarySearchGenerator`` utility can also be constructed with a | 
 | filter function to restrict the set of symbols that may be reflected. For | 
 | example, to expose an allowed set of symbols from the main process: | 
 |  | 
 |   .. code-block:: c++ | 
 |  | 
 |     const DataLayout &DL = getDataLayout(); | 
 |     MangleAndInterner Mangle(ES, DL); | 
 |  | 
 |     auto &JD = ES.createJITDylib("main"); | 
 |  | 
 |     DenseSet<SymbolStringPtr> AllowList({ | 
 |         Mangle("puts"), | 
 |         Mangle("gets") | 
 |       }); | 
 |  | 
 |     // Use GetForCurrentProcess with a predicate function that checks the | 
 |     // allowed list. | 
 |     JD.addGenerator(cantFail(DynamicLibrarySearchGenerator::GetForCurrentProcess( | 
 |           DL.getGlobalPrefix(), | 
 |           [&](const SymbolStringPtr &S) { return AllowList.count(S); }))); | 
 |  | 
 |     // IR added to JD can now link against any symbols exported by the process | 
 |     // and contained in the list. | 
 |     CompileLayer.add(JD, loadModule(...)); | 
 |  | 
 | References to process or library symbols could also be hardcoded into your IR | 
 | or object files using the symbols' raw addresses, however symbolic resolution | 
 | using the JIT symbol tables should be preferred: it keeps the IR and objects | 
 | readable and reusable in subsequent JIT sessions. Hardcoded addresses are | 
 | difficult to read, and usually only good for one session. | 
 |  | 
 | Roadmap | 
 | ======= | 
 |  | 
 | ORC is still undergoing active development. Some current and future works are | 
 | listed below. | 
 |  | 
 | Current Work | 
 | ------------ | 
 |  | 
 | 1. **TargetProcessControl: Improvements to in-tree support for out-of-process | 
 |    execution** | 
 |  | 
 |    The ``TargetProcessControl`` API provides various operations on the JIT | 
 |    target process (the one which will execute the JIT'd code), including | 
 |    memory allocation, memory writes, function execution, and process queries | 
 |    (e.g. for the target triple). By targeting this API new components can be | 
 |    developed which will work equally well for in-process and out-of-process | 
 |    JITing. | 
 |  | 
 |  | 
 | 2. **ORC RPC based TargetProcessControl implementation** | 
 |  | 
 |    An ORC RPC based implementation of the ``TargetProcessControl`` API is | 
 |    currently under development to enable easy out-of-process JITing via | 
 |    file descriptors / sockets. | 
 |  | 
 | 3. **Core State Machine Cleanup** | 
 |  | 
 |    The core ORC state machine is currently implemented between JITDylib and | 
 |    ExecutionSession. Methods are slowly being moved to `ExecutionSession`. This | 
 |    will tidy up the code base, and also allow us to support asynchronous removal | 
 |    of JITDylibs (in practice deleting an associated state object in | 
 |    ExecutionSession and leaving the JITDylib instance in a defunct state until | 
 |    all references to it have been released). | 
 |  | 
 | Near Future Work | 
 | ---------------- | 
 |  | 
 | 1. **ORC JIT Runtime Libraries** | 
 |  | 
 |    We need a runtime library for JIT'd code. This would include things like | 
 |    TLS registration, reentry functions, registration code for language runtimes | 
 |    (e.g. Objective C and Swift) and other JIT specific runtime code. This should | 
 |    be built in a similar manner to compiler-rt (possibly even as part of it). | 
 |  | 
 | 2. **Remote jit_dlopen / jit_dlclose** | 
 |  | 
 |    To more fully mimic the environment that static programs operate in we would | 
 |    like JIT'd code to be able to "dlopen" and "dlclose" JITDylibs, running all of | 
 |    their initializers/deinitializers on the current thread. This would require | 
 |    support from the runtime library described above. | 
 |  | 
 | 3. **Debugging support** | 
 |  | 
 |    ORC currently supports the GDBRegistrationListener API when using RuntimeDyld | 
 |    as the underlying JIT linker. We will need a new solution for JITLink based | 
 |    platforms. | 
 |  | 
 | Further Future Work | 
 | ------------------- | 
 |  | 
 | 1. **Speculative Compilation** | 
 |  | 
 |    ORC's support for concurrent compilation allows us to easily enable | 
 |    *speculative* JIT compilation: compilation of code that is not needed yet, | 
 |    but which we have reason to believe will be needed in the future. This can be | 
 |    used to hide compile latency and improve JIT throughput. A proof-of-concept | 
 |    example of speculative compilation with ORC has already been developed (see | 
 |    ``llvm/examples/SpeculativeJIT``). Future work on this is likely to focus on | 
 |    re-using and improving existing profiling support (currently used by PGO) to | 
 |    feed speculation decisions, as well as built-in tools to simplify use of | 
 |    speculative compilation. | 
 |  | 
 | .. [1] Formats/architectures vary in terms of supported features. MachO and | 
 |        ELF tend to have better support than COFF. Patches very welcome! | 
 |  | 
 | .. [2] The ``LazyEmittingLayer``, ``RemoteObjectClientLayer`` and | 
 |        ``RemoteObjectServerLayer`` do not have counterparts in the new | 
 |        system. In the case of ``LazyEmittingLayer`` it was simply no longer | 
 |        needed: in ORCv2, deferring compilation until symbols are looked up is | 
 |        the default. The removal of ``RemoteObjectClientLayer`` and | 
 |        ``RemoteObjectServerLayer`` means that JIT stacks can no longer be split | 
 |        across processes, however this functionality appears not to have been | 
 |        used. | 
 |  | 
 | .. [3] Weak definitions are currently handled correctly within dylibs, but if | 
 |        multiple dylibs provide a weak definition of a symbol then each will end | 
 |        up with its own definition (similar to how weak definitions are handled | 
 |        in Windows DLLs). This will be fixed in the future. |