# guidebook.md The purpose of this guidebook is to lay down principles and guidelines for how to write code and work together at Our Machinery. # OMG-META: About this guidebook ## OMG-META-1: Guidelines are identified by unique identifiers Each guideline in this document is identified by a unique identifier (`OMG-META-1`) as well as a name (`Guidelines are identified by unique identifiers`). Names may change, but identifiers are permanent and can be used to permanently and uniquely refer to a specific guideline. If a guideline is retired, its identifier will be removed and not used again for other purposes. ## OMG-META-2: Prefer to have things in writing Rules, guidelines, and work processes should be written down instead of being kept in people's heads. This makes it easier to refer to them, and discuss them and it makes it easier to onboard new people to the project. In addition to these guidelines, here are other useful resources: * [Workflows (internal doc)](https://paper.dropbox.com/doc/Workflows--AU9nAzZP9QkEMPqXLd56sf~ZAg-i7exDDLxypZmlxf2g24gY) ## OMG-META-3: Purpose of the guidelines The purpose of this guide is to help programmers who are new to *The Machinery* write code in *The Machinery* style and to answer questions programmers might face about how to design or implement certain things. * These guidelines are intended to be a practical help, not to cover every possible situation. We want to keep this document succinct, so we avoid discussing minutiae or things that are already widely known. If something isn't covered by the guidelines and you are uncertain about the best approach, look at the surrounding code and follow its example or talk it through with a colleague. * This is not intended to be a static document. If we notice a problematic pattern in the code or think of new advice that will be helpful, we will extend this document with that information. * Some of the guidelines in this document have good rationales behind them, others are just arbitrary choices, but it's better to pick one alternative and be consistent than to have different choices all over the code. * You may find existing code in the codebase that conflicts with the guidelines. That is usually because the code was written before the guideline. Resist the urge to go in and "fix" all that code. Such fixes do not add much value by themselves and risk introducing new bugs. Instead, only fix the code that needs to be updated for other reasons (bug fixes, new features, etc). Fixing code that you are already working on anyway is a lot safer because you already have a good understanding of the code which makes you less likely to introduce bugs. Also, since you are actively working on the code, chances are that if you introduce a bug you will spot it quickly. # OMG-DESIGN: Design Principles ## OMG-DESIGN-1: Code must be easy to change The primary purpose of these guidelines is to design code that is easy to change and adapt to various situations and circumstances. Being easy to change is the most important property of a codebase as it is the basis for everything else we want to do with the code. Code that is easy to change can adapt to new technical requirements or business goals with ease. Code that can’t change is doomed to stagnate, become technically irrelevant and then die (when it is easier to throw everything away and start from scratch than to keep modifying it). ## OMG-DESIGN-2: Less is more Code is not an asset, it is a liability. The more code you have, the more you have to optimize, debug, modernize, refactor, understand. In short, the more code you have — the harder it is to change. A smaller and simpler codebase is easier to understand, improve and extend. Strategies for keeping the codebase small: - Be minimalistic: Implement the tiniest thing that solves the current problem. Don’t expect future needs. If the code is flexible you can add the functionality when the need occurs. - Build more advanced things out of a few simple building blocks. Arrays and hash tables should cover almost all of your data type needs. - Actively prune unused code paths. - Refactor out unneeded complexity whenever you can. Strive to make the code as small and simple as possible. - Never leave commented out code in the project. That’s what version history is for. “Less is more” is not just about physical lines of code, it is also about adding other unnecessary stuff. An external library does not add any code to the repository, but it adds dependencies and conceptual complexity — more things that you need to understand and more things that can go wrong. Strategies for keeping conceptual complexity down: - Don’t introduce unnecessary abstraction layers, prefer simpler concepts (`char *` over `std::string`). Only abstract when it brings clear value. - Don’t add new programming languages to the codebase. We strive to do everything with C, so developers only need to learn one language, one debugger, etc. - Minimize the use of external libraries. Prefer minimalistic one-file libraries over bigger ones. It can be tempting to rely on an external library as a quick way of implementing a feature. But be wary. Code in external libraries is hard to modify, which violates our first principle. First, you have to understand the external codebase, which can be daunting, and then you either have to fork, which means you take over the maintenance burden or deal with the politics of pushing the changes upstream. The bigger and more complicated the library is, the harder this problem gets. If you only need a small functionality, consider just implementing that instead of adding a library dependency. ## OMG-DESIGN-3: Keep it simple Simpler solutions are easier to understand, easier to reason about, and easier to modify. What is “simple” can be discussed. From our perspective it means: - Fewer levels of abstraction. - Easier to understand *completely* (performance implications, threading implications, etc). - Easier to debug. - More straightforward and easier to follow logic. - Closer to the metal. Examples: - C is simpler than C++. - `const char *` is simpler than `std::string`. - `int[]` is simpler than `std::vector`. - Immutable is simpler than mutable. - Non-template code is simpler than template code. - Single-threaded is simpler than multi-threaded. - boost libraries are not simple. Sometimes complexity is needed. For example, you can’t get good performance on modern hardware without multi-threading and some mutability. The point is to keep things as simple *as possible.* Use a limited set of core data structures and concepts. Use established standards *when they are simple:* - Simple standards: JSON, Web sockets, UTF-8. - Complicated standards: XML, Corba, UTF-16. ## OMG-DESIGN-4: Explicit is better than implicit It is better when programmers can see what is going on than when it is hidden. Avoid doing cute tricks with templates and macros. Think about code that is easy to understand and step through in a debugger. ## OMG-DESIGN-5: Design with performance in mind The performance of a system is a part of its design. When you design a new system you should have clear performance goals in mind — how many objects the system should be able to handle, how many milliseconds it should take to update, etc. A system that doesn’t live up to its performance goals is not finished. Remember that the performance of a platform such as *The Machinery* is significantly different from the performance of an end-user application. Application software only has to be “fast enough” for the user not to become annoyed. In contrast, a platform needs to stay performant across a wide range of use cases, as we cannot predict what the end users will want to do with the platform. In other words, performance is a primary feature of the engine and should always be in the back of your mind. Avoid hidden, unexpected costs — the users should only pay for the features they use. System update costs should be linear in the number of active objects. Profile your code so you know where time is being spent. ## OMG-DESIGN-6: Design for fast iterations For modifications to be easy, the modify-build-test cycle must be fast. Minimizing build times ensures that programmers stay productive. This is accomplished by minimizing header inclusion and avoiding known costly constructs, such as heavy use of C++ templates. Unit tests should be used as much as possible to verify the correctness of the code. This allows changes to be made with greater confidence and less manual testing, which means changes can be made faster. These unit tests should be fast enough to run as a part of the standard build. Unit tests provide the most value for *complicated low-level code*, such as JSON parsing and complicated data structures (*The Truth*). *Simple* low-level code (e.g., vec3 addition) is unlikely to be changed a lot and see regression bugs. And even if it does have bugs, they are usually easy to spot and fix. Thus, unit tests provide limited value. *High-level* code (e.g., UI) is hard to test with unit tests and usually better tested in other ways (hands-on or through recorded UI interaction sessions). Hot-reloading of DLLs should be supported to provide quick testing of changes in a running application. Be mindful of the application's boot time. The boot time is a fixed cost that everyone has to pay whenever they want to test anything in the application. Faster boot times means faster iterations. ## OMG-DESIGN-7: Deliver changes quickly The path from implementing a feature to delivering it to the end user should be short. Regardless of the benefits to end users, there are also many advantages to delivering changes fast *internally.* When we integrate often and quickly, designs can be validated earlier, bugs can be discovered sooner and everybody gets on the same page about the current status of the project. To achieve fast delivery we discourage long-lived branches. Users should be working against the trunk as much as possible. Instead of using branches, experimental and unfinished features should be protected by feature flags. Such feature flags can also be used for quick rollback if critical bugs are discovered. ## OMG-DESIGN-8: Avoid coupling Avoid complicated dependencies between systems. These make the codebase harder to understand and modify. It should be possible to modify, optimize or replace each system on its own. Use abstract interfaces to access shared services such as logging, file systems, memory allocation, etc. That way, these systems can be replaced or mocked for unit tests. ## OMG-DESIGN-9: Everything is a plugin *The Machinery* is a platform built from a toolbox of collaborating plugins. Each plugin is defined by a pure C11 API and a common mechanism is provided for querying for and using these APIs. The plugins are built on a common minimalistic foundation layer that provides useful utility functions (math, containers, etc) as well as interfaces for reasoning about core concepts such as memory, threading, logging, error handling, etc. A product built on top of *The Machinery* can freely mix and match from existing plugins to create an adaptable and custom tailored user experience without extra bloat. It also allows teams to work in isolation on a plugin without having to worry about breaking things for other teams. Plugins can be shared without the need for merging code bases. New functionality should almost always be added as a plugin — be wary about extending the foundation. Also, take care to minimize the dependencies between your plugin and other plugins. Each such dependency is a coupling that will, in the long run, make the code harder to modify. ## OMG-DESIGN-10: Follow Data Oriented Design principles Design your systems around data layouts and data flows first. Memory throughput is often the limiting factor of software performance. Make sure your data layouts and access patterns are cache friendly. Don’t abstract away the nature of the hardware. Avoid object-oriented design principles. They encourage heap allocated individual objects with data hidden behind accessor functions. This leads to bad data access patterns and code flows that are hard to optimize and multi-thread. Instead, lay out the data in the most efficient way and then write functions to operate on that data. Multi-threaded, job-based parallelism should be considered the norm for any high-throughput system. ## OMG-DESIGN-11: Take advantage of modern rendering architecture features All core rendering APIs in The Machinery are architected to run on top of explicit graphics APIs such as Vulkan or DX12. The graphics API agnostic building blocks exposed are more minimalistic than what you typically would find in engines built on top of older generation graphics APIs. A few examples: - Instead of exposing a bunch of different domain-specific resource types for reasoning about buffers (e.g vertex, index buffers, etc.) we only have two types: buffer & image buffer. - Instead of using a slot based binding model for exposing resources to a shader, you bind sets of resources to a resource binder that later gets bound to the shader. - Resource state transitions are the responsibility of the user. - All APIs are Multi-GPU aware allowing them to broadcast commands to any configuration of active devices in a render backend using a device affinity bit-mask. When designing new rendering features it is important to keep this in mind: - Instead of thinking about traditional concepts such as vertex, index and constant buffers think "raw memory". - Design around update frequency and data flows. Keep shuffling memory over PCI-E to a minimum. - Utilize "bindless" concepts, avoid unnecessary updates of resource binders. ## OMG-DESIGN-12: Shared responsibility The codebase should be the shared responsibility and pride of every developer that works on it. If you see a problem, it is your problem. Don’t wait for somebody else to fix it. Fixing issues in parts of the code you’re not familiar with is a good way of understanding more of the codebase. Of course, reach out to the people who know that part of the code, to help you understand it better, and verify that your fix is the right one. When you deliver a feature or a new system, you are responsible for the whole delivery. Design, clean code, documentation, testing, performance, knowledge sharing, etc. A feature is not done just because it works. When you go in to make a change, you should always leave the code in a better state than you found it: cleaner, simpler, faster, easier to understand. When you are fixing a bug, don’t just fix the bug. Ask yourself what the underlying cause was that made the bug slip through compilation, unit tests, developer tests, and QA. Can you somehow write the code so that bugs like this can be detected at compile time? Can you add a unit test that detects the bug? Can you add an assert that prevents misuse of the functions? Can you improve the function names and/or the documentation to make misuse less likely? When you are responding to questions, consider adding them to an FAQ or clarifying the code documentation, so that you won’t have to answer the same question again. ## OMG-DESIGN-13: Write for readability Code is read more often than it is written. Write your code so that it is easy to read and understand for a new programmer (or for yourself coming back to the code later). Code should aim to be plain and straight forward. Avoid tricky constructs, trying to be clever and showing off your programming skills. It is not always 100 % clear cut what is an "unreadable clever hack" and what is a "common programming idiom". For example, consider this code for setting a bit in an integer: ~~~c flags |= (1ULL << TM_RENDER_FLAG__CAST_SHADOWS); ~~~ For someone not used to bit twiddling this may seem like a "hack". However, for someone used to bit twiddling, this is just the standard way of setting a bit and adding an abstraction around it (such as a `SET_BIT()` macro) is just obfuscating things. Familiarize yourself with the idioms that are commonly used in *The Machinery* by reading the code. # OMG-CODEORG: Code organization ## OMG-CODEORG-1: All repository code should be written in C At Our Machinery, we use C/C++ for everything. This includes things like command-line tools and documentation generators which would be tempting to write in a scripting language, such as Ruby or Python. The reason we are using C/C++ for everything is: * Anybody can work on any part of the code. * The same debugger can be used for everything. * Minimize dependencies and version hell. * It's a good test run of our C APIs — they should be easy to work with, if not, something is wrong. * Avoid writing the same code multiple times in different languages. The only exception to this rule is when a platform requires us to use a specific language. E.g. Cocoa on OS X requires us to use Objective-C, while Android requires us to use Java. We will minimize the interface code written in these languages and write the bulk of the code in C/C++. ## OMG-CODEORG-2: Header files (.h) should be written in C11 Header files (`.h`) define public interfaces between modules. All header files should be strict C11 (not C++). We use C rather than C++ as our *interface* language for multiple reasons: * C is a lot simpler than C++. Reading a C header file is easier than reading a C++ header file. * C has a smaller design space than C++. C++ can be written in lots of different "styles", which leads to confusion and fragmentation. * We want to support developing in C — i.e. it should be possible to write modules entirely in C. * C has a stable ABI. Our system is plugin-based and using C interfaces allows us to compile DLLs without worrying about which compiler version and flags we are using. * Most scripting languages are easy to bind to a C API. In contrast, when trying to bind to a C++ API, you can end up with the C++ object model and the scripting language object model fighting each other. Header files should follow the pattern: ~~~c #pragma once #include "api_types.h" // ... ~~~ We use `#pragma once` rather than include guards because it is simpler to write and works with all modern compilers. Note that we don't use `extern "C"` in the header files, as we consider C our main language. To load our header files from C++, you should wrap your includes in an `extern "C"` declaration: ~~~c++ extern "C" { #include "bla.h" } ~~~ ## OMG-CODEORG-3: Header files can only include `"api_types.h"` Header files (.h) should not be dependent on other header files and there should be no `#include` statements in header files. The only exception to this is `"api_types.h"` that may be included in header files in order to get access to standard API types (`bool`, `uint32_t`, etc). The reasons for this rule are: * Better modularity — each header file can be understood in isolation. * Faster compile times — headers including other headers can create massive amounts of code to compile. Look at some of the `.h` files in the project to see how to write headers in this style. Some of the techniques used are: * **Forward declared opaque structs:** (`struct tm_allocator_i;`). This is used to refer to objects defined in other modules. Since we only refer to them by pointers, the header file does not need the full struct definition. * **Opaque data blobs:** (`struct tm_file_o { uint8_t opaque[64]; }`). This is used to represent things like file handles, that we want to keep on the stack but that use different types on different platforms. We use an opaque data blob that is big enough to hold the largest type size and then `memcpy()` data between the opaque type and the platform-specific types. One tricky construct is when you want to implement some form of "inheritance" in C. I.e., something like this: a.h: ~~~c struct foo_base { int x; }; ~~~ b.h: ~~~c struct foo_impl { TM_INHERITS(struct foo_base); int y; }; ~~~ In this case `b.h` cannot be compiled without the definition in `a.h` and it might be tempting to put `#include "a.h"` in `b.h`. Instead, you should make sure that you include `a.h` before `b.h` in any `*.c` file that uses `b.h`. Like this: c.c: ~~~c #include "a.h" #include "b.h" ... ~~~ This might seem overly complicated, but we believe it is preferable to opening the door to header inclusion. Virtual inheritance should be a rare thing and there are advantages to exposing its complexity. To make sure the user gets better error messages if she forgets to include `a.h` you can do something like this: a.h: ~~~c #define FOO_BASE struct foo_base { int x; }; ~~~ b.h: ~~~c #ifndef FOO_BASE #error "You must include \"a.h\" before including \"b.h\"." #endif struct foo_impl { TM_INHERITS(struct foo_base); int y; }; ~~~ ## OMG-CODEORG-4: Implementation files (.c) should be written in C, but C++ can be used in rare cases Interfaces *must* be written in C. Implementations *should* be written in C. It is a smaller and simpler language and writing the code in C probably makes it easier to understand. Programmers working in *The Machinery* codebase should not have to learn C++. However, we acknowledge that there are a few rare cases where the greater expressive power of C++ can simplify the code, or where C++ code is needed to interface with an external C++ based library. In those few, very rare cases, we allow the implementation to be written in C++. Currently the code base consists of 98.9 % C and 1.1 % C++. We feel that this is a good balance. Note that this rule about preferring C only applies to our *internal* code. Clients and customers may have many good reasons to prefer other languages (for example consistency with an existing code base). They can use any language that can interface with our C ABI: C, C++, D, Rust, ML, etc. ## OMG-CODEORG-5: Shared non-API code goes in `.inl` files Header files (`.h`) are only allowed to contain public C11 interfaces. However, there is often a need for other kinds of shared code. One example is internal headers within a large module. These headers are not intended for public use, just for the different parts of the module to communicate with each other. For this kind of shared code, that is not a part of the public APIs we use the `.inl` extension, to keep them physically distinct from the public `.h` files. `.inl` files may include `.h` files and other `.inl` files but you should use this sparingly in order to avoid increasing compile times. ## OMG-CODEORG-6: Most functionality should be built as plugins The design of The Machinery is to have a small core (the `foundation`) that can be easily extended with plugins. Most functionality added to The Machinery should be written as a plugin rather than as a part of the `foundation` since it gives better modularity. ## OMG-CODEORG-7: File names should be unique Avoid having files with the same name in different directories. (For example `math/types.h` and `config/types.h`.) Unique file names are easier to search for. ## OMG-CODEORG-8: Document third-party code All use of third-party code should be documented in the `third-party-code.md` file. # OMG-API: API design ## OMG-API-1: Externally visible names should start with `tm_` or `TM_` This is our namespace prefix and is used to avoid name clashes. `tm_` should be followed by the system name. So functions related to input should be prefixed by `tm_input_` while enums related to input are prefixed `TM_INPUT_`. Static functions and other objects that don't have global visibility don't need to be and shouldn't be prefixed. ## OMG-API-2: APIs should use explicit width integers APIs should use `uint16_t`, `uint32_t`, etc rather than `int`, `short`, etc. We want to always know the exact integer sizes and have them be consistent across platforms. ## OMG-API-3: Unsigned integers are preferred over signed Signed integers should only be used for values that can be negative. Values that can only be positive should be represented by unsigned integers. Be careful when using subtraction operations on unsigned integers -- it is easy to slip up and underflow the integer into `0xffffffff`, for example with code like this: ~~~c for (uint32_t i = n-1; i>0; --i) // Do stuff ~~~ When you are doing stuff like this, consider casting to signed integers unless you're 100 % sure the subtraction won't underflow. ## OMG-API-4: `bool` from `stdbool.h` should be used for boolean types Using an explicit `bool` type for boolean values is clearer than representing them by an integer value. You should include `api_types.h` to get access to `bool` and other standard types. ## OMG-API-6: Strings should be UTF-8 encoded All strings used in APIs should be UTF-8 encoded. ## OMG-API-7: Unchanged pointers should be `const` declared If an API function takes a pointer and doesn't modify the value pointed to, the pointer should be declared `const` in the API. ## OMG-API-8: Use 0 to indicate "no value" API functions that return an integer or a floating point value and want to indicate "no result" should do so by returning `0` and not some other value, such as `-1` or `0xffffffff`. Using `0` for "no value" is less ambiguous and makes it easy to test the value with a simple if-statement. A useful technique is to reserve the first slot in arrays of objects, etc, so that the index `0` will never be used. That way we can use it for the `null` value. If you are unable to use `0` to mean "no value" because it is a valid result of the function you are writing, use a second `bool` return value to indicate whether the value returned is valid or not. ## OMG-API-9: memset(0) should be a valid data initialization Aim to design data structures so that `0` represents the default value. This makes it possible to initialize structs with a simple memory clear or by a `= {0}` assignment. There are some situations where this is not possible. For example, a `tm_vec3_t` representing scale has a natural default value of `{1,1,1}` and a `tm_vec4_t` representing a quaternion has a natural default of `{0,0,0,1}`. Don't try to fight this by adding a weird bias to the values. Just accept that in these cases we cannot achieve our goal of zero-initialization. ## OMG-API-10: Prefer polling over callbacks If your system needs to notify other parts of the code when events occur, it should do so through polling rather than callbacks. Callbacks often create subtle problems because it is hard to know which state the system is in when the callback occurs, what it is safe to do from within the callback, what thread the callback occurs on, etc. Polling does not have these issues. In a real-time application, there is always an update loop, so polling can be done without incurring any big overhead. In cases where you have a lot of objects that might need to be polled, use aggregate polling to reduce the cost. E.g., to avoid having to poll thousands of individual physics objects to see if they collided, expose a single aggregated polling function that returns an array of the physics objects that collided during the last physics update. ## OMG-API-11: Modules are structs with function pointers The Machinery is built as a number of interacting but independent modules. A module is defined by its name and its API. The API is a struct that can contain variables and function pointers: ~~~c // The logger registry stores a list of active loggers. You can register // more loggers to add more backend outputs for log messages. struct tm_logger_api { // Adds a new logger to the registry. void (*add_logger)(tm_logger_i *logger); // Removes a previously added logger from the registry. void (*remove_logger)(tm_logger_i *logger); // Sends a log message to all registered loggers. void (*print)(enum tm_log_type log_type, const char *msg); // Convenience function for printing a formatted string (using `printf` syntax) // to the default logger registry. int (*printf)(enum tm_log_type log_type, const char *format, ...); // A default logger that will print log messages using `printf(...)`. // // On Windows, these messages are also printed using `OutputDebugString()` so they appear in // Visual Studios log console. // // Note that this logger is automatically added. You need to explicitly remove it, if you don't want to use it. tm_logger_i *default_logger; }; ~~~ [Sample module.] The API registry keeps track of all current modules. You can query the registry for a module with a specific name. The registry will return a `void *` which can be cast to a struct of the appropriate type. Note that the `void *` returned is permanent. If the module is reloaded, the function pointers in the module will change, but the pointer to the module is still the same. This means it is legal to query for the API of a module before it is even loaded. Later, when it is loaded, the API will be filled in. ## OMG-API-12: create_xxx(), destroy_xxx() Functions for creating objects that allocate the object's memory should be called `create_xxx()`. The corresponding function for destroying the object should be called `destroy_xxx()`. If the functions don't allocate memory, but just initialize an object that already exists, they should be called `init_xxx()` and `shutdown_xxx()` instead. ## OMG-API-13: Counting functions are prefixed with `num_` Functions that return a count should be prefixed with `num_`, for example, `num_buckets()`. ## OMG-API-14: Type naming conventions * Externally visible objects must be named with `tm_` prefix. * Internal objects should not have `tm_` prefix (to avoid confusion). Every type should have a suffix. We use the following suffixes: * `_api` — API with function pointers that can be queried through the API registry. * `_i` — object interface — struct with function pointers and an `inst` member. * `_vt` — vtable, for objects using vtables in their interface * `_t` — simple/raw/POD data that you can copy freely and create on the stack. * `_o` — classes, opaque data, structs that need initialization, etc. ## OMG-API-15: Use of `char *`, `unsigned char *`, `uint8_t *`, `void *` * An API parameter representing a traditional string (zero-terminated, UTF-8 literal) should use `char *` (for compatibility with C literals). * String processing code may want to cast from `char *` to `uint8_t *` to deal with bytes outside the 0--127 range. * An API parameter representing a block of bytes (not a string) should use `unsigned char *`. * An API parameter representing an untyped pointer to memory should use `void *`. * The internal storage format for arbitrary data buffers should be `unsigned char[]`. !!! note: Note `unsigned char` is preferred over `uint8_t` to represent arbitrary data blocks. The reason is that `char` pointers are "special" in C. Any object pointer can be cast to a `char *` and any object can be represented as `unsigned char [sizeof A]`. The same is not necessarily true of `uint8_t *` (that is implementation dependent). ## OMG-API-16: Parameter order for functions The typical parameter order for a function should look something like this: ~~~c return_value = f(object, parameters, options, helpers, more_return_values) ~~~ * `object` is the "object" (if any) that the function operates on (e.g. for `fread`, its the file handle). * `parameters` are the primary parameters that the function uses (e.g. for `fopen`, its the file name). * `options` are any options that modify how the function operates (e.g. open for reading or writing). * `helpers` are any additional helper objects needed by the function to do its job (e.g. memory allocators). * `more_return_values` -- if the function needs to return more than one value, the extra return values are passed as pointers at the end of the parameter list. Note that often readability is improved by returning multiple values as a `struct` instead of using output parameters. However, adding that extra struct increases the cognitive load a little bit, especially if it is only used in one place. You have to decide on a case-by-case basis whether returning a struct or using return value pointers makes more sense. When passing pointers and sizes we use the `memcpy` parameter order: * Destination pointer comes before source pointer. * Pointer comes before byte counter. !!! note: Note We may want to change this in the future, to support declaring functions with explicit pointer sizes based on other parameters: ~~~c void f(size_t n, char s[static n]); ~~~ This syntax is not yet supported by the Visual Studio C compiler though. ## OMG-API-17: Use SI units Units in APIs should be SI units: seconds, meters, kilograms, etc. I.e., use a `double` parameter that specifies seconds, instead of an `uint32_t` that specifies milliseconds. Buffer sizes should be specified in bytes (not kilobytes or megabytes). Prefer to use `_bytes` rather than `_size` in the name to indicate byte count rather than element count. ## OMG-API-18: API Documentation API documentation is automatically generated using the `docgen` executable — it is built automatically by `tmbuild`. `docgen` parses `.h` files and finds `//` comment blocks that precede code blocks (`#define`, `struct` or function definitions) and turns those comment blocks into comments for the code. Freestanding comment blocks become freestanding text in the documentation. Tips for using `docgen`: * Comments should be written in Markdown ([Markdeep](https://casual-effects.com/markdeep/), to be specific). * Quote variables using backticks. * If you want to exclude something from the documentation, use `// tm_docgen off` to turn off documentation generation and `// tm_docgen on` to turn it back on again. * You can also ignore a single thing with `// tm_docgen_ignore`. * To make a section within a header file or a struct, put a "freestanding" comment there. (I.e. a comment that's not attached to a code block.) * The first "freestanding" comment in a file gets turned into Overview documentation. * You can quickly view the documentation for a single file or a directory with `docgen -e PATH`. API documentation should aim to be succinct and provide the user with just the information that they need. Avoid documentation boilerplate. ## OMG-API-19: Opaque object representations Some modules are capable of creating opaque objects. We support four different kinds of object representations. **Simple objects** are represented by an opaque forward declared struct pointer: ~~~c typedef struct tm_object_o tm_object_o; struct tm_api { tm_object_o *(*create)(); void (*do_stuff)(tm_object_o *o); void (*destroy)(tm_object_o *o); }; ~~~ Functions in the API allow you to manipulate the object. A variation on simple objects, **simple POD objects** are used for objects that are manipulated by value rather than by pointer. It looks something like this: ~~~c typedef struct tm_object_o { uint8_t bytes[32]; } tm_object_o; struct tm_api { void (*init)(tm_object_o *o); void (*do_stuff)(tm_object_o o); }; ~~~ With this approach, the caller is responsible for managing the memory of the object and the object can live on the stack. It can be compared and copied with `memcmp()` and `memcpy()`. On the inside, the API implementation will cast the raw bytes to a concrete `object_t` representation of the object. This representation is not exposed in the API either because it is platform-specific, contains types that we don't want to expose, etc. **Object interfaces** are used when we want several different implementations that implement the same interface. They look like this: ~~~c typedef struct tm_allocator_i { tm_allocator_o *inst; void *(*realloc)(tm_allocator_o *inst, void *ptr, uint64_t old_size, uint64_t new_size, const char *file, uint32_t line); } tm_allocator_i; ~~~ Here `tm_allocator_o *` is an opaque implementation-specific representation of the data for the object. Each implementation of the interface will cast this pointer to a specific type for that implementation. In addition to the data pointer, the interface also contains (implementation specific) function pointers that manipulate the data. Note that these functions all take the instance data as an argument. You would use the interface like this: ~~~c tm_allocator_i *a; void *p = a->realloc(a->inst, 0, 0, 1024, __FILE__, __LINE__); ~~~ Note that with this approach, each object contains all the function pointers of the interface, which consumes a bit of memory (not a lot, since interfaces should typically be small). A more serious drawback is that reloading code is tricky with this approach because all the function pointers in all the objects must be replaced to point to the new reloaded version. **Object interfaces with vtables** can be used to address this. They work as object interfaces, except they put all the function definitions in a vtable, so they can be shared by multiple objects: ~~~c typedef struct tm_allocator_vt tm_allocator_vt; typedef struct tm_allocator_i { tm_allocator_o *inst; tm_allocator_vt *vt; } tm_allocator_i; struct tm_allocator_vt { void *(*realloc)(tm_allocator_o *inst, void *ptr, uint64_t old_size, uint64_t new_size, const char *file, uint32_t line); }; ~~~ And using it becomes: ~~~c void *p = a->vt->realloc(a->inst, 0, 0, 1024, __FILE__, __LINE__); ~~~ Vtables also allow an object to implement an interface and extend it with additional functionality: ~~~c struct tm_my_allocator_vt { TM_INHERITS(struct tm_allocator_vt); void (*set_chunk_size)(tm_allocator_o *inst, uint64_t chunk_size); }; ~~~ This kind of inheritance should be used *very sparingly* as it can create complicated couplings, bad cache behavior and hard to read code. Prefer composition over inheritance. However, there are a few special cases where inheritance can be a good solution. ## OMG-API-20: Enum types should be named Enum types in APIs should be named. Note that since we are using C, this does not provide a namespace for the constants, the constant names still need to be fully qualified: ~~~c enum tm_log_type { TM_LOG_TYPE_INFO, TM_LOG_TYPE_DEBUG, TM_LOG_TYPE_ERROR }; ~~~ If an enum is used just to define a constant, the enum does not need to be named. In this case the enum should be written on a single line, or a `#define` statement should be used: ~~~c enum { TM_UI_MAX_TEXT_INPUT = 32 }; #define TM_UI_MAX_TEXT_INPUT 32 ~~~ ## OMG-API-21: Stable interfaces With our module architecture, changing an API interface will break compatibility with DLLs and plugins using the old interface. You should be conservative and not change the interfaces more than necessary. For example, add new functions at the end of the interface rather than at the beginning, that way existing users are unaffected by the change. However, in a long term perspective, making sure the code can evolve, improve and adapt is more important than keeping backward compatibility with really old code. Currently, since the project is in early development, we don't attempt to keep the APIs stable and you may change them freely. ## OMG-API-22: Use enum types instead of generic types when possible If a variable, field or parameter represents an enum type, use that enum type to declare it, rather than a generic type, such as `uint32_t`. ~~~c void f(enum tm_log_type log_type); struct data_t { enum tm_log_type log_type; }; ~~~ This makes it clear what value is expected and makes the code easier to use. An issue with this is that the C standard does not guarantee anything about the sizes of enum types and we want to make sure that the struct layout (at least for publicly visible structs) is identical across compilers and platforms, so that we have a stable ABI. We assume the following: * Any enum who's values can fit into an `int32_t` is represented as an `int32_t`. * Any enum with bigger values is represented as an `int64_t`. This seems to be true on all platforms and compilers that we currently use. If this is no longer true, for example, if some platform wants to store small enums as `int16_t` we will revise the situation and take additional measures to ensure that all the enums in public API structs use the same number of bytes. A common use case for enums is as bit flags. In this case, we recommend that a variable, field or parameter that holds a combination of such bit flags be represented by the enum type too: ~~~c enum tm_my_type x = MY_TYPE__ONE_FLAG | MY_TYPE__ANOTHER_FLAG; ~~~ The type of `x` is not really `enum my_type` in this case, rather it is "a set of `enum my_type`". However, in most cases, it is not really necessary to make that distinction and just using `enum tm_my_type` for `x` is simpler than introducing a new "set" type. ## OMG-API-23: Thread-safety rules Unless otherwise documented, the following thread-safety rules apply to all API functions: * Functions are not thread-safe. * But they only operate on the objects explicitly passed, so they are thread-safe with respect to operations on other objects. As an example, consider this simple function in the UI API: ~~~ tm_ui_api->make_id(ui); ~~~ This function is not thread-safe. I.e. if two different threads call `make_id(ui)` simultaneously, errors can occur. For example, the two threads may end up with identical UIs even though the UIs are supposed to be unique. However, the function will only operate on the `ui` that is passed, so it *is* safe to call `make_id(ui_1)` on one thread and `make_id(ui_2)` on a different thread. Certain APIs, such as `tm_docking_api` and `tm_profiler_api` are *global*. I.e., their data is saved in a single global data structure. Since the functions in such APIs all touch the same data they are *not* thread-safe (unless otherwise noted). These are the default thread-safety rules. APIs and functions may deviate from these rules, as long as those deviations are clearly noted in the APIs. For example, the `tm_profiler_api` would normally not be thread-safe as it uses a global object. However, it has been explicitly made thread-safe so that we can profile multiple threads simultaneously. Thus, this is noted in the documentation. Similarly, if a function that you would expect to be thread-safe by the default rule is *not* thread-safe (because it touches hidden global state), that needs to be noted too. # OMG-STYLE: Source code style rules ## OMG-STYLE-1: Source code should be formatted using `clang-format` The formatting done by our `clang-format` rules is not perfect, but having an automatic formatting system is more important than having a perfect one. It allows developers to focus on the code rather than formatting issues. * Set up your editor to automatically run `clang-format` on save (both Atom and Visual Studio Code have plugins for this). * Install the pre-commit hook `utils/githooks/pre-commit` to format the source code automatically on commit (this is done automatically by `tmbuild`). Note that different versions of `clang-format` will format the code slightly differently. It is therefore *very important* that everybody uses the same clang-format version, or we'll get formatting diffs on every commit. `tmbuild` automatically installs clang-format version 6.0.0 in your `lib` folder and sets the pre-commit hook to use this version. You may want to point your editor to use the same version to ensure consistent formatting. !!! note: Note The most annoying thing about `clang-format` is that it left-flushes `#if` statements rather than indenting them. This makes the code harder to read — especially when there are multiple nested statements. We should consider patching `clang-format` to fix this in the future. I.e. instead of this: ~~~c dont #if defined(TM_OS_WINDOWS) #define DIRSEP "\\" #else #define DIRSEP "/" #endif static void log_output_debug_string(struct tm_logger_o *inst, enum tm_log_type log_type, const char *msg) { #if defined(TM_OS_WINDOWS) void OutputDebugStringA(const char *s); OutputDebugStringA(msg); #endif } ~~~ We want this: ~~~c #if defined(TM_OS_WINDOWS) #define DIRSEP "\\" #else #define DIRSEP "/" #endif static void log_output_debug_string(struct tm_logger_o *inst, enum tm_log_type log_type, const char *msg) { #if defined(TM_OS_WINDOWS) void OutputDebugStringA(const char *s); OutputDebugStringA(msg); #endif } ~~~ Tips if you find yourself fighting with `clang-format`: * When formatting inlined struct initializers, you can make clang-format split it over multiple lines by putting a `,` after the last element, or keep it as one line by omitting the last `,`: ~~~c a = { .x = 1, .y = 2 }; b = { .x = 1, .y = 2, }; ~~~ * If you can't get the line breaking you want, you can prevent clang-format from merging your lines by putting an empty comment `//` at the end of the line. * Use `// clang-format off` to disable clang formatting for a section of the code and `// clang-format on` to turn it back on again. ## OMG-STYLE-2: If you are writing C++, prefer simple C++ If you are writing code in C++ (C is preferred) you should use a simple style. Avoid class hierarchies, multiple inheritance, template metaprogramming, virtual functions, operator overloading etc. Prefer public data members over accessor functions. In most cases, accessor functions are unnecessary boilerplate and the "encapsulation" they provide does not add any real value. Avoid using `auto`. `auto` makes it harder to see what types are actually being used in the code and leads to more cryptic error messages. Only use `auto` for complicated types, such as lambdas or iterators. Prefer using `static` rather than anonymous namespaces (`namespace { }`) to prevent external linkage. The reason we prefer `static` is that it doesn't create huge namespace scopes that make it difficult to see where the scope begins and ends. For local classes and other things that have external linkage in C++, but cannot use the `static` keyword, use a small `namespace {}` scope wrapped around just that object. ## OMG-STYLE-4: Line length is up to the discretion of the programmer Monitors are wider these days and nobody prints source code, so there is no need to enforce a line width of 80 characters. Long lines can be hard to read, but using short variable names and chopping up a line with line breaks can also make it hard to read -- we leave it up to each programmer to find the right balance. Try to avoid super long lines, but don't religiously line break your code at a fixed width at the cost of readability. If you want a guide for line length -- we think line breaking at 100 characters makes more sense than line breaking at 80. 80 is unnecessarily restrictive and 100 gives a bit more room for longer variables. Documentation should always be line-broken at 100 characters or less. Reading text is different than reading source code. Source code is read line by line, while text is read as continuous paragraphs. Column widths greater than 100 characters makes text really hard to read. Make use of editor features to line break the documentation at 100 characters. For example, in *Visual Studio Code*, set *Word Wrap Column* to 100, install the *Rewrap* extension and press *Alt-Q* to wrap a paragraph. We believe that the *right* way to handle long code lines is for the *editor* to word wrap automatically *with intelligent indentation*. I.e., when you enable word wrap in the editor, the indentation should look similar to what you would get with hand-inserted line breaks. This way, each user can get the editor width that they prefer and the code will look good everywhere. Getting this behavior is actually possible now (finally!) in for example Visual Studio Code. We recommend the following settings: ```json { "editor.wordWrapColumn": 100, "rewrap.wrappingColumn": 100, "editor.rulers": [ 100 ], "editor.wrappingIndent": "indent", "editor.wordWrap": "on", } ``` Note that we word wrap at the window width rather than at the 100 character ruler. This allows the programmers to change the column width based on their needs by resizing the window. Even with this setting we don't recommend writing super long lines. The reason is that there is still a lot of tools that process code line-by-line (for example source control) and those tools become uncomfortable to use if there are a lot of really long lines in the source code. But the above settings makes it easier to deal with the occasional long line. ## OMG-STYLE-7: Unexposed functions and variables should be declared `static` Everything that is not part of the public API should be declared `static` so that it doesn't have external linkage. ## OMG-STYLE-9: Naming Use short names for local variables. Use longer descriptive names for things with bigger scopes (functions, structs, global variables). Outside of local variables, you should not use abbreviations in variable names. Most things are named with `lower_case`. Enums and define constants are named with `UPPER_CASE`. Functions private to a source file can be given the prefix `private__` to emphasize their private nature. ## OMG-STYLE-10: Typedefs Use typedefs to define names for structs and unions that can be used without the `struct` keyword. These names should always be the same as the name of the struct: ~~~c typedef struct tm_vec2_t tm_vec2_t; ~~~ Don't use typedefs for other things. In particular, don't typedef pointer types (hiding the fact that they are pointers). ## OMG-STYLE-11: Goto Use of `goto` should be limited to early exiting of functions that need to do some cleanup before returning. ## OMG-STYLE-12: Commenting Exposed API functions need documentation comments, other code only needs to be commented as necessary. Write your code clearly and use sensible names to reduce the need for comments. `//` comments should be used for all comments. `/*` should only be used to temporarily disable code. ## OMG-STYLE-13: No dead code Unused code, commented out code, functions that are never called, etc, should be removed from the project to reduce the cognitive load of reading the source code. Old code is available in the source history if it is needed. ## OMG-STYLE-14: Macros Macros should be `CAPITALIZED`, unless the macro is "function-like", i.e. it works similar to an ordinary C function, but needs to be a macro for some reason (like `tm_min(a,b)`). Note that this is somewhat of a blurry line. Static inline functions should generally be preferred over functional macros, but there are some situations where you might still want to use a functional macro: * The macro declares variables. (`TM_INIT_TEMP_ALLOCATOR()`) * You want to make use of `__FILE__`, `__LINE__` or `__func__` in the context of the caller. (`tm_alloc()`) * The macro is generic. (`tm_min(a,b)`) * The macro is variadic. (`TM_ERROR()`) * The macro is pattern matched by external tools. (`TM_LOCALIZE()`) * The macro makes use of global variables that are declared at the point where the macro is used, but not where it is defined. (`tm_frame_alloc(), tm_tt_read()`) * You want to make absolutely sure that the call is always inlined. Macros with multiple statements should be enclosed in a `do {...} while (0)` block so that they can be used as part of the normal control flow. For macros that should take static string parameters, enforce that the parameter is a static string by wrapping it in `"" s ""`: ~~~c #define TM_STATIC_HASH(s, v) ("" s "", v) ~~~ ## OMG-STYLE-15: Conditional compilation Prefer `#if defined(X)` to `#ifdef X`. Avoid use of `#if` and `#ifdef` as much as possible. Source code is easier to reason about if every compiler on every platform sees the same code. Don't sprinkle the code with platform-specific `#ifdef`s. Instead, put all the platform-specific stuff in its own files, one per platform: `os.win.c`, `os.osx.c`, etc Use the `TM_IS_DEFINED` macro and a regular `if` statement instead of `#if` whenever possible: ~~~c if (TM_IS_DEFINED(TM_OS_WINDOWS)) do_some_stuff(); else do_some_other_stuff(); ~~~ instead of: ~~~c dont #if defined(TM_OS_WINDOWS) do_some_stuff(); #else do_some_other_stuff(); #endif ~~~ This is preferable, because when we compile, all code paths are checked for syntax errors, not just the platform that we are compiling for. ## OMG-STYLE-16: Use `TM_STATIC_HASH()` for static hash values To create a static hash value, you should use the `TM_STATIC_HASH()` macro, like this: ~~~c const uint64_t name_hash = TM_STATIC_HASH("name", 0xd4c943cba60c270bULL); ~~~ The macro resolves to its second argument, so when compiled this will just be: ~~~c const uint64_t name_hash = 0xd4c943cba60c270bULL; ~~~ `TM_STATIC_HASH()` is defined in `foundation/api_types.h`, so it can be used to define macros in header files. The `hash.exe` utility program can be used to automatically check and patch the arguments of all `TM_STATIC_HASH()` macros in the codebase. So instead of trying to figure out what the right murmur hash value is for a particular string, you can just write: ~~~c const uint64_t name_hash = TM_STATIC_HASH("name",); ~~~ And then run: ~~~c bin/Debug/hash.exe ~~~ in the root of your project, and the code will automatically be patched to: ~~~c const uint64_t name_hash = TM_STATIC_HASH("name", 0xd4c943cba60c270bULL); ~~~ ## OMG-STYLE-17: Use `TM_LOCALIZE()` for localization Every string that is visible in the UI should be wrapped in a `TM_LOCALIZE()` macro for localization. Example: ~~~c const tm_ui_menu_item_t items[] = { { .text = TM_LOCALIZE("File"), .item_id = MENU_FILE }, { .text = TM_LOCALIZE("Edit"), .item_id = MENU_EDIT }, { .text = TM_LOCALIZE("View"), .item_id = MENU_VIEW }, { .text = TM_LOCALIZE("Tools"), .item_id = MENU_TOOLS }, }; ~~~ The macro will call out to the localize system to translate the string to the current UI language. If there is no translation for the string, or the UI language is English, the original string will be returned. Thus, wrapping strings in `TM_LOCALIZE()` allows them to be localized. The Machinery editor includes a *Gibberish* translation option to test localization. When you set the UI language to *Gibberish*, all localized strings will be translated to slightly longer randomized strings. This serves two purposes. First, it allows you to check that you have properly used `TM_LOCALIZE()` everywhere. If you still see any English text after switching to Gibberish, that text has not been localized. Second, it allows you to check that the UI doesn't break if the translated strings are slightly longer than the original. The `localize.exe` program can be used for translation. It will find all the `TM_LOCALIZE()` macros in the project, check them against a list of translations and print out any translations that are still missing. Since we scan for `TM_LOCALIZE()` macros *statically*, `TM_LOCALIZE()` can only use *static* strings. I.e., this is an error: ~~~c dont // Don't do this. sprintf(msg, "Thanks %s!"); msg = TM_LOCALIZE(msg); ~~~ Instead of trying to localize dynamic strings like this, compose dynamic strings out of localized static parts: ~~~c sprintf(msg, TM_LOCALIZE("Thanks %s!"), name); ~~~ !!! Tip: Localization tips * Be wary of concatenating strings out of too many individual parts. Word order might differ between languages and the way you put the string together might not make sense. E.g.: ~~~c dont // Not ideal: This locks down the sentence structure and depends on // "great" and "good" being translated the same way in this sentence // as in all other possible sentences. sprintf(msg, TM_LOCALIZE("Thanks %s, you did %s!"), name, score > 100 ? TM_LOCALIZE("great") : TM_LOCALIZE("good")); ~~~ Better: ~~~c sprintf(msg, score > 100 ? TM_LOCALIZE("Thanks %s, you did great!") : TM_LOCALIZE("Thanks %s, you did good!"), name); ~~~ * Make sure to clearly mark in your APIs which function return localized vs non-localized strings. E.g., if you just see a function: ~~~c const char *name(); ~~~ it is hard to know if it returns a localized name (suitable for displaying in the UI), or a non-localized string for internal use. In some cases, you might need two separate functions for the two different purposes: ~~~c const char *name(); const char *display_name(); ~~~ ## OMG-STYLE-18: All variables that are not modified should be declared `const` Any variable in the code that is not mutated should be const-declared: ~~~c const uint32_t x = y + 3; ~~~ Using `const` this way makes it easier for readers to understand the code. They know that `x` will always have the value that was set on initialization and do not have to scan the code for places where `x` might be modified. Value parameters passed to a function are always assumed to be `const`. However, we do not explicitly declare them as const. Never modify value parameters in your function implementation: ~~~c dont // Don't do this: uint32_t add(uint32_t a, uint32_t b) { a += b; return a; } ~~~ ## OMG-STYLE-19: Don't cast `void *` Don't cast the return value of `tm_alloc(...)` and similar functions, since they return a `void *`, they can be simply assigned to any desired pointer type. Explicit conversion is only needed in C++ code. Don't: ~~~c dont char *p = (char *)tm_alloc(a, 32); ~~~ Instead: ~~~c char *p = tm_alloc(a, 32); ~~~ ## OMG-STYLE-20: Use `__restrict` to prevent pointer aliasing In performance critical code, declare pointers `__restrict` to prevent aliasing. !!! We are using `__restrict` rather than `restrict`, because Visual Studio does not yet support the `restrict` keyword. Also, some Microsoft headers use `restrict` for other purposes, so redefining it will give errors when including those headers. Once Visual Studio has proper `restrict` support we should change this to use `restrict` instead of `__restrict` everywhere. A `restrict` declared pointer means that *within the scope* of that pointer declaration, we promise to access the memory that the pointer points to *only* through that pointer. Don't feel the need to put `restrict` everywhere. You may accidentally introduce a bug if you declare two pointers as `restrict` in situations where they can actually alias. It's not always evident when pointers can alias or not, therefore each use of `restrict` requires a bit of thinking, and it should be used with care and consideration. The Machinery currently compiles with `-fno-strict-aliasing`, but you should write your code to be `-fstrict-aliasing` safe. I.e., you should not use two pointers of different types to access the same memory area (with the exception of `char *`). The reason we're not using strict aliasing is that our main development environment is Visual Studio which doesn't support strict aliasing anyway. We might revisit this in the future. Quick summary of strict aliasing rules: * An object must be accessed through its effective type, a `char` pointer or a union that includes its effective type. * Any object pointer converts to and from `void *`. * Only pointers of the same type can alias, except for `char *` which can alias with everything. * Every object can be viewed as `unsigned char [sizeof A]`. Since we're not compiling with strict aliasing you may want to use `restrict` even on pointers that would not be able to alias under strict aliasing rules -- i.e. pointers of different types. References: * https://cellperformance.beyond3d.com/articles/2006/05/demystifying-the-restrict-keyword.html * https://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html ## OMG-STYLE-21: All block scope variables should be initialized All block scope variables should be initialized at the point of declaration: ~~~c uint32_t x = 0; ~~~ Avoiding bugs from uninitialized variables is more important than the possible performance gains from not initializing variables. ## OMG-STYLE-22: All struct padding should be explicit Because of alignment requirements, C will often insert padding into structs: ~~~c struct s { uint32_t u32; // <--- 4 bytes of padding is inserted here to align u64. uint64_t u64; bool b; // <-- 7 bytes of padding here is inserted to align the size of the struct. } ~~~ We require that all such padding be made implicit with the use of the `TM_PAD()` macro. I.e., the above struct should be written as: ~~~c struct { uint32_t u32; TM_PAD(4); uint64_t u64; bool b; TM_PAD(7); }; ~~~ The reason why we make the padding explicit is that without explicit padding, initializers such as `= {0}` may have the surprising result of not clearing the padding bytes, which can lead to problems when hashing, comparing or serializing structs. The build files are set up to generate warnings whenever the compiler has to add padding to a struct. # OMG-BUILD: Build process ## OMG-BUILD-1: Single button builds through `tmbuild` Running the `tmbuild` executable on a fresh machine should be enough to build the project. ## OMG-BUILD-2: Distribution packages through `tmbuild -p` To build a package of the engine for distribution, use the `-p` option, such as: ~~~sh tmbuild -p release-package.json ~~~ Packages are defined in JSON files. The source repository contains three packages: * `release-package.json`: Builds the binary release of The Machinery for the current platform. * `release-pdbs-package.json`: Builds a package of PDB files for Windows. Must be built immediately after you've built `release-package.json`. * `release-debug-symbols-package.json`: Builds a package of debug symbol files for Linux. Must be built immediately after you've built `release-package.json`. The package builds makes use of the following environment variables: * `TM_LIB_DIR`: The directory where the libraries needed to build The Machinery are installed. The libraries are automatically downloaded to this directory by `tmbuild`. If you don't specify this environment variable, `./lib` will be used. * `TM_SDK_DIR`: The library where The Machinery SDK can be find. When you do a release package build, `tmbuild` will make sure that the `utils` and `samples` distributed in the SDK package can be built, so `TM_SDK_DIR` needs to be set to the directory where the package is being built (i.e., `./build/themachinery`). * `TM_OURMACHINERY_COM_DIR`: Directory where the `ourmachinery.com` repository resides. The package build will update the API documentation in this directory. # OMG-GIT: Git workflow ## OMG-GIT-1: Use trunk-based development At Our Machinery, we use a [trunk-based development](https://trunkbaseddevelopment.com/) workflow. The basic idea behind trunk-based development is to not have long-lived feature branches. Instead, work should be merged incrementally to the `master` trunk, typically at least daily. The workflow is: * Work on your change directly in `master`. * Work on a small isolated feature in a single commit. * When you have something to share, write a commit message that is as descriptive as possible and push it. * Typically the process for this is: `git pull --rebase`, `git push`. If your change has multiple small steps to it, or if you want someone to look at it before you commit it, it can make sense to work on a small private branch instead of directly in master. This lets you commit to backup your work before your change is finished. If you do this, when you are done with your change, merge it with `git merge --squash`. This merges together the multiple *WIP* commits from your private branch into a single commit for the `master` branch. The advantages of using this workflow are: * Everybody sees the same codebase — people are not off on separate branches. * Merge conflicts are reduced. * Integration happens sooner rather than later — problems are discovered sooner. * Commits are smaller and easier to grasp, two months of code doesn't come in as a single chunk. * Trunk history is branch free which simplifies the use of tools like `git-bisect`. A disadvantage often claimed against this workflow is that the master branch becomes more unstable when more people are committing into it. This is true. However, the stability you get with long feature branches is often deceptive — you are just delaying the instability until the feature branches get integrated. Usually, that happens near a release — everybody integrates their feature branches at the same time, and you suddenly have a lot of instability. But near a release is exactly when you don't want a lot of instability. To make sure that you can commit often, break bigger work into smaller steps that can be worked on and tested independently. (That is usually the only way to tackle bigger projects anyway.) Each single step should be something that adds value to the project by moving it forward in the right direction. This makes it easy to course-correct early if we are moving in the wrong direction. For features that are tricky to enable incrementally, consider the use of *feature flags* — compiler or runtime flags that enable or disable the new feature. This allows the feature to be tested as it is being developed, instead of "after integration", as in a feature-branch workflow. But note that developing the feature incrementally is even better than using feature flags, as feature flags segment the developers and can hide integration problems. (Only the developers with certain feature flags set will test your code changes.) In the current setup, developers test and merge their branches to the trunk as soon as they are done. Code review happens before and after: * Code review before — make sure that the feature plan is sound. * Code review after merge — review the changes and look for any problems. * Code review of completed system — does it match the plan, are there changes that need to be made? If the developer so desires, they can also request for a code review before they push, but this is not *required*. It is left to the discretion of the developer. If you feel confident about your code, push it. If you feel unsure, request a review. With time, you will get better and better intuition for what code is safe to push, and what code might need a review. The trust and responsibility to produce high-quality code is on you, but you are not alone. Use your colleagues for help. ## OMG-GIT-2: Prefer rebase to merge The advantage of `rebase` is that it gives a cleaner, more linear commit history than relying on `merge`, especially when there are many contributors. A more linear commit history makes it easier to tell what is going on in the project and use tools such as `git bisect`. The basic rebase workflow is, instead of doing this to push your changes: ~~~sh dont git pull git push ~~~ Do this: ~~~sh git pull --rebase git push ~~~ Conflicts can be trickier to handle in rebase, but if you follow the rule of pushing often you shouldn't run into conflicts too often. Sometimes change history can get messy. For example, you work on a branch, make multiple commits, merge in changes from the master etc. In this case, rebasing can be tricky. Instead, merge it into `master` as a single commit, using `git merge --squash`: ~~~sh # Merge master into your branch, to make sure you have the latest changes # and fix any issues. git merge master # Switch back to master and squash merge. git checkout master git merge --squash my-branch # Commit the changes. git commit ~~~ Now you have all your messy changes and merges applied as a single clean commit on top of master. This technique can also be used in case you get a conflict when using the regular rebase workflow. ## OMG-GIT-3: Use the first line of your commit message as a summary The first line of your git commit message serves as the title and should be a summary of you did (preferably in less than 50 characters, since that works better with some tools, but it's OK to go a bit longer if you can't fit a good summary in 50 characters). After the summary, put a blank line and then follow up with a detailed description of the commit, what problem it's solving and why you are making this change. This text should be line broken at about 72 characters. If your commit includes several distinct changes, list them in bullet points. If your commit is very small, the details are not needed, just the title/summary is fine. If your commit resolves specific bugs, put links to the bugs in the commit. Example: ~~~txt Fix for log not working File logger was using the address of local variables. Fixes: OurMachinery/themachinery-public#538 ~~~ A good guide to writing commit messages is: https://chris.beams.io/posts/git-commit/ If you have to do something surprising and tricky in the code to fix a particular bug, make sure to put a note of what you are doing and why, not just in the commit message, but also in the code itself. People reading your code are unlikely to go back and look at commit messages, so it's important that they can find out what is happening by just looking at the code itself. Example: ~~~c // Note: We can't use a static string in the call to weird_api_function(), because contrary to the // documentation, it will actually modify the string. const char *s = tm_temp_allocator_api->printf(ta, "stuff"); weird_api_function(x); ~~~ ## OMG-GIT-4: Work on hotfixes in the release branch When you do fixes for a hotfix release (e.g. 2021.9.a), the preferred workflow is: * Implement and test the fix in the release branch (e.g. `release/2021.9`). * When you have committed and pushed the fix, merge the release branch back into `master`. An alternative workflow is: * Implement and test the fix in `master`. * Cherry-pick the fix into the release branch. The preferred workflow is preferred for a number of reasons: * Doing the fix in the release branch ensures that it gets tested in the release branch. Since it's the stability of the release branch that matters, testing it there is more important than testing it in `master`. * When you do the fix in the release branch it's easier to ensure that you do a "minimal" fix -- i.e. change as little as possible to fix the bug... when you work in master it's more tempting to do bigger refactorings. * Doing it this way minimizes the merge conflicts... when you cherry-pick commits it's easy to get merge conflicts if some API has been refactored, etc. ## OMG-GIT-5: Post bugs in the public issue tracker Bugs or feature requests should preferably be posted in the public issue tracker at https://github.com/OurMachinery/themachinery-public/issues. Having the bug list public makes it easier to search and easier to check for duplicates. In addition to the public issue tracker, we also have a private issue tracker at https://github.com/OurMachinery/themachinery/issues, accessible only to users with source access. Usually there is no reason to post bugs there. Even if a bug needs to quote some source code, it is usually just a small snippet, that can live on the public issue tracker without problems. ## OMG-GIT-6: Close "nice-to-have"/"low priority" issues if there is no plan to work on them We want users to feel comfortable positing "nice-to-have" feature requests and "small annoyances" in the issue tracker, because it is good for us to know what problems users are having, however small, and sometimes a small issue can be addressed with a quick 5-minute fix. Or, they can be fixed in an overhaul of that particular system. However, there is not much value of keeping tons and tons of such issues open. For any project of this size, there's thousands of "nice-to-have" things you can think of, and if you put them all in the issue tracker, they tend to obscure the more important issues. Therefore, for low priority issues like these, the policy is that if someone has had a quick look at the issue, and there is no plan to work on it in the near future, it should be closed with the tag *later*. This prevents it from taking up too much attention in the issue tracker. The issue will still be around in the history of closed issues, so if more people have the same problem, they can comment on it, or reopen it, and we can consider again if it is the right time to fix it.