Set up the project's sources (recommended to just use Intellij IDEA, and have it figure out everything for you, which should work out of the box). Alternatively, the ./gradlew wrapper can be used to build the project.
Create an environment file (.env) in the root directory of the project, with the following variables:
- DATABASE_URL: The JDBC URL to the database; for example:
  - jdbc:sqlite:./intrinsics.sqlite for a local SQLite database
  - jdbc:mariadb://localhost:3306/intrinsics for a MariaDB database (served on the local machine at port 3306), which should already be running
- DATABASE_DRIVER: The JDBC driver class name; for example:
  - org.sqlite.JDBC for SQLite
  - org.mariadb.jdbc.Driver for MariaDB
- DATABASE_USER: the username to connect to the database; not required for SQLite
- DATABASE_PASSWORD: the password to connect to the database; not required for SQLite
(Optional) change the default port in the src/main/resources/application.yaml configuration file.
Get the intrinsics data from the Intel Intrinsics Guide and post-process the data (I am planning to automate this in the future, stand by).
1. The downloaded data is a ZIP-file, with the relevant data files being files/data.js and files/perf2.js.
2. Post-process files/data.js to be a valid XML file (when downloaded, it's an XML string in a JS variable):
  1. Remove the var data = " prefix at the beginning of the file
  2. Remove the "; suffix at the end of the file
  3. Replace all \n with actual newlines, and all \" with ", and then remove all trailing \
3. Post-process files/perf2.js to be a valid JSON file (when downloaded, it's a JSON string in a JS variable):
  1. Remove the perf2_js = prefix at the beginning of the file
  2. Replace all {l: by {"l":, and all ,t: by ,"t":
At this point, you are ready to load the data into the database using the application. Run ./gradlew run -reload /path/to/post-processed/data.js /path/to/post-processed/perf2.js
Finally, you can start the application using ./gradlew run

You can also use the ./package.sh script to build a standalone executable (with bundled JVM).You can also use the ./package.sh script to build a standalone executable (with bundled JVM). It uses the Gradle shadowJar target together with jpackage.

Running

By default, the application runs on port 42024. You can start it using ./gradlew run.

To reload the database (it drops all data, then re-parses everything), use ./gradlew run -reload /path/to/xml /path/to/json.

API

Most of the API is paginated. The default page size is 100, and the default page number is 0. Each paginated response has the following format:

{
  "page": <page number>,
  "totalPages": <total number of pages>,
  "items": []
}

Typically, you can use the "root" API endpoint (e.g. /all) to get the first page of results, and get any page by using the /{page} query parameter (e.g. /all/3). There is one notable exception (/search), where the page number is specified as a query parameter (/search?page=3). Finally, the details endpoint (/details/{id}) is not paginated, and returns a single object.

`GET /all`

Gets a (paginated) list of all intrinsics. For each intrinsic, the following fields are returned:

{
  "id": "<UUID string>",
  "name": "<function name of the intrinsic>"
}

Other pages can be requested using the GET /all/{page} endpoint (GET /all is equivalent to GET /all/0).

`GET /cpuid`

Gets a (paginated) list of all CPUIDs in the database. Each CPUID can be used as a filter for the /search endpoint. The data is a simple list of strings. Examples include PREFETCHI, SSE2, AVX, etc.

`GET /tech`

Gets a (paginated) list of all technologies in the database. Each technology can be used as a filter for the /search endpoint. The data is a simple list of strings. The full list (at the moment of writing) is:

AMX
AVX-512
AVX_ALL
MMX
Other
SSE_ALL
SVML

`GET /category`

Gets a (paginated) list of all categories in the database. Each category can be used as a filter for the /search endpoint. The data is a simple list of strings. Examples include Logical, OS-Targeted, Swizzle, etc.

`GET /types`

Gets a (paginated) list of all C/C++(-like) types used by the intrinsics. The types can be used as filter for the /search endpoint (but currently only based on return type). The data is a simple list of strings. Examples include __int16, __m128 const *, string literal, etc.

`GET /search`

Searches the database using the given filters. The filters are passed as query parameters, and can be combined. All filters are optional. The following filters are available:

name=[string]: searches based on the name of the intrinsic; employs fuzzy-search (using LIKE %it%)
return=[string]: searches based on the return type of the intrinsic; exact search only
cpuid=[*]: searches based on the CPUID of the intrinsic; exact search only
tech=[*]: searches based on the technology of the intrinsic; exact search only
category=[*]: searches based on the category of the intrinsic; exact search only
desc=[string]: searches based on the description of the intrinsic; employs fuzzy-search (using LIKE %it%)
page=[int]: specifies the page number to return (default is 0)

Parameters marked by [*] are JSON-lists (so you should pass them as cpuid=["PREFETCHI", "SSE2"]). They are considered to be OR-ed together (i.e. the results will contain a mix of all intrinsics matching either of the CPUIDs (from the example)).

Passing no filters is equivalent to using GET /all, and data is returned in the same format:

{
   "page": <page number>,
   "totalPages": <total number of pages>,
   "items": [
      {
         "id": "<UUID string>",
         "name": "<function name of the intrinsic>"
      }
   ]
}

`GET /details/{id}`

Gets the details for a single, specific intrinsic. The following data is returned:

{
    "id": "<UUID string>",
    "name": "<function name of the intrinsic>",
    "returnType": "<return type of the intrinsic>",
    "returnVar": "<variable name for the return value, as used in the description; can be null>",
    "description": "<description of the intrinsic>",
    "operations": "<operation of the intrinsic in pseudocode>",
    "category": "<category>",
    "cpuid": "<CPUID>",
    "tech": "<technology>",
    "params": [
        {
            "name": "<parameter name>",
            "type": "<parameter type>"
        }
    ],
    "instructions": [
        {
            "mnemonic": "<instruction mnemonic>",
            "xed": "<Intel XED code>",
            "form": "<instruction argument form>"
        }
    ],
    "performance": [
        {
            "platform": "<platform name>",
            "latency": <latency in cycles; can be null>,
            "throughput": <throughput in CPI; can be null>
        }
    ]
}

`GET /version`

Gets version information for the data. The following data is returned:

{
    "intelVersion": "M.m.p (Major.minor.patch version as reported by Intel)",
    "intelUpdate": "yyyy-MM-dd (date of Intel's last update prior to scraping)",
    "scrapeDate": "yyyy-MM-dd (date of last update)"
}