simd.jaytux.com/api/README.md

146 lines
7.2 KiB
Markdown

# Intrinsics API
*The backend API powering [https://simd.jaytux.com](https://simd.jaytux.com/api)*
## Preparation
1. Set up the project's sources (recommended to just use Intellij IDEA, and have it figure out everything for you, which should work out of the box). Alternatively, the `./gradlew` wrapper can be used to build the project.
2. Create an environment file (`.env`) in the root directory of the project, with the following variables:
- `DATABASE_URL`: The JDBC URL to the database; for example:
- `jdbc:sqlite:./intrinsics.sqlite` for a local SQLite database
- `jdbc:mariadb://localhost:3306/intrinsics` for a MariaDB database (served on the local machine at port 3306), which should already be running
- `DATABASE_DRIVER`: The JDBC driver class name; for example:
- `org.sqlite.JDBC` for SQLite
- `org.mariadb.jdbc.Driver` for MariaDB
- `DATABASE_USER`: the username to connect to the database; not required for SQLite
- `DATABASE_PASSWORD`: the password to connect to the database; not required for SQLite
3. (Optional) change the default port in the `src/main/resources/application.yaml` configuration file.
4. Get the intrinsics data from the [Intel Intrinsics Guide](https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html) and post-process the data (I am planning to automate this in the future, stand by).
1. The downloaded data is a ZIP-file, with the relevant data files being `files/data.js` and `files/perf2.js`.
2. Post-process `files/data.js` to be a valid XML file (when downloaded, it's an XML string in a JS variable):
1. Remove the `var data = "` prefix at the beginning of the file
2. Remove the `";` suffix at the end of the file
3. Replace all `\n` with actual newlines, and all `\"` with `"`, and then remove all trailing `\`
3. Post-process `files/perf2.js` to be a valid JSON file (when downloaded, it's a JSON string in a JS variable):
1. Remove the `perf2_js = ` prefix at the beginning of the file
2. Replace all `{l:` by `{"l":`, and all `,t:` by `,"t":`
5. At this point, you are ready to load the data into the database using the application. Run `./gradlew run -reload /path/to/post-processed/data.js /path/to/post-processed/perf2.js`
6. Finally, you can start the application using `./gradlew run`
## Running
By default, the application runs on port `42024`. You can start it using `./gradlew run`.
To reload the database (it drops all data, then re-parses everything), use `./gradlew run -reload /path/to/xml /path/to/json`.
## API
Most of the API is paginated. The default page size is 100, and the default page number is 0. Each paginated response has the following format:
```json
{
"page": <page number>,
"totalPages": <total number of pages>,
"items": []
}
```
Typically, you can use the "root" API endpoint (e.g. `/all`) to get the first page of results, and get any page by using the `/{page}` query parameter (e.g. `/all/3`). There is one notable exception (`/search`), where the page number is specified as a query parameter (`/search?page=3`). Finally, the details endpoint (`/details/{id}`) is not paginated, and returns a single object.
### `GET /all`
Gets a (paginated) list of all intrinsics. For each intrinsic, the following fields are returned:
```json
{
"id": "<UUID string>",
"name": "<function name of the intrinsic>"
}
```
Other pages can be requested using the `GET /all/{page}` endpoint (`GET /all` is equivalent to `GET /all/0`).
### `GET /cpuid`
Gets a (paginated) list of all CPUIDs in the database. Each CPUID can be used as a filter for the `/search` endpoint. The data is a simple list of strings. Examples include `PREFETCHI`, `SSE2`, `AVX`, etc.
### `GET /tech`
Gets a (paginated) list of all technologies in the database. Each technology can be used as a filter for the `/search` endpoint. The data is a simple list of strings. The full list (at the moment of writing) is:
- `AMX`
- `AVX-512`
- `AVX_ALL`
- `MMX`
- `Other`
- `SSE_ALL`
- `SVML`
### `GET /category`
Gets a (paginated) list of all categories in the database. Each category can be used as a filter for the `/search` endpoint. The data is a simple list of strings. Examples include `Logical`, `OS-Targeted`, `Swizzle`, etc.
### `GET /types`
Gets a (paginated) list of all C/C++(-like) types used by the intrinsics. The types can be used as filter for the `/search` endpoint (but currently only based on return type). The data is a simple list of strings. Examples include `__int16`, `__m128 const *`, `string literal`, etc.
### `GET /search`
Searches the database using the given filters. The filters are passed as query parameters, and can be combined. All filters are optional. The following filters are available:
- `name=[string]`: searches based on the name of the intrinsic; employs fuzzy-search (using `LIKE %it%`)
- `return=[string]`: searches based on the return type of the intrinsic; exact search only
- `cpuid=[*]`: searches based on the CPUID of the intrinsic; exact search only
- `tech=[*]`: searches based on the technology of the intrinsic; exact search only
- `category=[*]`: searches based on the category of the intrinsic; exact search only
- `desc=[string]`: searches based on the description of the intrinsic; employs fuzzy-search (using `LIKE %it%`)
- `page=[int]`: specifies the page number to return (default is 0)
Parameters marked by `[*]` are JSON-lists (so you should pass them as `cpuid=["PREFETCHI", "SSE2"]`). They are considered to be OR-ed together (i.e. the results will contain a mix of all intrinsics matching either of the CPUIDs (from the example)).
Passing no filters is equivalent to using `GET /all`, and data is returned in the same format:
```json
{
"page": <page number>,
"totalPages": <total number of pages>,
"items": [
{
"id": "<UUID string>",
"name": "<function name of the intrinsic>"
}
]
}
```
### `GET /details/{id}`
Gets the details for a single, specific intrinsic. The following data is returned:
```json
{
"id": "<UUID string>",
"name": "<function name of the intrinsic>",
"returnType": "<return type of the intrinsic>",
"returnVar": "<variable name for the return value, as used in the description; can be null>",
"description": "<description of the intrinsic>",
"operations": "<operation of the intrinsic in pseudocode>",
"category": "<category>",
"cpuid": "<CPUID>",
"tech": "<technology>",
"params": [
{
"name": "<parameter name>",
"type": "<parameter type>"
}
],
"instructions": [
{
"mnemonic": "<instruction mnemonic>",
"xed": "<Intel XED code>",
"form": "<instruction argument form>"
}
],
"performance": [
{
"platform": "<platform name>",
"latency": <latency in cycles; can be null>,
"throughput": <throughput in CPI; can be null>
}
]
}
```
### `GET /version`
Gets version information for the data. The following data is returned:
```json
{
"intelVersion": "M.m.p (Major.minor.patch version as reported by Intel)",
"intelUpdate": "yyyy-MM-dd (date of Intel's last update prior to scraping)",
"scrapeDate": "yyyy-MM-dd (date of last update)"
}
```