Single instruction, multiple data (SIMD) instructions, such as AVX on x64 processors, are designed to improve performance. In some areas, such as numerical computation, they offer significant performance gains. In some other areas, they provide less gain or even performance loss.
Do they improve the performance of Node.js? We ran some experiments.
Compiling Node.js
We first built two Node.js instances: One with SIMD enabled and one disabled.
We ran the experiments in Debian Bookworm, but the compiling commands should work on any POSIX systems with minor adaptations. Our CPU was AMD Ryzen 5 8400F, which is an AMD Zen 4 CPU. Our RAM was 64GB.
First, we installed nvm.
Compiling Node.js with SIMD Enabled
We used the following command to compile Node.js with SIMD instructions enabled:
CFLAGS='-march=znver4' CXXFLAGS='-march=znver4' CC='clang-16' CXX='clang++-16' nvm install -s 22.11
CFLAGS='-march=znver4' CXXFLAGS='-march=znver4'
: Specify the AMD Zen 4 as the target CPU. (See march.) The compiler will use any instructions as long as they can be used by the an AMD Zen 4 CPU, including AVX instructions.CC='clang-16' CXX='clang++-16'
: Specify Clang 16 as the compiler.
Compiling Node.js with SIMD Disabled
We used the following command to compile Node.js with SIMD instructions disabled:
CFLAGS='-march=x86-64 -mtune=znver4' CXXFLAGS='-march=x86-64 -mtune=znver4' CC='clang-16' CXX='clang++-16' nvm install -s 22.11
CFLAGS='-march=x86-64 -mtune=znver4' CXXFLAGS='-march=x86-64 -mtune=znver4'
: Specify the generic x86-64 as the instruction set, but optimize for the AMD Zen 4 microarchitecture. (See mtune.) We optimized for AMD Zen 4 because we would like to make the benchmarks more comparable: The SIMD-disabled Node.js should be as close as possible to the SIMD-enabled Node.js, except for the absence of SIMD instructions.CC='clang-16' CXX='clang++-16'
: Specify Clang 16 as the compiler.
Notes
We used znver4
instead of native
for better transparency and reproducibility. We used Clang 16
as the compiler because the default compiler in Debian Bookworm was GCC 12.2.0, which did not
support full optimization for the AMD Zen 4 microarchitecture. We did not use a later version of GCC
because Clang 16 was readily available in the repo of Debian Bookworm, while a later version of GCC
would require us to compile the compiler.
If you’d like to run the benchmarks using your system, you can replace znver4
with native
. You
may also not need to specify the C/C++ compiler if the default compiler works. Also, feel free to
replace 22.11 with the version of Node.js of your interest.
Types of Tasks Compared
We compared two kinds of tasks on well-known open source projects: Running unit tests and generating documentation. Both tasks can benefit from performance improvements from Node.js, and improved performance result in more pleasant development experiences.
We did not test typical server Node.js applications, Since the performance bottlenecks of these applications usually lie outside of Node.js, such as databases and networks.
Running Unit Tests
We ran the unit tests with Node.js compiled with/without SIMD on 3 projects. Results are shown below:
Project | with SIMD | w/o SIMD |
---|---|---|
React main branch (16d2bb ) | 34.62s | 34.36s |
TypeScript v5.7.2 | 2m 3.2s | 2m 2.3s |
Vitest v2.1.8 | 14.98s | 15.04s |
We can see that there wasn’t much difference in performance.
Notes
We used the React main branch with commit 16d2bbbd1f1617d636ea0fd271b902a12a763c27
instead of the latest stable version v18.3.1 at the time of writing, because v18.3.1 required an
older version of Node.js for running unit tests.
Test Instructions:
- React
- TypeScript
- Vitest,
with
pnpm run test:ci
For each run, we ran twice and used the running time from the second run. In this way, we minimized the impact of reading files from disks since the OS kernel would be able to cache the files in RAM.
Generating Documentation
We generated the documentation with Node.js compiled with/without SIMD on 2 projects. Results are shown below:
Project | with SIMD | w/o SIMD |
---|---|---|
Vite v6.0.2 | 11.76s | 11.88s |
Docusaurus v3.6.3 | 468.39s | 471.55s |
We can see that there wasn’t much difference in performance.
Notes
- Generated Vite documentation with
pnpm run docs-build
. - Generated Docusaurus documentation with
yarn run build:website
.
Conclusion
We compiled two Node.js instances, one with SIMD enabled and one disabled. We compared their performance in running unit tests and generating documentation of some well-known open source projects. It turned out there wasn’t much difference in the performance of the tasks that we compared.
Implications
Based on our experiments, generally programs written in Node.js do not seem to benefit from SIMD instructions (at least on x64 CPUs). Unless there’s a justification such as numerically heavy programs, it’s unnecessary to rebuild Node.js just to enable SIMD instructions.
Afterthought
- One reason for this indifference might be caused by AVX downclocking.
- A brief scanning of the
node
binary without SIMD still contained ~4500 AVX instructions, even though it was built for the generic x64. (Thenode
binary with SIMD contained more than 29000.) It’s unclear where those AVX instructions originated from. I speculate that Node.js and/or its dependencies contained some runtime code path selection based on the CPU microarchitecture. Therefore, Node.js and/or its dependencies might have already been optimized with AVX instructions where needed.