LogoTeluq
Français
Logo
Open access research
publication repository

Scanning HTML at Tens of Gigabytes per Second on ARM Processors [r-libre/3616]

Lemire, Daniel (In Press). Scanning HTML at Tens of Gigabytes per Second on ARM Processors. Software: Practice and Experience.

File(s) available for this item:
[img]  PDF - simdhtml-6.pdf
Content : Draft Version
License : Creative Commons Attribution.
 
Item Type: Journal Articles
Refereed: Yes
Status: In Press
Abstract: Modern processors have instructions to process 16 bytes or more at once. These instructions are called SIMD, for single instruction, multiple data. Recent advances have leveraged SIMD instructions to accelerate parsing of common Internet formats such as JSON and base64. During HTML parsing, they quickly identify specific characters with a strategy called vectorized classification. We review their techniques and compare them with a faster alternative. We measure a 20-fold performance improvement in HTML scanning compared to traditional methods on recent ARM processors. Our findings highlight the potential of SIMD-based algorithms for optimizing Web browser performance.
Depositor: Lemire, Daniel
Owner / Manager: Daniel Lemire
Deposited: 03 Mar 2025 18:36
Last Modified: 11 Mar 2025 20:48

Actions (login required)

RÉVISER RÉVISER