Open access research
publication repository
publication repository
Klarqvist, Marcus D. R.; Muła, Wojciech, & Lemire, Daniel (In Press). Efficient Computation of Positional Population Counts Using SIMD Instructions. Concurrency and Computation: Practice and Experience. https://doi.org/10.1002/cpe.6304
File(s) available for this item:
PDF
 shortpospopcnt4.pdf
Content : Submitted Version License : Creative Commons Attribution. 

Item Type:  Journal Articles 

Refereed:  Yes 
Status:  In Press 
Abstract:  In several fields such as statistics, machine learning, and bioinformatics, categorical variables are frequently represented as onehot encoded vectors. For example, given 8 distinct values, we map each value to a byte where only a single bit has been set. We are motivated to quickly compute statistics over such encodings. Given a stream of kbit words, we seek to compute k distinct sums corresponding to bit values at indexes 0, 1, 2, ..., k1. If the kbit words are onehot encoded then the sums correspond to a frequency histogram. This multiplesum problem is a generalization of the populationcount problem where we seek the sum of all bit values. Accordingly, we refer to the multiplesum problem as a positional populationcount. Using SIMD (Single Instruction, Multiple Data) instructions from recent Intel processors, we describe algorithms for computing the 16bit position population count using less than half of a CPU cycle per 16bit word. Our best approach uses up to 400 times fewer instructions and is up to 50 times faster than baseline code using only regular (nonSIMD) instructions, for sufficiently large inputs. 
Depositor:  Lemire, Daniel 
Owner / Manager:  Daniel Lemire 
Deposited:  19 Mar 2021 12:54 
Last Modified:  04 May 2021 23:47 
RÉVISER 