Simple but slow when table grows say 10M+?
ORDER BY random();
Faster alternatives. Two algorithms. SYSTEM
is more performant than BERNOULLI
, but BERNOULLI
has better randomness.
-- get 1% of row by SYSTEM; 0.1 for 0.1% row
SELECT * FROM user_sessions TABLESAMPLE SYSTEM(1)
-- get 1% of row by BERNOULLI; 0.1 for 0.1% row
SELECT * FROM user_sessions TABLESAMPLE BERNOULLI(1)
SYSTEM is sampling data blocks. BERNOULLI is sampling table rows.
When you see query plan, you will see QUERY PLAN | Sample Scan
.
Also checks out additional module tsm_system_rows that you can do:
SELECT * FROM user_sessions TABLESAMPLE SYSTEM_ROWS(100);
that will give you 100 random rows from your table!