We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No response
private static class BucketInteger extends Bucket<Integer> implements SerializableFunction<Integer, Integer> { private BucketInteger(int numBuckets) { super(numBuckets); } @Override protected int hash(Integer value) { return BucketUtil.hash(value); } }
Why do we use Murmur3 instead of the int value itself?
test code
Bucket bucket = Bucket.get(Types.IntegerType.get(), 10); List<Integer> list = new ArrayList<>(); List<Integer> randomList = new ArrayList<>(); Random rand = new Random(); for (int i = 0; i < 10000000; i++) { int num = rand.nextInt(10000); randomList.add(bucket.apply(num)); list.add(bucket.apply(i)); } System.out.println("natural list"); list.stream() .collect(Collectors.groupingBy(item -> item, Collectors.counting())) .forEach((key, value) -> System.out.println(key + ": " + value)); System.out.println("random list"); randomList.stream() .collect(Collectors.groupingBy(item -> item, Collectors.counting())) .forEach((key, value) -> System.out.println(key + ": " + value));
result with Murmur3
natural list 0: 1000996 1: 999922 2: 1000005 3: 1000139 4: 1001529 5: 998785 6: 999291 7: 999777 8: 999204 9: 1000352 random list 0: 998404 1: 1005602 2: 1021090 3: 1021668 4: 996615 5: 1046078 6: 1023192 7: 997321 8: 923906 9: 966124
result of the origin value
natural list 0: 1000000 1: 1000000 2: 1000000 3: 1000000 4: 1000000 5: 1000000 6: 1000000 7: 1000000 8: 1000000 9: 1000000 random list 0: 1001911 1: 997982 2: 999024 3: 1000518 4: 999166 5: 999954 6: 1001863 7: 1000033 8: 997981 9: 1001568
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Query engine
No response
Question
Why do we use Murmur3 instead of the int value itself?
test code
result with Murmur3
result of the origin value
The text was updated successfully, but these errors were encountered: