CF 76E - Points
We are given up to one hundred thousand points on a 2D plane. For every unordered pair of points, we must compute the squared Euclidean distance between them and add all those values together.
Rating: 1700
Tags: implementation, math
Solve time: 1m 53s
Verified: yes
Solution
Problem Understanding
We are given up to one hundred thousand points on a 2D plane. For every unordered pair of points, we must compute the squared Euclidean distance between them and add all those values together.
For two points (x1, y1) and (x2, y2), the squared distance is
$(x_1-x_2)^2 + (y_1-y_2)^2$
The task is not asking for the actual Euclidean distance, only the square of it. That matters because squared distances expand into algebraic expressions that can be summed efficiently.
The input size immediately rules out pairwise enumeration. With N = 100000, the number of pairs is about
$\frac{N(N-1)}{2}$
which is roughly five billion pairs in the worst case. Even a very small amount of work per pair would exceed the time limit.
The coordinates themselves are small, between -10000 and 10000, but that does not help enough to justify any quadratic solution. The correct approach must run close to linear time.
There are several easy-to-miss edge cases.
If multiple points coincide, their contribution should be zero. For example:
3
1 1
1 1
1 1
Every pair has distance 0, so the answer is:
0
A careless implementation that assumes all points are distinct could accidentally double count or mishandle duplicates.
Negative coordinates also matter because terms like (x1 - x2)^2 remain positive after squaring. For example:
2
-3 -4
3 4
The squared distance is:
(6)^2 + (8)^2 = 100
Any approach relying on absolute values without squaring carefully can produce incorrect results.
Another subtle issue is integer overflow in languages with fixed-size integers. Suppose all coordinates are near 10000 and N is large. The final answer can become very large, well beyond 32-bit integer range. Python handles this automatically, but in C++ this requires long long.
Approaches
The brute-force approach is straightforward. Iterate over every pair of points, compute the squared distance directly, and add it to the answer.
That means evaluating:
$(x_i-x_j)^2 + (y_i-y_j)^2$
for every i < j.
This works because it exactly matches the problem definition. The issue is scale. With 100000 points, there are about five billion pairs, and each pair requires several arithmetic operations. That is far too slow for a one second limit.
The key observation is that the x and y coordinates are independent inside the squared distance formula. Expanding the expression gives:
$(x_i-x_j)^2 = x_i^2 + x_j^2 - 2x_ix_j$
and similarly for the y coordinates.
So instead of thinking about pairs geometrically, we can think about how many times each coordinate contributes to the final sum.
Consider only the x part first:
$\sum_{i<j}(x_i-x_j)^2$
Expanding all terms:
$\sum_{i<j}(x_i^2+x_j^2-2x_ix_j)$
Every x_i^2 appears exactly N-1 times across all pairs. The cross terms can be rewritten using the square of the total sum:
$\left(\sum x_i\right)^2 = \sum x_i^2 + 2\sum_{i<j}x_ix_j$
Rearranging gives:
$\sum_{i<j}(x_i-x_j)^2 = N\sum x_i^2 - \left(\sum x_i\right)^2$
The same formula works independently for y coordinates.
That transforms the problem from pairwise computation into maintaining only four aggregate values:
sum_x
sum_y
sum_x2
sum_y2
After reading all points once, we can compute the answer directly.
| Approach | Time Complexity | Space Complexity | Verdict |
|---|---|---|---|
| Brute Force | O(N²) | O(1) | Too slow |
| Optimal | O(N) | O(1) | Accepted |
Algorithm Walkthrough
- Read the number of points
N. - Initialize four accumulators:
sum_x, sum_y, sum_x2, and sum_y2.
3. For each point (x, y):
- Add
xtosum_x. - Add
ytosum_y. - Add
x * xtosum_x2. - Add
y * ytosum_y2.
These values are enough because the final formula depends only on sums and sums of squares. 4. Compute the total contribution from x coordinates:
$N\sum x_i^2 - \left(\sum x_i\right)^2$
- Compute the total contribution from y coordinates:
$N\sum y_i^2 - \left(\sum y_i\right)^2$
- Add the two contributions together and print the result.
Why it works
The algorithm relies on algebraic expansion of squared differences. Every pairwise squared distance can be decomposed into independent x and y components. When all expanded terms are summed together, the quadratic pair structure collapses into expressions involving only global sums and sums of squares.
The identity
$\sum_{i<j}(a_i-a_j)^2 = N\sum a_i^2 - \left(\sum a_i\right)^2$
holds for any sequence of numbers. Applying it separately to x coordinates and y coordinates exactly reconstructs the total sum of squared distances over all pairs.
Python Solution
import sys
input = sys.stdin.readline
def solve():
n = int(input())
sum_x = 0
sum_y = 0
sum_x2 = 0
sum_y2 = 0
for _ in range(n):
x, y = map(int, input().split())
sum_x += x
sum_y += y
sum_x2 += x * x
sum_y2 += y * y
ans = n * sum_x2 - sum_x * sum_x
ans += n * sum_y2 - sum_y * sum_y
print(ans)
solve()
The first part of the code reads all points while maintaining four aggregate values. We never store the points themselves because the final formula depends only on totals.
The expressions x * x and y * y compute squared coordinates directly. Using multiplication instead of exponentiation is slightly faster and clearer in competitive programming code.
The final computation mirrors the mathematical derivation exactly. First we compute the total x contribution, then the y contribution, and finally add them together.
One subtle point is that the formula already counts every unordered pair exactly once. There is no division by two anywhere in the implementation because the algebraic identity already accounts for pair counting correctly.
Python integers automatically expand to arbitrary precision, so overflow is not a concern here.
Worked Examples
Example 1
Input:
4
1 1
-1 -1
1 -1
-1 1
| Point | sum_x | sum_y | sum_x2 | sum_y2 |
|---|---|---|---|---|
| (1, 1) | 1 | 1 | 1 | 1 |
| (-1, -1) | 0 | 0 | 2 | 2 |
| (1, -1) | 1 | -1 | 3 | 3 |
| (-1, 1) | 0 | 0 | 4 | 4 |
Now compute:
| Expression | Value |
|---|---|
4 * 4 - 0² |
16 |
4 * 4 - 0² |
16 |
| Total | 32 |
Output:
32
This example shows how symmetry causes the coordinate sums to cancel out. The entire answer comes from the sums of squares.
Example 2
Input:
3
0 0
1 0
3 0
Pairwise squared distances are:
(0,1) -> 1
(0,3) -> 9
(1,3) -> 4
Total:
14
Trace:
| Point | sum_x | sum_y | sum_x2 | sum_y2 |
|---|---|---|---|---|
| (0, 0) | 0 | 0 | 0 | 0 |
| (1, 0) | 1 | 0 | 1 | 0 |
| (3, 0) | 4 | 0 | 10 | 0 |
Now compute:
| Expression | Value |
|---|---|
3 * 10 - 4² |
14 |
3 * 0 - 0² |
0 |
| Total | 14 |
This example demonstrates that the formula also works when all points lie on a single axis.
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(N) | Each point is processed once |
| Space | O(1) | Only a few accumulator variables are stored |
The solution comfortably fits the limits. Processing one hundred thousand points with constant work per point is easily fast enough for a one second limit.
Test Cases
# helper: run solution on input string, return output string
import sys
import io
def solve():
input = sys.stdin.readline
n = int(input())
sum_x = 0
sum_y = 0
sum_x2 = 0
sum_y2 = 0
for _ in range(n):
x, y = map(int, input().split())
sum_x += x
sum_y += y
sum_x2 += x * x
sum_y2 += y * y
ans = n * sum_x2 - sum_x * sum_x
ans += n * sum_y2 - sum_y * sum_y
print(ans)
def run(inp: str) -> str:
sys.stdin = io.StringIO(inp)
out = io.StringIO()
old_stdout = sys.stdout
sys.stdout = out
solve()
sys.stdout = old_stdout
return out.getvalue().strip()
# provided sample
assert run(
"""4
1 1
-1 -1
1 -1
-1 1
"""
) == "32", "sample 1"
# minimum size
assert run(
"""1
5 7
"""
) == "0", "single point"
# duplicate points
assert run(
"""3
2 2
2 2
2 2
"""
) == "0", "all points identical"
# points on a line
assert run(
"""3
0 0
1 0
3 0
"""
) == "14", "line case"
# negative coordinates
assert run(
"""2
-3 -4
3 4
"""
) == "100", "negative coordinates"
# mixed coordinates
assert run(
"""4
0 0
1 2
-1 3
4 -2
"""
) == "74", "general case"
| Test input | Expected output | What it validates |
|---|---|---|
| Single point | 0 | No pairs exist |
| All points identical | 0 | Duplicate handling |
| Points on a line | 14 | Independent x/y contributions |
| Negative coordinates | 100 | Correct squaring with negatives |
| Mixed coordinates | 74 | General correctness |
Edge Cases
Consider the case where all points are identical:
3
2 2
2 2
2 2
During processing:
| Point | sum_x | sum_y | sum_x2 | sum_y2 |
|---|---|---|---|---|
| (2,2) | 2 | 2 | 4 | 4 |
| (2,2) | 4 | 4 | 8 | 8 |
| (2,2) | 6 | 6 | 12 | 12 |
Now compute:
3 * 12 - 6² = 36 - 36 = 0
for both coordinates, so the total answer is 0.
This works because identical coordinates make the square-of-sums term exactly cancel the multiplied sum-of-squares term.
Now consider negative coordinates:
2
-3 -4
3 4
The accumulators become:
| Point | sum_x | sum_y | sum_x2 | sum_y2 |
|---|---|---|---|---|
| (-3,-4) | -3 | -4 | 9 | 16 |
| (3,4) | 0 | 0 | 18 | 32 |
Final computation:
2 * 18 - 0² = 36
2 * 32 - 0² = 64
Total:
36 + 64 = 100
The squaring naturally removes the sign, matching the geometric distance formula exactly.
Finally, consider the smallest possible input:
1
5 7
There are no pairs at all. The formula still behaves correctly:
1 * 25 - 5² = 0
1 * 49 - 7² = 0
So the answer becomes 0 without requiring any special-case handling.