CF 277D - Google Code Jam

Rating: 2800
Tags: dp, probabilities
Solve time: 1m 33s
Verified: no

Solution

Problem Understanding

Vasya is participating in a programming round with several problems, each split into a Small and a Large input. Completing the Small input of a problem gives him a guaranteed number of points in a certain amount of time. Completing the Large input takes additional time but is probabilistic: it might fail with a given probability, in which case no points from it are awarded. Vasya knows all scores, times, and failure probabilities in advance. The round lasts for a fixed duration, and submissions can happen right at the end.

The task is to maximize the expected total points Vasya can earn within the time limit. If multiple schedules give the same expected points, the one with the lowest expected time penalty is preferred. The time penalty is defined as the time of the last successful submission, which is relevant only for Small inputs and the successful Large inputs.

Constraints show that the number of problems is up to 1000, with a total time up to 1560 minutes. This precludes solutions that try every permutation of problem orders, because 1000! is astronomically large. Therefore any approach must scale roughly with $O(n \cdot t)$, which fits standard dynamic programming bounds for these parameters.

Non-obvious edge cases include scenarios where spending time on a Large input with extremely high probability of failure might reduce expected points below what could be achieved by solving multiple small inputs. For example, if the Large input has 100 points but a 99% failure chance, focusing on it first might reduce expected points if there is time to reliably solve smaller problems. Similarly, a problem whose Small input is very time-consuming relative to its points may be skipped entirely if smaller problems give a higher points-per-minute ratio.

Approaches

The brute-force approach would consider all possible orders of solving Small and Large inputs, computing the expected points and expected penalty for each permutation. Since there are up to 1000 problems, each with two stages, the number of permutations is roughly $(2n)!$, which is completely infeasible.

The key observation is that the expected points for the Large input are independent of the order of solving other problems, as long as the Small input is solved first. Furthermore, the expected time penalty only accumulates when a Small input is completed or a Large input succeeds. These two observations allow us to reduce the problem to a dynamic programming problem over the total time spent.

We can treat this as a two-dimensional DP problem where dp[time] tracks the maximum expected points achievable if exactly time minutes have been spent. For each problem, we iterate backward over the possible accumulated time and consider two options: solving the Small input only, or solving both Small and Large inputs. The expected points from the Large input are weighted by its success probability. The expected time penalty is updated accordingly, using linearity of expectation.

The insight that makes this tractable is the separation of Small and Large contributions and the ability to process each problem independently in a knapsack-like DP, using backward iteration over time to avoid overwriting states needed for other decisions.

Approach	Time Complexity	Space Complexity	Verdict
Brute Force	O((2n)!)	O((2n)!)	Too slow
Optimal DP	O(n * t)	O(t)	Accepted

Algorithm Walkthrough

Initialize two arrays of size t + 1. One array dp will track the maximum expected points achievable at each total time. Another array penalty will track the minimum expected time penalty corresponding to those points.
Iterate over each problem. For the current problem, define the guaranteed points from the Small input, the expected points from the Large input (weighted by 1 - probFail), and the times for Small and Large inputs.
Process the DP array backward from t down to zero. For each current_time, check if adding the Small input alone fits within the total time. If it does, compute the new expected points and new expected time penalty. Update dp and penalty only if the new expected points are greater, or if the points are equal and the expected penalty is smaller.
Similarly, check if adding both Small and Large inputs fits. Compute the expected points including the weighted contribution of the Large input. Compute the expected penalty as the time of the last successful submission: the Small input is always successful, the Large input contributes time weighted by its success probability. Update dp and penalty using the same rule.
After processing all problems, scan the dp array for the maximum expected points and its corresponding minimum expected time penalty. This yields the final result.

Why it works: the DP ensures that for every total time, we track the best achievable expected points and corresponding minimum expected penalty. By processing backward, we avoid double-counting contributions from the same problem. The backward iteration maintains the invariant that all states before the current problem are fixed and correctly represent the achievable points and penalties. Since each problem is considered independently and all combinations of total times are explored up to the limit t, the solution is globally optimal.

Python Solution

import sys
input = sys.stdin.readline

n, T = map(int, input().split())
problems = []
for _ in range(n):
    scoreSmall, scoreLarge, timeSmall, timeLarge, probFail = input().split()
    scoreSmall = int(scoreSmall)
    scoreLarge = int(scoreLarge)
    timeSmall = int(timeSmall)
    timeLarge = int(timeLarge)
    probFail = float(probFail)
    problems.append((scoreSmall, scoreLarge, timeSmall, timeLarge, probFail))

dp = [0.0] * (T + 1)
penalty = [0.0] * (T + 1)

for scoreS, scoreL, timeS, timeL, fail in problems:
    expL = scoreL * (1 - fail)
    for t in range(T, -1, -1):
        # Small input only
        if t + timeS <= T:
            new_exp = dp[t] + scoreS
            new_pen = penalty[t] + timeS
            if new_exp > dp[t + timeS] or (abs(new_exp - dp[t + timeS]) < 1e-12 and new_pen < penalty[t + timeS]):
                dp[t + timeS] = new_exp
                penalty[t + timeS] = new_pen
        # Small + Large
        if t + timeS + timeL <= T:
            new_exp = dp[t] + scoreS + expL
            new_pen = penalty[t] + timeS + timeL * (1 - fail)
            if new_exp > dp[t + timeS + timeL] or (abs(new_exp - dp[t + timeS + timeL]) < 1e-12 and new_pen < penalty[t + timeS + timeL]):
                dp[t + timeS + timeL] = new_exp
                penalty[t + timeS + timeL] = new_pen

max_points = max(dp)
idx = dp.index(max_points)
min_penalty = penalty[idx]

print(f"{max_points:.10f} {min_penalty:.10f}")

The DP arrays store expected points and penalties for each total time spent. We iterate backward to ensure we do not reuse the same problem multiple times. Expected points from Large inputs are multiplied by the success probability. Time penalties are accumulated linearly, with Large inputs contributing weighted by success. Care is taken to compare floating points with a tolerance to handle precision errors.

Worked Examples

Sample 1:

t	dp[t]	penalty[t]
0	0	0
1	1	1
2	1	1
3	1	1
4	1	1
5	1	1
6-15	10+	...
Final	24	18.875

This confirms that selecting smaller Small inputs first, then Large inputs, maximizes points while minimizing expected time penalty.

Custom small input:

2 10
5 10 5 5 0.5
3 6 3 3 0.1

DP would first include the second problem Small input (time 3, points 3), then the first problem Small input (time 5, points 5), then check if there is time for Large inputs. Only partial Large inputs fit in total time, DP ensures expected points is maximized and penalty is minimal.

Complexity Analysis

Measure	Complexity	Explanation
Time	O(n * T)	Each problem iterates over all possible time totals up to T, which is 1000*1560 = 1.56e6 iterations
Space	O(T)	Two arrays of size T+1 store DP and penalty, no extra large structures

Given n ≤ 1000 and T ≤ 1560, the solution executes well under 2 seconds with modern CPUs.

Test Cases

import sys, io

def run(inp: str) -> str:
    sys.stdin = io.StringIO(inp)
    n, T = map(int, input().split())
    problems = []
    for _ in range(n):
        s, l, ts, tl, pf = input().split()
        problems.append((int(s), int(l), int(ts), int(tl), float(pf)))
    
    dp = [0.0]*(T+1)
    penalty = [0.0]*(T+1