Kactl
Kactl
Gracias Mateo
Antonio Mondejar, Pietro Palombini, Iván Renison
2025-04-20
1 Contest 1 8.4 Misc. Point Set Problems . . . . . . . . . . . . . . 26 troubleshoot.txt 52 lines
8.5 3D . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Pre-submit:
2 Mathematics 1 Write a few simple test cases if sample is not enough.
Are time limits close? If so, generate max cases.
2.1 Equations . . . . . . . . . . . . . . . . . . . . . . . 1 9 Strings 27 Is the memory usage fine?
2.2 Recurrences . . . . . . . . . . . . . . . . . . . . . . 1 Could anything overflow?
2.3 Trigonometry . . . . . . . . . . . . . . . . . . . . . 1 10 Various 29 Make sure to submit the right file.
2.4 Geometry . . . . . . . . . . . . . . . . . . . . . . . 1 10.1 Dates . . . . . . . . . . . . . . . . . . . . . . . . . 29 Wrong answer:
2.5 Derivatives/Integrals . . . . . . . . . . . . . . . . . 2 10.2 Intervals . . . . . . . . . . . . . . . . . . . . . . . . 29 Print your solution! Print debug output, as well.
2.6 Sums . . . . . . . . . . . . . . . . . . . . . . . . . . 2 10.3 Misc. algorithms . . . . . . . . . . . . . . . . . . . 30 Are you clearing all data structures between test cases?
Can your algorithm handle the whole range of input?
2.7 Series . . . . . . . . . . . . . . . . . . . . . . . . . 2 10.4 Dynamic programming . . . . . . . . . . . . . . . . 30 Read the full problem statement again.
2.8 Probability theory . . . . . . . . . . . . . . . . . . 2 10.5 Debugging tricks . . . . . . . . . . . . . . . . . . . 30 Do you handle all corner cases correctly?
Have you understood the problem correctly?
2.9 Markov chains . . . . . . . . . . . . . . . . . . . . 3 10.6 Optimization tricks . . . . . . . . . . . . . . . . . . 30 Any uninitialized variables?
Any overflows?
3 Data structures 3 Contest (1) Confusing N and M, i and j, etc.?
Are you sure your algorithm works?
What special cases have you not thought of?
4 Numerical 7 template.cpp 18 lines Are you sure the STL functions you use work as you think?
4.1 Polynomials and recurrences . . . . . . . . . . . . . 7 #include <bits/stdc++.h> Add some assertions, maybe resubmit.
using namespace std; Create some testcases to run your algorithm on.
4.2 Optimization . . . . . . . . . . . . . . . . . . . . . 7 Go through the algorithm for a simple case.
4.3 Matrices . . . . . . . . . . . . . . . . . . . . . . . . 8 #define fst first Go through this list again.
4.4 Fourier transforms . . . . . . . . . . . . . . . . . . 10 #define snd second Explain your algorithm to a teammate.
#define pb push_back Ask the teammate to look at your code.
#define fore(i, a, b) for (ll i = a, gmat = b; i < gmat; i++) Go for a small walk, e.g. to the toilet.
5 Number theory 11 #define ALL(x) x.begin(), x.end() Is your output format correct? (including whitespace)
#define SZ(x) (ll)(x).size() Rewrite your solution from the start or let a teammate do it.
5.1 Modular arithmetic . . . . . . . . . . . . . . . . . . 11
#define mset(a, v) memset((a), (v), sizeof(a))
5.2 Primality . . . . . . . . . . . . . . . . . . . . . . . 12 typedef long long ll; Runtime error:
5.3 Divisibility . . . . . . . . . . . . . . . . . . . . . . 12 typedef pair<ll, ll> ii; Have you tested all corner cases locally?
typedef vector<ll> vi; Any uninitialized variables?
5.4 Fractions . . . . . . . . . . . . . . . . . . . . . . . 13 Are you reading or writing outside the range of any vector?
5.5 Pythagorean Triples . . . . . . . . . . . . . . . . . 13 int main() { Any assertions that might fail?
5.6 Primes . . . . . . . . . . . . . . . . . . . . . . . . . 13 cin.tie(0)->sync_with_stdio(0); Any possible division by 0? (mod 0 for example)
Any possible infinite recursion?
5.7 Highly composite numbers . . . . . . . . . . . . . . 13 } Invalidated pointers or iterators?
5.8 Mobius Function . . . . . . . . . . . . . . . . . . . 13 Are you using too much memory?
hash.sh Debug with resubmits (e.g. remapped signals, see Various).
3 lines
6 Combinatorial 13 # Hashes a f i l e , ignoring a l l whitespace and comments. Use for Time limit exceeded:
6.1 Permutations . . . . . . . . . . . . . . . . . . . . . 13 # verifying that code was correctly typed . Do you have any possible infinite loops?
cpp -dD -P -fpreprocessed | tr -d ’[:space:]’| md5sum |cut -c-6 What is the complexity of your algorithm?
6.2 Partitions and subsets . . . . . . . . . . . . . . . . 14 Are you copying a lot of unnecessary data? (References)
6.3 General purpose numbers . . . . . . . . . . . . . . 14 problemInteraction.sh How big is the input and output? (consider scanf)
3 lines Avoid vector, map. (use arrays/unordered_map)
6.4 Game theory . . . . . . . . . . . . . . . . . . . . . 14
# For interactive problems What do your teammates think about your algorithm?
mkfifo fifo
7 Graph 14 (./solution < fifo) | (./interactor > fifo) Memory limit exceeded:
What is the max amount of memory your algorithm should need?
7.1 Fundamentals . . . . . . . . . . . . . . . . . . . . . 14 Are you clearing all data structures between test cases?
7.2 Network flow . . . . . . . . . . . . . . . . . . . . . 15 tester.py 15 lines
7.7 Trees . . . . . . . . . . . . . . . . . . . . . . . . . . 19
system("./good < in > o")
system("./bad < in > o2")
2.1 Equations
7.8 Math . . . . . . . . . . . . . . . . . . . . . . . . . . 22 x = open("o", "r").read().strip().split() √
y = open("o2", "r").read().strip().split() 2 −b ± b2 − 4ac
ax + bx + c = 0 ⇒ x =
8 Geometry 22 for i in range(len(x)): 2a
8.1 Geometric primitives . . . . . . . . . . . . . . . . . 23 if (x[i] != y[i]):
print("FAILED!!!!!", i + 1)
8.2 Circles . . . . . . . . . . . . . . . . . . . . . . . . . 24 exit(0)
8.3 Polygons . . . . . . . . . . . . . . . . . . . . . . . . 24 print("ok", t + 1) The extremum is given by x = −b/2a.
UNC template hash problemInteraction tester troubleshoot 2025-04-20 2
2.4.2 Triangles 2.5 Derivatives/Integrals
ed − bf Side lengths: a, b, c
ax + by = e x=
ad − bc a+b+c d 1 d 1
⇒ Semiperimeter: p = arcsin x = √ arccos x = − √
cx + dy = f af − ec 2 dx dx
y= p
Area: A = p(p − a)(p − b)(p − c) 1 − x2 1 − x2
ad − bc
abc d d 1
Circumradius: R = tan x = 1 + tan2 x arctan x =
4A dx dx 1 + x2
In general, given an equation Ax = b, the solution to a variable A Z Z
Inradius: r = ln | cos ax| sin ax − ax cos ax
xi is given by tan ax = − x sin ax =
p a a2
det A′i Length √of median (divides triangle into two equal-area triangles): √
xi = Z
2 π
Z
e ax
det A ma = 12 2b2 + 2c2 − a2 e−x = erf(x) xeax dx = 2 (ax − 1)
2 a
where A′i is A with the i’th column replaced by b. Length
v of "bisector (divides angles in two):
u 2 #
a
2.2 Recurrences
u
sa = tbc 1 − Integration by parts:
b+c
If an = c1 an−1 + · · · + ck an−k , and r1 , . . . , rk are distinct roots of
xk − c1 xk−1 − · · · − ck , there are d1 , . . . , dk s.t. sin α sin β sin γ 1 b b
Law of sines: = = =
Z Z
a b c 2R f (x)g(x)dx = [F (x)g(x)]ba − F (x)g ′ (x)dx
an = d1 r1n + ··· + dk rkn . Law of cosines: a2 = b2 + c2 − 2bc cos α a a
2.4.3 Quadrilaterals tan α + β
Non-distinct roots r become polynomial factors, e.g. With
Law ofside lengths aa,+b,bc,=d, diagonals
tangents: 2 e, f , diagonals angle θ, area
a − b
A and magic flux F = b +tan 2 2 α −2β
d − a − c2 : 2.6 Sums
an = (d1 n + d2 )rn . 2 cb+1 − ca
ca + ca+1 + · · · + cb = , c ̸= 1
2.3 Trigonometry c−1
the polygon and B is the number of integer points on the y = r sin θ sin ϕ θ = acos(z/ x2 + y 2 + z 2 ) x2 x4 x6
boundary. z = r cos θ ϕ = atan2(y, x) cos x = 1 − + − + . . . , (−∞ < x < ∞)
2! 4! 6!
UNC OrderStatisticTree HashMap rope SegmentTree 2025-04-20 3
2.8 Probability theory 2.8.2 Continuous distributions A Markov chain is an A-chain if the states can be partitioned
Let X be a discrete random variable with probability pX (x) of Uniform distribution into two sets A and G, such that all states in A are absorbing
assuming theP value x. It will then have an expected value (mean) (pii = 1), and all states in G leads to an absorbing state in A.
If the probability density function is constant between a and b
µ = E(X) = x xpX (x) and variance The probability for absorption in state i ∈ A, when the initial
and 0 elsewhere it is U(a, b), a < b. P
σ 2 = V (X) = E(X 2 ) − (E(X))2 = x (x − E(X))2 pX (x) where σ state is j, is aij = pij + k∈G aik pkj . The expected
P time until
P
absorption, when the initial state is i, is ti = 1 + k∈G pki tk .
1
is the standard deviation. If X is instead continuous it will have b−a
a<x<b
f (x) =
a probability density function fX (x) and the sums above will 0 otherwise
instead be integrals with pX (x) replaced by fX (x).
a+b 2 (b − a)2 Data structures (3)
Expectation is linear: µ= ,σ =
2 12
OrderStatisticTree.h
E(aX + bY ) = aE(X) + bE(Y ) Exponential distribution Description: A set (not multiset!) with support for finding the n’th ele-
ment, and finding the index of an element. To get a map, change null type.
The time between events in a Poisson process is Time: O (log N ) ac5104, 16 lines
For independent X and Y , Exp(λ), λ > 0. #include "ext/pb_ds/assoc_container.hpp"
λe−λx x ≥ 0
using namespace __gnu_pbds;
f (x) =
V (aX + bY ) = a2 V (X) + b2 V (Y ). 0 x<0
template<class T>
1 1 using Tree = tree<T, null_type, less<T>, rb_tree_tag,
µ = , σ2 = 2 tree_order_statistics_node_update>;
2.8.1 Discrete distributions λ λ
Binomial distribution Normal distribution
void example() {
Tree<ll> t, t2; t.insert(8);
The number of successes in n independent yes/no experiments, Most real random values with mean µ and variance σ 2 are well auto it = t.insert(10).fst;
assert(it == t.lower_bound(9));
each which yields success with probability p is described by N (µ, σ 2 ), σ > 0. assert(t.order_of_key(10) == 1);
Bin(n, p), n = 1, 2, . . . , 0 ≤ p ≤ 1. assert(t.order_of_key(11) == 2);
1 (x−µ)2 assert(*t.find_by_order(0) == 8);
−
! f (x) = √ e 2σ2 t.join(t2); // assuming T < T2 or T > T2, merge t2 into t
n k 2πσ 2 }
p(k) = p (1 − p)n−k
k If X1 ∼ N (µ1 , σ12 ) and X2 ∼ N (µ2 , σ22 ) then
HashMap.h
Description: Hash map with mostly the same API as unordered map, but
aX1 + bX2 + c ∼ N (µ1 + µ2 + c, a2 σ12 + b2 σ22 ) ∼3x faster. Uses 1.5x memory. Initial capacity must be a power of 2 (if
µ = np, σ 2 = np(1 − p) provided). 23cd06, 7 lines
λk
p(k) = e−λ , k = 0, 1, 2, . . . A Markov chain is ergodic if the asymptotic distribution is SegmentTree.h
k! independent of the initial distribution. A finite Markov chain is Description: Zero-indexed max-tree. Bounds are inclusive to the left and
exclusive to the right. Can be changed by modifying T, f and neut.
ergodic iff it is irreducible and aperiodic (i.e., the gcd of cycle Time: O (log N )
µ = λ, σ 2 = λ lengths is 1). limk→∞ Pk = 1π. 921f55, 19 lines
UNC LazySegmentTree MemoryLazySegmentTree PersistentSegmentTree SegmentTree2d 2025-04-20 4
struct Tree { ll m = (s + e) / 2; }
typedef ll T; upd(2*k, s, m, a, b, v), upd(2*k+1, m, e, a, b, v); l->lazy = comb(l->lazy, lazy);
static constexpr T neut = LONG_LONG_MIN; st[k] = f(st[2*k], st[2*k+1]); r->lazy = comb(r->lazy, lazy);
T f(T a, T b) { return max(a, b); } // (any associative fn ) } lazy = lneut;
vector<T> s; ll n; T query(ll k, ll s, ll e, ll a, ll b) { val = f(l->query(lo, hi), r->query(lo, hi));
Tree(ll n = 0, T def = neut) : s(2*n, def), n(n) {} if (s >= b || e <= a) return tneut; }
void upd(ll pos, T val) { push(k, s, e); };
for (s[pos += n] = val; pos /= 2;) if (s >= a && e <= b) return st[k];
s[pos] = f(s[pos * 2], s[pos * 2 + 1]); ll m = (s + e) / 2;
} return f(query(2*k, s, m, a, b),query(2*k+1, m, e, a, b)); PersistentSegmentTree.h
T query(ll b, ll e) { // query [ b , e) } Description: Max segment tree in which each update creates a new version
T ra = neut, rb = neut; void upd(ll a, ll b, L v) { upd(1, 0, n, a, b, v); } of the tree and you can query and update on any version of the tree. Can be
for (b += n, e += n; b < e; b /= 2, e /= 2) { T query(ll a, ll b) { return query(1, 0, n, a, b); } changed by modifying T, f and unit.
if (b % 2) ra = f(ra, s[b++]); }; Usage: Tree rmq(n);
if (e % 2) rb = f(s[--e], rb); ver = rmq.init(xs);
} new ver = rmq.upd(ver, i, x);
MemoryLazySegmentTree.h rmq.query(ver, l, u);
return f(ra, rb); Description: Segment tree with ability to add or set values of large inter-
} Time: O (log N ) 83b8e4, 43 lines
vals, and compute max of intervals. Can be changed to other things. Use
}; with a bump allocator for better performance, and SmallPtr or implicit in- struct Tree {
dices to save memory. typedef ll T;
Usage: Node* tr = new Node(v, 0, SZ(v)); static constexpr T neut = LONG_LONG_MIN;
LazySegmentTree.h Time: O (log N ). T f(T a, T b) { return max(a, b); } // (any associative fn )
Description: Segment tree with ability to add values of large intervals, and "../various/BumpAllocator.h" a87495, 53 lines
compute the sum of intervals. Ranges are [s, e). Can be changed to other vector<T> st;
things. const ll inf = 1e18;
struct Node { vi L, R;
Usage: Tree st(n); ll n, rt;
st.init(x); typedef ll T; // data type
struct L { ll toset, toadd; }; // lazy type Tree(ll n) : st(1, neut), L(1), R(1), n(n), rt(0) {}
st.upd(s, e, v); ll new_node(T v, ll l, ll r) {
st.query(s, e); const T tneut = -inf; // neutral elements
const L lneut = {inf, 0}; st.pb(v), L.pb(l), R.pb(r);
Time: O (log N ). 0fc20e, 53 lines return SZ(st) - 1;
T f (T a, T b) { return max(a, b); } // (any associative fn )
typedef ll T; typedef ll L; // T: data type , L: lazy type T apply (T a, L b) { }
const T tneut = 0; const L lneut = 0; // neutrals return b.toset != inf ? b.toset + b.toadd : a + b.toadd; // not necessary in most cases
T f(T a, T b) { return a + b; } // associative } // Apply lazy ll init(ll s, ll e, vector<T>& a) {
T apply(T v, L l, ll s, ll e) { // new s t according to lazy L comb(L a, L b) { if (s + 1 == e) return new_node(a[s], 0, 0);
return v + l * (e - s); } if (b.toset != inf) return b; ll m = (s + e) / 2, l = init(s, m, a), r = init(m, e, a);
L comb(L a, L b) { return a + b; } // cumulative e f f e c t of lazy return {a.toset, a.toadd + b.toadd}; return new_node(f(st[l], st[r]), l, r);
} // Combine lazy }
struct Tree { // example : range sum with range addition ll upd(ll k, ll s, ll e, ll p, T v) {
ll n; Node *l = 0, *r = 0; ll ks = new_node(st[k], L[k], R[k]);
vector<T> st; ll lo, hi; T val = tneut; L lazy = lneut; if (s + 1 == e) {
vector<L> lazy; Node(ll lo,ll hi):lo(lo),hi(hi) {} //Large interval of tneut st[ks] = v;
Tree(ll n) : n(n), st(4*n, tneut), lazy(4*n, lneut) {} Node(vector<T>& v, ll lo, ll hi) : lo(lo), hi(hi) { return ks;
Tree(vector<T> &a) : n(SZ(a)), st(4*n), lazy(4*n, lneut) { if (lo + 1 < hi) { }
init(1, 0, n, a); ll mid = lo + (hi - lo)/2; ll m = (s + e) / 2;
} l = new Node(v, lo, mid); r = new Node(v, mid, hi); if (p < m) L[ks] = upd(L[ks], s, m, p, v);
void init(ll k, ll s, ll e, vector<T> &a) { val = f(l->val, r->val); else R[ks] = upd(R[ks], m, e, p, v);
lazy[k] = lneut; } st[ks] = f(st[L[ks]], st[R[ks]]);
if (s + 1 == e) { st[k] = a[s]; return; } else val = v[lo]; return ks;
ll m = (s + e) / 2; } }
init(2*k, s, m, a), init(2*k+1, m, e, a); T query(ll L, ll R) { T query(ll k, ll s, ll e, ll a, ll b) {
st[k] = f(st[2*k], st[2*k+1]); if (R <= lo || hi <= L) return tneut; if (e <= a || b <= s) return neut;
} if (L <= lo && hi <= R) return apply(val, lazy); if (a <= s && e <= b) return st[k];
void push(ll k, ll s, ll e) { push(); ll m = (s + e) / 2;
if (lazy[k] == lneut) return; // i f neutral , nothing to do return f(l->query(L, R), r->query(L, R)); return f(query(L[k], s, m, a, b), query(R[k], m, e, a, b));
st[k] = apply(st[k], lazy[k], s, e); } }
if (s + 1 < e) { // propagate to children void upd(ll Le, ll Ri, L x) { ll init(vector<T>& a) { return init(0, n, a); }
lazy[2*k] = comb(lazy[2*k], lazy[k]); if (Ri <= lo || hi <= Le) return; ll upd(ll ver, ll p, T v) {return rt = upd(ver, 0, n, p, v);}
lazy[2*k+1] = comb(lazy[2*k+1], lazy[k]); if (Le <= lo && hi <= Ri) lazy = comb(lazy, x); // upd on l a s t root
} else { ll upd(ll p, T v) { return upd(rt, p, v); }
lazy[k] = lneut; // clear node lazy push(), l->upd(Le, Ri, x), r->upd(Le, Ri, x); T query(ll ver, ll a, ll b) {return query(ver, 0, n, a, b);}
} val = f(l->query(lo, hi), r->query(lo, hi)); };
void upd(ll k, ll s, ll e, ll a, ll b, L v) { }
push(k, s, e); } SegmentTree2d.h
if (s >= b || e <= a) return; void set(ll L, ll R, ll x) { upd(L, R, {x, 0}); } Description: Query sum of area ans make point updates. Bounds are in-
if (s >= a && e <= b) { void add(ll L, ll R, ll x) { upd(L, R, {inf, x}); } clusive to the left and exclusive to the right. Can be changed by modifying
lazy[k] = comb(lazy[k], v); // accumulate lazy void push() { T, f and unit.
push(k, s, e); if (!l) { Time: O (log N )
return; ll mid = lo + (hi - lo)/2; 4f243e, 37 lines
fore(j,m,n) C[j] = (C[j] - coef * B[j - m]) % mod; typedef array<double, 2> P; struct LPSolver {
if (2 * L > i) continue; ll m, n;
L = i + 1 - L, B = T, b = d, m = 0; pair<double, P> hillClimb(P start, auto&& f) { vi N, B;
} pair<double, P> cur(f(start), start); vvd D;
for (double jmp = 1e9; jmp > 1e-20; jmp /= 2) {
C.resize(L + 1); C.erase(C.begin()); fore(j,0,100) fore(dx,-1,2) fore(dy,-1,2) { LPSolver(const vvd& A, const vd& b, const vd& c) :
for (ll& x : C) x = (mod - x) % mod; P p = cur.snd; m(SZ(b)), n(SZ(c)), N(n+1), B(m), D(m+2, vd(n+2)) {
return C; p[0] += dx*jmp, p[1] += dy*jmp; fore(i,0,m) fore(j,0,n) D[i][j] = A[i][j];
} cur = min(cur, make_pair(f(p), p)); fore(i,0,m) {B[i]=n+i; D[i][n]=-1; D[i][n+1]=b[i];}
} fore(j,0,n) { N[j] = j; D[m][j] = -c[j]; }
}
LinearRecurrence.h return cur;
N[n] = -1, D[m+1][n] = 1;
Description: Generates the k’th term of an n-order linear recurrence }
P }
S[i] = j S[i − j − 1]tr[j], given S[0 . . . ≥ n − 1] and tr[0 . . . n − 1]. Faster
than matrix multiplication. Useful together with Berlekamp–Massey. void pivot(ll r, ll s) {
Usage: linearRec({0, 1}, {1, 1}, k) // k’th Fibonacci number Integrate.h T *a = D[r].data(), inv = 1 / a[s];
Time: O n2 log k
Description: Simple integration of a function over an interval using Simp- fore(i,0,m+2) if (i != r && abs(D[i][s]) > eps) {
a61838, 26 lines
son’s rule. The error should be proportional to h4 , although in practice you T *b = D[i].data(), inv2 = b[s] * inv;
typedef vi Poly; will want to verify that the result is stable to desired precision when epsilon fore(j,0,n+2) b[j] -= a[j] * inv2;
ll linearRec(Poly S, Poly tr, ll k) { changes. b[s] = a[s] * inv2;
1c4ce6, 6 lines
ll n = SZ(tr); }
UNC IntegerLinearProgramming Determinant IntDeterminant SolveLinear SolveLinear2 SolveLinearBinary MatrixInverse 2025-04-20 9
fore(j,0,n+2) if (j != s) D[r][j] *= inv; IntDeterminant.h SolveLinear2.h
fore(i,0,m+2) if (i != r) D[i][s] *= -inv; Description: Calculates determinant using modular arithmetics. Modulos Description: To get all uniquely determined values of x back from Solve-
D[r][s] = inv; can also be removed to get a pure-integer version. Linear, make the following changes:
Time: O N 3
swap(B[r], N[s]); d39cf6, 18 lines
"SolveLinear.h" 11ffdd, 7 lines
} fore(j,0,n) if (j != i) // instead of fore ( j , i +1,n)
const ll mod = 12345;
ll det(vector<vi>& a) { // . . . then at the end :
bool simplex(ll phase) { x.assign(m, undefined);
ll x = m + phase - 1; ll n = SZ(a); ll ans = 1;
fore(i,0,n) { fore(i,0,rank) {
for (;;) { fore(j,rank,m) if (fabs(A[i][j]) > eps) goto fail;
ll s = -1; fore(j,i+1,n) {
while (a[j][i] != 0) { // gcd step x[col[i]] = b[i] / A[i][i];
fore(j,0,n+1) if (N[j] != -phase) ltj(D[x]); fail:; }
if (D[x][s] >= -eps) return true; ll t = a[i][i] / a[j][i];
ll r = -1; if (t) fore(k,i,n)
fore(i,0,m) { a[i][k] = (a[i][k] - a[j][k] * t) % mod; SolveLinearBinary.h
if (D[i][s] <= eps) continue; swap(a[i], a[j]); Description: Solves Ax = b over F2 . If there are multiple solutions, one is
if (r == -1 || MP(D[i][n+1] / D[i][s], B[i]) ans *= -1; returned arbitrarily. Returns rank, or -1 if no solutions. Destroys A and b.
Time: O n2 m
< MP(D[r][n+1] / D[r][s], B[r])) r = i; }
54a024, 34 lines
} }
ans = ans * a[i][i] % mod; typedef bitset<1000> bs;
if (r == -1) return false;
pivot(r, s); if (!ans) return 0;
} ll solveLinear(vector<bs>& A, vi& b, bs& x, ll m) {
} ll n = SZ(A), rank = 0, br;
} return (ans + mod) % mod;
} assert(m <= SZ(x));
vi col(m); iota(ALL(col), 0);
T solve(vd &x) { fore(i,0,n) {
ll r = 0; for (br = i; br < n; ++br) if (A[br].any()) break;
fore(i,1,m) if (D[i][n+1] < D[r][n+1]) r = i; if (br == n) {
if (D[r][n+1] < -eps) { SolveLinear.h fore(j,i,n) if (b[j]) return -1;
pivot(r, n); Description: Solves A ∗ x = b. If there are multiple solutions, an arbitrary break;
if (!simplex(2) || D[m+1][n+1] < -eps) return -inf; one is returned. Returns rank, or -1 if no solutions. Data in A and b is lost. }
fore(i,0,m) if (B[i] == -1) { Time: O n2 m ll bc = (ll)A[br]._Find_next(i-1);
ll s = 0; 9863aa, 38 lines
swap(A[i], A[br]);
fore(j,1,n+1) ltj(D[i]); typedef vector<double> vd; swap(b[i], b[br]);
pivot(i, s); const double eps = 1e-12; swap(col[i], col[bc]);
} fore(j,0,n) if (A[j][i] != A[j][bc]) {
} ll solveLinear(vector<vd>& A, vd& b, vd& x) { A[j].flip(i); A[j].flip(bc);
bool ok = simplex(1); x = vd(n); ll n = SZ(A), m = SZ(x), rank = 0, br, bc; }
fore(i,0,m) if (B[i] < n) x[B[i]] = D[i][n+1]; if (n) assert(SZ(A[0]) == m); fore(j,i+1,n) if (A[j][i]) {
return ok ? D[m][n+1] : inf; vi col(m); iota(ALL(col), 0); b[j] ^= b[i];
} A[j] ^= A[i];
}; fore(i,0,n) { }
double v, bv = 0; rank++;
IntegerLinearProgramming.h fore(r,i,n) fore(c,i,m) }
Description: When A and b have all integer entries and A is totally unimod- if ((v = fabs(A[r][c])) > bv)
ular then every basic feasible solution is integral so the simplex algorithm br = r, bc = c, bv = v; x = bs();
can be used to solve ILP. A matrix A is totally unimodular if every square if (bv <= eps) { for (ll i = rank; i--;) {
submatrix of A has determinant 0, 1 or −1. fore(j,i,n) if (fabs(b[j]) > eps) return -1; if (!b[i]) continue;
break; x[col[i]] = 1;
} fore(j,0,i) b[j] ^= A[j][i];
swap(A[i], A[br]); }
4.3 Matrices swap(b[i], b[br]); return rank; // ( multiple solutions i f rank < m)
Determinant.h swap(col[i], col[bc]); }
Description: Calculates determinant of a matrix. Destroys the matrix. fore(j,0,n) swap(A[j][i], A[j][bc]);
Time: O N 3 bv = 1/A[i][i];
144e26, 15 lines fore(j,i+1,n) { MatrixInverse.h
double det(vector<vector<double>>& a) { double fac = A[j][i] * bv; Description: Invert matrix A. Returns rank; result is stored in A unless
ll n = SZ(a); double res = 1; b[j] -= fac * b[i]; singular (rank < n). Can easily be extended to prime moduli; for prime
fore(i,0,n) { fore(k,i+1,m) A[j][k] -= fac*A[i][k]; powers, repeatedly set A−1 = A−1 (2I − AA−1 ) (mod pk ) where A−1 starts
ll b = i; } as the inverseof A mod p, and k is doubled in each step.
fore(j,i+1,n) if (fabs(a[j][i]) > fabs(a[b][i])) b = j; rank++; Time: O n3 8a7b27, 35 lines
if (i != b) swap(a[i], a[b]), res *= -1; }
ll matInv(vector<vector<double>>& A) {
res *= a[i][i];
ll n = SZ(A); vi col(n);
if (res == 0) return 0; x.assign(m, 0);
vector<vector<double>> tmp(n, vector<double>(n));
fore(j,i+1,n) { for (ll i = rank; i--;) {
fore(i,0,n) tmp[i][i] = 1, col[i] = i;
double v = a[j][i] / a[i][i]; b[i] /= A[i][i];
if (v != 0) fore(k,i+1,n) a[j][k] -= v * a[i][k]; x[col[i]] = b[i];
fore(i,0,n) {
} fore(j,0,i) b[j] -= A[j][i] * b[i];
ll r = i, c = i;
} }
fore(j,i,n) fore(k,i,n)
return res; return rank; // ( multiple solutions i f rank < m)
if (fabs(A[j][k]) > fabs(A[r][c]))
} }
r = j, c = k;
UNC MatrixInverse-mod Tridiagonal FastFourierTransform FastFourierTransformMod NumberTheoreticTransform 2025-04-20 10
if (fabs(A[r][c]) < 1e-12) return i; return n; for (static ll k = 2; k < n; k *= 2) {
A[i].swap(A[r]); tmp[i].swap(tmp[r]); } R.resize(n); rt.resize(n);
fore(j,0,n) auto x = polar(1.0L, acos(-1.0L) / k);
swap(A[j][i], A[j][c]), swap(tmp[j][i], tmp[j][c]); Tridiagonal.h fore(i,k,2*k) rt[i] = R[i] = i&1 ? R[i/2] * x : R[i/2];
swap(col[i], col[c]); Description: x = tridiagonal(d, p, q, b) solves the equation system }
double v = A[i][i]; vi rev(n);
b0 d0 p0 0 0 ··· 0 x0
fore(j,i+1,n) { fore(i,0,n) rev[i] = (rev[i / 2] | (i & 1) << L) / 2;
double f = A[j][i] / v; b1 q0 d1 p1 0 ··· 0 x1 fore(i,0,n) if (i < rev[i]) swap(a[i], a[rev[i]]);
A[j][i] = 0;
b
2
0
q1 d2 p2 ··· 0 x2
for (ll k = 1; k < n; k *= 2)
b x
= . . . .
fore(k,i+1,n) A[j][k] -= f*A[i][k]; 3 .. .. .. 3 for (ll i = 0; i < n; i += 2 * k) fore(j,0,k) {
. . .
fore(k,0,n) tmp[j][k] -= f*tmp[i][k];
.
. . . . . .
.
C z = rt[j+k] * a[i+j+k]; // (25% faster i f hand−r o l l e d )
. .
}
. 0 0 ··· qn−3 dn−2 pn−2 .
a[i + j + k] = a[i + j] - z;
fore(j,i+1,n) A[i][j] /= v; bn−1 0 0 ··· 0 qn−2 dn−1 xn−1 a[i + j] += z;
fore(j,0,n) tmp[i][j] /= v; This is useful for solving problems on the type }
A[i][i] = 1; }
} ai = bi ai−1 + ci ai+1 + di , 1 ≤ i ≤ n, vd conv(const vd& a, const vd& b) {
where a0 , an+1 , bi , ci and di are known. a can then be obtained from if (a.empty() || b.empty()) return {};
for (ll i = n-1; i > 0; --i) fore(j,0,i) { vd res(SZ(a) + SZ(b) - 1);
double v = A[j][i]; {ai } = tridiagonal({1, −1, −1, ..., −1, 1}, {0, c1 , c2 , . . . , cn }, ll L = 64 - __builtin_clzll(SZ(res)), n = 1 << L;
fore(k,0,n) tmp[j][k] -= v*tmp[i][k]; {b1 , b2 , . . . , bn , 0}, {a0 , d1 , d2 , . . . , dn , an+1 }). vector<C> in(n), out(n);
} copy(ALL(a), begin(in));
Fails if the solution is not unique.
fore(i,0,SZ(b)) in[i].imag(b[i]);
If |di | > |pi | + |qi−1 | for all i, or |di | > |pi−1 | + |qi |, or the matrix is positive
fore(i,0,n) fore(j,0,n) A[col[i]][col[j]] = tmp[i][j]; fft(in);
definite, the algorithm is numerically stable and neither tr nor the check for
return n; for (C& x : in) x *= x;
diag[i] == 0 is needed.
} fore(i,0,n) out[i] = in[-i & (n - 1)] - conj(in[i]);
Time: O (N ) 3c76ca, 26 lines fft(out);
typedef double T; fore(i,0,SZ(res)) res[i] = imag(out[i]) / (4 * n);
MatrixInverse-mod.h vector<T> tridiagonal(vector<T> diag, const vector<T>& super, return res;
Description: Invert matrix A modulo a prime. Returns rank; result is const vector<T>& sub, vector<T> b) { }
stored in A unless singular (rank < n). For prime powers, repeatedly set ll n = SZ(b); vi tr(n);
A−1 = A−1 (2I − AA−1 ) (mod pk ) where A−1 starts as the inverse of A mod fore(i,0,n-1) { FastFourierTransformMod.h
p, and k is doubled in each step. if (abs(diag[i]) < 1e-9 * abs(super[i])) { // diag [ i ] == 0 Description: Higher precision FFT, can be used for convolutions modulo
Time: O n3
b[i+1] -= b[i] * diag[i+1] / super[i]; arbitrary integers as long as N log2 N · mod < 8.6 · 1014 (in practice 1016 or
"../number-theory/ModPow.h" a019e9, 37 lines if (i+2 < n) b[i+2] -= b[i] * sub[i+1] / super[i]; higher). Inputs must be in [0, mod).
ll matInv(vector<vi>& A) { diag[i+1] = sub[i]; tr[++i] = 1; Time: O (N log N ), where N = |A| + |B| (twice as slow as NTT or FFT)
ll n = SZ(A); vi col(n); } else { "FastFourierTransform.h" 8121b2, 21 lines
vector<vi> tmp(n, vi(n)); diag[i+1] -= super[i]*sub[i]/diag[i];
b[i+1] -= b[i]*sub[i]/diag[i]; template<ll M> vi convMod(const vi &a, const vi &b) {
fore(i,0,n) tmp[i][i] = 1, col[i] = i; if (a.empty() || b.empty()) return {};
}
} vi res(SZ(a) + SZ(b) - 1);
fore(i,0,n) { ll B=64-__builtin_clzll(SZ(res)), n=1<<B, cut=ll(sqrt(M));
ll r = i, c = i; for (ll i = n; i--;) {
if (tr[i]) { vector<C> L(n), R(n), outs(n), outl(n);
fore(j,i,n) fore(k,i,n) if (A[j][k]) { fore(i,0,SZ(a)) L[i] = C((ll)a[i] / cut, (ll)a[i] % cut);
r = j; c = k; goto found; swap(b[i], b[i-1]);
diag[i-1] = diag[i]; fore(i,0,SZ(b)) R[i] = C((ll)b[i] / cut, (ll)b[i] % cut);
} fft(L), fft(R);
return i; b[i] /= super[i-1];
} else { fore(i,0,n) {
found: ll j = -i & (n - 1);
A[i].swap(A[r]); tmp[i].swap(tmp[r]); b[i] /= diag[i];
if (i) b[i-1] -= b[i]*super[i-1]; outl[j] = (L[i] + conj(L[j])) * R[i] / (2.0 * n);
fore(j,0,n) outs[j] = (L[i] - conj(L[j])) * R[i] / (2.0 * n) / 1i;
swap(A[j][i], A[j][c]), swap(tmp[j][i], tmp[j][c]); }
} }
swap(col[i], col[c]); fft(outl), fft(outs);
ll v = modpow(A[i][i], mod - 2); return b;
} fore(i,0,SZ(res)) {
fore(j,i+1,n) { ll av = ll(real(outl[i])+.5), cv = ll(imag(outs[i])+.5);
ll f = A[j][i] * v % mod; ll bv = ll(imag(outl[i])+.5) + ll(real(outs[i])+.5);
A[j][i] = 0; 4.4 Fourier transforms res[i] = ((av % M * cut + bv) % M * cut + cv) % M;
fore(k,i+1,n) A[j][k] = (A[j][k] - f*A[i][k]) % mod; FastFourierTransform.h }
fore(k,0,n) tmp[j][k] = (tmp[j][k] - f*tmp[i][k]) % mod; Description: fft(a) computes fˆ(k) =
P
return res;
x a[x] exp(2πi · kx/N ) for all k.
} N mustPbe a power of 2. Useful for convolution: conv(a, b) = c, where }
fore(j,i+1,n) A[i][j] = A[i][j] * v % mod; c[x] = a[i]b[x − i]. For convolution of complex numbers or more than two
fore(j,0,n) tmp[i][j] = tmp[i][j] * v % mod; vectors: FFT, multiply pointwise, divide by n, reverse(start+1, end), FFT
A[i][i] = 1; NumberTheoreticTransform.h
back. Rounding is safe if ( a2i + b2i ) log2 N < 9 · 1014 (in practice 1016 ;
P P
Description: ntt(a) computes fˆ(k) = xk
P
} x a[x]g for all k, where g =
higher for random inputs). Otherwise, use NTT/FFTMod. (mod−1)/N
Time: O (N log N ) with N = |A| + |B| (∼1s for N = 222 ) root . N must be a power of 2. Useful for convolution modulo spe-
for (ll i = n-1; i > 0; --i) fore(j,0,i) { 71e979, 35 lines cific nice primes of the form 2a b + 1, where the convolution result has size
ll v = A[j][i]; typedef complex<double> C; at mostP 2a . For arbitrary modulo, see FFTMod. conv(a, b) = c, where
fore(k,0,n) tmp[j][k] = (tmp[j][k] - v*tmp[i][k]) % mod; typedef vector<double> vd; c[x] = a[i]b[x − i]. For manual convolution: NTT the inputs, multiply
} void fft(vector<C>& a) { pointwise, divide by n, reverse(start+1, end), NTT back. Inputs must be in
ll n = SZ(a), L = 63 - __builtin_clzll(n); [0, mod).
fore(i,0,n) fore(j,0,n) static vector<complex<long double>> R(2, 1); Time: O (N log N )
A[col[i]][col[j]] = tmp[i][j] % mod + (tmp[i][j] < 0)*mod; static vector<C> rt(2, 1); // (^ 10% f a s t e r i f double ) "../number-theory/ModPow.h" a249ae, 34 lines
UNC FastSubsetTransform NTT-operations ModularArithmetic PrecomputeInverses 2025-04-20 11
Poly res(max(SZ(p), SZ(q))); }
const ll mod = (119 << 23) + 1, root = 62; // = 998244353 fore(i, 0, SZ(p)) res[i] += p[i];
// For p < 2^30 there i s also e . g . 5 << 25, 7 << 26, 479 << 21 fore(i, 0, SZ(q)) res[i] += q[i]; vector<Poly> filltree(vi& x) {
// and 483 << 21 (same root ) . The l a s t two are > 10^9. for (ll& x : res) x %= mod; ll k = SZ(x);
void ntt(vi &a) { while (!res.empty() && !res.back()) res.pop_back(); vector<Poly> tr(2*k);
ll n = SZ(a), L = 63 - __builtin_clzll(n); return res; fore(i, k, 2*k) tr[i] = {(mod - x[i - k]) % mod, 1};
static vi rt(2, 1); } for (ll i = k; --i;) tr[i] = conv(tr[2*i], tr[2*i+1]);
for (static ll k = 2, s = 2; k < n; k *= 2, s++) { Poly derivate(const Poly& p) { // O(n) return tr;
rt.resize(n); Poly res(max(0ll, SZ(p)-1)); }
ll z[] = {1, modpow(root, mod >> s)}; fore(i, 1, SZ(p)) res[i-1] = (i * p[i]) % mod; vi evaluate(Poly& a, vi& x) { // O(n log (n)^2)
fore(i,k,2*k) rt[i] = rt[i / 2] * z[i & 1] % mod; return res; ll k = SZ(x); // Evaluate a in a l l points of x
} } if (!SZ(a)) return vi(k);
vi rev(n); Poly integrate(const Poly& p) { // O(n) vector<Poly> tr = filltree(x), ans(2*k);
fore(i,0,n) rev[i] = (rev[i / 2] | (i & 1) << L) / 2; Poly ans(SZ(p) + 1); ans[1] = div(a, tr[1]).snd;
fore(i,0,n) if (i < rev[i]) swap(a[i], a[rev[i]]); fore(i, 0, SZ(p)) ans[i+1] = (p[i] * inv(i+1)) % mod; fore(i, 2, 2*k) ans[i] = div(ans[i/2], tr[i]).snd;
for (ll k = 1; k < n; k *= 2) return ans; vi r(k);
for (ll i = 0; i < n; i += 2 * k) fore(j,0,k) { } fore(i, 0, k) if (SZ(ans[i+k])) r[i] = ans[i+k][0];
ll z = rt[j + k] * a[i + j + k] % mod, &ai = a[i + j]; return r;
a[i + j + k] = ai - z + (z > ai ? mod : 0); Poly takeMod(Poly p, ll n) { // O(n) }
ai += (ai + z >= mod ? z - mod : z); p.resize(min(SZ(p), n)); // p % (x^n) Poly interpolate(vi& x, vi& y) { // O(n log (n)^2)
} while (!p.empty() && !p.back()) p.pop_back(); vector<Poly> tr = filltree(x);
} return p; Poly p = derivate(tr[1]);
vi conv(const vi &a, const vi &b) { } ll k = SZ(y);
if (a.empty() || b.empty()) return {}; vi d = evaluate(p, x); // pass tr here for a speed up
ll s = SZ(a) + SZ(b) - 1, B = 64 - __builtin_clzll(s), Poly inv(const Poly& p, ll d) { // O(n log (n) ) vector<Poly> intr(2*k);
n = 1 << B; Poly res = {inv(p[0])}; // f i r s t d terms of 1/p fore(i, k, 2*k) intr[i] = {(y[i-k] * inv(d[i-k])) % mod};
ll inv = modpow(n, mod - 2); ll sz = 1; for (ll i = k; --i;) intr[i] = add(
vi L(a), R(b), out(n); while (sz < d) { conv(tr[2*i], intr[2*i+1]), conv(tr[2*i+1], intr[2*i]));
L.resize(n), R.resize(n); sz *= 2; return intr[1];
ntt(L), ntt(R); Poly pre(p.begin(), p.begin() + min(SZ(p), sz)); }
fore(i,0,n) Poly cur = conv(res, pre);
out[-i & (n - 1)] = (ll)L[i] * R[i] % mod * inv % mod; fore(i, 0, SZ(cur)) if (cur[i]) cur[i] = mod - cur[i];
ntt(out);
return {out.begin(), out.begin() + s};
cur[0] = cur[0] + 2;
res = takeMod(conv(res, cur), sz);
Number theory (5)
} }
res.resize(d); 5.1 Modular arithmetic
FastSubsetTransform.h return res; ModularArithmetic.h
Description: Transform to a basis with fast convolutions of the form } Description: Operators for modular arithmetic. Update mod. Use com-
X Poly log(const Poly& p, ll d) { // O(n log (n) ) mented code in invert if mod is not prime.
c[z] = a[x] · b[y], where ⊕ is one of AND, OR, XOR. The size Poly cur = takeMod(p, d); // f i r s t d terms of log (p)
z=x⊕y "euclid.h" 433a25, 20 lines
of a must be a power of two. Poly res = integrate(
Time: O (N log N ) takeMod(conv(inv(cur, d), derivate(cur)), d - 1)); const ll mod = 17; // change to something e l s e
265ad3, 16 lines struct Mod {
res.resize(d);
void FST(vi& a, bool inv) { return res; ll x;
for (ll n = SZ(a), step = 1; step < n; step *= 2) { } Mod(ll xx) : x(xx) {}
for (ll i = 0; i < n; i += 2 * step) fore(j,i,i+step) { Poly exp(const Poly& p, ll d) { // O(n log (n)^2) Mod operator+(Mod b) { return Mod((x + b.x) % mod); }
ll &u = a[j], &v = a[j + step]; tie(u, v) = Poly res = {1}; // f i r s t d terms of exp(p) Mod operator-(Mod b) { return Mod((x - b.x + mod) % mod); }
inv ? ii(v - u, u) : ii(v, u + v); // AND for (ll sz = 1; sz < d; ) { Mod operator*(Mod b) { return Mod((x * b.x) % mod); }
inv ? ii(v, u - v) : ii(u + v, u); // OR sz *= 2; Mod operator/(Mod b) { return *this * invert(b); }
ii(u + v, u - v); // XOR Poly lg = log(res, sz), cur(sz); Mod invert(Mod a) {
} fore(i, 0, sz) cur[i] = (mod + (i<SZ(p) ? p[i] : 0) return a ^ (mod - 2);
} - (i<SZ(lg) ? lg[i] : 0)) % mod; // l l x , y , g = euclid (a . x , mod, x , y) ;
if (inv) for (ll& x : a) x /= SZ(a); // XOR only cur[0] = (cur[0] + 1) % mod; // assert (g == 1) ; return Mod(( x + mod) % mod) ;
} res = takeMod(conv(res, cur), sz); }
vi conv(vi a, vi b) { } Mod operator^(ll e) {
FST(a, 0); FST(b, 0); res.resize(d); Mod ans(1);
fore(i,0,SZ(a)) a[i] *= b[i]; return res; for (Mod b = *this; e; b = b * b, e >>= 1)
FST(a, 1); return a; } if (e & 1) ans = ans * b;
} return ans;
pair<Poly,Poly> div(const Poly& a, const Poly& b) { }
ll m = SZ(a), n = SZ(b); // O(n log (n) ) , returns {res , rem} };
NTT-operations.h
Description: Some operations on polynomials made fast with NTT. The if (m < n) return {{}, a}; // i f min(m−n,n) < 750 i t may be
may also work with doubles and FFT, but it’s numerically unstable. inv, Poly ap = a, bp = b; // f a s t e r to use cuadratic version PrecomputeInverses.h
log ans exp return truncated infinite series. Poly elements should not have reverse(ALL(ap)), reverse(ALL(bp)); Description: Pre-computation of modular inverses. Assumes LIM ≤ mod
trailing zeros. The zero polynomial is {}. Poly q = conv(ap, inv(bp, m - n + 1)); and that mod is a prime. 7f666d, 6 lines
"NumberTheoreticTransform.h", "../number-theory/FastInverse.h" b315c7, 102 lines
q.resize(SZ(q) + m - n - SZ(q) + 1, 0), reverse(ALL(q));
Poly bq = conv(b, q); constexpr ll mod = 1e9+7, LIM = 2e5;
typedef vi Poly; fore(i, 0, SZ(bq)) if (bq[i]) bq[i] = mod - bq[i]; array<ll, LIM> inv;
return {q, add(a, bq)}; void initInv() {
Poly add(const Poly& p, const Poly& q) { // O(n) inv[1] = 1;
UNC FastInverse ModPow ModLog ModSum ModMulLL ModSqrt Eratosthenes FastEratosthenes MillerRabin Factor 2025-04-20 12
fore(i,2,LIM) inv[i] = mod - mod / i * inv[mod % i] % mod; ModMulLL.h }
} Description: Calculate a·b mod c (or ab mod c) for 0 ≤ a, b ≤ c ≤ 7.2·1018 .
Time: O (1) for modmul, O (log b) for modpow bbbd8f, 11 lines FastEratosthenes.h
FastInverse.h typedef unsigned long long ull; Description: Prime sieve for generating all primes smaller than LIM.
Description: Fast modular inverse for a constant modulus. ull modmul(ull a, ull b, ull M) { Time: O (n log log n); LIM=1e9 ≈ 1.5s 9d96d8, 20 lines
Time: O (logn), ≈ 2.7x faster than euclid in CF. 4fccf0, 11 lines ll ret = a * b - M * ull(1.L / M * a * b); const ll LIM = 1e6;
constexpr ll mod = 1e9 + 7; return ret + M * (ret < 0) - M * (ret >= (ll)M); bitset<LIM> isPrime;
constexpr ll k = bit_width((unsigned long long)(mod - 2)); } vi eratosthenes() {
ll inv(ll a) { ull modpow(ull b, ull e, ull mod) { const ll S = (ll)round(sqrt(LIM)), R = LIM / 2;
ll r = 1; ull ans = 1; vi pr = {2}, sieve(S+1); pr.reserve(ll(LIM/log(LIM)*1.1));
#pragma GCC unroll(k) for (; e; b = modmul(b, b, mod), e /= 2) vector<ii> cp;
fore(l, 0, k) { if (e & 1) ans = modmul(ans, b, mod); for (ll i = 3; i <= S; i += 2) if (!sieve[i]) {
if ((mod - 2) >> l & 1) r = r * a % mod; return ans; cp.pb({i, i * i / 2});
a = a * a % mod; } for (ll j = i * i; j <= S; j += 2 * i) sieve[j] = 1;
} }
return r; ModSqrt.h for (ll L = 1; L <= R; L += S) {
} Description: Tonelli-Shanks algorithm for modular square roots. Finds x array<bool, S> block{};
s.t. x2 = a (mod p) (−x gives the other solution). for (auto &[p, idx] : cp)
Time: O log2 p worst case, O (log p) for most p for (ll i=idx; i < S+L; idx = (i+=p)) block[i-L] = 1;
ModPow.h c6ee78, 8 lines "ModPow.h" b167b9, 24 lines fore(i,0,min(S, R - L))
const ll mod = 1000000007; // f a s t e r i f const ll sqrt(ll a, ll p) { if (!block[i]) pr.pb((L + i) * 2 + 1);
a %= p; if (a < 0) a += p; }
ll modpow(ll b, ll e) { if (a == 0) return 0; for (ll i : pr) isPrime[i] = 1;
ll ans = 1; assert(modpow(a, (p-1)/2, p) == 1); // e l s e no solution return pr;
for (; e; b = b * b % mod, e >>= 1) if (p % 4 == 3) return modpow(a, (p+1)/4, p); }
if (e & 1) ans = ans * b % mod; // a^(n+3)/8 or 2^(n+3)/8 ∗ 2^(n−1)/4 works i f p % 8 == 5
return ans; ll s = p - 1, n = 2; MillerRabin.h
} ll r = 0, m; Description: Deterministic Miller-Rabin primality test. Guaranteed to
while (s % 2 == 0) work for numbers up to 7 · 1018 ; for larger numbers, use Python and ex-
ModLog.h ++r, s /= 2; tend A randomly.
Description: Returns the smallest x > 0 s.t. ax = b (mod m), or −1 if no while (modpow(n, (p - 1) / 2, p) != p - 1) ++n; Time: 7 times the complexity of ab mod c.
such x exists. ll x = modpow(a, (s + 1) / 2, p);
√ modLog(a,1,m) can be used to calculate the order of a. "ModMulLL.h" 1d5cc3, 11 lines
ll b = modpow(a, s, p), g = modpow(n, s, p);
Time: O m 0e2062, 11 lines bool isPrime(ull n) {
for (;; r = m) { if (n < 2 || n % 6 % 4 != 1) return (n | 1) == 3;
ll modLog(ll a, ll b, ll m) { ll t = b; ll s = __builtin_ctzll(n-1), d = n >> s;
ll n = (ll) sqrt(m) + 1, e = 1, f = 1, j = 1; for (m = 0; m < r && t != 1; ++m) for (ull a : {2,325,9375,28178,450775,9780504,1795265022}) {
unordered_map<ll, ll> A; t = t * t % p; ull p = modpow(a%n, d, n), i = s;
while (j <= n && (e = f = e * a % m) != b % m) if (m == 0) return x; while (p != 1 && p != n - 1 && a % n && i--)
A[e * b % m] = j++; ll gs = modpow(g, 1LL << (r - m - 1), p); p = modmul(p, p, n);
if (e == b % m) return j; g = gs * gs % p; if (p != n-1 && i != s) return 0;
if (gcd(m, e) == gcd(m, b)) x = x * gs % p; }
fore(i,2,n+2) if (A.count(e = e * f % m)) b = b * g % p; return 1;
return n * i - A[e]; } }
return -1; }
}
5.2 Primality Factor.h
Description: Pollard-rho randomized factorization algorithm. Returns
ModSum.h Eratosthenes.h
Description: Sums of mod’ed arithmetic progressions. ofa number, in arbitrary order (e.g. 2299 -> {11, 19, 11}).
prime factors
Pto−1 Description: s[i] = smallest prime factor of i (except for i = 0, 1). sieve Time: O n1/4 , less for numbers with small factors.
modsum(to, c, k, m) = i=0 (ki + c)%m. divsum is similar but for
returns sorted primes less than L. fact returns sorted prime, exponent pairs
floored division. "ModMulLL.h", "MillerRabin.h" fd4221, 18 lines
of the factorization of n.
Time: log(m), with a large constant. 5c5bc5, 16 lines
55dd05, 18 lines ull pollard(ull n) {
const ll L = 1e6; ull x = 0, y = 0, t = 30, prd = 2, i = 1, q;
typedef unsigned long long ull;
array<ll, L> s; auto f = [&](ull x) { return modmul(x, x, n) + i; };
ull sumsq(ull to) { return to / 2 * ((to-1) | 1); }
vi sieve() { while (t++ % 40 || gcd(prd, n) == 1) {
vi p; if (x == y) x = ++i, y = f(x);
ull divsum(ull to, ull c, ull k, ull m) {
for (ll i = 4; i < L; i += 2) s[i] = 2; if ((q = modmul(prd, max(x,y) - min(x,y), n))) prd = q;
ull res = k / m * sumsq(to) + c / m * to;
for (ll i = 3; i * i < L; i += 2) if (!s[i]) x = f(x), y = f(f(y));
k %= m; c %= m;
for (ll j=i*i; j < L; j += 2*i) if (!s[j]) s[j] = i; }
if (!k) return res;
fore(i,2,L) if (!s[i]) p.pb(i), s[i] = i; return gcd(prd, n);
ull to2 = (to * k + c) / m;
return p; }
return res + (to - 1) * to2 - divsum(to2, m-1 - c, m, k);
} vector<ull> factor(ull n) {
}
vector<ii> fact(ll n) { if (n == 1) return {};
vector<ii> res; if (isPrime(n)) return {n};
ll modsum(ull to, ll c, ll k, ll m) {
for (; n > 1; n /= s[n]) { ull x = pollard(n);
c = ((c % m) + m) % m;
if (!SZ(res) || res.back().fst!=s[n]) res.pb({s[n],0}); auto l = factor(x), r = factor(n / x);
k = ((k % m) + m) % m;
res.back().snd++; l.insert(l.end(), ALL(r));
return to * c + k * sumsq(to) - m * divsum(to, c, k, m);
} return l;
}
return res; }
UNC FastDivisors euclid Diophantine CRT phiFunction PointsUnderLine ContinuedFractions FracBinarySearch 2025-04-20 13
FastDivisors.h Description: Euler’s ϕ function is defined as ϕ(n) := # of positive integers FracBinarySearch.h
Description: Given the prime factorization of a number, returns all its di- ≤ n that are coprime with n. ϕ(1) = 1, p prime ⇒ ϕ(pk ) = (p − 1)pk−1 , Description: Given f and N , finds the smallest fraction p/q ∈ [0, 1] such
visors. k
m, n coprime ⇒ ϕ(mn) = ϕ(m)ϕ(n). If n = p1 1 p2 2 ...pk
k r then ϕ(n) = that f (p/q) is true, and p, q ≤ N . You may want to throw an exception from
r
Time: O (d), where d is the number of divisors. 3c6074, 8 lines k1 −1 kr −1 Q f if it finds an exact solution, in which case N can be removed.
(p1 − 1)p1 ...(pr − 1)pr . ϕ(n) = n · p|n (1 − 1/p). Usage: fracBS([](Frac f) { return f.p>=3*f.q; }, 10); // {1,3}
vi divisors(vector<ii>& f) { P P
Time: O (log(N ))
d|n ϕ(d) = n, 1≤k≤n,gcd(k,n)=1 k = nϕ(n)/2, n > 1 79308c, 20 lines
vi res = {1};
for (auto& [p, k] : f) { Euler’s: a, n coprime ⇒ aϕ(n) ≡ 1 (mod n), am ≡ am mod ϕ(n) (mod n). struct Frac { ll p, q; };
ll sz = SZ(res); Generalization: m ≥ log2 (n) ⇒ xm ≡ xϕ(n)+(m mod ϕ(n)) (mod n).
fore(i,0,sz) for(ll j=0,x=p;j<k;j++,x*=p) res.pb(res[i]*x); Fermat’s little thm: p prime ⇒ ∀a : ap−1 ≡ 1 (mod p). 151aa0, 8 lines
Frac fracBS(auto&& f, ll N) {
} bool dir = 1, A = 1, B = 1;
return res; const ll LIM = 5000000; Frac lo{0, 1}, hi{1, 1}; // Set hi to 1/0 to search (0 , N]
} array<ll, LIM> phi; if (f(lo)) return lo;
assert(f(hi));
void calculatePhi() {
5.3 Divisibility fore(i,0,LIM) phi[i] = i&1 ? i : i/2;
while (A || B) {
ll adv = 0, step = 1; // move hi i f dir , e l s e lo
euclid.h for (ll i = 3; i < LIM; i += 2) if (phi[i] == i) for (ll si = 0; step; (step *= 2) >>= si) {
Description: Finds two integers x and y, such that ax + by = gcd(a, b). If for (ll j = i; j < LIM; j += i) phi[j] -= phi[j] / i; adv += step;
you just need gcd, use the built in gcd instead. If a and b are coprime, then } Frac mid{lo.p * adv + hi.p, lo.q * adv + hi.q};
x is the inverse of a (mod b). 33ba8f, 5 lines
if (abs(mid.p) > N || mid.q > N || dir == !f(mid))
adv -= step, si = 2;
ll euclid(ll a, ll b, ll &x, ll &y) { PointsUnderLine.h }
if (!b) return x = 1, y = 0, a; Description: Given a, b > 0, f returns the number of lattice points (x, y) hi.p += lo.p * adv, hi.q += lo.q * adv;
ll d = euclid(b, a % b, y, x); such that ax + by ≤ c and x, y > 0, and g returns the number of lattice dir = !dir, swap(lo, hi), A = B, B = !!adv;
return y -= a/b * x, d; points (x, y) such that ax + by ≤ c, 0 < x ≤ X and 0 < y ≤ Y 388aa4,
. }
11 lines
} return dir ? hi : lo;
ll f(ll a, ll b, ll c) {
if (c <= 0) return 0; }
Diophantine.h if (a < b) swap(a, b);
Description: Returns (x0 , y0 , dx, dy) such that all integer solutions (x, y)
to ax + by = r are (x0 + k · dx, y0 + k · dy) for integer k.
ll m = c / a, k = (a - 1) / b, h = (c - a * m) / b; 5.5 Pythagorean Triples
if (a == b) return m * (m - 1) / 2; The Pythagorean triples are uniquely generated by
Time: O (log(min(a, b))) return f(b, a - b*k, c - b*(k*m + h)) + k*m*(m - 1)/2 + m*h;
"./euclid.h" 12347a, 6 lines
}
array<ll, 4> diophantine(ll a, ll b, ll r) { ll g(ll a, ll b, ll c, ll X, ll Y) { a = k · (m2 − n2 ), b = k · (2mn), c = k · (m2 + n2 ),
ll x, y, g = euclid(a, b, x, y); if (a * X + b * Y <= c) return X * Y;
a /= g, b /= g, r /= g, x *= r, y *= r; return f(a,b,c)-f(a,b,c-a*X)-f(a,b,c-b*Y)+f(a,b,c-a*X-b*Y); with m > n > 0, k > 0, m⊥n, and either m or n even.
assert(a * x + b * y == r); // otherwise no solution }
return {x, y, -b, a}; 5.6 Primes
}
p = 962592769 is such that 221 | p − 1, which may be useful. For
CRT.h 5.4 Fractions hashing use 970592641 (31-bit number), 31443539979727 (45-bit),
Description: Chinese Remainder Theorem. ContinuedFractions.h 3006703054056749 (52-bit). There are 78498 primes less than
crt(a, m, b, n) computes x such that x ≡ a (mod m), x ≡ b (mod n). If Description: Given N and a real number x ≥ 0, finds the closest rational 1 000 000.
|a| < m and |b| < n, x will obey 0 ≤ x < lcm(m, n). Assumes mn < 262 . approximation p/q with p, q ≤ N . It will obey |p/q − x| ≤ 1/qN .
Time: log(n) For consecutive convergents, pk+1 qk − qk+1 pk = (−1)k . (pk /qk alternates Primitive roots exist modulo any prime power pa , except for
"euclid.h" 04d93a, 7 lines between > x and < x.) If x is rational, y eventually becomes ∞; if x is the
ll crt(ll a, ll m, ll b, ll n) { root of a degree 2 polynomial the a’s eventually become cyclic. p = 2, a > 2, and there are ϕ(ϕ(pa )) many. For p = 2, a > 2, the
if (n > m) swap(a, b), swap(m, n); Time: O (log N ) 5f8089, 21 lines
group Z× 2a is instead isomorphic to Z2 × Z2a−2 .
ll x, y, g = euclid(m, n, x, y);
assert((a - b) % g == 0); // e l s e no solution // double i s safe for N ∼ 1e7 , use long double for N ∼ 1e9
typedef long double ld;
5.7 Highly composite numbers
x = (b - a) % n * x % n / g * m + a;
return x < 0 ? x + m*n/g : x; ii approximate(ld x, ll N) { The number of divisors of n is O(log(log(n))). Max number of
} ll LP = 0, LQ = 1, P = 1, Q = 0, inf = LLONG_MAX; ld y = x; divisors up to 10n :
for (;;) {
ll lim = min(P ? (N-LP) / P : inf, Q ? (N-LQ)/Q : inf), n 0 1 2 3 4 5 6 7
5.3.1 Bézout’s identity a = (ll)floor(y), b = min(a, lim),
For a ̸=, b ̸= 0, then d = gcd(a, b) is the smallest positive integer NP = b*P + LP, NQ = b*Q + LQ; divisors 1 4 12 32 64 128 240 448
for which there are integer solutions to if (a > b) { number 1 6 60 840 7560 83160 720720 8648640
// I f b > a/2, we have a semi−convergent that gives us a
// better approximation ; i f b = a/2, we ∗may∗ have one .
8 9 10 11 12
ax + by = d // Return {P, Q} here for a more canonical approximation . 768 1344 2304 4032 6720
return (abs(x - (ld)NP/(ld)NQ) < abs(x - (ld)P/(ld)Q)) ? 73513440 735134400 6983776800 97772875200 963761198400
If (x, y) is one solution, then all solutions are given by ii{NP, NQ} : ii{P, Q};
} 13 14 15
if (abs(y = 1/(y - (ld)a)) > 3*N) { 10752 17280 26880
kb ka return {NP, NQ};
x+ ,y − , k∈Z }
9316358251200 97821761637600 866421317361600
gcd(a, b) gcd(a, b) 16 17 18
LP = P, P = NP, LQ = Q, Q = NQ;
} 41472 64512 103680
}
phiFunction.h 8086598962041600 74801040398884800 897612484786617600
UNC Mobius IntPerm multinomial 2025-04-20 14
P
d|n d = O(n log log n). IntPerm.h 6.2.2 Lucas’ Theorem
Description: Permutation -> integer conversion. (Not order preserving.)
Let n, m be non-negative integers and p a prime. Write
5.8 Mobius Function Integer -> permutation can use a lookup table.
n= pk + ... +n1 p + n0 and m = mk pk + ... + m1 p + m0 . Then
Time: O (n) d7d731, 7 lines nkQ k
0
n is not square free
ll permToInt(vi& v) {
n
m
≡ ni
i=0 mi (mod p).
µ(n) = 1 n has even number of prime factors ll use = 0, i = 0, r = 0; 6.2.3 Binomials
for (ll x : v)
Kummer’s theorem: the exponent of p in nk is the number of
−1 n has odd number of prime factors
r = r * ++i + __builtin_popcountll(use & -(1<<x)),
use |= 1 << x; // (note : minus, not ∼! ) carries when adding k and n − k in base p, or, equivalently
return r;
Mobius.h } !!
Description: Computes the Mobius function µ(n) for all n < L. n sp (k) + sp (n − k) − sp (n)
Time: O (L log L) 6.1.2 Cycles νp =
50cb20, 6 lines k p−1
const ll L = 1e6; Let gS (n) be the number of n-permutations whose cycle lengths
array<int8_t, L> mu; all belong to the set S. Then n
void calculateMu() { where sp (n) is the sum of the digits of n in base p. Therefore k
mu[1] = 1; ∞
!
is odd iff k is a submask of n.
fore(i,1,L) if(mu[i]) for(ll j=2*i; j<L; j+=i) mu[j]-=mu[i];
X xn X xn
}
gS (n) = exp
n=0
n! n∈S
n ! ! k
! !
n n n−1 Y n+1−i n−1 n−1
Mobius Inversion: = = = +
k k k−1 i k−1 k
X X
6.1.3 Derangements i=1
g(n) = f (d) ⇔ f (n) = µ(d)g(n/d) Permutations of a set such that none of the elements appear in
their original position. k
! ! n
! !
d|n d|n X n+i n+k+1 X i n+1
= , =
n! i k k k+1
Other useful formulas/forms: D(n) = (n − 1)(D(n − 1) + D(n − 2)) = nD(n − 1) + (−1)n = i=0 i=0
P e
d|n µ(d) = [n = 1] (very useful) n
!2 ! n
!
X n 2n X n
P P 6.1.4 Burnside’s lemma = , k = n2n−1
g(n) = n|d f (d) ⇔ f (n) = n|d µ(d/n)g(d) k n k
k=0 k=0
Given a group G of symmetries and a set X, the number of
n n
elements of X up to symmetry equals
P P
g(n) = 1≤m≤n f ( m ) ⇔ f (n) = 1≤m≤n µ(m)g( m )
1 X g multinomial.h k + · · · + k P
|X |, 1 n ( ki )!
|G| g∈G Description: Computes = .
k1 , k2 , . . . , k n k1 !k2 !...kn !
Combinatorial (6) ll multinomial(vi& v) {
91eef8, 5 lines
Description: Calculates shortest paths from s in a graph that might have enum { ADD, DEL, QUERY };
! negative edge weights. Stores the answer in nodes. Unreachable nodes get struct Query { ll type, x, y; }; // You can add s t u f f for QUERY
k
1 X struct DynCon {
k−j k dist = inf; nodes reachable through negative-weight cycles get dist = -inf.
S(n, k) = (−1) jn Assumes V 2 max |wi | < ∼263 . vector<Query> q;
k! j=0 j Time: O (V E) RSUF uf;
387d3f, 17 lines
vi mt;
const ll inf = LLONG_MAX; map<ii, ll> last;
6.3.5 Bell numbers struct Ed { ll a, b, w, s() { return a < b ? a : -a; }}; vector<T> ans;
Total number of partitions of n distinct elements. B(n) = struct Node { ll dist = inf, prev = -1; }; DynCon(ll n) : uf(n) {}
1, 1, 2, 5, 15, 52, 203, 877, 4140, 21147, . . . . For p prime, DynCon(vector<D>& d) : uf(d) {}
void bellmanFord(vector<Node>& nodes, vector<Ed>& eds, ll s) { void add(ll x, ll y) {
nodes[s].dist = 0; if (x > y) swap(x, y);
B(pm + n) ≡ mB(n) + B(n + 1) (mod p) sort(ALL(eds), [](Ed a, Ed b) { return a.s() < b.s(); });
ll lim = SZ(nodes) / 2 + 2; // /3+100 with shuffled vertices
mt.pb(-1);
last[{x, y}] = SZ(q);
fore(i,0,lim) for (Ed ed : eds) { q.pb({ADD, x, y});
Node cur = nodes[ed.a], &dest = nodes[ed.b];
6.3.6 Fibonacci numbers if (abs(cur.dist) == inf) continue;
}
void remove(ll x, ll y) {
ll d = cur.dist + ed.w; if (x > y) swap(x, y);
if (d < dest.dist) dest = {i < lim-1 ? d : -inf, ed.a}; ll pr = last[{x, y}];
n
X } mt[pr] = SZ(q);
2
F2n+1 = Fn+1 + Fn2 , 2
F2n = Fn+1 2
− Fn−1 , Fi = Fn+2 − 1 fore(i,0,lim) for (Ed e : eds) mt.pb(pr);
if (nodes[e.a].dist == -inf) nodes[e.b].dist = -inf; q.pb({DEL, x, y});
i=1 } }
void query() { // Add parameters i f needed
Fn+i Fn+j − Fn Fn+i+j = (−1)n Fi Fj FloydWarshall.h q.pb({QUERY, -1, -1});
Description: Calculates all-pairs shortest path in a directed graph that mt.pb(-1);
might have negative edge weights. Input is an distance matrix m, where }
6.3.7 Labeled unrooted trees m[i][j] = inf if i and j are not adjacent. As output, m[i][j] is set to the void process() { // Answers a l l queries in order
# on n vertices: nn−2 shortest distance between i and j, inf if no path, or -inf if the path goes if (q.empty()) return;
through a negative-weight cycle. fore(i, 0, SZ(q))
# on k existing trees of size ni : n1 n2 · · · nk nk−2 Time: O N 3
if (q[i].type == ADD && mt[i] < 0) mt[i] = SZ(q);
7eb90d, 12 lines
# with degrees di : (n − 2)!/((d1 − 1)! · · · (dn − 1)!) go(0, SZ(q));
UNC PushRelabel MinCostMaxFlow EdmondsKarp 2025-04-20 16
} } else if (cur[u]->c && H[u] == H[cur[u]->dest]+1) fl = min(fl, x->cap - x->flow);
void go(ll s, ll e) { addFlow(*cur[u], min(ec[u], cur[u]->c));
if (s + 1 == e) { else ++cur[u]; totflow += fl;
if (q[s].type == QUERY) { // Answer query using DSU } for (edge* x = par[t]; x; x = par[x->from]) {
ans.pb(uf.ans); // Maybe you want to use uf . get (x) } x->flow += fl;
} // for some x stored in Query bool leftOfMinCut(ll a) { return H[a] >= SZ(g); } ed[x->to][x->rev].flow -= fl;
return; }; }
} }
ll k = uf.time(), m = (s + e) / 2; fore(i,0,N) for (edge& e : ed[i])
for (ll i = e; --i >= m;) MinCostMaxFlow.h totcost += e.cost * e.flow;
if (0 <= mt[i] && mt[i] < s) uf.join(q[i].x, q[i].y); Description: Min-cost max-flow. If costs can be negative, call setpi before return {totflow, totcost/2};
go(s, m); maxflow, but note that negative cost cycles are not supported. To obtain }
uf.rollback(k); the actual flow, look at positive values only.
for (ll i = m; --i >= s;) Time: O (F E log(V )) where F is max flow. O (V E) for setpi. 735f9d, 80 lines // I f some costs can be negative , c a l l t h i s before maxflow :
if (mt[i] >= e) uf.join(q[i].x, q[i].y); void setpi(ll s) { // ( otherwise , leave t h i s out)
go(m, e); #include "ext/pb_ds/priority_queue.hpp" fill(ALL(pi), INF); pi[s] = 0;
uf.rollback(k); ll it = N, ch = 1; ll v;
} const ll INF = numeric_limits<ll>::max() / 4; while (ch-- && it--)
}; fore(i,0,N) if (pi[i] != INF)
struct MCMF { for (edge& e : ed[i]) if (e.cap)
struct edge { if ((v = pi[i] + e.cost) < pi[e.to])
7.2 Network flow ll from, to, rev;
ll cap, cost, flow;
pi[e.to] = v, ch = 1;
assert(it >= 0); // negative cost cycle
PushRelabel.h }; }
Description: Push-relabel using the highest label selection rule and the gap ll N; };
heuristic. Quite fast in practice. To obtain the actual flow, look at positive vector<vector<edge>> ed;
values only.
√ vi seen;
Time: O V 2 E vi dist, pi;
da4f8b, 48 lines vector<edge*> par;
struct PushRelabel { EdmondsKarp.h
struct Edge { MCMF(ll N) : N(N), ed(N), seen(N), dist(N), pi(N), par(N) {} Description: Flow algorithm with guaranteed complexity O(V E 2 ). To get
ll dest, back; edge flow values, compare capacities before and after, and take the positive
ll f, c; void addEdge(ll from, ll to, ll cap, ll cost) { values only. 1e8088, 36 lines
}; if (from == to) return;
template<class T> T edmondsKarp(vector<unordered_map<ll, T>>&
vector<vector<Edge>> g; ed[from].pb(edge{from, to, SZ(ed[to]), cap, cost, 0});
graph, ll source, ll sink) {
vi ec; ed[to].pb(edge{to, from, SZ(ed[from])-1, 0, -cost, 0});
assert(source != sink);
vector<Edge*> cur; }
T flow = 0;
vector<vi> hs; vi H;
vi par(SZ(graph)), q = par;
PushRelabel(ll n) : g(n), ec(n), cur(n), hs(2*n), H(n) {} void path(ll s) {
fill(ALL(seen), 0);
for (;;) {
void addEdge(ll s, ll t, ll cap, ll rcap=0) { fill(ALL(dist), INF);
fill(ALL(par), -1);
if (s == t) return; dist[s] = 0; ll di;
par[source] = 0;
g[s].pb({t, SZ(g[t]), 0, cap});
ll ptr = 1;
g[t].pb({s, SZ(g[s])-1, 0, rcap}); __gnu_pbds::priority_queue<ii> q;
q[0] = source;
} vector<decltype(q)::point_iterator> its(N);
q.push({ 0, s });
for (ll i = 0; i < ptr; i++) {
void addFlow(Edge& e, ll f) {
ll x = q[i];
Edge &back = g[e.dest][e.back]; while (!q.empty()) {
for (auto e : graph[x]) {
if (!ec[e.dest] && f) hs[H[e.dest]].pb(e.dest); s = q.top().snd; q.pop();
if (par[e.fst] == -1 && e.snd > 0) {
e.f += f; e.c -= f; ec[e.dest] += f; seen[s] = 1; di = dist[s] + pi[s];
par[e.fst] = x;
back.f -= f; back.c += f; ec[back.dest] -= f; for (edge& e : ed[s]) if (!seen[e.to]) {
q[ptr++] = e.fst;
} ll val = di - pi[e.to] + e.cost;
if (e.fst == sink) goto out;
ll calc(ll s, ll t) { if (e.cap - e.flow > 0 && val < dist[e.to]) {
}
ll v = SZ(g); H[s] = v; ec[t] = 1; dist[e.to] = val;
}
vi co(2*v); co[0] = v-1; par[e.to] = &e;
}
fore(i,0,v) cur[i] = g[i].data(); if (its[e.to] == q.end())
return flow;
for (Edge& e : g[s]) addFlow(e, e.c); its[e.to] = q.push({ -dist[e.to], e.to });
out:
else
T inc = numeric_limits<T>::max();
for (ll hi = 0;;) { q.modify(its[e.to], { -dist[e.to], e.to });
for (ll y = sink; y != source; y = par[y])
while (hs[hi].empty()) if (!hi--) return -ec[s]; }
inc = min(inc, graph[par[y]][y]);
ll u = hs[hi].back(); hs[hi].pop_back(); }
while (ec[u] > 0) // discharge u }
flow += inc;
if (cur[u] == g[u].data() + SZ(g[u])) { fore(i,0,N) pi[i] = min(pi[i] + dist[i], INF);
for (ll y = sink; y != source; y = par[y]) {
H[u] = 1e9; }
ll p = par[y];
for (Edge& e : g[u]) if (e.c && H[u] > H[e.dest]+1)
if ((graph[p][y] -= inc) <= 0) graph[p].erase(y);
H[u] = H[e.dest]+1, cur[u] = &e; ii maxflow(ll s, ll t) {
graph[y][p] += inc;
if (++co[H[u]], !--co[hi] && hi < v) ll totflow = 0, totcost = 0;
}
fore(i,0,v) if (hi < H[i] && H[i] < v) while (path(s), seen[t]) {
}
--co[H[i]], H[i] = v + 1; ll fl = INF;
}
hi = H[u]; for (edge* x = par[t]; x; x = par[x->from])
UNC Dinic MinCut GlobalMinCut GomoryHu hopcroftKarp DFSMatching MinimumVertexCover 2025-04-20 17
Dinic.h s = t, t = max_element(ALL(w)) - w.begin(); }
Description: Flow algorithm with complexity O(V E log U ) where U = fore(i,0,n) w[i] += mat[t][i]; else if (btoa[b] != a && !B[b]) {
√
max |cap|. O(min(E 1/2 , V 2/3 )E) if U = 1; O( V E) for bipartite match- } B[b] = lay;
ing. best = min(best, {w[t] - mat[t][t], co[t]}); next.pb(btoa[b]);
d44dbc, 42 lines
co[s].insert(co[s].end(), ALL(co[t])); }
struct Dinic { fore(i,0,n) mat[s][i] += mat[t][i]; }
struct Edge { fore(i,0,n) mat[i][s] = mat[s][i]; if (islast) break;
ll to, rev; mat[0][t] = LLONG_MIN; if (next.empty()) return res;
ll c, oc; } for (ll a : next) A[a] = lay;
ll flow() { return max(oc - c, 0LL); } // i f you need flows return best; cur.swap(next);
}; } }
vi lvl, ptr, q; fore(a,0,SZ(g))
vector<vector<Edge>> adj; res += dfs(a, 0, g, btoa, A, B);
Dinic(ll n) : lvl(n), ptr(n), q(n), adj(n) {}
GomoryHu.h
Description: Given a list of edges representing an undirected flow graph, }
void addEdge(ll a, ll b, ll c, ll rcap = 0) { }
returns edges of the Gomory-Hu tree. The max flow between any pair of
adj[a].pb({b, SZ(adj[b]), c, c});
vertices is given by minimum edge weight along the Gomory-Hu tree path.
adj[b].pb({a, SZ(adj[a]) - 1, rcap, rcap});
}
Time: O (V ) Flow Computations DFSMatching.h
"PushRelabel.h" a93d73, 13 lines Description: Simple bipartite matching algorithm. Graph g should be a list
ll dfs(ll v, ll t, ll f) {
typedef array<ll, 3> Edge; of neighbors of the left partition, and btoa should be a vector full of -1’s of
if (v == t || !f) return f;
vector<Edge> gomoryHu(ll N, vector<Edge> ed) { the same size as the right partition. Returns the size of the matching. btoa[i]
for (ll& i = ptr[v]; i < SZ(adj[v]); i++) {
vector<Edge> tree; will be the match for vertex i on the right side, or −1 if it’s not matched.
Edge& e = adj[v][i];
vi par(N); Usage: vi btoa(m, -1); dfsMatching(g, btoa);
if (lvl[e.to] == lvl[v] + 1)
fore(i,1,N) { Time: O (V E)
if (ll p = dfs(e.to, t, min(f, e.c))) { 6a75ec, 22 lines
e.c -= p, adj[e.to][e.rev].c += p; PushRelabel D(N); // Dinic also works bool find(ll j, vector<vi>& g, vi& btoa, vi& vis) {
return p; for (Edge t : ed) D.addEdge(t[0], t[1], t[2], t[2]); if (btoa[j] == -1) return 1;
} tree.pb({i, par[i], D.calc(i, par[i])}); vis[j] = 1; ll di = btoa[j];
} fore(j,i+1,N) for (ll e : g[di])
return 0; if (par[j] == par[i] && D.leftOfMinCut(j)) par[j] = i; if (!vis[e] && find(e, g, btoa, vis)) {
} } btoa[e] = di;
ll calc(ll s, ll t) { return tree; return 1;
ll flow = 0; q[0] = s; } }
fore(L,0,31) do { // ’ l l L=30’ maybe f a s t e r for random data return 0;
lvl = ptr = vi(SZ(q)); 7.3 Matching }
ll qi = 0, qe = lvl[s] = 1; ll dfsMatching(vector<vi>& g, vi& btoa) {
while (qi < qe && !lvl[t]) { hopcroftKarp.h vi vis;
ll v = q[qi++]; Description: Fast bipartite matching algorithm. Graph g should be a list
fore(i,0,SZ(g)) {
for (Edge e : adj[v]) of neighbors of the left partition, and btoa should be a vector full of -1’s of
vis.assign(SZ(btoa), 0);
if (!lvl[e.to] && e.c >> (30 - L)) the same size as the right partition. Returns the size of the matching. btoa[i]
for (ll j : g[i])
q[qe++] = e.to, lvl[e.to] = lvl[v] + 1; will be the match for vertex i on the right side, or −1 if it’s not matched.
if (find(j, g, btoa, vis)) {
} Usage: vi √btoa(m, -1); hopcroftKarp(g, btoa);
btoa[j] = i;
while (ll p = dfs(s, t, LLONG_MAX)) flow += p; Time: O VE break;
2bbb99, 42 lines
} while (lvl[t]); }
return flow; bool dfs(ll a, ll L, vector<vi>& g, vi& btoa, vi& A, vi& B) { }
} if (A[a] != L) return 0; return SZ(btoa) - (ll)count(ALL(btoa), -1);
bool leftOfMinCut(ll a) { return lvl[a] != 0; } A[a] = -1; }
}; for (ll b : g[a]) if (B[b] == L + 1) {
B[b] = 0;
if (btoa[b] == -1 || dfs(btoa[b], L + 1, g, btoa, A, B)) MinimumVertexCover.h
MinCut.h return btoa[b] = a, 1; Description: Finds a minimum vertex cover in a bipartite graph. The size
Description: After running max-flow, the left side of a min-cut from s to t } is the same as the size of a maximum matching, and the complement is a
is given by all vertices reachable from s, only traversing edges with positive return 0; maximum independent set.
residual capacity. } "DFSMatching.h" cd3f06, 20 lines
vi cover(vector<vi>& g, ll n, ll m) {
ll hopcroftKarp(vector<vi>& g, vi& btoa) { vi match(m, -1);
GlobalMinCut.h ll res = 0; ll res = dfsMatching(g, match);
Description: Find a global minimum cut in an undirected graph, as repre- vi A(g.size()), B(btoa.size()), cur, next; vector<bool> lfound(n, true), seen(m);
sented by an adjacency matrix. for (;;) { for (ll it : match) if (it != -1) lfound[it] = false;
Time: O V 3
fill(ALL(A), 0); vi q, cover;
16cb60, 21 lines fill(ALL(B), 0); fore(i,0,n) if (lfound[i]) q.pb(i);
pair<ll, vi> globalMinCut(vector<vi> mat) { cur.clear(); while (!q.empty()) {
pair<ll, vi> best = {LLONG_MAX, {}}; for (ll a : btoa) if (a != -1) A[a] = -1; ll i = q.back(); q.pop_back();
ll n = SZ(mat); fore(a,0,SZ(g)) if (A[a] == 0) cur.pb(a); lfound[i] = 1;
vector<vi> co(n); for (ll lay = 1;; lay++) { for (ll e : g[i]) if (!seen[e] && match[e] != -1) {
fore(i,0,n) co[i] = {i}; bool islast = 0; seen[e] = true;
fore(ph,1,n) { next.clear(); q.pb(match[e]);
vi w = mat[0]; for (ll a : cur) for (ll b : g[a]) { }
size_t s = 0, t = 0; if (btoa[b] == -1) { }
fore(it,0,n-ph) { // O(V^2) −> O(E log V) with prio . queue B[b] = lay; fore(i,0,n) if (!lfound[i]) cover.pb(i);
w[t] = LLONG_MIN; islast = 1; fore(i,0,m) if (seen[i]) cover.pb(n+i);
UNC WeightedMatching GeneralMatching SCC BiconnectedComponents 2sat 2025-04-20 18
assert(SZ(cover) == res); vi has(M, 1); vector<ii> ret; if (y == at) { // s e l f loop
return cover; fore(it,0,M/2) { nodesComp[at].insert(edgesComp[e] = nComps++);
} fore(i,0,M) if (has[i]) } else if (num[y]) {
fore(j,i+1,M) if (A[i][j] && mat[i][j]) { top = min(top, num[y]);
WeightedMatching.h fi = i; fj = j; goto done; if (num[y] < me) st.pb(e);
Description: Given a weighted bipartite graph, matches every node on the } assert(0); done: } else {
left with a node on the right such that no nodes are in two matchings and the if (fj < N) ret.pb({fi, fj}); ll si = SZ(st), up = dfs(y, e);
sum of the edge weights is minimal. Takes cost[N][M], where cost[i][j] = cost has[fi] = has[fj] = 0; top = min(top, up);
for L[i] to be matched with R[j] and returns (min cost, match), where L[i] is fore(sw,0,2) { if (up == me) {
matched with R[match[i]]. Negate costs for max cost. Requires N ≤ M . ll a = modpow(A[fi][fj], mod-2); st.pb(e); // from s i to SZ( s t ) we have a comp
Time: O N 2 M
fore(i,0,M) if (has[i] && A[i][fj]) { fore(i, si, SZ(st)) {
178f97, 31 lines ll b = A[i][fj] * a % mod; edgesComp[st[i]] = nComps;
pair<ll, vi> hungarian(const vector<vi> &a) { fore(j,0,M) A[i][j] = (A[i][j] - A[fi][j] * b) % mod; auto [u, v] = edges[st[i]];
if (a.empty()) return {0, {}}; } nodesComp[u].insert(nComps);
ll n = SZ(a) + 1, m = SZ(a[0]) + 1; swap(fi,fj); nodesComp[v].insert(nComps);
vi u(n), v(m), p(m), ans(n - 1); } }
fore(i,1,n) { } nComps++, st.resize(si);
p[0] = i; return ret; } else if (up < me) st.pb(e); // e l s e e i s bridge
ll j0 = 0; // add ”dummy” worker 0 } }
vi dist(m, LLONG_MAX), pre(m, -1); }
vector<bool> done(m + 1);
do { // d i j k s t r a
7.4 DFS algorithms };
return top;
ans = Data{}; Dilworth’s: In a finite poset, the maximum size of an antichain template<class P>
} equals the minimum number of chains needed to partition the double lineDist(const P& a, const P& b, const P& p) {
void ex(vd& e, vd& a, Data& ne, ll v) { return (double)(b-a).cross(p-a)/(b-a).dist();
ll d = SZ(a); Data b = ne; poset. }
fore(i, 0, d) acc(b, a[i], v, i);
fill(begin(e), begin(e) + d, b);
König’s: In a bipartite graph, the number of edges in a SegmentDistance.h
e res p
fore(i, 0, d) unacc(e[i], a[i], v, i);
} maximum matching equals the number of vertices in a minimum Description:
}; Returns the shortest distance between point p and the line
vertex cover. segment from point s to e.
Usage: Point<double> a, b(2,2), p(1,1); s
RerootLinear.h bool onSegment = segDist(a,b,p) < 1e-10;
Description: Use two operations instead of one to make rerooting linear. Hall’s: A bipartite graph (X, Y, E) has an X-saturating "Point.h" 5c88f4, 6 lines
Usually only worth it for non- O(1) operations. Add merge and extend, and matching iff for all W ⊆ X, |N (W )| ≥ |W |, i.e. it has as many
change acc and exclusive. Don’t use inheritance. typedef Point<double> P;
merge should, given accumulated(p, S) and accumulated(p, T ), with S and neighbors as elements. double segDist(P& s, P& e, P& p) {
T disjoint, return accumulated(p, S ∪ T ). if (s==e) return (p-s).dist();
extend should, given the answer for g[p][ei], return b such that auto d = (e-s).dist2(), t = min(d,max(.0,(p-s).dot(e-s)));
merge(neuts[p], b, p) = accumulated(p, {g[p][ei]}). return ((p-s)*d-(e-s)*t).dist()/d;
Time: Slow O (n) Geometry (8) }
"Reroot.h" 4cb523, 17 lines
ll a=0, N=SZ(s); s += s;
struct PR { KMP.h fore(b,0,N) fore(k,0,N) {
void ins(ll x) { (a == -1 ? a : b) = x; } Description: pi[x] computes the length of the longest prefix of s that ends if (a+k == b || s[a+k] < s[b+k]) {
void rem(ll x) { (a == x ? a : b) = -1; } at x, other than s[0...x] itself (abacaba -> 0010123). Can be used to find all b += max(0ll, k-1);
ll cnt() { return (a != -1) + (b != -1); } occurrences of a string. break;
ll a, b; Time: O (n) 1bc4d4, 16 lines }
}; if (s[a+k] > s[b+k]) { a = b; break; }
vi pi(const string& s) {
vi p(SZ(s)); }
struct F { P3 q; ll a, b, c; }; return a;
fore(i,1,SZ(s)) {
ll g = p[i-1]; }
vector<F> hull3d(const vector<P3>& A) {
assert(SZ(A) >= 4); while (g && s[i] != s[g]) g = p[g-1];
vector<vector<PR>> E(SZ(A), vector<PR>(SZ(A), {-1, -1})); p[i] = g + (s[i] == s[g]);
#define E(x,y) E[f.x][f.y] }
return p;
vector<F> FS;
}
SuffixArray.h
auto mf = [&](ll i, ll j, ll k, ll l) { Description: Builds suffix array for a string. sa[i] is the starting index
P3 q = (A[j] - A[i]).cross((A[k] - A[i])); of the suffix which is i’th in the sorted suffix array. The returned vector
if (q.dot(A[l]) > q.dot(A[i])) vi match(const string& s, const string& pat) {
is of size n + 1, and sa[0] = n. The lcp array contains longest common
q = q * -1; vi p = pi(pat + ’\0’ + s), res;
prefixes for neighbouring strings in the suffix array: lcp[i] = lcp(sa[i],
F f{q, i, j, k}; fore(i,SZ(p)-SZ(s),SZ(p))
sa[i-1]), lcp[0] = 0. rank is the inverse of the suffix array: rank[sa[i]]
E(a,b).ins(k); E(a,c).ins(j); E(b,c).ins(i); if (p[i] == SZ(pat)) res.pb(i - 2 * SZ(pat));
= i. lim should be strictly larger than all elements. For larger alphabets use
FS.pb(f); return res;
basic string<ll> instead of string. The input string must not contain
}; }
any zero bytes.
fore(i,0,4) fore(j,i+1,4) fore(k,j+1,4) Time: O (n log n) 6f3b67, 17 lines
mf(i, j, k, 6 - i - j - k); Zfunc.h
Description: z[i] computes the length of the longest common prefix of s[i:] array<vi, 3> suffixArray(string& s, ll lim = ’z’ + 1) {
fore(i,4,SZ(A)) { and s, except z[0] = 0. (abacaba -> 0010301) ll n = SZ(s) + 1, k = 0, a, b;
for (ll j = 0; j < SZ(FS); j++) { Time: O (n) vi rank(ALL(s)+1), y(n), ws(max(n,lim)), sa(n), lcp(n);
9e784e, 12 lines
F f = FS[j]; iota(ALL(sa), 0);
if (f.q.dot(A[i]) > f.q.dot(A[f.a])) { vi Z(const string& S) { for (ll j = 0, p = 0; p < n; j = max(1ll, j * 2), lim = p) {
E(a,b).rem(f.c); vi z(SZ(S)); p = j, iota(ALL(y), n - j), fill(ALL(ws), 0);
E(a,c).rem(f.b); ll l = -1, r = -1; fore(i,0,n) if (ws[rank[i]]++, sa[i]>=j) y[p++] = sa[i]-j;
E(b,c).rem(f.a); fore(i,1,SZ(S)) { fore(i, 1, lim) ws[i] += ws[i - 1];
swap(FS[j--], FS.back()); z[i] = i >= r ? 0 : min(r - i, z[i - l]); for (ll i = n; i--;) sa[--ws[rank[y[i]]]] = y[i];
FS.pop_back(); while (i + z[i] < SZ(S) && S[i + z[i]] == S[z[i]]) swap(rank, y), p = 1, rank[sa[0]] = 0;
} z[i]++; fore(i, 1, n) a = sa[i - 1], b = sa[i], rank[b] =
} if (i + z[i] > r) (y[a] == y[b] && y[a + j] == y[b + j]) ? p - 1 : p++;
ll nw = SZ(FS); l = i, r = i + z[i]; }
fore(j,0,nw) { } for(ll i = 0, j; i < n - 1; lcp[rank[i++]] = k)
F f = FS[j]; return z; for(k && k--, j = sa[rank[i] - 1]; s[i+k] == s[j+k]; k++);
#define C(a, b, c) if (E(a,b).cnt() != 2) mf(f.a, f.b, i, f.c); } return {sa, lcp, rank};
C(a, b, c); C(a, c, b); C(b, c, a); }
} Manacher.h
} Description: For each position in a string, computes p[0][i] = half length
for (F& it : FS) if ((A[it.b] - A[it.a]).cross( of longest even palindrome around pos i, p[1][i] = longest odd (half rounded
A[it.c] - A[it.a]).dot(it.q) <= 0) swap(it.c, it.b); down).
return FS; Time: O (N ) SuffixAutomaton.h
c6bbec, 13 lines Description: Online algorithm for minimal deterministic finite automaton
};
array<vi, 2> manacher(const string& s) { that accepts the suffixes of a string s.
UNC SuffixTree Hashing AhoCorasick Dates 2025-04-20 29
Exactly all substrings of s are represented by the states, each state repre- } return ret;
senting one or more substrings. Let t the longest string represented by state }
v. Then v.len == SZ(t), all strings represented by v only appear in s as SuffixTree(string a) : a(a), N(2*SZ(a)+2), t(N,vi(ALPHA,-1)){
a suffix of t and they are the longest suffixes of t. The rest of the suffixes r = vi(N, SZ(a)), l = p = s = vi(N), t[1] = vi(ALPHA, 0); H hashString(string& s){H h{}; for(char c:s) h=h*C+c;return h;}
of t are found by following the suffix links v.l. p is the state representing s s[0] = 1, l[0] = l[1] = -1, r[0] = r[1] = p[0] = p[1] = 0;
so terminal states are the ones in the path from p to the root through suffix fore(i,0,SZ(a)) ukkadd(i, toi(a[i])); // hashString ( s + t ) = concat( hashString ( s ) , hashString ( t ) , pw)
links. Also suffix links form the suffix tree of reversed s. Here you can see } // Where pw i s C∗∗| t | and can be obtained from a HashInterval
the automaton for abcbc: H concat(H h0, H h1, H h1pw) { return h0 * h1pw + h1; }
// example : find longest common substring ( uses ALPHA = 28)
b c ii best;
b bc AhoCorasick.h
c b ll lcs(ll node, ll i1, ll i2, ll olen) {
if (l[node] <= i1 && i1 < r[node]) return 1; Description: AhoCorasick automaton. It consists of a trie in with each
start abcb c abcbc node except the root has a link to the longest suffix that is also a node in the
abc b if (l[node] <= i2 && i2 < r[node]) return 2;
a a c trie. That link is used for transitions that are not defined in the trie. Works
b ab ll mask = 0, len = node ? olen + (r[node] - l[node]) : 0;
fore(c,0,ALPHA) if (t[node][c] != -1) with vectors, but for lower case latin strings you can to convert it to vi and
Complexity is amortized: extend adds 1 or 2 states but can change many subtract ’a’ for each character. Use the function go to get the next state of
mask |= lcs(t[node][c], i1, i2, len);
suffix links. Up to 2N states and 3N transitions. For larger alphabets, use the automaton given the current state. Use t[v].leaf to know witch strings
if (mask == 3) best = max(best, {len, r[node] - len});
T = ll. For performance consider s.reserve(2*N), and replacing map with end in the state v.
return mask;
vector or unordered map. Time: O (N ) for constructing where N is the sum of lengths of the words
}
Time: O (N log K). and O (1) for each transition query.
ada074, 16 lines static ii LCS(string s, string t) { 072030, 30 lines
template <class T = char> struct SuffixAutomaton { SuffixTree st(s + (char)(’z’ + 1) + t + (char)(’z’ + 2)); struct AhoCorasick {
struct State { ll len = 0, l = -1; map<T, ll> t; }; st.lcs(0, SZ(s), SZ(s) + 1 + SZ(t), 0); static const ll alpha = 26; // Size of the alphabet
vector<State> s{1}; ll p = 0; return st.best; struct Node {
void extend(T c) { } array<ll, alpha> next, go; vi leaf; ll p, link, pch;
ll k = SZ(s), q; s.pb({s[p].len+1}); }; Node(ll p = -1, ll pch = -1) : p(p), link(-1), pch(pch) {
for(;p != -1 && !s[p].t.count(c); p = s[p].l)s[p].t[c] = k; next.fill(-1), go = next;
if (p == -1) s[k].l = 0; }
else if (s[p].len + 1 == s[q = s[p].t[c]].len) s[k].l = q; Hashing.h };
else { Description: Self-explanatory methods for string hashing. 088baa, 48 lines vector<Node> t;
s.pb(s[q]), s.back().len = s[p].len + 1; // Arithmetic mod 2^64−1. 2x slower than mod 2^64 and more AhoCorasick(vector<vi>& words) : t(1) {
for (; p!=-1 && s[p].t[c]==q; p=s[p].l) s[p].t[c] = k+1; // code , but works on e v i l t e s t data (e . g . Thue−Morse, where fore(i, 0, SZ(words)) {
s[q].l = s[k].l = k+1; // ABBA. . . and BAAB. . . of length 2^10 hash the same mod 2^64) . ll v = 0;
} // ”typedef u l l H;” instead i f you think t e s t data i s random, for (ll c : words[i]) {
p = k; // or work mod 10^9+7 i f the Birthday paradox i s not a problem . if (t[v].next[c]<0) t[v].next[c]=SZ(t),t.pb(Node(v,c));
} typedef uint64_t ull; v = t[v].next[c];
}; struct H { }
ull x; H(ull x=0) : x(x) {} t[v].leaf.pb(i);
H operator+(H o) { return x + o.x + (x + o.x < x); } }
SuffixTree.h H operator-(H o) { return *this + ∼o.x; } }
Description: Ukkonen’s algorithm for online suffix tree construction. Each ll getLink(ll v) {
node contains indices [l, r) into the string, and a list of child nodes. Suffixes H operator*(H o) { auto m = (__uint128_t)x * o.x;
return H((ull)m) + (ull)(m >> 64); } if (t[v].link < 0) t[v].link = v && t[v].p ?
are given by traversals of this tree, joining [l, r) substrings. The root is 0 (has go(getLink(t[v].p), t[v].pch) : 0;
l = -1, r = 0), non-existent children are -1. To get a complete tree, append ull get() const { return x + !∼x; }
bool operator==(H o) const { return get() == o.get(); } return t[v].link;
a dummy symbol – otherwise it may contain an incomplete path (still useful }
for substring matching, though). bool operator<(H o) const { return get() < o.get(); }
}; ll go(ll v, ll c) {
Time: O (26N ) 242d5e, 48 lines if (t[v].go[c] < 0) t[v].go[c] = t[v].next[c] >= 0 ?
static const H C = (ll)1e11+3; // (order ∼ 3e9 ; random also ok)
struct SuffixTree { t[v].next[c] : v ? go(getLink(v),c) : 0;
static constexpr ll ALPHA = 26; // alphabet s i z e struct HashInterval { return t[v].go[c];
ll toi(char c) { return c - ’a’; } vector<H> ha, pw; }
string a; HashInterval(string& str) : ha(SZ(str)+1), pw(ha) { };
ll N, v = 0, q = 0, m = 2; // v = cur node , q = cur position pw[0] = 1;
vector<vi> t; // transitions fore(i,0,SZ(str))
vi r, l, p, s; // a [ l [ i ] : r [ i ] ] i s substring on edge to i ha[i+1] = ha[i] * C + str[i], Various (10)
pw[i+1] = pw[i] * C;
void ukkadd(ll i, ll c) { }
if (r[v] <= q) H hashInterval(ll a, ll b) { // hash [ a , b) 10.1 Dates
if (t[v][c] == -1) return l[t[v][c] = m] = i, return ha[b] - ha[a] * pw[b - a]; Dates.h
q = r[v = s[p[m++] = v]], ukkadd(i, c); } Description: Convert dates to numbers and vice versa. Days and months
else q = l[v = t[v][c]]; }; start from 1. 1/1/1 is day number 1721426.
if (q == -1 || c == toi(a[q])) q++; Time: O (1)
else { vector<H> getHashes(string& str, ll length) { a52f52, 14 lines
l[m+1] = i, l[m] = l[v], r[m] = l[v] = q, p[m] = p[v]; if (SZ(str) < length) return {}; ll dateToInt(ll y, ll m, ll d) {
p[t[m][c] = m+1] = p[v] = t[p[m]][toi(a[l[m]])] = m; H h = 0, pw = 1; return 1461*(y+4800+(m-14)/12)/4 + 367*(m-2-(m-14)/12*12)/12
t[m][toi(a[q])] = v, v = s[p[m]], q = l[m]; fore(i,0,length) - 3*((y+4900+(m-14)/12)/100)/4 + d - 32075;
while (q < r[m]) v = t[v][toi(a[q])], q += r[v]-l[v]; h = h * C + str[i], pw = pw * C; }
if (q == r[m]) s[m] = v; vector<H> ret = {h}; tuple<ll, ll, ll> intToDate(ll jd) {
else s[m] = m + 2; fore(i,length,SZ(str)) { ll x = jd + 68569, n = 4*x/146097;
q = r[v]-(q-r[m]), m += 2, ukkadd(i, c); ret.pb(h = h * C + str[i] - pw * str[i-length]); x -= (146097*n + 3)/4;
} } ll i = (4000*(x + 1))/1461001;
UNC DayOfWeek IntervalContainer IntervalCover ConstantIntervals TernarySearch LIS FastKnapsack KnuthDP DivideAndConquerDP 2025-04-20 30
x -= 1461*i/4 - 31; if (mx.snd == -1) return {}; ll L = SZ(res), cur = res.back().snd;
ll j = 80*x/2447, d = x - 2447*j/80; cur = mx.fst; vi ans(L);
x = j/11; R.pb(mx.snd); while (L--) ans[L] = cur, cur = prev[cur];
ll m = j + 2 - 12*x, y = 100*(n - 49) + i + x; } return ans;
return {y, m, d}; return R; }
} }
(or SIGSEGV on gcc 5.4.0 apparently). Randin.h char buf[450 << 20] alignas(16);
Description: Fast and secure integer uniform random numbers. For floating size_t buf_ind = sizeof buf;
• feenableexcept(29); kills the program on NaNs (1), point numbers use uniform real distribution.
template<class T> struct small {
Time: 2x faster than rand(). 1108e7, 7 lines
0-divs (4), infinities (8) and denormals (16). typedef T value_type;
template <typename T> small() {}
• -fsanitize=address -g in compilation to detect some T randin(T a, T b) { // Random number in [ a , b) template<class U> small(const U&) {}
static random_device rd; T* allocate(size_t n) {
memory access errors at run time. static mt19937_64 gen(rd()); buf_ind -= n * sizeof(T);
uniform_int_distribution<T> dis(a, b - 1);
• Several runtime checks: return dis(gen);
buf_ind &= 0 - alignof(T);
return (T*)(buf + buf_ind);
#define _GLIBCXX_DEBUG 1 } }
#define _GLIBCXX_DEBUG_PEDANTIC 1 void deallocate(T*, size_t) {}
#define _GLIBCXX_CONCEPT_CHECKS 1 FastInput.h };
Description: Read an integer from stdin. Usage requires your program to
#define _GLIBCXX_SANITIZE_VECTOR 1 pipe in input from file.
Usage: ./a.out < input.txt SIMD.h
10.6 Optimization tricks Time: About 5x as fast as cin/scanf. d42732, 17 lines
Description: Cheat sheet of SSE/AVX intrinsics, for doing arithmetic
on several numbers at once. Can provide a constant factor improvement
__builtin_ia32_ldmxcsr(40896); disables denormals inline char gc() { // l i k e getchar () of about 4, orthogonal to loop unrolling. Operations follow the pat-
(which make floats 20x slower near their minimum value). static char buf[1 << 16]; tern " mm(256)? name (si(128|256)|epi(8|16|32|64)|pd|ps)". Not all
static size_t bc, be; are described here; grep for mm in /usr/lib/gcc/*/4.9/include/ for
10.6.1 Bit hacks if (bc >= be) { more. If AVX is unsupported, try 128-bit operations, ”emmintrin.h” and
buf[0] = 0, bc = 0; #define SSE and MMX before including it. For aligned memory use
• x & -x is the least bit in x. be = fread(buf, 1, sizeof(buf), stdin); mm malloc(size, 32) or int buf[N] alignas(32), but prefer loadu/s-
} toreu. d1ecdc, 43 lines
• for (int x = m; x; ) { --x &= m; ... } loops return buf[bc++]; // returns 0 on EOF
#pragma GCC target ("avx2") // or sse4 .1
}
over all subset masks of m (except m itself). #include "immintrin.h"
ll readInt() {
• c = x&-x, r = x+c; (((rˆx) >> 2)/c) | r is the ll a, c; typedef __m256i mi;
while ((a = gc()) < 40); #define L(x) _mm256_loadu_si256((mi*)&(x))
next number after x with the same number of bits set.
if (a == ’-’) return -readInt();
// High−l e v e l / s p e c i f i c methods :
• fore(b,0,K) fore(i,0,(1 << K)) while ((c = gc()) >= 48) a = a * 10 + c - 480;
return a - 48; // load (u)? si256 , store (u)? si256 , setzero si256 , mm malloc
if (i & 1 << b) D[i] += D[iˆ(1 << b)]; } // blendv ( epi8 | ps | pd) ( z?y : x) , movemask epi8 ( h i b i t s of bytes )
// i32gather epi32 (addr , x , 4) : map addr [ ] over 32−b parts of x
computes all sums of subsets. // sad epu8 : sum of absolute differences of u8, outputs 4xi64
BumpAllocator.h // maddubs epi16 : dot product of unsigned i7 ’ s , outputs 16xi15
10.6.2 Pragmas Description: When you need to dynamically allocate many objects and // madd epi16 : dot product of signed i16 ’ s , outputs 8xi32
don’t care about freeing them. ”new X” otherwise has an overhead of some- // extractf128 si256 ( , i ) (256−>128) , cvtsi128 si32 (128−>lo32 )
• #pragma GCC optimize ("Ofast") will make GCC thing like 0.05us + 16 bytes per allocation. // permute2f128 si256(x , x ,1) swaps 128−b i t lanes
d23557, 8 lines
auto-vectorize loops and optimizes floating points better. // Either g l o b a l l y or in a single class : // shuffle epi32 (x , 3∗64+2∗16+1∗4+0) == x for each lane
static char buf[900 << 20]; // s h u f f l e e p i 8 (x , y) takes a vector instead of an imm
• #pragma GCC target ("avx2") can double performance of void* operator new(size_t s) {
vectorized code, but causes crashes on old machines. static size_t i = sizeof buf; // Methods that work with most data types (append e . g . epi32 ) :
assert(s < i); // set1 , blend ( i8?x : y) , add , adds ( sat . ) , mullo , sub , and/or ,
// andnot , abs , min, max, sign (1 ,x) , cmp( gt | eq) , unpack( lo | hi )
• #pragma GCC optimize ("trapv") kills the program on integer return (void*)&buf[i -= s];
}
overflows (but is really slow). void operator delete(void*) {} int sumi32(mi m) { union {int v[8]; mi m;} u; u.m = m;
int ret = 0; fore(i,0,8) ret += u.v[i]; return ret; }
10.6.3 Other mi zero() { return _mm256_setzero_si256(); }
SmallPtr.h mi one() { return _mm256_set1_epi32(-1); }
• cin.exceptions(cin.failbit); will make cin throw on bad Description: A 32-bit pointer that points into BumpAllocator memory. bool all_zero(mi m) { return _mm256_testz_si256(m, m); }
"BumpAllocator.h" 2dd6c9, 10 lines bool all_one(mi m) { return _mm256_testc_si256(m, one()); }
input.
template<class T> struct ptr {
FastMod.h unsigned ind; int example_filteredDotProduct(int n, short* a, short* b) {
Description: Compute a%b about 5 times faster than usual, where b is ptr(T* p = 0) : ind(p ? unsigned((char*)p - buf) : 0) { int i = 0; int r = 0;
constant but not known at compile time. Returns a value congruent to a assert(ind < sizeof buf); mi zero = _mm256_setzero_si256(), acc = zero;
(mod b) in the range [0, 2b). } while (i + 16 <= n) {
751a02, 8 lines T& operator*() const { return *(T*)(buf + ind); } mi va = L(a[i]), vb = L(b[i]); i += 16;
typedef unsigned long long ull; T* operator->() const { return &**this; } va = _mm256_and_si256(_mm256_cmpgt_epi16(vb, va), va);
struct FastMod { T& operator[](int a) const { return (&**this)[a]; } mi vp = _mm256_madd_epi16(va, vb);
UNC 2025-04-20 32
acc = _mm256_add_epi64(_mm256_unpacklo_epi32(vp, zero),
_mm256_add_epi64(acc, _mm256_unpackhi_epi32(vp, zero)));
}
union {int v[4];mi m;} u; u.m=acc; fore(i,0,4) r += u.v[i];
for (;i<n;++i) if (a[i] < b[i]) r += a[i]*b[i]; // <− equiv
return r;
}
UNC techniques 2025-04-20 33