Algorithm Design Laboratory with Applications

Stefano Leucci
Academic Year 2024/2025

Schedule

Office hours: Thursday 14:30 - 16:30. Please send me an email or ask before/after the lectures.

Exam Dates

Lectures and Material

Introduction and Prefix Sums

Course overview.
Introduction to laboratory exercises: exercises format, writing, compiling, evaluating, and debugging a solution. Programming tips, assertions.
The STL library: overview, basic types, containers (arrays, vectors, deques, queues, stacks, (muti-)sets, priority queues), iterators, algorithms: (heaps and sorting, linear search, binary search, deleting and replacing elements).
The prefix sums technique. Counting the number of contiguous subsequences with even total sum: cubic-, quadratic-, and linear-time solutions.

Material

Sorting, Binary Searching, and Sliding Window

Sorting and binary searching with predicates. Computing good upper bounds using exponential search.
The sliding window technique. Examples with cubic-, quadratic-, almost linear-, and linear-time solutions.

Material

Greedy algorithms

Greedy algorithms. Interval scheduling: problem definition, the earliest finish time algorithm, proof of correctness (greedy stays ahead).
Interval partitioning: problem definition, greedy algorithm, proof of correctness (using structural properties).
Scheduling jobs with deadlines to minimize lateness. The Earliest Deadline First algorithm. Proof of correctness through an exchange argument.

Material

Divide and Conquer, Memoization, and Dynamic Programming

The divide and conquer technique and the polynomial multiplication problem.
Recursion and memoization: computing Fibonacci numbers recursively in linear-time.
Introduction to dynamic programming. A trivial example: computing Fibonacci numbers iteratively. The Longest Increasing Subsequence problem: a O(n^2)-time algorithm. The segmented least squares problem: a dynamic programming algorithm running in time O(n^2).

Material

More Dynamic Programming

Maximum-weight independent set on paths (linear-time algorithm), Maximum-weight independent set on trees (linear-time algorithm), Maximum weight independent set on trees with cardinality constraints. Counting the number of ways to distribute budget with stars and bars. Dynamic programming algorithm to optimally distribute budget.

The minimum edit distance problem: definition, quadratic algorithm, techniques for reconstructing optimal solutions from the dynamic programming table.

The Knapsack problem: a dynamic programming algorithm with polynomial running time in the number of items and in the maximum weight. A dynamic programming algorithm with polynomial running time in the number of items and in the optimal value.

Material

The Split and List Technique, the 2-SAT problem

The Subset-Sum problem: definition and a dynamic programming algorithm. Pseudopolynomial-time algorithms. The split and list technique: Improving the trivial O*(2^n)-time algorithm for subset-sum to a O*(2^(n/2))-time algorithm. Techniques for generating all subsets sums of a set, a trivial O(n2^n) algorithm, a O(2^n) algorithm.

A split and list algorithm for the Knapsack problem.

The 1-in-3 Positive SAT problem: definition, improving the trivial O*(2^n)-time algorithm to a O*(2^(n/2))-time split and list algorithm.

The 2-SAT problem: the implication graph, strongly connected components, and topological sorting. Relation between strongly connected components and 2-SAT.

Material

Tarjan's algorithm for SCCs. Introduciton to the Boost Graph Library

Tarjan's algorithm for computing Strongly Connected Components, proof of correctness and analysis of its running time.

The Boost Graph Library: fundamentals, graph representations, vertex and edge descriptors, iterators, internal property maps, external property maps.

Algorithms in BGL: topological sorting, connected components, strongly connected components, named parameters, solving 2-SAT with BGL, single source distances (in DAGS, using Dijkstra algorithm, using Bellman-Ford algorithm), all to all distances (Floyd-Warshall algorithm), minimum spanning trees (using Kruskal algorithm, using Prim algorithm).

Visits in BGL: the BFS visitor, the DFS visitor.

Material

Applications of Network Flows

The maximum flow problem: definition, linear programming formulation, augmenting paths and residual graphs. Flow algorithms (without analysis): Ford-Fulkerson, Edmonds-Karp, and Push-Relabel. Handing multiple sources, vertex capacities, undirected graphs, and minimum capacities.

Network flow in BGL: graph representation, invoking Edmonds-Karp and Push-Relabel.

Flow applications: the minimum s-t cut problem and the max-flow min-cut theorem. Finding the maximum number of edge-disjoint paths, the circulation problem, the maximum bipartite matching problem and Kőnig's theorem, image segmentation.

Min-cost max-flow and applications. Minimum-cost bipartite matching. Min-cost max-flow in BGL: using the Cycle Canceling algorithm, and the Successive Shortest Paths algorithm.

Material

Range Trees

A data structure for orthogonal range searching: Range trees. Problem definition. Solution for the 1-dimensional case using arrays and binary search. Solution for the 1-dimensional case using binary trees. Solution for the 2-dimensional case. Reducing the construction time. Solution for the D-dimensional case for D>2.

Fractional cascading: problem definition, naive solution. Two ideas: cross-linking and fractional cascading. Using cross-linking to improve the query time for 2-dimensional range trees. Extension to D>2.

Material

Oracles for Lowest Common Acestors and Range Minimum Queries

Designing an oracle for Lowest Common Ancestor (LCA) queries: problem definition, trivial solutions, Euler tours and relation to oracles for Range Minimum queries (RMQ).

Designing an oracle for RMQ: problem definition, trivial solutions, the "Sparse Table" oracle, improving the sparse table oracle using blocks. The ±1 RMQ problem. An oracle with optimal size and query time for ±1 RMQ using blocks and block types.

Reducing the problem of designing an oracle for general RMQ to the problem of designing an oracle for LCA queries. Cartesian trees: definition, constructing a Cartesian tree in linear time. Relation between LCA queries in Cartesian trees and general RMQ. An oracle with optimal size and query time for general RMQ.

A linear-size oracle for finding all distinct elements in a range in time proportional to the number of reported elements.

Material

Oracles for Level Ancestor Queries. The String Matching Problem

Designing an oracle for Level Ancestor queries, problem definition and trivial solutions. A first non-trivial oracle using jump pointers.

The long path decomposition of a tree: an oracle using the long path decomposition, tight analysis of the number of paths of the decomposition encountered in a root-to-leaf path and relation with the oracle's query time. Extending the paths of the decomposition into ladders, combining ladders with jump pointers.

The micro-macro tree decomposition, handling the macro-tree, handling micro-trees and bounding the number of distinct micro-tree types, combining the previous oracles into a new oracle with optimal space, preprocessing time, and query time.

Finding occurrences of a pattern in a text and the Rabin-Karp algorithm.

Material

Data structures for String Matching: Tries and Suffix Trees

The problem of maintaining a (possibly dynamic) collection of strings over a common alphabet and the trie data structure. Implementing the find and predecessor operations. Storing trie vertices: dense arrays, sparse arrays and binary search trees, and weight-balanced binary search trees. Improving the time per operation by using indirection. Compressed tries.

Application of tries: sorting strings, and packet routing.

Suffix trees and their applications: string matching, finding the longest prefix that is common to at least two strings, longest repeated substring, finding additional occurrences of a substring, document retrieval. Constructing suffix trees: suffix arrays and longest common prefix (LCP) arrays, constructing a suffix tree in linear time from the Suffix and LCP arrays.

Material

Laboratory Problems

The online judge is accessible at https://judge.stefanoleucci.com.

References