Parallel Depth First Search for Directed Acyclic Graphs

Depth-first search underpins many graph algorithms, yet its inherently sequential nature complicates parallelization. We implemented a work-efficient GPU traversal for DAGs that explores independent frontier vertices concurrently, delivering significant speedups over the serial baseline on large graphs and enabling faster topological analyses.
