The Depth-First Search (DFS) algorithm is a fundamental building block used in many higher level applications, such as topological sort and connectivity and planarity testing of graphs. We'll briefly review prior results and propose two novel variations of parallel DFS on DAGs. The first traverses the graph three times in a breadth-first search-like fashion. The second assigns a weight to each edge, such that the shortest path from root to a node corresponds to the DFS path. The parallel algorithm visits all nodes in the graph multiple times and as a result computes the DFS parent relationship, pre- (discovery) and post-order (finish) time for every node. In some cases, the parallel DFS on GPU can outperform sequential DFS on CPU by up to 6x. However, the performance of the algorithm depends highly on the structure of the graph, and is related to the length of the longest path and the degree of nodes in the graph.