Jake's CS Notes - Data Structures / Algorithms

//*******************************************************************//
//*************Minimum Spanning Trees - April 17th, 2017************//
//*****************************************************************//

-The room, as we walked in, was strangely empty...presumably the Calc 2 class was not here. I will assume they were eaten by bearsharks.
----------------------------------------------------------

-Another graphs algorithm we're looking at: Minimum Spanning Trees!

-Some terms to know first:
    -A SPANNING SUBGRAPH is a subgraph of "G" that contains all the vertices of G
    -A SPANNING TREE is a subtree that is itself a "free" tree

-A MINIMUM SPANNING TREE (MST) is where we're trying to connect a set of nodes with the minimum number of edges, OR, as in this class, the minimum possible amount of edge weight
    -Why is this useful? Fastest possible road route, wireless connections, etc.
-It partially relies on two properties
    -The CYCLE PROPERTY states that no MST has a cycle/loop, as if a cycle exists in the tree, we can always remove an edge and still have all the nodes connected to each other
        -Plus, it's not really a "tree" if it has a cycle anyway, right?
    -The PARTITION PROPERTY says that if we have arbitrarily split the graph's vertices into 2 groups "U" and "V," then an MST must contain an edge "e" that connects the two parts
        -If there are 2 edges "e" that connect the two parts, and they ahve equal weights, we can choose either one and remove the other
        -Because of this, occasionally a graph CAN have 2 or more MSTs

    -There are two algorithms we'll teach that can be used to find an MST: The Prim-Jarnik Algorithm, and
        -The PRIM-JARNIK algorithm (more commonly, just PRIM'S algorithm) is similar to Dijkstra's algorithm, and relies heavily on the partition property
            -We start at an arbitrary node; for each adjacent node, we record the lowest-weight edge to get from the current node to the node
            -At each step, we then visit the closest adjacent node / node with the smallest weight that we haven't yet visited, add the new adjacent vertices (via BFS), and update the closest edge for each known node (e.g. if we find a smaller-weight edge to an exisiting node, we remove the higher-weight edge)
        -The difference from Dijkstra's is that we don't have to add up the cumulative weights of the paths

        -KRUSKAL's algorithm (didn't get great notes for this...), on the other hand, is based on the cycle property
            -"How do we know if we have a cycle?" We use something called a "disjoint set," which is a little counter-intuitive, but it works
                -We start by saying every node has a "disjoint set" with a root of itself (p.s. the disjoint set itself is a binary tree); when we add an edge, we then look at each node's "disjoint set"
                    -If they don't have the same root, then we combine their disjoint sets by adding them together
                    -If they DO have the same root, then adding their disjoint sets together would create a loop, so we DO NOT create that edge