Jake's CS Notes - Intro. to HPC

//****************************************************************************//
//**************** Load Balancing (cont.) - April 21st, 2020 ****************//
//**************************************************************************//

- Currently, if I do raw grading scores, there are 3 people in the A range, 14 in the B range, and 6 in the C range
    - I'd tried to give enough extra-credit to avoid having to use a class-wide curve, but given the extraordinary circumstances with the coronavirus, I do think some extra leniency is warranted
    - Because of that, I'm considering a "letter-sign shift" in the grade assignments, where a B+ becomes and A-, an A becomes an A+, etc.
        - This isn't a huge shift - just ~3.3% - but hopefully it should give us a more normal-ish looking curve
    - In addition, it also seems that the final being a full 30% of the course when I've been less available for office hours since the break isn't fair, so I'm going to have the final work like this:
        - If the final raises your grade, you can count it as 30%
        - OTHERWISE, it will only count for 15% of your grade (meaning your total course grade will sum to 85%, rather than 100%)

- Programming Project 3 is due tonight, so PLEASE make sure you're working on that!

- Homework 4, for extra credit, is still out, and also due tonight
    - For problem 2, you can think of this Toeplitz matrix as being like a polynomial multiplication problem, like with FFT
        -

- "My philosophy is that turning something in late is better than turning in nothing at all, so if you need extra time reach out to me 1-on-1 and, while you probably won't get full credit, something can be arranged"
--------------------------------------------------------------------------------

- Last time, we were talking about LOAD BALANCING
    - One prominent feature of physical simulations is that data tends to have some sort of locality/spatial relations, and the density of the data at points can shift over time
        - Think of a wind tunnel simulation, where the air can shift around and move
    - We had a STATIC load balancing problem, where we wanted to minimize computational imbalances, and minimize communication across cuts - and these 2 goals are usually opposed!
        - Typically, if we end up with cuts that cross a small number of edges, it means
    - The DYNAMIC load balancing problem, then, also tries to minimize the movement of data from the old partitions to the new partitions we choose
        - Minimizing the movement of data from the old to the new partition is the (TODO: on lecture)
        - We tend to prefer incremental algorithms that can update the data in small increments, and small changes in the input result in gradual changes to the partitions

- So, those are our problems; let's take a look at some solutions!
    - There are 2 types of models for these problems:
        - GEOMETRIC models are where coordinates for our data items are available, and are great for problems where the data is in an actual physical space
        - COMBINATORIAL models are more abstract for non-physical problems, and are where we don't have any physical locations but we DO have connectivity information for our data items

- Let's look at some geometric solutions!
    - The RECURSIVE COORDINATE BISECTION (RCB) algorithm divides work into 2 equal parts by cutting it with a flat line/plane, then recursively divides each of THOSE portions to end up with an equal amount of work, lternating the direction we cut
        - Why alternate the dimensions (i.e. horizontal, then vertical, instead of just all horizontal)? Because of a ratio: Surface Area to Volume!
            - By alternating cuts, we're keeping our subdomains as close to square as possible, trying to minimize the size of our cuts
        - This algorithm is hierarchical, and is IMPLICITLY incremental with the cuts changing gradually to match the data, making it a good fit for dynamic load-balancing problems
        - So:
            - Pros:
                - Fairly simple and efficient to implement
                - Regular subdomains
                - Doesn't need connectivity information
            - Cons:
                - No explicit control of communication costs (since we don't use any connectivity information)
                - Can generate disconnected domains
                    - Basically, the partitions it generates aren't guaranteed to line up naturally with the problem space, which can make it more awkward/inefficient to calculate (e.g. )
                - Partition quality can be meh
                - Needs geometric coordinates
        - Still, though, this algorithm is commonly used in particle simulations, volume rendering, crash simulations, etc.
    - Another approach: the SPACE-FILLING CURVE PARTITION (SFC)
        - The idea here is that we have a space-filling curve that completely fills a domain, and we can apply it recursively to obtain however much granularity as we need
            - Given a coordinate, we can find its partition quickly, and vice-versa
        - We can use this for partitioning as follows:
            - Run a space-filling curve on a grid containing all our data points
            - Order objects based on their 1D position along the curve
            - Evenly split the data along that 1D curve (i.e. so the points are evenly distributed)
        - Doing this gives us non-rectangular but still pretty compact sub-domains, and figuring out where to divide is SUPER simple once we have this space-filling curve function (lie Hilbert's curve)
            - We can make this algorithm incremental, too, by taking a given grid square and recursively splitting it where we need more data!
        - So:
            - Pros:
                - Simple and fast
                - Maintains geometric locality (good for dynamic balancing)
                - Linear ordering of objects can give great cache performance
                    - Partitioning and cache-blocking are distinct, but definitely related
            - Cons:
                - No explicit control of communication costs (again, it's wholly determined by the data's position)
                - Can still generate disconnected subdomains/partitions
                - Partitions are often lower quality than PCB
                - Still need geometrics coordinates

- Then, for combinatorial solutions
    - First up, MULTI-LEVEL GRAPH PARTITIONING!
        - The idea here is that we construct small approximations of our graph repeatedly until there's one we know how to partition easily, we partition that easier problem, and then re-scale the problem back up to the original graph (making some slight changes as we scale it up to keep it good)
        - This multi-level approach has a BUNCH of sub-approaches; the default approach is okay for static partitioning, but for dynamic partitioning some approaches use a "diffusive strategy" to take some information from the original graph into account, keeping the changes incremental
        - So:
            - Pros:
                - High-quality partitions known for MANY types of graphs and applications (e.g. highly optimized engineering models, which take edges into accounts)
                - Can control communication costs
                - Widely used for static partitioning
            - Cons:
                - More expensive than geometric approaches
                - Difficult to make incremental
        - So, what are some applications of graph partitioning? There's tons!
            - Finite element analysis (?)
            - "Multi-physics" simulations (rarely need to rebalance, but need high-quality partitions to start)
            - Linear solvers
                - These are square and structurally symmetric, and decomposing based on the mesh indices is a natural fits

- This graph-based approach to load balancing is very popular in scientific computing, but it has a flaw that becomes evident in non-spatial applications: it assumes the number of edge cuts is the same as the communication volume
    - In reality, this ISN'T always true! Some edges get used much more than others
    - Graph models also tend to assume a symmetri, square (???)
    - So, is the graph based model good enough?
        - In mesh-based applications, probably! It works well in prctice and the structure of meshes guarantees low vertex degrees and decent partitions
        - For irregular networks, no, it doesn't;

- If you want to play around with some of these approaches, a good resource is the Zoltan Toolkit (which was party developed at Georgia Tech!), which lets you write code and freely switch between these algorithms

- So, to compensate for the problems of the graph model, some researchers have tried to devise the HYPERGRAPH MODEL
    - We won't have time to cover the details, but be aware that it's a thing

- Okay, I'll have office hours from 1:00 to 2:00 today to talk about the project if you have any questions; thanks for showing up today and sticking with it through this semester!
    - I'll