Jake's CS Notes - Intro. to AI

//****************************************************************************//
//******************** Search (cont.) - May 24th, 2018 **********************//
//**************************************************************************//

- Today's whiteboard preface: "
    - Informed Search
        - BestFS
        - gBestFS (greedy search)
        - A*
    - Admissible
    - Consistent"
        - It's not all-caps today! Huzzah!
- (once again, the graph I will not draw has been drawn)
---------------------------------------------

- "So, today we'll finish informed search, and possibly start moving into probability-based agents"
    - After this, you should have everything you need to finish the project
    - "Hopefully, you'll also realize that A* isn't as scary as you might've thought"

- But FIRST, let's just make sure everyone's on the same page in this class
    - We've talked about search problems that require a fully-observable, static, discrete, finite, deterministic environment
        - ALL of the algorithms we'll discuss for this are "planning" algorithms, which means we can boil them down to 2 main steps:
            1) INTERNALLY (without taking an action), simulate movement to form a plan (i.e. a path to the goal)
            2) Execute that plan
        - What this means is that these algorithms can't handle any changes in the world while you're planning - it just looks at its snapshot of the world
            - Now, trying to re-run these searches at EVERY step to make it dynamic by brute force would result in you running an exponential algorithm an exponential number of times - that's TERRIBLY inefficient
    - We've covered one big type of this search problem: UNINFORMED searches, which only require the information in the formal definition of a search problem
        - Pros of this:
            - These can be good because we ONLY need to know what actions we can take - we don't need any special information
            - They also tend to work on a LOT of different problems - anything that can be expressed as a graph (e.g. scheduling problems, etc.)
            - It can find solutions that you weren't expecting/that are good based on a metric you weren't expecting - this is ESPECIALLY helpful for complicated problems with lots of variables
            - They can easily be switched to 
        - Cons of uninformed:
            - Tend to be comparatively slow/memory-intensive, compared to informed searches
            - VERY poor at dealing with non-deterministic problems
                - This was a HUGE problem in the early days of robotics; they would make a plan to move, the environment would change, the wheels would move a little faster than the program expected, etc., and the plan would get quickly derailed
        - "Overall, we use uninformed searches when we just don't have a better way of solving it - they're versatile, and for problems that we can't come up with a good heuristic for they will eventually get a decent answer"
        - Now, DFS tries to minimize our search time, even if its results are sub-optimal; in BFS/UCS, we spend more time "thinking" on average but can always find an optimal solution
            - If we're in a situation where the difference between a "good" and "best" path is just a few cents, DFS might be worthwhile for the shorter calculation time; if we're talking thousands of dollars saved by a better path, though, we're definitely gonna want an optimal search
    - INFORMED searches get all this information PLUS a "heuristic" function that lets them make more informed searches and run faster - that's the difference!
        - The heuristic function can be ANY function as long as, for ANY possible state, it returns a real number greater or equal to 0
            - Negative values don't really make sense, since meaning-wise they translate to "Oh, I went past the goal"
    - For BFS, our open list is a queue; for DFS, it's a stack; for ALL the rest we'll talk about in this part of the class, it's a priority queue
        - The weights of the priority queue for each algorithm are, respectively: UCS is g(x); BestFS is g(x) + h(x); gBestFS is h(x)
            - h(x) is our heuristic function, g(x) is our distance from the start node
    - Now, I mentioned how important a "getSuccessors" function is (returns the list of states we can get to in the next step); ALSO important is an "isGoal" function that tells us if we've reached our goal and can stop searching
        - The more possible states we have, the more important these functions become; if we have multiple possible goals, or (worse) a set of rules we want to accomplish instead of a specific set of states, we'll want an "isGoal" function
        - "Think of something like chess; we might have to program in a few hundred possible moves for each piece, but it's a LOT better than programming in every possible configuration of pieces on the board; similarly, for goals, we might define a set of rules we want to have ('opponent king in checkmate', etc.), and that's a LOT better than trying to come up with every possible checkmate configuration"
            - For PacMan, this might not be "required" yet, but you should still do this as a good practice; it'll save you a TON of work when the problems get more complicated

- Alright, so we're all caught up - now back to INFORMED SEARCH!
    - So, first off - why should we care about heuristics? 
        - Well, basically, heuristics save us time - they're a simpler-to-calculate version of the problem that's technically incorrect, but allow us to estimate the problem MUCH faster than solving it completely
            - Our algorithms can use these to start searching in the most promising places first, so we don't waste time searching paths that don't work out

- So, BEST FIRST-SEARCH, like we said, is the same as UCS except that its weight when adding new nodes isn't just based on the distance from the start node, but the distance from the start node PLUS our heuristic function's estimate of the new node's distance from the goal
    - So, the weight we add the node with is now "weightOfCurrentNode + weightOfNeighborBeingAdded + h(s)", where h(s) is the node's estimated cost to get to the goal from there
        - SIDENOTE: For UCS/BestFS/The rest of these searches, it's okay to add "duplicate" nodes (e.g. to add "C" to the list twice), since we can have two potential paths to the same node - the "tracked paths" and weights of these nodes will be different, so they're really not duplicates at all!
    - Now, on average, this search will get closer to the goal as it continues searching 
- ...HOWEVER, this algorithm can possibly break if we have a poor heuristic

- For UCS, if we have the goal node hidden behind an expensive path, it'll be one of the last things it checks, even if that path is the only way to get to the goal
    - If we had a good heuristic, it could save us a lot of time by factoring in how close that path is to the goal!
- For BestFS, if we have a good heuristic, it'll get us a good answer - but what if we don't?

- Let's say that we have GREEDY BEST-FIRST SEARCH, where the weight for new nodes is just our heuristic from the goal
    - If the path is straightforward, this'll let us go straight to the goal without any wasted searching!
    - HOWEVER, this approach has a MASSIVE problem: we're only basing the weights on our heuristic and NOT on the "real cost" of reaching the node, so it's completely dependent on our heuristic being accurate
        - So, for instance, if we have a straight-line graph but there's a "shortcut" to reach the node right next to the goal, but that shortcut costs 100 weight, greedy search will take it instead of going through 3 1-cost nodes since the heuristic doesn't take the cost to reach the nodes into account - it finds a clearly wrong path!
            - This is assuming we have a heuristic that only measures, say, distance from the node to the goal, but does NOTHING to account for traveling between our nodes - so it fails to accurately represent the environment
        - Plain-old BestFS wouldn't fall for this - it would explore that path, so a naive heuristic might increase our search time, but it wouldn't return it as an answer
- So, Greedy search is NOT guaranteed to be optimal, even with a good heuristic

- BestFS IS complete; in the worst case, it just becomes UCS
    - ...however, it is NOT guaranteed to optimal
        - The general reason for this is that our heuristic might overstimate the cost to the goal; if it does this, we might think that we've found the best path to the goal and return early, BEFORE we check a shorter path that we've (falsely) guessed is longer than our current one
    - ...ON THE OTHER HAND, if we can PROVE that our heuristic never overestimates the true cost to the goal, then it WOULD be optimal!
        - At worst, we underestimate the costs of some nodes, which would lead to it exploring some useless paths, but it'd still find the right answer in the end

- In FACT, do you want to know what A* is? ALL the A* algorithm is is the BestFS algorithm with a heuristic we've GUARANTEED won't overestimate a node's real cost to the goal
    - That's it!
    - "On an exam, I might ask you stuff like 'What's BestFS with a heuristic of 0?' (it's UCS) or 'With an admissible heuristic?' (A*)"

- This brings us to some ways of evaluating heuristics we have to quickly define
    - An ADMISSIBLE heuristic is one that'll never overestimate the cost to reach a certain state
- Now, a "feature" of A* is that there are some problems with multiple admissible heuristics - so in cases like this, how do we decide which one is "better?" After all, a more accurate heuristic will mean we have to do less searching
    - So, we want the heuristic that's the closest to the actual cost WITHOUT overestimating
    - We say a heuristic 'h1' DOMINATES another heuristic (say, 'h2') if, for all possible inputs, h1 > h2 while still being admissible
        - i.e. we want it to be as high as possible while still not overestimating
        - "It's only dominant if it's dominant EVERYWHERE; if h1 is better some places and h2 is better in others, then neither is dominant"
            - Because of this, it can be tricky to find truly "dominant" heuristics that are better everywhere
    - Another way of evaluating heuristics is by actually running an experiment on it and coming up with a number like this:
    
            total size of search space / (avg # of nodes explored by h(x))

        - This is called the INFORMEDNESS of a heuristic; when we can't mathematically prove the dominance of a heuristic, this can be a useful way of evaluating which one is better anyway
- Finally, consistency - if a heuristic badly underestimates costs, then our searches may be significantly slowed down. We say a heuristic is CONSISTENT if the difference between the estimated cost of two nodes is equal to/less than the actual path's length difference
    - That probably seems abstract, but it just means that if the difference between going from node A to goal vs from B to goal is "actually" 5 distance, then the heuristic won't change by more than 5 distance if we move from A to B
        - Otherwise, we might be teleporting or something (e.g. two nodes are next to each other but separated by a wall, so the "actual" path from one to another is long, but the heuristic might think they're close together)
    - A heuristic that isn't consistent will STILL result in an optimal result (the ultimate path will still be the same) - it'll just take longer to find that answer, since it'll try searching "dead ends"